1987 


Software 

Tools 


Dr.  Dobb's  Journal 


VOLU  M  E 


Si  Software 

Tools 

1987 


M&T  BOOKS 


A  Publication  of  M&T  Publishing,  Redwood  City,  CA 


M&T  Books 

A  Division  of  M&T  Publishing,  Inc. 
501  Galveston  Drive 
Redwood  City,  CA  94063 


M&T  Books 

General  Manager,  Ellen  Ablow 

Project  Manager,  Michelle  Hudun 

Editor,  Indexer,  Sally  J.  Brenton 

Cover  Design,  Michael  Hollister,  Barbara  Mautz 

Production,  Beth  Auten,  Lynn  Sanford 


©  1988  by  M&T  Publishing,  Inc. 


Printed  in  the  United  States  of  America 
First  Edition  published  1988 


All  rights  reserved.  No  part  of  this  book  may  be  reproduced  or  transmitted  in  any  form  or 
by  any  means,  electronic  or  mechanical,  including  photocopying,  recording,  or  by  any 
information  storage  and  retrieval  system,  without  prior  written  permission  from  the 
Publisher.  Contact  the  Publisher  for  information  on  foreign  rights. 


ISBN  0-934375-84-4 


About  M&T  Publishing 

and 

Dr.  Dobb's  Journal 


M&T  Publishing,  Inc.,  was  founded  in  1981  as  the  subsidiary  of  Markt  & 
Technik  A.G.,  the  leading  West  German  publisher  of  computer  magazines, 
books,  and  software.  In  addition  to  its  active  role  in  the  international 
microcomputer  marketplace,  M&T  publishes  two  monthly  magazines:  Dr. 
Dobb's  Journal  of  Software  Tools  and  Business  Software  Magazine.  M&T 
also  publishes  a  bimonthly  magazine,  Micro! Systems  Journal ,  and  a 
bimonthly  newsletter,  Turbo  Tech  Report.  In  addition,  M&T  publishes 
microcomputer  books  and  software  under  the  imprint  M&T  Books. 

Since  its  first  issue  in  1976,  Dr.  Dobb's  Journal  has  maintained  a  unique 
voice  among  microcomputer  publications.  The  first  issue  was  published  to 
provide  short-term  distribution  of  the  newly  written  Tiny  BASIC  language. 
Reader  response  was  so  strong  that  it  went  immediately  into  monthly 
production.  DDJ  has  led  the  way  since  then  by  focusing  on  important 
advances  in  microcomputers,  printing  public-domain  software,  and  fostering 
vital  reader  interaction  on  technical  matters. 

Today,  Dr.  Dobb's  Journal  remains  the  primary  source  of  information  and 
software  tools  for  advanced  programmers.  Its  early  issues  are  still  in 
demand,  and  no  part  of  the  editorial  content  of  DDJ  has  ever  gone  out  of 
print.  Volumes  One  through  Twelve  of  this  series  contain  the  entire  editorial 
content  of  DDJ 's  first  twelve  years  and  are  available  in  bound  volumes 
through  M&T  Books. 

Through  Dr.  Dobb's  Journal  and  its  other  software  publications,  M&T 
Publishing  is  providing  microcomputer  professionals  and  enthusiasts  with 
the  most  advanced  and  useful  software  information  available. 


Editor’s  Preface 


The  book  you  are  holding  in  your  hands — or  possibly  cradling  in  your  arms, 
considering  its  bulk — contains  the  entire  editorial  contents  of  Dr.  Dobb's 
Journal  of  Software  Tools  for  one  year — its  twelfth  year,  1987.  DDJ  is  the 
leading  progammer's  magazine,  and  this  volume  is  at  once  a  technical 
reference  and  a  contemporary  history  of  technological  progress  in  computer 
programming  in  1987. 

OS/2  was  announced  in  1987,  and  DDJ  pulled  the  venerable  Resident  Intern, 
Dave  Cortesi,  out  of  retirement  to  give  a  preview  of  how  third-party 
developers  can  add  extensions  to  Microsoft's  new  operating  system. 

In  1987  new  machines  and  cheap  memory  made  techniques  developed  in 
artificial  intelligence  research  more  employable  on  personal  computers,  and 
DDJ  responded  by  publishing  a  neural  network  implementation,  instituting 
monthly  coverage  of  artificial  intelligence  and  object-oriented  programming 
techniques,  and  reviewing  implementations  of  PROLOG,  LISP,  and 
Smalltalk. 

It  was  a  year  of  change  and  maturation  for  more  traditional  language  tools. 
DDJ  evaluated  the  new,  more  structured  and  powerful  BASICs,  previewed 
the  ANSI  C  standard,  and  looked  into  the  optimization  techniques  employed 
in  a  new  generation  of  optimizing  compilers. 

And  1987  saw  more  new  programmers  than  ever  before.  So,  as  it  did  for 
the  preceding  eleven  years,  DDJ  continued  to  deliver  programs  and 
techniques  of  practical  value  to  programmers — more  useful  code  than  any 
other  magazine.  DDJ  published  practical  utilities  in  C,  Pascal,  Forth,  and 
assembly  language.  It  gave  its  readers  a  UNIX  bulletin-board  system,  an 
implementation  of  the  nroff  text  editor,  and  a  technique  for  extending 
AppleTalk.  It  showed  how  to  create  Macintosh  Buttons,  Amiga  Gadgets, 
and  DOS  device  drivers.  And  it  staked  out  as  the  magazine's  proper  territory 
a  broad  range  of  software  development  environments,  from  6502  and  8088 
killer  hacks  to  software  for  advanced  microprocessors,  from  OS-9  to  OS/2, 
from  BASIC's  resurgence  to  PROLOG'S  prominence. 

Nineteen  eighty-seven  was  also  my  fourth  year  at  the  helm  of  DDJ ,  making 
me  the  longest-tenured  editor  of  the  premier  programmer's  journal.  It's  been 
rewarding,  and  I  claim  this  space  to  thank  DDT s  readers  for  that.  Thanks. 


Michael  Swaine 
Editor-in-Chief 


Software 

Availability 


All  the  listings  appearing  in  Bound  Volume  12  are  available  on  two  MS/PC-DOS 
disks.  The  set  costs  $19.95.  California  residents  must  add  the  appropriate  sales  tax. 

To  order,  please  send  a  check  made  payable  to  M&T  Books,  or  credit  card  number  with 
expiration  date,  to: 

Bound  Volume  12  Listings  Disk 

M&T  Books 
501  Galveston  Drive 
Redwood  City,  CA  94063 

Or,  you  may  order  by  calling  our  toll-free  number  between  8  a.m.  and  5  p.m.  Pacific 
Standard  Time:  800/533-4372  (800/356-2002  in  California). 


M&T  BOOKS 


Contents 


VOLUME  12,  NUMBER  123,  1987 


3  Editorial  MICHAEL  SWAINE 

4  Running  Light  NICK  TURNER 

5  Letters 

7  Viewpoint:  Logic  and  PROLOG  DICK  BUTRICK 

8  680xx  Computers:  Where  Are  They  Going?  NICK  TURNER 

1 1  A  Mini  Forth  for  the  68000  G.  YATES  FLETCHER 

16  The  OS-9  Operating  System  BRIAN  CAPOUCH 
22  Macintosh  Buttons  and  Amiga  Gadgets  JAN  L.  HARRINGTON 
61  C  Chest  Allen  Holub 

65  Structured  Programming:  Naming  Names  MICHAEL  HAM 

72  The  Right  to  Assemble:  A  New  Project  is  Bom  NICK  TURNER 
78  Of  Interest 

84  Swaine's  Flames  MICHAEL  SWAINE 


VOLUME  12,  NUMBER  124,  1987 


87  Editorial  MICHAEL  SWAINE 

88  Running  Light  NICK  TURNER 

89  Letters 

91  Viewpoint:  What's  Wrong  with  High-Level  Languages?  MIKE  SUMAN 

92  Text  Editors:  In  Matters  of  Taste  LEVI  THOMAS  and  NICK  TURNER 

97  6502  Hacks  MARK  S.  ACKERMAN 

104  Hashing  for  High-Performance  Searching  EDWIN  T.  FLOYD 
1 39  C  Chest:  Nroff:  Hashing,  Expressions,  and  Roman  Numerals 
Allen  Holub 

145  16-Bit  Software  Toolbox  RAY  DUNCAN 

149  Artificial  Intelligence:  What  Progress  is  Being  Made  in  AI? 

Ernest  r.  tello 

158  Structured  Programming:  Language  Translations 
NAMIR  CLEMENT  SHAMMAS 
164  DDJOnLine 

167  The  State  of  BASIC 

1 69  Of  Interest 

170  Swaine's  Flames  MICHAEL  SWAINE 


VOLUME  12,  NUMBER  125,  1987 


173  Editorial  MICHAEL  SWAINE 

174  Running  Light  NICK  TURNER 

175  Letters 

178  Compressing  Image  Data  With  Quadtrees  RONALD  G.  WHITE 
1 84  ARC  Wars:  MS-DOS  Archiving  Utilities  RUSSELL  NELSON 

1 88  Optimizing  Integer  Multiplications  by  Constant  Multipliers 

Robert  D.  Grappel 

223  C  Chest:  Nr:  A  C  Implementation  of  Nroff,  Part  2  ALLEN  HOLUB 

233  16-Bit  Software  Toolbox  Ray  DUNCAN 
238  Structured  Programming:  BASIC:  Quo  Vadis? 

Namir  Clement  Shammas 

241  Artificial  Intelligence:  Object-Oriented  Programming  ERNEST  R.  TELLO 

245  DDJ  On  Line 

247  The  State  of  BASIC 

249  Of  Interest 

253  Swaine's  Flames  MICHAEL  SWAINE 


VOLUME  12,  NUMBER  126,  1987 


257  Editorial  MICHAEL  SWAINE 

258  Running  Light  NICK  TURNER 

259  Letters 

261  Viewpoint:  Education  and  Programming  ALLEN  HOLUB 

262  An  Artificial  Neural  Network  Experiment  ROBERT  JAY  BROWN 
270  Four  PROLOGS  for  the  Macintosh  Dan  L.  PIERSON 

278  MYCIN-Like  Expert  Systems  RICHARD  W.  GRIGONIS 
316  C  Chest:  Nr:  A  C  Implementation  of  Nroff,  Part  3  ALLEN  HOLUB 
323  Structured  Programming:  People  in  Programming  MICHAEL  HAM 
327  Artificial  Intelligence:  Object-Oriented  Programming  in  AI 
Ernest  R.  tello 
331  DDJ  On  Line 

333  The  State  of  BASIC 

335  Of  Interest 

337  Swaine's  Flames  MICHAEL  SWAINE 


VOLUME  12,  NUMBER  127,  1987 


341  Editorial  MICHAEL  SWAINE 

342  Running  Light  ALLEN  HOLUB 

343  Letters 

345  Viewpoint:  What's  Right  with  High-Level  Languages? 

Philip  J.  Erdelsky 

346  Pushing  the  Sound  Envelope  DAVID  LEVITT 
350  Designing  a  Music  Recorder  Mark  GARVIN 
366  Dimensional  Data  Types  DO- WHILE  JONES 

39 1  C  Chest:  Statistical  Applications  of  Digital  Low-Pass  Filters,  Exec 
Bug  in  Microsoft  C  ALLEN  HOLUB 
400  16-Bit  Software  Toolbox  RAY  DUNCAN 
404  Structured  Programming:  True  BASIC  Challenges  Modula-2 
Namir  Clement  Shammas 

4 1 0  Artificial  Intelligence:  Object-Oriented  LISP  on  PCs  ERNEST  R.  TELLO 
416  The  State  of  BASIC 

418  Books 

419  Of  Interest 

422  Swaine's  Flames  MICHAEL  SWAINE 


VOLUME  12,  NUMBER  128,  1987 


425  Editorial  MICHAEL  SWAINE 

426  Running  Light  LEVI  THOMAS 

427  Letters 

429  Viewpoint:  What's  Right  with  High-Level  Languages? 

Brian  R.  Anderson 

430  An  Efficient  Algorithm  for  Large  Priority  Queues  ROBERT  JAY  BROWN 
434  Two-Bit  Analog-to-Digital  Conversion  JOHN  MUSSELMAN 

437  The  XOR  Chain  DAVID  E.  CORTESI 

442  An  Extended  IBM  PC  COM  Port  Driver  THOMAS  A.  ZlMNIEWICZ 
447  Dynamic  Memory  Overlays  for  Turbo  Pascal  STEVE  MCMAHON 
450  A  Unix  BBS  Using  Shell  Scripts  JAN  L.  HARRINGTON 
482  C  Chest:  Priority  Queues  ALLEN  HOLUB 
489  16-Bit  Software  Toolbox:  Resources  RAY  DUNCAN 
492  Artificial  Intelligence:  Object-Oriented  Programming  in  SCOOPs 
Ernest  R.  tello 
497  The  State  of  BASIC 
499  Of  Interest 

501  Swaine's  Flames  MICHAEL  SWAINE 


VOLUME  12,  NUMBER  129,  1987 


505  Editorial  MICHAEL  SWAINE 

506  Running  Light  TYLER  SPERRY 

507  Letters 

509  Developing  80386  Applications  . . .  Today  RICHARD  RELPH 
5 14  8088  Assembly-Language  Programming  Techniques  TOM  DlSQUE 

5 1 8  Logic  and  Knowledge  Representation  in  PROLOG  RICHARD  BUTRICK 
524  Multitasking  with  Turbo  Pascal  CRAIG  A.  LlNDLEY 
550  C  Chest:  Curses:  Unix-Compatible  Windowing  Output  Functions 
Allen  Holub 

557  16-Bit  Software  Toolbox:  80386  Programming  Tools  RAY  DUNCAN 
561  Structured  Programming:  Software  Design  Rules  MICHAEL  Ham 
566  Artificial  Intelligence:  The  Xerox  1 1 86  LISP  Machine  ERNEST  R.  TELLO 

572  Books 

573  The  State  of  BASIC:  Fundamental  Data  Types  in  the  New  BASICs 

574  Of  Interest 

575  Swaine's  Flames  MICHAEL  SWAINE 


VOLUME  12,  NUMBER  130,  1987 


579  Editorial  Michael  SWAINE 

580  Running  Light  TYLER  SPERRY 

581  Letters 

583  Preparing  for  ANSI  C  RICHARD  RELPH 
589  Backtracking  CHARLES  F.  BOWMAN 

592  What's  the  DIFF?  DON  KRANTZ 

598  Optimizing  Compilers  for  C  RICHARD  RELPH 
627  C  Chest:  Subroutines  with  A  Variable  Number  of  Arguments 
Allen  Holub 

632  16-BIT  Software  Toolbox:  MS-DOS  3.30  Ray  DUNCAN 
637  Structured  Programming:  Translating  from  MS-BASIC  to  C 
Namir  Clement  Shammas 

640  Artificial  Intelligence:  LOOPS  ERNEST  R.  TELLO 

649  The  State  of  BASIC:  The  New  Internal  Coding  Engines 

650  Of  Interest:  386  Computers 

652  Swaine’s  Flames  MICHAEL  SWAINE 


VOLUME  12,  NUMBER  131,  1987 


655  Editorial:  Stone  Age  Computing  TYLER  SPERRY 

656  Running  Light  TYLER  SPERRY 

657  Letters 

659  How  Many  Ways  Can  You  Draw  a  Circle?  JAMES  F.  BLINN 
666  File  Comparison  Algorithms  TOM  STEPPE 
67 1  The  XOR  Chain  Revisited  BENNETTE  R.  HARRIS 
676  Writing  MS-DOS  Device  Drivers  in  C  ANDY  KLEIN 
721  C  Chest:  The  Ultimate  Metronome:  Writing  Interrupt  Service  Routines  in  C 
ALLEN  HOLUB 

729  Structured  Programming:  V.I.P.,  Clustered  Binary  Trees,  and  Clustered 
List  Data  Structures  NAMIR  CLEMENT  SHAMMAS 
734  Artificial  Intelligence:  Smalltalk/V  ERNEST  R.  TELLO 
740  Of  Interest 

742  Swaine's  Flames  MICHAEL  SWAINE 


VOLUME  12,  NUMBER  132,  1987 


745  Editorial:  Lebensraum  and  RAM  TYLER  SPERRY 

746  Running  Light  TYLER  SPERRY 

747  Letters 

749  Async  AppleTalk  RICHARD  E.  BROWN  and  STEVE  LlGETT 
757  A  Fast  Forth  for  the  68000  LORI  CHAVEZ 
761  A  Forth  Standard  Prelude  MARTIN  TRACY 

765  Pattern  Matching  Using  Finite  State  Machines  CHARLES  F.  BOWMAN 
806  C  Chest:  Language  Wars  Over  Cs  ALLEN  HOLUB 
809  The  Forth  Column  MARTIN  TRACY 

8 1 3  Structured  Programming:  Data  Hiding  and  Its  V ariations 

Namir  Clement  Shammas 
817  Of  Interest 

819  Swaine's  Flames  MICHAEL  SWAINE 


VOLUME  12,  NUMBER  133,  1987 

823 

Editorial:  Mike's  Survey  MICHAEL  SWAINE 

824 

Running  Light  TYLER  SPERRY 

825 

Letters 

826 

3-D  Images  from  Contour  Maps  WILLIAM  D.  MAY 

833 

A  Graphics  Toolbox  for  Turbo  C  KENT  PORTER 

840 

A  Graphics  Toolkit  for  Turbo  Pascal  HUBERT  D.  Cali  than 

845 

Using  EGA  Graphics  Screens  in  Your  Programs  J. 

Brooks  Breeden 

851 

Automated  Interrupt  Handling  in  C  RON  MILLER 

856 

An  Alternative  to  Soundex  JIM  Howell 

893 

C  Chest:  Using  the  Unix/ANSI  Time  Functions  ALLEN  HOLUB 

898 

Structured  Programming:  Heuristic  Searching 

Namir  Clement  Shammas 

900 

Artificial  Intelligence:  Object-Oriented  Programming 

Ernest  r.  Tello 

908 

Of  Interest 

911 

Swaine's  Flames  MICHAEL  SWAINE 

VOLUME  12,  NUMBER  134,  1987 

915  Editorial:  DOS  Ex  Machina  TYLER  SPERRY 

916  Running  Light  TYLER  SPERRY 

917  Letters 

919  Dynamic  Linking  in  OS/2  DAVID  E.  CORTESI 
925  A  RAM-Cache  Manager  in  C  Alan  DEIKMAN 
930  Putting  ROM  Code  in  Its  Place  Rick  Naro 
935  Integers  Don't  Float  Ray  Mariella 
937  A  Graphics  Toolbox  for  Turbo  C — Part  2  KENT  PORTER 

983  C  Chest:  A  Preemptive  Multitasking  Kernel  and  More  Mean  Subroutines 
Allen  Holub 

993  The  Forth  Column:  New  Forth  Sources,  a  Bibliography,  and  String 
Extensions  for  Forth-83  MARTIN  TRACY 
999  Of  Interest 

1001  Swaine's  Flames  MICHAEL  SWAINE 


1003  INDEX 


ANNUAL  68K  ISSUE 


( 

I 

68K  Mini  Forth 

OS-9  Operating 
System 

Mac  and  Amiga 

Interface 

Programming 

Languages: 

Forth  Names 

Proper  PROLOG 

Memory  Management  in  C 

BASIC  Rebirth 

68K  Assembly  Project 


The  Bandwidth 
Bottleneck 


JANUARY  1987 


CONTENTS 


VOLUME  12,  ISSUE  1 


ARTICLES 


ANNUAL  68K  ISSUE 


Mac  ana  nt 

Ir.le'faca 

Wo'yamming 


The  G8K  story  ► 


BBOxx  PROGRAMMING:  MOxx  Canpnters:  Where  Are 
They  Going? 

byNickTlimer  ,  -  V  ;  ' 

An  overview  of  the  680xx  family  of  chips:  past,  present, 
and  probable  future. 

680xx  PROGRAMMING:  A  Mini  Forth  for  the  68000 

by  G.  Yates  Fletcher 

Yates  tells  us  about  the  “no  frills"  Forth-like  interpreter  for 
the  68000  that  he  designed  to  test  the  theory  that  Forth  is 
more  naturally  understood  as  a  program  than  as  a 
language. 

680xx  PROGRAMMING:  The  08-0  Operating  System 

by  Brian  Capouch  ' 

A  look  at  the  modular,  multiprogramming,  multitasking 


Mac  and  Amiga 
assembly  ► 
language 


programmers. 


About  the  Cover 

Motorola  has  just  forged  the 
68030.  Is  it  as  hot  as  it  seems? 


Gadgets  t 

by  Jan  L  Harrington 

Comparing  user  interface  programming  on  the  Macintosh 
and  Amiga,  Jan  provides  details  about  the  operating  system 
support  produced  on  both  machines  for  user  interface 
features  such  as  menus,  buttons  and  windows. 

PROCESSORS:  Series  33006  Cross  Assembler 

by  Richard  Rodman 

The  listing  !in  human-readable  form)  for  Richard's  article 
that  was  published  in  December. 


This  Issue 

In  the  beginning  there  were  the 
8080  and  the  6502 — program¬ 
mers  chose  their  weapons  and 
the  battle  lines  were  drawn.  A 
few  years  later,  Motorola  gave 
the  “sixers"  more  power  when  it 
introduced  its  680xx  line  of  chips. 
Today  there  is  a  wide  range  of 
powerful  680xx  machines — and 
some  very  interesting  rumors 
about  the  future.  This  month  we 
survey  the  680xx  family  and  ex¬ 
amine  a  modular,  multitasking 
operating  system,  a  68K  Forth- 
like  interpreter,  and  the  chal¬ 
lenges  of  creating  Amiga-  and 
Mac-like  user  interfaces. 


Managing  your 
memory  ► 


by  Allen  Ffohib  ■ ' ' 

Allen  presents  some  memory  management  techniques 
plus  an  explanation  of  C  memory  organization  for 
heginningC  programmers. 

STRUCTURED  PROGRAMMING 
by  Michael  Ham 

Michael  discusses  the  naming  of  names  in  Forth, 

THE  RIGHT  TO  ASSEMBLE 
by  Nick  Turner 

Nick  launches  a  project  to  design  a  versatile,  easy-to-use, 


Forth  names  ► 


Next  Issue 

The  choice  of  a  text  editor  is 
based  on  many  highly  subjective 
considerations  as  well  as  some 
“hard”  pragmatic  requirements. 
In  February,  we  ll  present  an 
overview  of  the  various  elements 
involved  in  that  choice  and  let 
you  hear  what  some  program¬ 
mers  have  to  say  about  their  fa¬ 
vorite  and  least  favorite  editors. 


The  bandwidth  3 
bottleneck  5 


by  Michael  Swathe 
RUNNING  LIGHT 
by  Nick  Turner 
ARCHIVES 
LETTERS 
by  you 
VIEWPOINT 
by  UickButrick 
DIM  ON  LINE 
SWAING’S  FLAMES 
byMchael  Swaim 


DDJ  books  and  software 
OF  INTEREST: 

New  products  out  there 
ADVERTISER  INDEX: 
Where  to  find  those  ads 


Proper  PROLOG  ► 


JJ — bandwidth  topic 
^ — entry  point 


The  rebirth  of 
BASIC 


Dr.  Dobb's  Journal,  January  1987 

2 


3 


FORUM 


EDITORIAL 


Here's  some  of 
what  we  have 
planned  for  DDJ's 
twelfth  year. 

In  a  series  of  short 
reports,  contributing 
editor  Namir  Sham- 
mas  will  examine  the 
state  of  the  BASIC  lan¬ 
guage  in  1987.  Yes,  BA¬ 
SIC.  DDJ  is  hardly  a  be¬ 
ginner's  magazine,  but 
if  the  beginner’s  language  has  grown 
up  to  long  pants,  as  some  claim,  we 
should  all  know  about  it. 

^^Then  again,  each  of  us  is  a  nov- 
*90  ice  in  some  area.  Realizing  this, 
DDJ  will  use  the  icon  at  the  beginning 
of  this  paragraph  to  flag  certain  fea¬ 
tures  as  entry  points:  items  such  as 
Allen  Holub's  Flotsam  and  Jetsam  in 
this  issue,  from  which  the  less  experi¬ 
enced  programmer  can  learn  useful 
techniques  or  gain  familiarity  with 
more  technical  subjects.  C 

There  are  a  lot  of  ways  to  write 
about  artificial  intelligence,  nearly  all 
bad.  Our  new  column  on  alternative 
programming  paradigms  (it  starts 
next  month)  will  avoid  them  all  as  it 
examines  such  topics  as  knowledge- 
based  programming,  logic  program¬ 
ming,  and  object-oriented  program¬ 
ming  from  an  experienced  software 
designer's  perspective.  Contributing 
editor  Ernie  Tello  writes,  lectures,  and 
consults  in  this  area  and  promises  to 
take  us  beyond  the  fields  we  know. 
And  we’re  going  to  attack  the 
bandwidth  problem. 

DDJ  was  born  to  shoehorn  BASIC 
into  the  hypomnemonic  personal 
computers  of  1976.  In  a  sense,  the 
Cain/Hendrix  versions  of  Small-C 
published  in  DDJ  over  the  succeeding 
years  addressed  this  same  problem  of 
cramming  programming  power  into 
micro  memory.  You  could  say  that's 
been  the  charter  of  the  magazine, 
and  on  one  level  it  will  continue  to  be 
our  focus.  But  nobody  today  needs 
another  Tiny  BASIC  or  Small-C.  Devel¬ 
opments  like  the  Intel  80386  proces¬ 


sor  lift  the  lid  of  a  dif¬ 
ferent  box  of  program¬ 
ming  problems.  One  is 
the  efficient  transmis¬ 
sion  of  information 
over  limited  channels. 

Bandwidth  is  al¬ 
ready  an  issue  in 
graphics  output:  Mi¬ 
crosoft  Windows  nev¬ 
er  made  sense  on  8088 
machines  and  the  PARC 
interface  overwhelmed  the  original 
128K  Mac.  Adequate  memory  and 
processor  power  make  a  big  differ¬ 
ence,  but  they  may  ultimately  just 
move  the  bottleneck  to  the  IC  level. 

Bandwidth  becomes  more  of  an  is¬ 
sue  in  mass  storage  as  storage  be¬ 
comes  more  massive.  Once  we  learn 
what  to  do  with  CD-ROMs,  retrieving 
information  efficiently  from  them 
will  require  more  than  increased 
speed  of  transmission. 

The  potential  bandwidth  crunch 
in  telecommunications  and  remote 
database  access  is  obvious,  but  when 
LANs  start  proliferating,  so  will  in- 
house  bandwidth  competition. 

Approaches  to  the  bandwidth 
crunch  can  range  from  clever  data- 
compression  algorithms  to  systems 
that  form  hypotheses  about  incoming 
data  and  acquire  just  enough  data  to 
confirm  or  reject  the  hypotheses.  As 
access  to  information  becomes  more 
technically  problematic,  it  will  also 
take  on  sociopolitical  dimensions;  for 
example,  public  libraries  are  becom 
ing  measurably  less  public  as  they 
subscribe  to  commercial  information 
provider  services  and  pass  the  costs 
on  to  their  patrons.  We'll  delve  into 
the  technology  for  dealing  with  the 
bandwidth  crunch  while  trying  to  see 
its  potential  social  consequences. 

Bandwidth- related  items  will  be 
flagged  with  the  icon  at  the  end  of 
this  line.  ■ 

Michael  Swaine 
editor-in-chief 


Dr.  Dobb’s  Journal  of 

Software  Tools 

Editorial 

Editor-in-Chief  Michael  Swaine 
Editor  Nick  Turner 
Managing  Editor  Vince  Leone 
Assistant  Editors  Sara  Noah  Ruddy 
Levi  Thomas 

Technical  Editor  Allen  Holub 
Contributing  Editors  Ray  Duncan 
Michael  Ham 
Bela  Lubkin 
Namir  Shammas 
Ernest  R.  Tello 

Copy  Editor  Rhoda  Simmons 
Production 

Production  Manager  Bob  Wynne 

Art  Director  Michael  Hollister 
Assoc.  Art  Director  Joe  Sikoryak 
Typesetter  Jean  Aring 
Cover  Artist  Michael  Carr 
Circulation 

Circulation  Director  Maureen  Kaminski 
Newsstand  Sales  Mgr.  Stephanie  Barber 
Book  Marketing  Mgr.  Jane  Shaminghouse 
Circulation  Coordinator  Kathleen  Shay 
Administration 
Finance  Director  Kate  Wheat 
Business  Manager  Betty  Trickett 
Accounts  Payable  Supv.  Mayda  LopezOpintana 
Accounts  Receivable  Mgr.  Laura  Di  Lazzaro 
Advertising  Director 
Robert  Horton  (415)  366-3600 
Account  Managers 
Michele  Beaty  (317)  8754093 
Lisa  Boudreau  (415)  366-3600 
Gary  George  (404)  897-1923 
Michael  Wiener  (415)  366-3600 
Cynthia  Zuck  (718)  499-9333 
Promotions/Srvcs.Mgr.  Anna  Kittleson 
Advertising  Coordinator  Charles  Shively 

M&T  Publishing  Inc. 

Chairman  of  the  Board  Otmar  Weber 
Director  C.F.  von  Qpadt 
President  and  Publisher  Laird  Foshay 


Dr.  Dobb's  Journal  of  Software  Tools  (USPS  307690) 
is  published  monthly  by  M&.T  Publishing  Inc.,  501  Gal¬ 
veston  Dr.,  Redwood  City,  CA  94063;  (415)  366-3600. 
Second-class  postage  paid  at  Redwood  City  and  at  ad¬ 
ditional  entry  points. 

Article  Submissions:  Send  manuscripts  and  disk 
(with  article  and  listings)  to  the  Editor. 

DDJ  on  CompuServe:  Type  GO  DDJ 
Address  Correction  Requested:  Postmaster:  Send 
Form  3579  to  Dr.  Dobb's  Journal,  P.O.  Box  27809,  San 
Diego,  CA  92128.  ISSN  0888-3076 

Customer  Service:  For  subscription  problems  call: 
outside  CA  (800)  321-3333;  in  CA  (619)  485-9623  or  566- 
6947.  For  book/software  order  problems  call  (415)  366- 
3600. 

Subscriptions:  $29.97  per  year;  $56.97  for  2  years. 
Canada  and  Mexico  add  $27  per  year  airmail  or  $10 
per  year  surface.  All  other  countries  add  $45  per  year 
airmail  or  $28  per  year  surface.  Foreign  subscriptions 
must  be  prepaid  in  U.S.  funds  drawn  on  a  U.S.  bank. 
For  foreign  subscriptions,  TELEX:  752-351. 

Foreign  Newsstand  Distributor:  Worldwide  Media 
Service  Inc.,  386  Park  Ave.  South,  New  York,  NY  10016; 
(212)  686-1520  TELEX:  620430  (WUI). 

Entire  contents  copyright  c  1987  by  M&.T 
Publishing,  Inc.,  unless  otherwise  noted  on 
specific  articles.  All  rights  reserved. 


People's  Computer  Company 

Dr.  Dobb  's  Journal  of  Software  Tools  is  published  by  M&.T 
Publishing  Inc.  under  license  from  People’s  Computer  Company , 
2682  Bishop  Dr.,  Suite  107,  San  Ramon,  CA  94583,  a  nonprofit 
corporation. 


6 


Dr.  Dobb's  Journal,  January  1987 

3 


FORUM 


RUNNING  LIGHT 


What  is  the  logical 
end  point  of  the 
current  trend  toward 
smaller  and  smaller 
chip  components? 
How  soon  will  it  be 
reached,  and  how  reli¬ 
able  will  the  compo¬ 
nents  be  at  that  level? 
These  questions  are  be¬ 
ing  addressed  by  a  to¬ 
tally  new  branch  of  sci¬ 


ence,  one  that  combines  the 
disciplines  of  chemistry,  physics,  biol¬ 
ogy,  computer  science,  and  mathe¬ 
matics.  The  new  field,  called  nano¬ 
technology,  concerns  itself  with 
devices  that  are  several  orders  of  mag¬ 
nitude  smaller  than  current  microcir¬ 
cuitry — components  made  up  of  indi¬ 
vidual  atoms  and  molecules  carefully 
bonded  one  by  one.  This  may  seem 
loo  farfetched  to  expect  to  see  it  in  our 
lifetimes,  but  we’re  much,  much  clos¬ 
er  than  that.  Several  recent  develop¬ 
ments  have  suddenly  elevated  nano¬ 
technology  from  a  pipe  dream  to  a 
bona  fide  discipline.  I  recently  attend¬ 
ed  a  seminar  at  which  the  field's  most 
vocal  advocate,  Eric  Drexler,  spoke  el¬ 
oquently  about  nanotechnology.  He 
actually  laid  out  the  design  (on  an 
atom-by-atom  scale)  of  a  working 
nanocomputer,  smaller  than  a  virus 
particle,  along  with  all  its  support  de¬ 
vices.  I’m  excited  by  the  potential  of 
such  incredibly  small,  fast  devices, 
and  I’d  love  to  hear  from  any  of  you 
who  are  working  in  this  new  field. 

We've  received  excellent  respons¬ 
es  to  our  October  1986  issue,  which 
focused  on  the  80386  and  its  family. 
This  month  we  focus  on  the  other 
side,  the  sixers.  (If  you're  a  680xx  pro¬ 
grammer,  you're  probably  a  sixer.) 
We  start  with  an  overview  of  the 
68000  line — where  it  has  been, 
where  it  is  now,  and  where  it  might 
be  going.  As  more  information  be¬ 
comes  available  on  the  68040  (and  be¬ 
yond),  we’ll  keep  you  up  to  date. 

Yates  Fletcher  has  written  a  Forth- 


like  interpreter  for  the 
68000  that  he  calls 
FLINT.  His  article  in  this 
issue  describes  it  in  de¬ 
tail  and  offers  some  in¬ 
teresting  insight  into 
what  makes  threaded 
interpreters  so  effi¬ 
cient.  Also  in  this  issue, 
Brian  Capouch  takes  a 
look  at  the  OS-9  operat¬ 
ing  system  from 
Microware.  OS-9  shows  promise  as  a 
standard  for  68K  development  work, 
and  Brian  explains  why.  Jan  Harring¬ 
ton's  article  about  Amiga  gadgets  and 
Macintosh  buttons  is  one  of  the  clear¬ 
est  comparisons  I've  seen  of  the  differ¬ 
ent  programming  styles  required  on 
the  two  machines. 

This  month  I  begin  what  I  hope  will 
be  a  series  of  essays  on  the  design  of 
an  interpreted  language  for  the  68000. 
I'm  interested  in  your  comments  and 
criticism. 

Last  July,  in  our  annual  Forth  issue, 
we  published  an  article  about  a  Forth- 
driven  robot  that  dives  into  the  ocean, 
records  oceanic  data,  and  then  pops 
up  to  send  the  data  home  via  satellite. 
This  July  we’d  like  to  focus  on  embed¬ 
ded  systems  of  that  sort — programs 
that  reside  inside  autonomous  de¬ 
vices.  If  you're  working  on  such  a  de¬ 
vice,  we'd  like  to  hear  from  you. 

Bela  Lubkin  has  joined  Levi  Thom¬ 
as  and  Ray  Duncan  as  a  sysop  on  our 
CompuServe  SIG  (DDJFORUM)  and  has 
been  stirring  things  up  quite  nicely. 
You  will  doubtless  see  his  name  on 
many  a  message  in  DDJ  On  Line. 

Finally,  I’d  like  to  thank  Jerry 
Houston,  Roger  Dunn,  John  Berry, 
Charles  Marslett,  and  Wayne  Vu- 
cenic  for  their  help  with  Table  2  on 
page  18. 

Nick  Turner 
editor 


ARCHIVES 


The  68000  et  Famille 

[A  small  boy,  Oliver  Wendall  Jones,  is  sit¬ 
ting  on  Santa's  lap  reciting  his  Christmas 
wish  list.] 

Oliver:  "That’s  the  32-bit  MC68000  micro¬ 
processor  .  . .  not  to  be  confused  with. ..." 
Santa:  "The  16-bit  8088.  Total  garbage,  El 
Stinko.” — Bloom  County  (comic  strip), 
Berke  Breathed,  December  12,  1985. 

"Tom  Pittman,  the  implementor  of  the 
$5/ copy  version  of  Tiny  BASIC  for  the  6800 
and  650x  is  now  offering  an  experimenter’s 
kit  for  those  folks  who  want  to  modify  and 
extend  his  Tiny  BASIC.  The  kit  includes  an 
assembled  source  listing  for  the  IL  (Interme¬ 
diate  Language),  an  IL  assembler  written  in 
Tiny  BASIC,  a  detailed  description  of  the  vir¬ 
tual  machine  implemented  by  the  IL,  in¬ 
structions  for  incorporating  a  new  IL  into 
the  Tiny  BASIC  system,  and  finally  some 
practical  hints  about  debugging  and  extend¬ 
ing  the  system.  The  cost  is  $10  from  Itty  Bitty 
Computers."— DDJ,  March  1977. 


Flak  from  Our  Readers 

"A  language  design  frequency  of  almost 
one  Tiny  BASIC  version  per  month  is  cer¬ 
tainly  impressive.  Programmers  ought  to 
have  realized  that  God  invented  many  dif¬ 
ferent  Iaguages  in  Babylon  not  to  enjoy  but 
to  punish  mankind,  and,  after  all,  he  never 
intended  implementing  all  of  them  for  the 
8080.  So  why  not  write — and,  if  necessary, 
publish — programs  in  an  informal,  ad  hoc 
language  fitting  to  the  problem,  not  the 
computer?” — Thomas  Alexander  Matzner, 
letter  to  the  editor,  DDJ,  March  1977. 

"[You  write  'em;  we’ll  publish  (may¬ 
be).)”—  editorial  reply  to  the  above,  DDJ, 
March  1977. 


Ten  Years  Ago  in  DDJ 

"Why  bother  with  a  multi-tasking  oper- 
ating  system  on  a  personal’  computer? 
Let  s  daydream  for  a  moment.  Wouldn't  it 
be  nice  to  be  able  to  start  a  lengthy  listing 
on  our  hardcopy  device;  while  that  was 
running,  start  an  assembly  of  a  large  pro¬ 
gram;  and  then  go  about  editing  the  source 
for  another  program  from  our  softcopy 
terminal?  That's  exactly  what  you  can  do 
with  a  multi-tasking  system.”— Jim  War¬ 
ren,  DDJ,  January  1977. 

”1  would  like  to  particularly  applaud 
Dick  [WilcoxJ's  position  regarding  low-cost 
distribution  of  software  for  not-for-profit 
use.  He  is  recognizing  and  adjusting  to  the 
realities  of  the  new  world  of  personal  com¬ 
puting  in  a  manner  that  I  feel  is  fair  and 
reasonable  for  everyone  concerned.” — 
Jim  Warren,  DDJ,  January  1977. 

Dr.  DoBB'S  loURNALof 

PUTER 

Orthodontia 

Running  Light  Without  Overbytr 


8 


4 


Dr.  Dobb's  Journal,  January  1987 


FORUM 


LETTERS 


Software  Gap 

Dear  DDJ, 

After  reading  Nick  Turner's  Running 
Light  column  in  the  August  issue  con¬ 
cerning  the  perceived  “software 
gap,”  I  felt  as  though  I  had  to  tell  you 
how  I  see  it. 

I  hold  a  Bachelor  of  Science  degree 
in  computer  science  from  the  Uni¬ 
versity  of  Maryland.  I  learned  all 
about  the  "right”  way  to  design  and 
code  programs,  including  everything 
you  say  programmers  don’t  care 
about  anymore.  From  my  own  expe¬ 
rience,  I  find  this  to  have  been  a 
waste  of  time  and  tuition. 

The  biggest  problem  I  have  had  in 
my  "career”  has  been  convincing 
some  of  the  "data  processing 
managers”  how  a  program 
should  be  constructed.  Every 
time  I  try  to  write  some  form 
of  documentation,  I  am  told 
"not  to  waste  time  on  such  use¬ 
less  paperwork.”  I  wish  I 
could  say  this  only  happened 
at  one  or  two  companies,  but  I 
have  been  employed  at  four 
different  companies  over  the 
past  five  years,  and  every  one 
of  them  has  been  the  same.  I 
am  beginning  to  think  the 
only  way  I'll  ever  get  to  be 
what  you  call  a  "professional” 
programmer  is  to  start  my 
own  software  company. 

At  times  I  wonder  if  the 
problem  with  the  manage¬ 
ment  structure  stems  from 
the  fact  that  most  of  the  peo¬ 
ple  promoted  have  business 
backgrounds.  Not  one  of  my 
bosses  has  ever  had  anything 
other  than  a  business  degree, 
and  not  one  knows  the  first 


10 


thing  about  a  programming  project. 
My  current  boss  only  understands 
straight-line  coding  and  sequential 
list  processing  (that  is,  no  doubly 
linked  lists,  no  sparse  matrices,  no 
queues,  and  so  on) — nice  way  to  run  a 
systems  software  development 
group  that  still  uses  a  one-way  line 
editor. 

All  I  want  is  the  chance  to  use  what 
I  learned  in  school  and  to  be  able  to 
do  a  programming  project  correctly. 
It  would  be  such  a  treat. 

Name  withheld  by  request 

Dear  DDJ, 

I  am  writing  in  response  to  Nick 
Turner’s  August  Running  Light  about 
sloppy  programmers.  I  am  a  pro¬ 
grammer/engineer,  and  I  see  a  lot  of 
sloppy  programming.  In  fact,  I  do 
some  sloppy  programming  myself.  I 
think  I  might  have  a  clue  as  to  what  is 
going  on. 

Turner  mentions  the  program¬ 
mers  who  can  "write  entire  operat¬ 
ing  system  kernels  ...  in  one  pass,  in 
assembly  code,  that  run  perfectly  the 
first  time  . . . .”  Well,  I  think  they  are 
the  exception  to  the  rule.  I  do  not, 
however,  doubt  the  premise  that  we 
can  all  do  it,  it  just  takes  the  rest  of  us 


a  little  more  time  and  concentration. 

So  why  don't  we  take  the  time,  con¬ 
centrate,  and  code  better?  I  propose 
that  the  reason  why  we  don’t  is  the 
reward  system  that  most  of  us  work 
within.  My  manager  seems  to  be  less 
concerned  that  a  project  works  the 
first  time  than  he  does  that  it  is  done. 
He  wants  to  "see  something.”  Grant¬ 
ed,  I  work  for  a  defense  contractor 
and  my  manager  wants  to  record 
milestones,  not  finish  projects.  But  I 
think  this  attitude  is  more  prevalent 
than  most  of  us  think. 

Why,  then,  does  this  attitude  and 
reward  system  not  affect  the  hot  per¬ 
formers?  I  think  it  is  because  they,  at 
some  level,  do  not  respond  to  the 
same  reward  system  as  the  majority. 
Every  organization  has  at  least  one 
programmer  who  walks  to  the  beat 
of  a  different  drummer.  This  is  not  to 
say  that  all  individuals  who  fit  this 
description  are  great  programmers, 
but  most  of  the  great  programmers  I 
know  (or  know  of)  are  of  this  type. 
On  the  other  hand,  this  tends  to  make 
them  more  difficult  to  manage,  par¬ 
tially  because  they  do  not  respond  to 
the  reward  system  that  reflects  the 
views  of  management. 

So  what  can  be  done?  Well,  we 
probably  won't  change  the 
managers  or  their  ideas  of 
how  things  should  be  done. 
From  my  own  experience,  I 
find  that  if  they  want  to  see 
something  I  have  two  choices. 
First,  I  can  slop  it  together  and 
debug  it  later  (when  I  have  less 
time).  Second,  I  can  "stub”  it 
off,  perhaps  only  coding  the 
user  interface  or  some  visual 
part  of  the  program.  This  lets 
the  manager  see  something, 
but  the  code  he  sees  is  good 
code.  Later,  I  go  back  and,  in¬ 
stead  of  debugging,  I  write  (for 
the  first  time)  the  code  that  I 
stubbed  off  in  the  first  place. 
This  seems  like  the  way  to  go, 
but  it  is  sometimes  difficult  to 
determine  when  to  stop  cod¬ 
ing  and  start  stubbing. 

So  I  haven't  really  offered  a 
solution.  Just  some  ideas  as  to 
what  I  think  the  probable 
cause  is  and  what  I  conceive  I 
should  be  doing  to  change  the 


Dr.  Dobb's  Journal,  January  1987 

5 


itrt  cwe  cf  'to 
&>fc  «rc)^s  fbo 
To  PRc&kft'n 
P\  <o8CD0  ■? 


LETTERS 

(continued  from  page  10) 


situation.  I  will  be  watching  with  in¬ 
terest  to  see  what  other  readers  have 
to  say  about  this  problem. 

Name  withheld  by  request 

Avoid  Woe  when 
Upgrading  MS-DOS 

Dear  DDJ, 

Allen  Holub's  "A  Tale  of  Woe”  (C 
Chest,  September  1986)  sparked  some 
vivid  memories  of  upgrading  from 
MS-DOS  2.11  to  MS-DOS  3.10.  MS-DOS  2.11 
and  earlier  would  search  for  an  avail¬ 
able  file  from  the  beginning  of  the 
FAT  each  time.  MS-DOS  3.10  is  more  ef¬ 
ficient  because  it  allocates  from 
wherever  it  left  off.  This  caused  me 
much  grief  in  trying  to  change  the 
attributes  and  delete  and  replace  the 
two  system  files  Allen  mentioned, 
IBMBIO.COM  and  IBMDOS.COM  for  the 
PC  and  IO.SYS  and  MSDOS.SYS  for  ge¬ 
neric  MS-DOS.  These  two  files  must  re¬ 
side  in  the  first  clusters  of  the  disk. 
The  first  data  cluster  is  cluster  2.  Ad¬ 
ditionally,  they  must  be  contiguous. 
You  can  delete  and  replace  these  two 
files  by  following  a  few  simple  rules. 
The  total  number  of  clusters  must  be 
no  more  than  the  original  unless 
there  are  some  unallocated  clusters 
just  beyond  these  two  files.  Under  MS- 
DOS  3.10,  you  can  unhide  and  delete 
these  files  and  then  reboot  from  a 
floppy.  This  will  restore  the  FAT 
pointer  to  start  searching  at  the  be¬ 
ginning  of  the  disk  to  modify.  You 
may  then  copy  the  system  files, 
which  will  now  be  the  first  n  clus¬ 
ters,  and  they  will  be  contiguous. 
Now  the  disk  will  be  bootable.  It 
helps  to  be  able  to  track  through  the 
FAT  and  directories  if  there  is  a  prob¬ 
lem  or  if  more  space  is  needed.  So 
there  may  be  no  need  to  reformat. 

Max  G.  Heffler 

Landmark  Graphics 
Corp. 

1011  Hwy.  6  S,  #120 

Houston,  TX  77077 

OS-9  Continued 

Dear  DDJ, 

I  was  dismayed  to  see  the  thrashing 
Heitzso  gave  to  the  OS-9  operating  sys¬ 
tem  in  his  October  letter.  I  disagree 
with  both  his  specific  examples  and 
his  general  conclusions  (please  par¬ 
don  the  assumption  of  gender). 

To  start  with,  OS-9  cannot  compete 


with  Unix  on  disk  speed;  Unix  keeps 
part  of  its  file  structure  in  memory 
whereas  OS-9  always  keeps  its  sector 
allocation  bit  map  on  the  disk  up  to 
date.  There  is  a  clear  speed  advantage 
to  accessing  information  in  memory 
rather  than  on  disk,  but  you  pay  a 
price  in  vulnerability.  In  this  case, 
Microware  chose  to  make  the  file 
structure  robust  and  corruptionproof. 

Heitzso  illustrates  the  “real  prob¬ 
lems”  he  has  had  with  OS-9  by  de¬ 
scribing  his  difficulty  in  using  the 
tsleepf  )  function.  After  reading  his 
letter  I  wrote  a  simple  C  program  to 
test  the  tsleepf )  function,  and  I  was 
unable  to  make  it  malfunction  for 
any  number  of  ticks  I  specified,  over 
a  range  of  1  to  several  thousand.  In 
every  case  the  timing  was  +/—  one 
tick,  just  as  specified. 

The  tsleepf )  function  accepts  a  sin¬ 
gle  parameter  indicating  the  number 
of  ticks  to  sleep.  Most  OS-9  systems  use 
a  tick  granularity  of  1/100  second, 
which  also  happened  to  be  the  length 
of  time  Heitzso  wished  the  task  to 
sleep.  I  suspect  that  he  requested  that 
the  task  sleep  for  one  tick;  however, 
the  documentation  clearly  states  that 
a  tick  parameter  of  1  causes  the  call¬ 
ing  task  to  give  up  its  present  time 
slice.  At  the  expiration  of  the  time 
slice,  the  task  will  be  put  back  on  the 
active  process  queue,  where  it  will 
compete  for  CPU  time  with  other  pro¬ 
cesses  that  are  ready  to  run.  If  the  call¬ 
ing  task  gives  up  its  time  slice  near  the 
end  of  a  quantum  and  there  are  no 
other  executable  processes,  it  is  quite 
possible  that  tsleepf )  will  return  after 
an  interval  as  short  as  1/3,000  second. 

Heitzso  also  describes  how  the  OS-9 
Format  utility  has  a  bug  that  prevents 
the  user  from  specifying  a  cluster  size 
greater  than  one  sector.  In  reality,  the 
documentation  for  the  Format  utility 
mentions  that  at  present  only  a  cluster 
size  of  one  is  supported! 

I  have  yet  to  uncover  a  bug  in  the 
operating  system.  Microware's  latest 
product  discrepancy  report  lists  a 
single  bug  in  the  operating  system 
components,  and  it  is  triggered  by  an 
obscure  condition  in  a  little-used  sys¬ 
tem  call. 

I  disagree  with  Heitzso  s  conclu¬ 
sion  that  Microware’s  customer  sup¬ 
port  is  poor.  I  have  found  Microware 
to  be  quite  reasonable  in  the  dealings 


I  have  had  with  it.  Microware  pro¬ 
vides  bug  lists  and  work-arounds  to 
those  who  request  them,  and  it  offers 
a  special  telephone  hot  line  for  pro¬ 
fessional  software  developers.  One  of 
my  gripes  is  that  the  hot  line  is  too 
expensive,  but  I  have  heard  that  this 
may  change.  I  hope  so. 

During  a  time  when  many  experts 
in  the  computing  field  were  touting 
the  use  of  high-level  languages  as  the 
best  way  to  create  an  operating  sys¬ 
tem,  Microware  quietly  crafted  an  el¬ 
egant,  modular,  and  extensible  gem  in 
assembly  language.  I  look  forward  to 
seeing  OS-9  dominate  the  68000  mar¬ 
ket  as  more  people  recognize  its 
merits. 

Kurt  Liebezeit 
Ordinate  Systems 
505  W.  Springfield 
Champaign,  IL  61802 

We  didn 't  do  our  homework  when  we 
published  Heitzso 's  letter.  This  issue 
contains  a  look  at  the  OS-9  operating 
system  by  Brian  Capouch — eds. 

Correction 

Dear  DDJ, 

The  Microsoft-supplied  correction  to 
Version  4  of  the  Microsoft  Macro  As¬ 
sembler  that  Ray  Duncan  gives  on 
page  96  of  the  September  1986  issue 
of  DDJ  is  incorrect.  The  "correction" 
as  published  causes  MASM  to  go  off 
into  never-never  land.  The  error  in 
the  listing  on  page  96  is  a  typographi¬ 
cal  one.  The  byte  entered  into  ad¬ 
dress  y.)QQt:72D4  should  be  E9  instead 
of  39.  This  error  appears  in  the  string 
of  34  bytes  that  are  entered  starting  at 
Wqoc:72B8. 

Robert  C.  F.  Bartels 
P.O.  Box  2240 
Ann  Arbor,  MI  48106 

DDJ 


la 

6 


Dr.  Dobb's  Journal,  January  1987 


FORUM 


VIEWPOINT 


Logic  and  PROLOG 

The  touted  virtue  of  PROLOG  is  that  it 
provides  a  basis  for  programming  in 
logic — hence  its  name.  This  suggests 
that  logically  correct  descriptions  or 
axiomatizations  of  a  body  of  knowl¬ 
edge  can  be  transcribed  into  PROLOG 
and  that  the  appropriate  deductions 
could  then  be  drawn  by  PROLOG’S  in¬ 
ference  engine.  The  idea  of  systemati¬ 
zation  of  knowledge  by  way  of  postu¬ 
lates  embedded  in  a  deductive  system 
has  proven  to  be  a  powerful  one  since 
the  time  of  Euclid.  With  the  develop¬ 
ment  of  predicate  logic  by  Frege  and 
Russell,  the  idea  received  new  impe¬ 
tus  in  this  century.  Many  areas  of 
mathematical  and  scientific  investiga¬ 
tion  have  axiomatic  foundations — set 
theory  being  a  prime  example.  The 
expansion  and  codification  of  knowl¬ 
edge  by  the  deductive-axiomatic 
method,  as  the  central  methodology 
of  knowledge,  has  its  critics.  But  what 
has  delayed  its  use  outside  theory  con¬ 
struction  itself  is  that,  for  practical 
real-time  applications,  hand-deduc¬ 
tion  from  a  knowledge  base  is  too 


by  Dick  Butrick 


slow.  What  PROLOG  promises  is  speed 
of  deduction.  You  could  simply  query 
a  knowledge  base  (axioms,  postulates), 
and  PROLOG  would  make  deductions 
with  computational  speed — giving 
tremendous  impetus  to  the  deductive- 
axiomatic  method  as  the  central 
methodology  of  knowledge  by  re¬ 
moving  the  practical  barrier  to  its  use. 

Unfortunately,  the  match  between 


Dick  Butrick,  Ohio  University,  Athens, 
OH  45701.  Dick  is  a  professor  in  the 
Computer  Science  Department. 


formal  logic  and  PROLOG  is  tenuous  at 
best.  In  fact,  from  a  strictly  logical 
point  of  view,  PROLOG  is  inconsistent. 
Because  any  inconsistent  deductive 
system  is  complete  (if  you  can  derive  a 
contradiction,  you  can  derive  any¬ 
thing),  PROLOG  is  theoretically  com¬ 
plete.  In  implementation,  however, 
PROLOG  is  not  just  inconsistent,  it  is  in¬ 
complete.  Time  considerations  and 
stack  space  are  not  considerations  in 
pure  logic,  but  these  realities  render 
PROLOG  radically  incomplete. 

Of  course,  there  are  PROLOGS  and 
PROLOGS.  Specific  reference  here  is  to 
PLS  PROLOG.  However,  the  points 
raised  apply  just  as  well  to  Borland 
PROLOG  and  indeed  to  any  nonpure, 
expanded,  DEC-20-type  PROLOG 
(Quintus  PROLOG,  CPROLOG,  Poplog, 
and  so  on). 

Typically,  in  testing  out  PROLOG  as 
a  deductive-axiomatic  shell,  a  logi¬ 
cian  might  enter  postulates  for  the 
transitivity  of  R: 

(x)  (y)  (z)  (Rxy  &  Ryz  ->  Rxz) 

In  the  Simple  syntax  of  PLS,  this 
becomes: 

R(x  z)  if  R(x  y)  and  R(y  z) 

Given  the  R-facts  R(a  b)  and  R(b  c),  it 
seems  reasonable  to  query  the  sys¬ 
tem  with  is(R(a  c) J?  PROLOG  prompt¬ 
ly  responds  with  “no  more  space.” 
Stack  overflow  has  occurred. 

This  is  not  apt  to  impress  the  logi¬ 
cian,  or  for  that  matter  the  layman, 
with  the  deductive  power  of  PROLOG. 
At  this  point  the  dismayed  logician 
might  enter  the  PROLOG  equivalent  of 
{p  <->  q,  pj  and  query  the  system 
for  q.  The  PROLOG  equivalent  is: 

p  if  q 
q  if  P 
P 

And  the  query  is  is(q).  “No  more 
space”  is  again  the  reply. 

At  this  point  the  logician  might 
wonder  if  the  inference  engine  is  out 
of  gas.  In  fact  the  system  is  in  a  goal- 
reduction  loop.  It  reasons  thus:  To  get 
q,  first  get  p;  to  get  p,  first  get  q;  to  get 
q,  first  get  p;  and  so  forth.  It  never  gets 
beyond  the  first  two  rules.  In  fact,  {p 


if  p,  p)  along  with  the  query  for  p  will 
put  the  system  into  an  infinite  loop. 

This  might  seem  a  simple  problem 
to  solve,  which  of  course  it  is  for  spe¬ 
cific  cases.  In  general,  however,  it 
can  be  shown  to  be  unsolvable.  A 
loop-detection  procedure  for  a  goal- 
reduction  theorem  prover  is  equiva¬ 
lent  to  the  halting  problem  shown  by 
Alan  Turing  in  the  30s  to  be  unsolv¬ 
able.  Essentially,  a  loop-detection 
monitor  requires  more  logic  than  the 
goal  reduction  it  monitors. 

To  make  matters  worse,  the  “no 
space  left”  response  can  be  triggered 
without  setting  up  a  goal-reduction 
loop:  (H(y  y)  if  H(y  y),  H(a  a),  H(b  b)} 
along  with  the  query  is(H(a  b  j)  causes 
stack  overflow.  In  Borland  PROLOG, 
{H(y  y  ifHfy  y),  H(a  b)}  along  with  the 
query  for  H(b  a)  does  the  trick.  If  you 
cannot  enter  the  postulates  declaring 
the  commutativity  of  a  relation,  then 
you  might  question  whether  PROLOG 
belongs  in  the  remedial  logic  class  for 
absolute  dummies. 

Examples  such  as  the  foregoing  are 
legion,  but  they  merely  demonstrate 
the  radical  incompleteness  of  PRO¬ 
LOG.  What  is  even  more  insidious  is 
PROLOG'S  inconsistency.  The  treach¬ 
erous  dimension  to  PROLOG  is  not  so 
much  what  it  can’t  do  as  what  it  can. 
Consider  the  following  deduction, 
based  on  the  deduction  rule  known 
to  logicians  as  mirabile  dictu: 

p  if  not  q 

q  /  Hence  not  p 

Querying  this  little  postulate  set  with 
isfnot  p)  yields  the  astounding  an¬ 
swer  “yes.”  Not  only  that,  the  system 
will  make  further  deductions  on  the 
basis  of  this  fallacious  deduction. 
Adding  the  postulate  r  if  not  p  and 
querying  the  system  for  r  yields  the 
answer  "yes.”  Again,  such  examples 
are  legion. 

Borland  refers  to  PROLOG'S  treat¬ 
ment  of  negation  as  "novel.”  The  justi¬ 
fication  for  this  novel  treatment  of  ne¬ 
gation  is,  according  to  Borland,  the 
hidden  assumption  of  the  law  of  the 
excluded  middle  automatically  made 
by  PROLOG.  Thus  for  the  one-place 
predicate  h,  PROLOG  automatically  as¬ 
sumes  h(y)  or  not  h(y).  The  latter  is 
(continued  on  page  138) 


14 


Dr.  Dobb's  Journal,  January  1987 

7 


ARTICLES 


680\Jl  Computers: 
Where  Are 
They  Going? 


Since  the  beginning  of 
the  microprocessor 
industry,  two  groups 
of  programmers  have 
emerged — a  result  of  the  di¬ 
vergence  between  those 
who  liked  the  8080  and  those 
who  liked  the  6502.  The  8080 
had  dedicated  I/O  instruc¬ 
tions  that  used  a  separate  address  space,  and  it  had  several 
fairly  specialized  internal  registers.  Its  instruction  set  was 
designed  to  be  powerful  and  specific.  To  learn  it  pro¬ 
grammers  had  to  memorize  its  specialized  instructions 
and  address  modes.  Conversely,  the  6502's  instructions 
were  more  general  purpose  but  less  powerful.  The  in¬ 
structions  were  easier  to  learn,  but  it  took  more  of  them  to 
accomplish  the  same  tasks.  Its  I/O  was  memory-mapped, 
making  input  and  output  identical  (from  the  processor’s 
point  of  view)  to  normal  memory  addressing.  Memory- 
mapped  I/O  also  reduced  the  number  of  specialized  in¬ 
structions  that  had  to  be  learned.  The  two  camps  began  to 
diverge  as  early  as  1974,  and  by  the  early  80s,  those  pro¬ 
grammers  who  were  most  fervent  acquired  the  nick¬ 
names  "eighters”  and  "sixers.” 

When  the  Z80  was  introduced,  the  eighters  suddenly 
had  a  much  more  powerful  tool.  The  sixers  had  the  6800, 
and  then  the  6809,  but  there  weren't  many  viable  ma¬ 
chines  that  used  those  chips.  Ardent  sixers  had  to  be  con¬ 
tent  with  their  Apple  IIs  and  Commodore  Pets  and  64s. 
Certainly,  some  really  magnificent  work  came  out  on  the 
Apple  II  during  the  early  80s,  showing  that  the  Volks¬ 
wagen  of  personal  computers  was  a  lot  more  powerful 
than  had  been  thought.  Toward  the  end  of  this  period, 
some  manufacturers  tried  to  get  the  best  of  both  worlds — 


Nick  Turner,  501  Galveston  Dr.,  Redwood  City,  CA  94063. 
Nick  is  a  DDJ  editor. 


for  example,  the  OSI  Chal¬ 
lenger  III,  a  strange  machine 
with  a  Z80,  a  6502,  and  a  6800 
all  in  the  same  box. 

For  several  years,  it 
seemed  likely  that  the  Intel 
line  of  CPU  chips  would 
eventually  take  over  the 
market.  Many  devoted 
sixers  shuddered  when  the  IBM  PC  turned  out  to  have  an 
eighter  chip  as  its  heart. 

It  wasn’t  until  the  68000  appeared  that  there  was  a  real¬ 
ly  powerful  sixer  machine.  What  led  to  the  development 
of  the  68000?  Motorola’s  engineers  perceived  the  need  for 
a  much  more  general  instruction  set  so  that  the  chip  de¬ 
sign  could  be  cleaner  and  easier  to  mask.  They  wanted  a 
CPU  that  would  be  upward  compatible  with  future,  more 
powerful  chips,  without  the  need  for  expensive  "modes” 
that  hamper  functionality  and  take  up  space  on  the  chip. 
For  the  most  part,  they  seem  to  have  succeeded.  Though 
the  68000  does  have  some  disadvantages  and  departures 
from  a  truly  general-purpose  design  (such  as  the  inability 
to  store  data  using  PC-relative  addressing),  it's  a  far  cry 
from  the  restrictive  modes  and  specializations  of  the 
high-end  eighter  chips.  Motorola  took  an  enormous  gam¬ 
ble  in  introducing  what  was  projected  to  be  a  whole  se¬ 
ries  of  CPUs  with  a  completely  new  instruction  set.  Low- 
level  programs  had  to  be  rewritten  for  the  68000.  There 
was  (and  still  is)  a  lot  of  momentum  in  the  Intel  line  of 
chips.  Has  Motorola's  gamble  paid  off? 

Today’s  GSOxx  Line 

To  me  the  most  valuable  feature  of  the  68000  line,  aside 
from  the  philosophy  behind  it,  is  the  enormous  range  of 
speed  and  power  available.  With  few  or  no  changes  to 
your  software,  you  can  move  up  from  the  68008,  which  is 
roughly  comparable  to  a  fast  Z80  in  power,  all  the  way  to 


by  Nick  Turner 


The  68000  led  to  the  first 
powerful  siyer  machine. 


16 

8 


Dr.  Dobb's  Journal,  January  1987 


a  20-MHz  68020,  which  easily  outperforms  a  small  VAX. 

Further,  the  line  includes  a  wide  and  constantly  grow¬ 
ing  selection  of  support  chips.  Most  of  the  support  chips 
are  currently  made  by  Motorola,  but  a  growing  number 
are  being  designed  by  other  companies  expressly  for  the 
68000  line  (an  indication  of  the  health  of  the  68000  stand¬ 
ard).  The  most  important  support  chips  are  the  68851 
MMU  and  the  68881  FPU,  both  of  which  are  true  coproces¬ 
sors  when  used  with  the  68020  (or  later)  CPU. 

Perhaps  as  a  result  of  the  ease  of  designing  with  the 
68000,  a  new  flock  of  68000-based  computers  has  ap¬ 
peared  in  the  last  three  years.  Table  1,  page  18,  shows  a 
sampling  of  the  range  of  power  and  speed  in  the  line.  Of 
course,  there  are  the  relatively  low-price  personal  sys¬ 
tems,  most  of  which  have  graphics-oriented  interfaces. 
But  you  also  have  high-end  workstations,  such  as  the  Sun 
systems  and  the  Apollo  Domain  network,  and  a  host  of 
other  high-end  systems,  including  some  powerful  image 
processing  equipment.  The  VME  bus,  originally  designed 
around  the  68000  line,  has  rapidly  become  a  worldwide 
standard  for  rugged,  powerful,  modular  computer  sys¬ 
tems.  It's  supported  by  an  international  coalition  of  com¬ 
panies,  the  VME  International  Trade  Association  (VITA), 
and  is  one  of  the  most  carefully  defined  system  specifica¬ 
tions  I’ve  encountered. 

Some  of  the  most  interesting  systems  are  the  most  re¬ 
cent.  For  example,  the  Mustang-020,  a  68020-based  sys¬ 
tem,  has  been  available  since  last  spring.  It’s  amazingly 
powerful  for  its  price,  and  it  shows  great  promise  as  a 
general-purpose  industrial  machine.  One  of  its  chief  as¬ 
sets  is  its  size — in  a  box  no  larger  than  an  IBM  PC,  it  pro¬ 
vides  more  power  than  do  many  minicomputers.  Anoth¬ 
er  system  to  watch  is  the  Quantum  QL,  which  comes  from 
Sinclair  in  England.  According  to  some  preliminary  liter¬ 
ature,  the  entire  computer  is  contained  within  a  box  the 
size  of  an  IBM  PC  keyboard,  and  it  runs  a  multitasking 
operating  system  called  QDOS. 


Software  Issues 

Because  of  its  generalized  and  rela¬ 
tively  simple  design,  the  68000  works 
well  with  multitasking  operating  sys¬ 
tems — for  example,  Unix  runs  well  under  the  68020.  An¬ 
other  interesting  operating  system  that  has  recently  been 
increasing  in  popularity  is  OS-9.  Originally  designed  for 
the  6809,  OS-9  is  fast,  small,  and  most  important,  highly 
accessible.  (See  Brian  Capouch’s  article  on  OS-9  on  page  30.) 

The  68000  line  also  lends  itself  well  to  graphics-oriented 
interfaces  because  of  its  efficient  handling  of  large  memo¬ 
ry  spaces  and  its  ability  to  easily  accommodate  large 
memory -mapped  I/O  spaces.  The  Macintosh  operating 
system,  for  example,  has  set  a  new  user  interface  stand¬ 
ard  for  personal  computers.  Although  the  Mac  OS  (espe¬ 
cially  in  its  original  incarnation)  contained  some  major 
flaws,  particularly  in  the  area  of  disk  organization,  it  has 
been  much  imitated,  and  most  of  the  flaws  have  been 
corrected  with  the  release  of  Apple’s  Hierarchical  File 
System.  The  Amiga  and  Atari  ST  both  have  similar  "desk¬ 
top”  screens,  icons,  and  mice. 

In  the  area  of  languages,  the  most  prominent  has  been 
C.  Table  2,  page  18,  illustrates  some  typical  execution 
times  of  programs  compiled  in  C  on  68000-based  ma¬ 
chines.  The  benchmarks  in  the  table  were  chosen  to  illus¬ 
trate  a  variety  of  operations  roughly  representative  of 
actual  programs.  C  is  in  some  ways  perfectly  matched  to 
the  68000  line— the  regularity  and  generality  of  the  in¬ 
struction  set  makes  a  language  such  as  C  almost  a  natural. 
Because  of  its  relatively  low-level  nature,  C  translates 
readily  into  well-defined  groups  of  machine  instructions. 


The  Future 

The  68000  line  is  the  most  serious  competition  for  the  Intel 
line,  and  the  machines  that  use  68000s  are  varied  and 
colorful.  But  will  the  68000  line  continue  to  grow  and 
diversify?  Will  the  industry  support  new  members  by 


Dr.  Dobb's  Journal,  January  1987 


17 

9 


680XX  COMPUTERS 

(continued  from  page  17) 


creating  products  that  incorporate  them?  I  think  so,  for 
several  reasons. 

First,  Motorola  seems  to  have  latched  firmly  to  the  idea 
that  object-code  compatibility  is  a  must  in  the  68000  line. 
So  far,  each  of  the  new  CPUs  has  been  almost  completely 
compatible  with  its  predecessors.  The  only  exceptions 
come  when  you're  writing  time-slicing  OS  code  or  you 
have  to  deal  with  interrupts.  Typically,  that  sort  of  code  is 
fairly  easy  to  upgrade,  and  the  actual  applications  (if 
they're  written  intelligently)  seldom  require  any 
changes. 

Second,  you  can  expect  continued  growth  and  diversity 
in  the  68000  line.  Some  preliminary  information  on  the 
68030  processor  has  just  landed  on  my  desk.  It  will  have 
not  only  the  instruction  cache  of  the  68020  but  also  a  data 
cache,  a  memory  manager,  and  much  more.  The  result 
will  be  a  CPU  that  has  the  68020’s  speed  in  tight  loops 
(because  of  the  instruction  cache)  and  that  also  can  at 
times  execute  completely  on-chip  for  extended  periods 
(because  frequently  accessed  data  will  be  in  the  cache). 
The  rumored  68040  may  use  a  64-  or  128-bit  data  bus  for 
reads  (although  it  will  still  use  a  32-bit  bus  for  writes).  It 
may  contain  an  even  larger  RAM  cache  as  well,  plus  a 


more  powerful  memory  manager.  In  the  peripherals  de¬ 
partment,  the  68882  will  be  a  more  powerful,  pin-com¬ 
patible  version  of  the  68881  floating-point  coprocessor 
that,  because  of  internal  multiprocessing,  will  significant¬ 
ly  outstrip  the  performance  of  its  predecessor. 

Finally,  there  are  rumors  of  many  other  support  chips 
floating  around,  along  with  some  really  fun  rumors  about 
68100  or  68200  CPU  chips.  For  example,  how  about  a  chip 
containing  four  68030  equivalents,  all  executing  simulta¬ 
neously  from  dual,  dynamically  allocated  16K  instruction 
and  data  caches?  Of  course,  that's  nothing  more  than  a 
nasty  rumor.  There’s  probably  no  truth  to  it  whatsoever. 

A  Dream  Machine? 

Will  future  developments  live  up  to  the  expectations  of 
software  developers?  So  far  the  68000  line  has  done  well  in 
a  market  that  might  otherwise  have  been  completely  dom¬ 
inated  by  Intel.  The  80386  is  strong  competition,  but  the 
68030  sounds  like  a  programmer's  dream.  Unless  some 
company  takes  over  the  sixer  marketplace  within  the  next 
few  years,  it  looks  like  the  68000  and  the  more  powerful 
related  chips  will  prevail  as  the  premier  sixer  CPUs. 

DDJ 

Vote  for  your  favorite  feature/articie. 

Circle  Reader  Service  No.  2. 


Type 

Maker 

Model 

Base 

Prices 

Memory 

Range 

(Mbytes) 

CPU 

Clock 

Speed 

(MHz) 

OS 

Open 

Design 

Intro 

Date 

Portable 

Sinclair 

QL 

.5 

68008 

8? 

QDOS 

No 

1986 

Macalikes 

Atari 

ST  520 

$800 

.5-1 

68000 

8 

TOS 

No 

7/85 

Commodore 

AMIGA  1000 

$1,000— $3,000 

.25-4 

68000 

7.16 

AmiqaDOS 

Yes 

10/85 

Apple 

Macintosh 

Enhanced 

$1,700 

.25-4 

68000 

7.8 

Mac  OS 

No 

4/86 

Macintosh 

Plus 

$2,200 

1-4 

68000 

7.8 

Mac  OS 

No 

1/86 

“Jonathon” 

? 

2-? 

68020? 

16? 

Mac  OS? 
Unix? 

Yes 

3/87? 

Workhorses 

Data-ComD 

Mustana-020 

$4,000 

2 

68020 

12-16 

OS-9 

No 

3/86 

Various 

VME  bus 

.5-16 

(various) 

8-20 

(various) 

Yes 

Workstations 

Apollo 

Domain 

$9,900— $70,000  2-16 

68020 

12-20 

Unix 

Yes 

2/86 

Sun 

Sun  II 

$19,900 

1-4 

68010 

10 

Unix  4.2 

Yes 

11/83 

Sun  III 

$19,900 

4-16 

68020 

16.7 

Unix  4.2 

Yes 

9/85 

Table  1:  Comparison  of  representative  HHOxx  systems 


Benchmarks: 


System 

Compiler 

Looptst  (500) 

Pointer  (1500) 

Fibtest(18) 

Sieve  (140) 

Amiga 

Lattice 

44.3 

37.5 

48.8 

81.7 

Manx  Aztec  16 

29.4 

33.3 

35.1 

64,3 

Manx  Axtec  32 

38.4 

35.1 

40.1 

80.7 

Atari  ST 

Aleyon 

25.6 

28.4 

31.2 

56.5 

Lattice 

30.7 

30.7 

35.0 

62.3 

Mark  Williams 

30.7 

30.0 

37.2 

60.1 

Megamax 

25.6 

35.3 

33.3 

53.9 

Macintosh 

Lightspeed 

37.4 

42.3 

45.4 

78.9 

IBM  PC 

Microsoft  C 

25.8 

30.9 

37.1 

60.1 

Table  2:  Representative  C  compiler  benchmarks  (in  seconds) 


18 

10 


Dr.  Dobb's  Journal,  January  1987 


ARTICLES 


A  Mini  Forth  for  the 

68000 

by  G.  Yates  Fletcher 


My  exposure  to  Forth  has 
been  limited  to  the  prosely¬ 
tizing  of  a  few  Forth  fanat¬ 
ics,  articles  in  various  publications, 
and  a  fairly  careful  reading  of  Leo 
Brodie’s  excellent  book  Starting  Forth 
(Englewood  Cliffs,  N.J.:  Prentice-Hall, 
1981).  One  of  the  attractions  of  the 
language  for  me  is  its  fundamental 
simplicity.  The  interpreter,  threader, 
and  kernel  of  basic  supporting  words 
are  small  enough  that  (at  least  in  the¬ 
ory)  a  single  programmer  can  create 
a  working  system  with  a  moderate 
amount  of  effort.  I  often  toyed  with 
the  notion  of  doing  it  myself  just  for 
the  learning  experience  (read  that 
fun),  but  I  never  got  to  the  point  of 
making  a  serious  attempt  until  last 
fall  when  1  was  assigned  to  teach  for 
the  umpteenth  time  our  sophomore- 
level  course  in  computer  organiza¬ 
tion  and  assembly  language.  I  was 
casting  about  for  some  way  to  gener¬ 
ate  enthusiasm,  which  although  not 
a  prerequisite  for  teaching  seems  to 
make  the  process  more  enjoyable  for 
all  concerned.  It  struck  me  that  build¬ 
ing  a  Forth  system  might  make  good 
programming  exercises  for  my  stu¬ 
dents.  Thus  was  FLINT  born. 

I  thought,  what  the  heck,  I’ll  write 
a  "no  frills”  Forth  and  give  the  stu¬ 
dents  the  executable  code  (so  they 
can  play  with  it  and  see  how  it's  sup¬ 
posed  to  work)  and  the  source  code 
minus  the  modules  they  will  be  re¬ 
quired  to  write.  By  the  time  I’ve 
taught  them  enough  about  assembly- 
language  programming  to  do  the  job 


G.  Yates  Fletcher,  Dept,  of  Computer 
Science,  North  Carolina  State  Univer¬ 
sity,  P.O.  Boy  8206,  Raleigh,  NC  27695- 
8206.  Mr.  Fletcher  is  an  assistant  pro¬ 
fessor  of  computer  science. 


The  inside  approach 
to  Forth  makes  the 
language  much  more 
palatable. 


and  enough  about  Forth  that  they  un¬ 
derstand  what  is  required,  the  semes¬ 
ter  should  be  winding  down.  Amaz¬ 
ingly  enough,  everything  went 
pretty  much  according  to  plan  (albeit 
with  considerably  more  sweat  than  I 
anticipated).  As  you  might  have 
guessed  by  now,  the  instructor  prob¬ 
ably  learned  a  lot  more  than  the  stu¬ 
dents,  but  that’s  one  of  the  reasons 
why  I  took  the  job. 

The  product  of  this  labor  is  not  a 
standard  Forth.  As  the  venture  was 
educational/recreational  and  not 
commercial,  I  never  bothered  to  find 
out  what  the  standards  were  or  even 
look  at  the  code  for  a  real  Forth.  Thus 
I  have  chosen  the  acronym  FLINT,  for 
Forth-like  interpreter  and  threader, 
to  describe  the  system.  FLINT  was  ba¬ 
sically  reverse-engineered  from  Bro¬ 
die’s  description  of  the  language,  so 
the  implementation  is  probably  a 
blend  of  novelty  and  naivete.  Never¬ 
theless,  I  feel  that  it  is  more  than  a 
toy,  as  I  have  used  it  to  write  a  turtle- 
graphics  program  for  my  terminal;  a 
full-screen  editor  (for  Forth  screens, 
of  course);  and  a  version  of  a  standard 
prime-number  seive  benchmark 
(Byte,  January  1983)  that  runs  in  a  lit¬ 
tle  more  than  20  seconds  on  an  8-MHz 
68000,  making  it  faster  than  any  of 
the  microcomputer  Forths  listed. 

I  can  conceive  several  levels  at 
which  this  program  may  be  useful. 
For  those  not  familiar  with  Forth,  it 
might  help  to  be  an  introduction.  It  is 


no  substitute  for  a  good  book  such  as 
Brodies,  but  it  could  make  a  worth¬ 
while  companion.  Those  approach¬ 
ing  Forth  from  the  outside  as  another 
language  are  often  put  off  or  puzzled 
by  its  many  idiosyncracies.  The  inside 
approach  to  Forth  as  a  program  gives 
a  complementary  perspective  that  re¬ 
solves  many  of  these  mysteries  and 
makes  the  language  much  more  pal¬ 
atable.  For  do-it-yourselfers  this  pro¬ 
gram  is  evidence  that  novices  can  in¬ 
deed  produce  something  workable. 
They  can  read  enough  of  the  descrip¬ 
tion  and  documentation  to  get  a  good 
idea  of  what  needs  to  be  done  and 
blast  off  on  their  own,  referring  to  the 
code  perhaps  when  stuck  or  merely 
to  confirm  that  their  own  way  of  do¬ 
ing  things  is  better.  Forth  program¬ 
mers  who  don't  have  a  version  run¬ 
ning  on  their  own  68000  machine 
might  be  able  to  revise  and  polish  it  up 
enough  so  that  it  resembles  some  us¬ 
able  standard.  Forth  cognoscenti 
might  be  interested  to  know  that 
FLINT  produces  what  I  have  since 
learned  is  subroutine  threaded  code 
(which  seems  to  be  particularly  ap¬ 
propriate  for  the  68000)  as  opposed  to 
the  more  usual  indirect  threaded 
code. 

Overall  Structure 

FLINT  was  written  with  several  goals 
in  mind:  to  provide  students  with  an 
example  of  a  well-structured,  real 
world,  assembly-language  program; 
to  illustrate  the  utility  and  power  of 
the  68000  instruction  set  and  address¬ 
ing  modes;  and  to  test  the  theory  that 
Forth  is  more  naturally  understood  as 
a  program  rather  than  as  a  language. 

I’ve  made  a  fairly  serious  attempt 
to  structure  and  comment  the  code 
properly.  Because  everything  is  in 


22 


Dr.  Dobb's  Journal,  January  1987 

11 


there  somewhere,  much  can  be 
learned  (at  least  in  theory)  by  reading 
the  program  carefully.  In  fact,  sever¬ 
al  of  my  students  have  intimated 
(very  discreetly)  that  they  found  the 
code  much  less  confusing  than  my 
explanations  of  it.  The  68000  assem¬ 
bly  code  comprises  a  minimal  kernel 
of  less  than  500  lines  and  is  arranged 
as  shown  in  Table  1,  below. 

FLINT  requires  some  basic  BIOS  sup¬ 
port  from  the  host  in  the  form  of 
macros  that  users  must  tailor  for 
their  own  machines  and  resident  sys¬ 
tem  software.  My  system  is  a  Sage  II 
(now  Stride),  which  has  two  floppy 
drives,  512K  RAM,  and  64K  PROM  and 
runs  on  an  8-MHz  68000.  The  PROM 
contains  all  the  necessary  BIOS  sup¬ 
port  as  well  as  a  monitor/debugger 
that  furnished  an  excellent  environ¬ 
ment  for  developing  and  running 
FLINT.  Macro  definitions  for  my  sys¬ 
tem  are  in  Listing  One,  page  52. 

Interpreter  and  Dictionary 

The  token  is  the  basic  unit  processed 
by  the  interpreter.  Tokens  are  not  ob¬ 
tained  directly  from  the  terminal  but 
are  taken  from  an  80-character  line 
buffer.  After  prompting  the  user,  rou¬ 
tine  LINE  (Listing  Two,  page  52)  fills 
the  line  buffer  from  the  terminal  and 
terminates  upon  receiving  a  carriage 
return  character.  TOKEN  takes  its  in¬ 
put  from  the  buffer  and  recognizes 
the  blank  as  a  token  delimiter.  The 
carriage  return  embedded  in  the  buff¬ 
er  (the  one  that  terminated  LINE)  is 
seen  as  a  legal  token  whose  execution 
returns  control  to  LINE.  In  support  of 
its  buffer-filling  function,  LINE  recog¬ 
nizes  and  handles  backspaces. 

The  interpreter  is  composed  of  sev¬ 
eral  routines  whose  operation  forms 
an  instruction  cycle  fed  by  the  user 
interactively.  The  instructions  are 
small  modules  of  code  called  words 
whose  actions  are  normally  directed 
at  data  (or  pointers  to  the  data)  resid¬ 
ing  on  a  parameter  stack.  The  words 
are  identified  by  short  alphanumeric 
strings  (tokens)  that  are  symbols  or 
English  words  that  usually  describe 
or  are  related  to  the  action  they  per¬ 
form.  The  words  are  arranged  in  a 
linked  list  in  which  each  node  con¬ 
tains  the  identifying  token,  a  link 
pointer,  and  the  code  defining  the 
word’s  action.  This  linked  list  is 
called  the  dictionary. 

A  dictionary  entry  starts  with  the 


identifier,  which  is  a  4-byte  value.  It 
consists  of  the  length  and  the  first 
three  characters  of  the  name  of  the 
word  being  identified.  For  example, 
if  the  word  were  EXECUTE,  the  identi¬ 
fier  would  contain  a  7  and  then  the 
characters  EXE. 

The  next  field  is  the  link  pointer, 
which  is  a  2-byte  field  containing  the 
address  of  the  previous  word's  identi¬ 
fier.  Thus  a  dictionary  search  always 
starts  with  the  most  recently  defined 
word  and  works  backward.  For  larg¬ 
er  systems  you  might  want  to  make 
the  link  pointer  a  4-byte  value  or  per¬ 
haps  make  it  a  relative  value  instead 
of  an  absolute  address. 

After  the  link  pointer  is  the  code 
field,  which  contains  the  machine 
code  that  runs  when  the  word  is 
executed. 

The  outer  interpreter  is  a  loop  that 
waits  for  and  accepts  input  tokens, 
searches  the  dictionary  until  a  match 
is  found,  and  extracts  the  address  of 
the  corresponding  code.  The  address 
can  be  sent  to  the  inner  interpreter 
for  direct  execution.  If  a  token  is  not 
in  the  dictionary,  it  is  assumed  to  rep¬ 
resent  a  number  in  the  current  base, 
and  its  value  is  extracted  and  placed 
on  the  stack.  If  this  attempt  fails,  an 
error  is  assumed  and  the  WHAZZAT 
token  is  invoked. 

Words  can  be  defined  as  well  as  ex¬ 
ecuted.  The  interpreter  has  an  alter¬ 
nate  compile  mode  (the  standard 
mode  is  execute  mode).  The  inter¬ 
preter’s  primary  task  remains  that  of 
extracting  the  code  address  of  an  in¬ 
put  token.  When  in  execute  mode,  a 
JSR  to  this  address  is  executed  as  be¬ 
fore.  But  when  the  interpreter  is  in 
compile  mode,  a  JSR  to  the  code  ad¬ 
dress  is  written  into  the  dictionary. 
Execution  is  deferred  until  the  word 
being  defined  is  invoked  in  execute 
mode.  Any  number  encountered  in 
compile  mode  is  handled  by  generat¬ 
ing  code  in  the  dictionary  that  will 
push  the  value  onto  the  stack  at  exe¬ 


cution  time.  These  compiled  num¬ 
bers  are  called  literals.  Additional 
support  is  provided  for  a  submode  in 
which  words  can  be  defined  directly 
in  machine  code. 

How  FLINT  Works 

For  a  concrete  example  of  how  FLINT 
works,  let's  look  at  the  activity  that  oc¬ 
curs  as  the  first  word  in  the  inner 
shell,  CONSTANT,  is  defined.  Envision 
using  this  word  interactively  as 
follows: 

10  CONSTANT  TEN 

or 

12  CONSTANT  DOZEN 

CONSTANT  will  thus  be  invoked  to  de¬ 
fine  words  that  are  constants.  If,  for 
example,  you  invoke  the  word  TEN, 
you  will  expect  it  to  push  the  number 
10  onto  the  stack;  if  you  invoke  DOZ¬ 
EN,  you  expect  1Z  to  be  pushed;  and 
so  on.  Let’s  call  TEN  and  DOZEN  in¬ 
stances  of  CONSTANT.  Thus  when 
CONSTANT  is  invoked,  it  will  take  the 
value  of  the  instance  from  the  stack 
and  the  name  of  the  instance  from 
the  input  stream. 

Now  look  at  the  definition  of  CON¬ 
STANT  to  see  how  this  activity  is  to  be 
directed.  The  defining  string  is 

:  CONSTANT  TOKEN  HEADER  LITERAL 

CODE  3AFC  4E75  ; 

The  outer  interpreter,  operating  in 
execute  mode,  picks  up  the  token 
finds  a  match  in  the  dictionary,  and 
proceeds  to  execute  it.  Examining  the 
code  for  you  see  that  a  subroutine 

call  (JSR)  to  TOKEN  will  be  excuted. 
The  action  of  TOKEN,  of  course,  is  to 
pull  in  the  next  token  from  the  input 
stream,  and  in  this  case  it  will  be  the 
token  CONSTANT.  The  next  JSR  is  to 
HEADER,  whose  action  is  to  produce  a 
dictionary  header  for  the  newly  cap- 


Outer  interpreter 

Line-buffer  input  routines:  PROMPT  LINE 

Words  supporting  execute  mode:  TOKEN,  SEARCH,  NUMBER,  EXECUTE,  WHAZZAT 
Words  supporting  compile  mode:  COMPILE,  HEADER,  IMMEDIATE,  CODE,  ", ", 

LITERAL 

System  variables:  BASE,  CBLOCK,  EDBUF,  DICT 
Words  supporting  disk  I/O:  LOAD,  GO,  SAVE,  INTERACTIVE 
Words  supporting  terminal  output:  TYPE,  ".S" 

Miscellaneous  words:  "(",  QUIT,  LOGOFF 


Table  1:  Arrangement  of 68000  assembly  code  in  kernel 


Dr.  Dobb's  Journal,  January  1987 

12 


33 


68000  MINI  FORTH 

(continued  from  page  23) 


tured  token  CONSTANT.  The  remain¬ 
ing  activity  initiated  by  the  definition 
will  produce  its  code  field. 

The  last  activity  of  is  to  set  the 
compile  flag,  FCOLON,  and  return  to 
the  outer  interpreter.  The  interpret¬ 
er  pulls  the  next  token,  TOKEN,  from 
the  input  stream  and  extracts  its  code 
address.  Because  the  interpreter  is 
now  in  compile  mode,  a  JSR  to  TOKEN 
will  be  written  in  the  dictionary — 
that  is,  into  the  code  field  of  CON¬ 
STANT.  This  means  that  the  first  ac¬ 
tivity  of  CONSTANT,  when  it  is  itself 
executed  (called  execution-time  be¬ 
havior),  will  be  (surprise,  surprise)  to 
pull  in  the  name  of  the  instance  from 
the  input  stream.  In  like  manner  a 
JSR  to  HEADER  will  become  the  next 
part  of  the  code  for  CONSTANT,  so 
now  the  execution-time  behavior  of 
CONSTANT  has  been  defined  up  to 
the  point  where  a  header  for  the  in¬ 
stance  has  been  created.  It  is  now 
time  to  define  the  execution-time  be¬ 
havior  of  CONSTANT  that  will  define 
the  execution-time  behavior  of  its  in¬ 
stances.  Thus  you  must  direct  CON¬ 
STANT  to  take  the  value  of  its  instance 
from  the  stack  and  write  code  in  the 
dictionary  that  upon  execution  will 
place  this  value  on  the  stack.  This  is 
an  exact  description  of  what  LITERAL 
does,  so  its  inclusion  in  the  definition 
solves  the  problem  neatly. 

The  activity  CONSTANT  must  per¬ 
form  that  has  not  yet  been  coded  is  to 
close  the  definition  of  its  instance.  At 
this  point  in  the  definition  process, 
the  outer  interpreter,  in  compile 
mode,  picks  up  the  token  CODE.  A 
careful  examination  of  the  header  for 
CODE  shows  that  its  listed  token  length 
is  132,  which  is  128  more  than  it 
should  be.  This  is  no  accident.  It  is  in 
fact  the  manner  in  which  words  are 
tagged  as  immediate,  meaning  that 
they  are  to  be  executed  even  when 
the  interpreter  is  in  compile  mode. 
The  necessity  for  such  words  should 
be  obvious  because  otherwise,  for  in¬ 
stance,  there  would  be  no  good  way 
to  terminate  a  definition.  (When  TO¬ 
KEN  picks  up  an  immediate  word,  it 
sets  the  immediate  flag,  FIMMED,  to  let 
the  interpreter  know  that  the  word  is 
to  be  executed.)  CODE  sets  the  system 
base  to  16  (hex);  sets  the  code  flag, 
FCODE ;  and  returns  to  the  interpreter. 


When  the  tokens  3AFC  and  4E75  are 
picked  up,  they  are  not  found  in  the 
dictionary.  Thus  each  one  is  sent  in 
turn  to  NUMBER,  which  extracts  its 
proper  hex  value  and  places  it  on  the 
stack  from  where  the  interpreter 
(now  in  code  mode)  copies  them  into 
the  dictionary.  The  code  3 AFC  4E75 
translates  as  MOVE.W  #04E75H,(A5)+ . 
Thus  the  final  activity  in  the  run-time 
behavior  of  CONSTANT  is  to  copy 
04E75H  into  the  dictionary  definition 
of  the  instance.  Because  4E75  is  the 
code  for  RTS,  this  activity  will  effec¬ 
tively  close  the  definition  of  the  in¬ 
stance.  All  that  remains  is  to  close  the 
definition  of  CONSTANT.  This  is  the 
job  of  which  resets  the  system 
base  to  10,  clears  the  compile  and  code 
flags,  and  writes  an  RTS  into  the 
dictionary. 

Well,  there  you  have  it:  just  a  sim¬ 
ple  little  definition.  Readers  new  to 
Forth  (if  there  are  any  who  have 
made  it  this  far)  probably  feel  that  the 
claims  for  its  simplicity  are  highly  ex¬ 
aggerated,  and  even  those  with  some 
familiarity  may  find  themselves  a  bit 
glazed  over.  The  bad  news  is  that  the 
operation  of  the  FLINT  compiler  is  of¬ 
ten  a  fairly  complicated  activity.  The 
good  news  is  that  it  seldom  gets  any 
more  complicated  than  the  example 
given  here.  Complexity  is  after  all  a 
relative  thing.  Would  anyone  care  to 
write  a  step-by-step  explanation  of 
the  operation  of  a  Pascal  compiler  on 
a  segment  of  code  that  invokes  most 
of  its  major  machinery? 

For  beginners  the  definition  of 
CONSTANT  is  as  much  a  puzzle  as  it  is 
an  example.  Understanding  it  re¬ 
quires  clear  thinking  and  a  careful 
reading  of  the  code  involved.  This  is 
in  fact  one  of  the  major  reasons  for 
discussing  it.  flint’s  interpreter/ 
compiler  is  put  to  work  at  three  lev¬ 
els.  At  one  level  the  word  CONSTANT 
is  being  compiled.  To  understand  the 
activity  at  this  level,  however,  you 
must  see  through  to  the  level  at 
which  CONSTANT  will  be  executed. 
This  level  itself  must  be  understood 
in  terms  of  the  activity  that  will  occur 
at  the  next  level — that  is,  when  a  par¬ 
ticular  instance  of  the  word  is  execut¬ 
ed.  The  activity  at  all  of  these  levels  is 
mediated  by  the  same  interpreter. 
The  code  defining  its  basic  structure 
occupies  half  a  page,  and  many  of  the 
supporting  words — for  example,  EX¬ 
ECUTE,  COMPILE,  HEADER,  CODE, 


and  LITERAL — are  only  two  or 
three  instructions  long.  Finally,  if 
you  stand  back  from  the  example  a 
little,  you  see  that  an  entire  data  type 
has  been  created  by  a  nine-word 
statement.  I  feel  that  anyone  who  un¬ 
derstands  the  simplicity  underlying 
this  example  has  a  firm  grasp  of 
FLINT  and  should  have  no  real  trou¬ 
ble  mastering  Forth. 

The  Inner  Shell 

Once  the  embryonic  FLINT  system  is 
running,  it  is  ready  for  a  big  meal  of 
nourishing  words  that  will  give  it 
more  size  and  power.  These  inner 
shell  words  support: 

•  definition  and  manipulation  of  con¬ 
stants,  variables,  and  arrays 

•  stack  manipulation  and  stack 
arithmetic 

•  structured  control  for  branching 
and  looping 

Listing  Three,  page  58,  contains  the 
inner  shell  words.  Many  of  these 
words  are  defined  directly  in  ma¬ 
chine  code  (with  the  assembler  mne¬ 
monics  given  as  comments),  so  you 
might  ask  why  they  are  not  placed  in 
the  kernel  and  assembled  as  part  of 
the  basic  system.  My  reasons  for  not 
doing  so  are  primarily  pedagogical.  I 
wanted  to  maintain  as  much  as  possi¬ 
ble  the  purity  of  the  kernel  to  under¬ 
score  its  elegance  and  simplicity.  In 
addition,  the  words,  most  of  which 
are  very  short,  seem  to  me  to  be  more 
readable  in  Forth  format  than  they 
are  in  assembly-language  format. 
This  representation,  even  allowing 
for  comments,  is  much  more  com¬ 
pact  because  the  work  of  building  the 
headers  is  left  to  the  system. 

Structured  control  in  Forth  is  di¬ 
rected  by  immediate  words,  placed 
in  colon  definitions  to  effect  the  com¬ 
pilation  of  appropriate  conditional 
branching  instructions.  The  flow  of 
control  is  conditioned  by  values  on 
the  stack  that  act  as  flags.  A  zero  val¬ 
ue  means  false,  and  any  nonzero  val¬ 
ue  means  true.  As  an  example  let's 
define  the  word  ODD,  which  takes  a 
number  from  the  stack  and  prints  an 
asterisk  if  it  is  odd: 

:  ODD  2  MOD  IF  42  EMIT  THEN  ; 

The  action  of  IF  when  ODD  is  execut¬ 
ed  is  to  pop  a  value  (assumed  by  IF  to 


24 


Dr.  Dobb's  Journal,  January  1987 

13 


68000  MINI  FORTH 

(continued  from  page  24) 

be  a  flag)  from  the  stack  and  test  its 
value.  If  the  value  is  false,  execution 
continues  at  the  word  following 
THEN  in  the  definition;  otherwise,  ex¬ 
ecution  continues  with  the  word  fol¬ 
lowing  IF.  In  the  above  definition,  the 
phrase  2  MOD  simply  furnishes  the 
proper  flag  for  IF  to  "eat.”  The  ac¬ 
tions  of  IF  and  THEN  when  ODD  is 
compiled  (their  compile-time  behav¬ 
ior)  must  ensure  that  they  have  the 
desired  effect  when  ODD  is  executed 
(execution-time  behavior).  Thus  the 
definitions  of  IF  and  THEN  will  define 
their  compile-time  behavior,  but 
they  must  be  understood  in  terms  of 
their  execution-time  behavior  as 
well.  An  important  point  to  remem¬ 
ber  here  is  that  these  words,  as  well 
as  other  control  words,  are  not  to  be 
invoked  directly;  they  act  within  def¬ 
initions  to  achieve  their  effects. 

Another  control  word,  ELSE,  can  be 
used  optionally  with  IF  and  THEN.  If, 
for  example,  you  wish  to  redefine 
the  word  ODD  so  that  its  action  is  to 
print  the  value  of  an  odd  number  or 
else  drop  the  number,  you  might  try 

:  ODD  DUP  2  MOD  IF  .  ELSE  \  THEN  ; 

The  execution-time  behavior  of 
ELSE  is,  as  you  might  suspect,  to  trans¬ 
fer  control  to  the  word  following 
THEN.  A  subtle  point  here  is  that  now 
IF  must  transfer  control  to  the  word 
following  ELSE  rather  than  to  the  one 
following  THEN  when  ELSE  is  present. 

FLINT  provides  two  structured  loop 
control  mechanisms:  DO  .  .  .  UNTIL 
and  DO...  WHILE . . .  LOOP.  The  exe¬ 
cution-time  behavior  of  UNTIL  is  to 
eat  a  flag  and  pass  control  back  to  the 
word  following  DO  if  its  value  is  true. 
Otherwise  the  loop  completes  when 
control  passes  to  the  word  following 
UNTIL.  WHILE,  on  the  other  hand, 
makes  an  exit  to  the  word  following 
loop  if  its  value  is  false.  LOOP  always 
passes  control  to  the  word  following 
DO.  (Note  that  FLINT'S  usage  of  these 
words  is  different  from  standard 
Forth's;  they  seem  to  make  a  little 
more  sense  to  me  this  way.) 

Nowhere,  I  feel,  is  the  elegance  and 
power  of  Forth’s  programming  para¬ 
digm  more  in  evidence  than  in  the 
definition  of  the  lexicon  of  words 
supporting  FLINT'S  structured  con¬ 


trol.  The  words  ?>,  BRA>,  BEO>, 
and  BNE>  form  the  raw  material  for 
the  tests  and  branches,  and  the 
words  MARK,  SPLIT,  and  JOIN  provide 
the  tools  for  assembling  them.  Each 
of  the  control  words  is  then  built  eco¬ 
nomically  by  a  single  defining  word 
or  phrase,  and  yet  when  they  are 
used  any  of  the  resulting  control 
structures  can  be  nested  to  arbitrary 
depth  within  any  of  the  others! 


Forth's  elegance 
and  power 
is  evidenced  by 
the  lexicon  of 
words  supporting 
FLINT'S 

structured  control. 


There  remains  the  problem  of  en¬ 
tering  these  definitions  interactively. 
In  theory  this  is  no  worse  than  enter¬ 
ing  them  into  an  assembly-language 
code  file— except  of  course  that  all  is 
lost  when  the  system  is  turned  off  or 
(as  is  more  likely)  crashed  by  the  nov¬ 
ice  user.  My  own  approach  to  the 
problem  was  to  try  to  find  a  way  to 
save  an  image  of  the  augmented  sys¬ 
tem.  As  it  turns  out,  all  that  is  neces¬ 
sary  for  saving  the  state  of  the  system 
is  to  preserve  the  contents  of  the  in¬ 
ternal  dictionary  pointer,  register  A5, 
which  points  to  the  next  vacancy  in 
the  dictionary.  (This  happens  to  be 
the  area  used  by  TOKEN  as  temporary 
storage  for  the  words  that  it  extracts 
from  the  line  buffer.)  The  word  LOG¬ 
OFF  accomplishes  this  task  by  saving 
A5  in  the  system  variable  BUFPNT, 
from  which  it  is  loaded  each  time 
FLINT  runs.  For  this  solution  to  work, 
you  must  have  the  means  (via  a  moni¬ 
tor/debugger,  for  instance)  to  save 
and  reload  the  augmented  system  in 
place  of  the  original.  Lacking  such  a 
tool,  you  should  probably  just  bite  the 
bullet  and  build  the  inner  shell 
words  into  the  source  in  the  manner 
of  the  kernel  words  so  that  they  will 
be  assembled  directly  in  the  dictio¬ 
nary.  This  is  tedious  work,  but  it  is 


not  difficult  if  you  understand  the 
dictionary  structure. 

Ultimately,  of  course,  it  would  be 
nice  to  dispense  with  these  primitive 
methods  and  be  able  to  create  and 
edit  disk-resident  files  of  FLINT  source 
text.  In  fact,  the  words  SAVE,  LOAD, 
and  GO  are  included  in  the  kernel 
system  to  support  this  very  activity. 
The  word  GO  directs  the  outer  inter¬ 
preter  to  take  its  tokens  from  a  IK  re¬ 
served  area  known  as  the  disk/edit 
buffer  instead  of  from  the  line  buff¬ 
er.  After  the  outer  interpreter  has  ex¬ 
hausted  the  contents  of  this  buffer,  it 
will  encounter  the  token  INTERAC¬ 
TIVE,  which  has  been  placed  in  mem¬ 
ory  just  beyond  the  buffer  area.  Exe¬ 
cution  of  this  word  reinitiates  the 
normal  interactive  input  sequence 
and  redirects  the  outer  interpreter  to 
take  tokens  from  the  line  buffer. 
LOAD  and  SAVE  allow  the  contents  of 
this  area  to  be  read  from  or  written  to 
an  absolute  disk  block  called  the  cur¬ 
rent  block,  which  is  specified  as  the 
value  of  the  system  variable  CBLOCK. 
A  description  of  these  words  begs  the 
obvious  question  of  how  usable  text 
gets  into  the  buffer  (or  on  disk)  in  the 
first  place. 

My  own  solution  was  to  use  the  ex¬ 
panded  vocabulary  of  the  inner  shell 
to  write  a  simple  editor.  Once  this 
editor  was  built  interactively,  it  could 
be  used  to  place  the  text  for  the  inner 
shell  (as  well  as  itself)  in  the  edit  buff¬ 
er  for  eventual  storage  on  disk  via 
SAVE.  These  blocks  of  text  are  called 
screens.  Careful  examination  of 
FLINT'S  initialization  procedure  will 
show  that  it  loads  and  executes 
screen  #80  (the  initialized  value  of 
CBLOCK).  This  block  and  the  succeed¬ 
ing  ones  are  where  the  text  for  the 
inner  shell  and  the  editor  have  been 
placed.  The  word  ->  (defined  on 
screen  #80)  directs  the  interpreter  to 
increment  CBLOCK  and  load  and  exe¬ 
cute  the  next  text  screen.  It  allows  an 
entire  string  of  screens  to  be  executed 
without  interactive  direction.  By  this 
device  the  inner  shell  and  editor  are 
effectively  booted  by  FLINT  whenev¬ 
er  it  is  invoked.  The  final  result  is  a 
fairly  mature  system  that  contains 
most  of  the  basic  features  of  a  true 
Forth  and  enough  legitimate  tools  to 
be  quickly  expanded  further. 

You  can  characterize  the  basic  out¬ 
line  of  FLINT’S  construction  as  a  boot¬ 
strap  procedure,  which  is  really  just 


28 

14 


Dr.  Dobb’s  Journal,  January  1987 


an  accelerated  and  miniaturized  ver¬ 
sion  of  the  conventional  process 
whereby  low-level  tools  are  used  to 
build  higher-level  ones.  The  assem¬ 
bler  is  used  to  build  an  interactive  in¬ 
terpreter/compiler  (the  FLINT  ker¬ 
nel),  which  is  used  to  incorporate  a 
more  sophisticated  set  of  tools  (the  in¬ 
ner  shell).  These  are  then  used  to 
build  an  editor  that  allows  the  inner 
shell,  the  editor,  and  all  future  exten¬ 
sions  to  become  permanent  parts  of 
the  system. 

The  Aftermath 

What,  you  might  ask,  did  the  teacher 
learn  from  all  this?  Well,  for  one 
thing,  I  relearned  the  value  of  igno¬ 
rance.  I  sailed  pretty  far  into  the  pro¬ 
ject  on  the  momentum  of  my  initial 
enthusiasm.  It  lasted,  in  fact,  until  ev¬ 
erything  was  pretty  much  complete 
and  working.  Unfortunately,  the  job 
is  not  done  when  the  program 
works.  Cleaning  up  little  messes, 
pruning,  tuning,  documenting,  test¬ 
ing,  and  tracking  down  the  subtle 
bugs — in  short,  the  real  work — was 
equally  time-consuming  but  much 
less  rewarding.  Determination  and 
good  old-fashioned  stubbornness  had 
to  finish  what  enthusiasm  had  be¬ 
gun.  If  I  had  fully  realized  in  advance 
the  amount  of  extra  work  involved,  I 
would  probably  never  have  started. 
The  moral  here  is  that  underestimat¬ 
ing  the  magnitude  of  a  task  is  often  a 
necessary  condition  for  attempting  it. 
The  work  is  still  not  finished,  of 
course.  It  never  is.  The  stack  is  in  the 
wrong  place,  the  realization  of  the 
CODE  submode  is  imperfect,  and  so 
on,  and  so  on.  Any  program  is  only  an 
approximation  of  what  it  should  be. 

As  far  as  Forth  itself  is  concerned, 
there  are  several  things  to  be  said.  I 
have  acquired  a  terrific  amount  of  re¬ 
spect  for  the  ingenuity  of  Forth's  in¬ 
ventor,  Charles  H.  Moore.  Work  of  ge¬ 
nius  is  a  term  properly  applied  to 
such  an  engagingly  simple  yet  im¬ 
mensely  practical  conception.  I  have 
not  yet  become,  however,  one  of 
Forth's  true  believers.  Programming 
is  not  my  profession — it’s  actually 
more  of  a  hobby — so  I  don’t  have 
enough  real  knowledge  or  experi¬ 
ence  to  make  any  credible  claims  for 
its  superiority  or  inferiority  to  other 
programming  languages.  I’ll  leave 
that  issue  to  the  people  who  make 
their  money  writing  programs.  My 


opinion  is  that  Forth’s  programming 
paradigm  of  building  the  language  to 
fit  the  problem  allows  (and  indeed  re¬ 
quires)  a  much  more  flexible  and 
imaginative  approach  to  program¬ 
ming  problems  than  is  called  for 
with  many  more  traditional  lan¬ 
guages. 

The  extra  degree  of  freedom  actu¬ 
ally  seems  to  require  more  self-disci¬ 
pline  on  the  part  of  the  programmer 
to  use  it  wisely,  but  it  makes  problem 
solving  more  fun.  I  find  that  my  pro¬ 
grams  tend  to  become  compositions 
that  I  judge  on  aesthetic  as  well  as 
functional  grounds. 


Availability 

All  the  source  code  for  articles  in  this 
issue  is  available  on  a  single  disk.  To 
order,  send  $14.95  to  Dr.  Dobb's  Jour¬ 
nal,  501  Galveston  Dr.,  Redwood  City, 
CA  94063  or  call  (415)  366-3600  ext. 
216.  Please  specify  the  issue  number 
and  disk  format  (MS-DOS,  Macintosh, 
Kaypro). 

DDJ 

(Listings  begin  on  page  52.) 

Vote  for  your  favorite  feature/article. 

Circle  Reader  Service  No.  3. 


Dr.  Dobb's  Journal,  January  1987 


29 

15 


ARTICLES 


The  OS- 9  Operating 

System 


The  OS-9  operating  system  is  a 
modular,  multiprogramming, 
multitasking  operating  sys¬ 
tem  (OS)  that  runs  on  a  large  variety  of 
Motorola  microprocessors.  Designed 
to  be  conceptually  similar  to  Unix,  it 
provides  programmers  with  a  good 
environment  for  the  same  reasons 
that  Unix  does:  it  uses  a  hierarchical 
file  structure,  allows  command-line 
invocation  of  concurrent  processes, 
has  a  similar  implementation  of  pipes 
and  filters,  and  permits  device-inde¬ 
pendent  I/O.  The  differences  are 
more  important  than  the  similarities, 
though. 

Mean  and  Lean 

OS-9  differs  from  Unix  in  several  sig¬ 
nificant  respects.  It  is,  to  turn  a  phrase, 
"meaner  and  leaner"  than  Unix.  It  oc¬ 
cupies  12K  of  address  space  in  its 
smallest  6809  incarnations  and  a  little 
more  than  48K  in  its  full-blown  68020 
form.  It  is  more  dynamically  change¬ 
able  than  Unix,  allowing  users  to  add 
I/O  devices  and  update  system  mod¬ 
ules  without  rebooting  a  running  sys¬ 
tem.  It  was  also  designed  specifically 
to  accommodate  ROMed  software  eas¬ 
ily  and,  for  a  variety  of  reasons,  has 
proven  suitable  for  real-time  and  pro- 
cess-control  applications.  These  and 
other  features  have  assured  OS-9  a 
niche  encompassing  a  wide  variety  of 
markets.  This  article  gives  an  over¬ 
view  of  the  design  of  OS-9,  particular¬ 
ly  focusing  on  those  aspects  that  make 
it  special.  I've  included  a  pair  of  small 
application  programs  that  are  intend¬ 
ed  to  give  you  the  flavor  of  the  OS-9 

Brian  Capouch,  H.  II  2  Bojc  151,  Monon, 
IN  47959-9229.  Brian  is  a  self-taught 
computer  scientist  who  teaches  at  St. 
Joseph  s  College  in  Rensselaer,  Ind. 


by  Brian  Capouch 


OS-9  is  meaner  and 
leaner  than  Unix. 


programming  environment 

Origins 

OS-9  originally  had  its  genesis  as  part 
of  a  contract  project  between  its  de¬ 
signer,  Microware  Systems  Corp., 
and  Motorola,  during  the  time  that 
the  hardware  design  of  the  6809  pro¬ 
cessor  was  being  finalized.  The  origi¬ 
nal  concept  of  the  project  was  to  pro¬ 
vide  a  modern,  structured  version  of 
the  BASIC  programming  language 
that  would  take  advantage  of  the  fea¬ 
tures  of  the  6809.  The  resultant  lan¬ 
guage,  called  Basic09,  incorporates 
several  features  that  were  at  the  fore¬ 
front  of  language  design;  I  ’ll  review  it 
briefly  with  examples  later. 

As  the  language  development  pro¬ 
ceeded,  Microware,  with  an  eye  to¬ 
ward  the  developing  academic  and 
commercial  use  of  Unix,  began  devel¬ 
oping  an  operating  environment  to 
complement  Basic09’s  structure  and 
modularity.  Thus  OS-9  was  born.  The 
language  and  the  operating  system 
appeared  together  in  1981,  a  few 
months  after  the  6809  came  into  pro¬ 
duction.  As  the  design  for  the  68000 
was  begun  shortly  thereafter, 
Microware  began  a  port  of  OS-9  for 
that  processor. 

As  the  68000  and  its  successors  have 
appeared  in  a  multitude  of  applica¬ 
tions,  OS-9  has  been  ported  into  hun¬ 
dreds  of  designs.  OS-9  currently  ap¬ 
pears  on  a  wide  range  of  machines, 
ranging  from  the  6809-based  Tandy 


Color  Computer  to  the  GMX  Micro-20, 
a  68020-based  system  that  provides  2- 
megabytes  of  RAM,  SASI  and  floppy- 
disk  interfaces,  serial  and  parallel  I/O 
ports,  and  a  68881  math  coprocessor 
on  a  card  that  mounts  on  a  5]/4-inch 
disk  drive.  This  machine  has  a  per¬ 
formance  that  outstrips  a  variety  of 
machines,  including  the  VAX  11/780. 

Named  Modules 

Structurally  speaking,  perhaps  the 
most  significant  feature  of  OS-9  is  its 
extreme  level  of  modularity  (see  Fig¬ 
ure  1,  page  31).  The  underlying  mod¬ 
el  of  the  OS-9  address  space  is  a  dy¬ 
namic  collection  of  named  modules. 
When  an  executable  module  comes 
to  life  as  a  process,  it  is  associated 
with  a  separate  area  of  memory  that 
is  used  to  store  its  data.  All  code  exists 
in  what  are  known  as  memory  mod¬ 
ules,  whether  located  in  primary  or 
secondary  store.  A  memory  module, 
generally,  is  a  segment  of  code  sand¬ 
wiched  between  a  header  and  a  cy¬ 
clic  rendundancy  checksum.  The 
header  contains  information  about 
the  module  that  is  used  by  the  OS 
both  during  the  time  that  the  module 
is  resident  in  memory  and  during 
transfers  to  and  from  secondary  stor¬ 
age  devices.  The  checksum  is  used  to 
verify  the  integrity  of  the  module 
during  transfers  to  and  from  second¬ 
ary  store. 

Modules  that  contain  object  code 
must  consist  of  position-independent 
code  and  are  shared  automatically 
between  various  users  of  the  system. 
This  avoids  having  copies  of  the  same 
object  code  occupying  multiple  sec¬ 
tions  of  memory.  In  such  cases,  the 
OS  assigns  each  user  of  the  module  to 
a  different  area  of  memory  for  data 
storage. 


30 

16 


Dr.  Dobb's  Journal,  January  1987 


Dr.  Dobb's  Journal,  January  1987 


31 

17 


OS-9  OPERATING  SYSTEM 

(continued  from  page  31) 

module  interfaces  to  a  tick  generator 
for  time-slice  timing. 

At  the  next  layer  outward  from  the 
basic  system  modules  lie  the  file  man¬ 
ager  modules.  Each  of  these  is  de¬ 
signed  to  interface  the  I/O  data 
stream  into  and  out  of  the  processor 
and  a  particular  class  of  similar  de¬ 
vices.  Because  the  conditioning  is 
performed  outside  the  kernel,  all 
data  appears  to  the  CPU  as  equivalent 
byte  streams.  The  module  RBF  inter¬ 
faces  to  random-block-oriented  de¬ 
vices,  such  as  disks.  SCF  handles  char¬ 
acter  streams  that  are  bound  for  both 
parallel  and  serial  I/O  devices,  and  PI- 
PEMAN  coordinates  data  streams  be¬ 
tween  concurrently  running 
processes. 

Each  file  manager  conditions  its 
data  stream  and  passes  it  on  to  a  de¬ 
vice  driver  module.  A  device  driver 
must  exist  for  each  different  type  of 
hardware  to  which  the  system  is  in¬ 
terfaced.  The  driver,  such  as  a  disk 
controller  or  intelligent  interface 


processor,  may  control  a  large  num¬ 
ber  of  devices,  and  it  is  this  level  of 
the  system  that  hides  hardware  de¬ 
pendencies  from  the  lower  levels  of 
the  OS. 

At  the  highest  level  of  abstraction 
are  the  device  descriptor  modules. 
The  system  needs  one  of  these  for 
each  individual  I/O  device.  They  con¬ 
tain  device-dependent  parameters, 
such  as  port  addresses,  disk  blocking 
factors,  control  characters,  and  so  on. 

The  OS-9  unified  I/O  system  is  dy¬ 
namically  configurable,  making  it 
quite  different  from  Unix.  Devices 
can  be  added  to  and  removed  from 
running  systems  on  the  fly,  and 
through  changing  device  descriptors, 
different  devices  can  be  added  to 
the  same  port  and  referenced 
interchangeably. 

The  68xxx  versions  of  OS-9  contain  a 
complete  math  subroutines  library 
that  handles  a  wide  variety  of  float¬ 
ing-point,  extended-precision  integer, 
and  transcendental  math  functions. 
This  package  is  called  via  the  Trap  in¬ 
struction.  Programs  that  use  the  li¬ 
brary  can  be  used  without  change  on 


systems  that  implement  hardware  co¬ 
processors  by  changing  the  trap  han¬ 
dler  associated  with  the  call. 

The  User  Interface 

The  OS-9  user  communicates  with 
the  OS  through  a  user  interface  called 
shell.  Shell  is  similar  to  Unix  shells, 
although  it  differs  in  enough  respects 
to  wreak  havoc  with  the  neuronal 
patterns  of  folks  who  try  to  time 
share  between  the  two  different  OSs. 
Many  of  the  utility  commands  go  by 
what  Microware  thinks  are  saner 
names  than  their  Unix  equivalents, 
and  utilities  that  depend  on  file  struc¬ 
ture  and  process  specifics  are  of  ne¬ 
cessity  organized  differently. 

One  notable  difference  between 
the  two  shells  is  a  set  of  OS-9  control 
characters  that  speed  up  line  editing 
at  the  shell  level.  For  instance,  in 
other  environments  I  have  found  my¬ 
self  constantly  wishing  for  a  "repeat 
line”  control  character  (commonly  as¬ 
signed  to  Control-A)  that  provides  the 
function  "reenter  the  contents  of  the 
line  input  buffer  up  to  the  most  recent 
carriage  return  entered.”  High-speed 


32 

18 


Dr.  Dobb's  Journal,  January  1987 


OS-9  OPERATING  SYSTEM 

(continued  from  page  32) 

interaction  with  any  processor  leads 
to  a  large  number  of  typographical 
mistakes,  such  as  misspelling  directo¬ 
ry,  file,  or  command  names.  The  re- 
peat-line  character  allows  for  quick 
reentry  of  the  offending  line,  then 
hackspacing  to  the  point  of  error 
without  having  to  retype  the  preced¬ 
ing  characters.  Although  this  may 
seem  a  picayune  advantage  to  some,  it 
can  become  addictive.  Other  control 
characters  allow  canceling  the  line, 
pausing  the  screen  display,  interrupt¬ 
ing  and  canceling  the  currently  run¬ 
ning  program,  and  so  on.  The  charac¬ 
ters  that  these  functions  map  to  are 
defined  in  the  device  descriptor  and 
can  be  changed  with  the  utility  pro¬ 


grams  tmode  and  xmode.  OS-9  also 
supports  U nix-style,  type-ahead  on  se¬ 
rial  input  devices. 

Development  Tools 

OS-9  programmers  can  choose  from  a 
variety  of  programming  languages 
and  functional  processors.  The  OS-9  C 
compiler  is  Unix  compatible  down  to 
the  standard  library  level,  and  most  of 
the  system  calls  accessible  from  C 
have  been  assigned  names  that  corre¬ 
spond  to  their  Unix  equivalents.  One 
OS-9  hacker  reported  developing  a 
reasonably  complex  program  that  in¬ 
cluded  several  low-level  I/O  calls  us¬ 
ing  OS-9  C  on  his  Radio  Shack  Color 
Computer.  He  then  ported  the  pro¬ 
gram  to  a  VAX  11/780,  where  it  com¬ 
piled  and  ran  without  a  single  change. 

Besides  C,  there  also  exist  compilers 


for  Pascal,  FORTRAN-77,  COBOL,  and  at 
least  four  different  versions  of  the  BA¬ 
SIC  language.  These  include  Basic09, 
the  language  for  which  OS-9  was  orig¬ 
inally  developed.  Basic09  is  an  inter¬ 
esting  and  unusual  language,  and  the 
example  programs  presented  later 
demonstrate  some  of  its  features.  Be¬ 
sides  language  processors,  there  is  the 
usual  plethora  of  functional  proces¬ 
sors,  such  as  text  editors,  formatters, 
and  so  on.  The  standard  OS-9  assem¬ 
bler  is  a  relocatable  macro  assembler 
that  allows  management  of  complex 
assembly-language  code  in  a  variety 
of  libraries  that  can  be  assembled 
separately. 

More  Differences 

A  few  sundry  points  also  underscore 
the  differences  between  OS-9  and 


Microware  OS-9/68000  Resident  Macro  Assembler  VI. 6  86/08/18  14:02  Page 


syscall . a 

OS-9  Assembly  constants  - 
00001  *'  Syscall  Routine  - 

From  Microware  Manual 

00002 

use 

<oskdefs .d> 

00001 

opt 

-1 

00072 

00003 

00004 

00000000 

org 

0 

00005 

00000000  Return 

do.l 

1 

00006 

00000004  Lengthl 

do .  1 

1 

00007 

00000008  Param2 

do.l 

1 

00008 

0000000c  Length2 

do  .1 

1 

00009 

00010 

psect 

SysCall,  (Sbrl 

00011 

00012 

0000  0c80  SysCall 

cmpi.l 

♦  2,  dO 

00013 

0006  6640 

bne.s 

ParamErr 

00014 

0008  Ocaf 

cmpi.l 

#4,  Lengthl  (a1 

00015 

0010  6636 

bne .  s 

ParamErr 

00016 

0012  Ocaf 

cmpi .  1 

♦  52,  Length2  (, 

00017 

001a  652c 

bio .  s 

ParamErr 

00019 

001c  343c 

move. w 

00020 

0020  41fa 

lea  ] 

00021 

0024  6002 

bra .  s 

00022 

0026  3f20 

SysCOl 

move . w 

00023 

0028  5 lea 

SysC02 

dbra 

00024 

002c  2041 

move . 1 

00025 

002e  3f 68 

move . w 

00026 

*  I 

Get  the 

registers 

00027 

0034  2a6f 

movea . 1 

00028 

0038  4cdd 

movem. 1 

00029 

003c  4e97 

jsr 

00030 

003e  48e5 

movem. 1 

00031 

0042  4  fef 

lea  .1 

00032 

-00033 

0046  4e75 

rts 

00034 

0048— 323c 

ParamErr 

move . w 

00035 

004c-003c 

ori 

00036 

00037 

0050  4e75 

rts 

00038 

0052-4e40 

Model 

os9 

00039 

00040 

0056  4e75 

rts 

00041 

00000006 

Modllen 

equ 

00042 

00043 

00000058 

ends 

Errors:  00000 

SysCall,  (Sbrtn«8)  lObjct,  (ReEnt«8)  !  1, 0,  0,  SysCall 

check  parameter  count 
branch  if  error 
is  first  parameter  integer? 
branch  if  not 
)  52  bytes  of  registers? 
branch  if  not 

♦Modllen/2,d2  number  of  words  for  dbra 
Model+Modllen (pc) ,  aO  get  address  of  model 
SysC02  branch  into  loop 

-(a0),-(a7)  move  a  word 

d2, SysCO 1  continue  if  not  done 

dl,a0  point  to  function  code 

2 (aO) , 2 (a7)  set  function  code 

Param2+Modllen(a7) ,a5  get  address  of  parameter 
(a5) +, d0-d7/a0-a4  get  register 
(a7)  call  function 

dO-d7/aO-a4,  —  ( a 5 )  copy  register  set 
Modllen(a7) , a7  clear  stack 


#E$Param, dl 
♦Carry, ccr 

F$Fork 

‘-Model 


get  error  code 
set  carry 

model  system  call  to  put  on  stack 


Memory  used:  19k 
Elapsed  time:  2  second (s) 


Code  Example  1:  Syscall,  an  OS-9  resident  macro  assembler 


34 


Dr.  Dobb  s  Journal,  January  1987 

19 


Unix.  First,  the  OS-9  kernel  is  written 
in  assembly  language  instead  of  in  C. 
Although  this  restricts  the  OS  to  a 
small  set  of  machines,  it  results  in 
faster,  more  compact  code.  OS-9  also 
uses  a  different  scheduling  algo¬ 
rithm — one  that  results  in  noticeably 
faster  throughput  than  most  Unix  im¬ 
plementations.  Because  of  its  modu¬ 
lar  memory  management  and  dy¬ 
namic  configurability,  OS-9  lends 
itself  better  to  low-level  control  appli¬ 
cations  and  real-time  processing.  On 
the  other  hand,  OS-9  does  not  swap 
programs  into  and  out  of  primary 
store.  Although  this  speeds  up 
throughput  considerably,  it  also 
means  that  once  the  total  available 
RAM  in  a  system  is  occupied,  no  more 
jobs  can  be  run.  OS-9  has  no  limit  on 
the  number  of  concurrently  execut¬ 
ing  processes — one  user  reports 
spawning  more  than  600  of  them  on 
his  68020  system  before  running  out 
of  memory.  OS-9  users  also  report 
that  the  substantially  smaller  memo¬ 
ry  requirements  and  faster  speed  of 
OS-9  suit  it  for  many  applications  in 
which  Unix  is  too  large  and  slow. 

BasicOO 

The  example  programs  I  have  includ¬ 
ed  with  this  article  are  designed  to 
illustrate  some  interesting  facets  of 
OS-9.  The  Basic09  programming  lan¬ 
guage  is  a  highly  modular  implemen¬ 
tation  of  BASIC  that  is  not  only  com¬ 
patible  with  most  sophisticated 
versions  of  that  language  but  also 
contains  many  structures  that  are 
seen  in  Pascal.  It  is  an  interpretive 
compiler  that  attempts  to  balance  the 
better  points  of  both  translation  pro¬ 
cesses.  The  language  contains  both  an 
integral  editor  and  debugger.  Source 
code  is  compiled,  a  line  at  a  time  as  it 
is  entered,  into  an  intermediate  code. 
Syntax  errors  are  reported  to  the  user 
as  each  line  is  entered  and  can  be  im¬ 
mediately  corrected  using  the  editor. 
Programs  consist  of  any  number  of 
named  procedures  in  which  a  vari¬ 
ety  of  data  types  can  be  declared  and 
built  up  into  named  Pascal-record¬ 
like  structures.  Parameters  can  be 
passed  between  procedures  by  ei¬ 
ther  value  or  reference,  and  TRACE, 
PAUSE,  and  STATE  statements  allow 
for  interactive  debugging. 

The  example  programs  illustrate 
some  specific  features:  first,  Basic09 
programs  can  call  and  pass  parame¬ 


ters  to  and  from  programs  written  in 
assembly  language  or  compiled  by 
other  compilers.  The  program  Syscall 
(Code  Example  1,  page  34)  is  a  68000 
program  that  allows  a  Basic09  pro¬ 
gram  to  [  srform  system  calls  from 
within  a  procedure  simply  by  passing 
a  68000  register  image  and  an  OS  func¬ 
tion  code.  Because  it  is  reasonably 
closely  tied  to  the  calling  syntax  for 
OS-9  assembly-language  system  calls 
and  outside  the  scope  of  this  article,  I 
have  chosen  not  to  discuss  it  here  and 
simply  treat  it  as  a  black  box.  Proce¬ 
dure  ShowCall  (Code  Example  2, below) 
illustrates  how  Syscall  would  be  in¬ 
voked  in  a  Basic09  procedure.  In  this 
case,  I  simply  use  the  Syscall  routine 
as  a  substitute  for  the  high-level  Ba- 
sic09  OPEN  command.  The  system 
call,  which  opens  the  file  for  read, 
passes  back  a  path  number  in  the  do 
register,  which  is  then  assigned  to  the 
Basic09  variable  Path. 

The  second  example  program, 
FormLetter  (Code  Example  3,  page 
36),  illustrates  a  Basic09  procedure 
that  adds  a  front  end  onto  a  standard 
text  formatter.  In  order  to  personal¬ 
ize  a  form  letter  that  I  was  preparing 
for  a  basketball  team  I  coach,  I  want¬ 
ed  to  incorporate  names,  addresses, 


and  greetings  from  a  demographic 
file  into  the  body  of  a  standard  letter. 
My  local  implementation  of  the  for¬ 
matter  does  not  permit  reading  from 
outboard  files  into  the  standard  data 
stream.  I  chose  to  put  together  tools 
that  I  already  had  at  hand  rather 
than  do  a  "Swiss  army  knife”  modifi¬ 
cation  to  the  text  formatter.  (At  this 
point,  please  realize  that  I  am  well 
aware  of  the  case  for  making  a  per¬ 
manent  addition  to  the  formatter  but 
would  have  lost  an  example  for  this 
article.) 

The  technique  I  chose  was  to  create 
a  pair  of  named  pipes,  one  of  which 
receives  a  stream  of  data  headed  for 
the  formatter,  the  other  one  receiving 
its  output.  I  then  connected  those 
pipes  to  file  paths  within  the  Basic09 
procedure  via  calls  to  the  OS-9  shell.  It 
was  possible  thereby  to  integrate  my 
demographics  with  the  body  of  my 
letter,  sending  the  merged  data  as 
standard  input  to  the  formatter.  In 
this  example  I  could  then  send  the 
output  of  the  formatter  to  a  file,  but  I 
could  have  just  as  easily  invoked  yet 
another  pipeline  to  have  it  processed 
directly  by  my  system's  print  spooler. 
(Basic09  commands  to  do  so  are  en¬ 
closed  as  comments  within  the  pro- 


PROCEDURE 

0000 

0001 

0036 

0069 

00A0 
00A1 
OOCO 
00C7 
00D3 
00DA 
00E3 
00EF 
00F0 
0108 
0141 
0149 
0173 
0 17F 
0180 
01A9 
01B4 
01C2 
01D1 
01D2 
0205 
0210 
0211 
0231 
023C 
0246 
024B 
024F 
0250 
0277 
027D 


ShowCall 

(*  Procedure  to  demonstrate  System  Call  from  Basic09 
(*  Define  complex  data  type  to  represent  registers 
TYPE  registers-dO, dl, d2,d3, d4, d5,d6,  d7,  aO, al,  a2,  a3 : 

INTEGER 

(*  Declare  necessary  variables 
DIM  Path: INTEGER 
DIM  Line: STRING [12 8] 

DIM  IOPEN : INTEGER 
DIM  Regs: registers 
DIM  FileName : STRING [4 8] 

(*  Initialize  Variables 

(*  IOPEN  is  OS-9  System  Call  I$OPEN:  Open  path  to  a  file 
IOPEN— $84 

(*  This  is  the  file  we'll  be  reading  from 
FileName— “BOpen" 

(*  Set  up  register  image  for  system  call 
Regs .d0-l 

Regs .a0-ADDR (FileName) 

RUN  Syscall  (IOPEN, Regs) 

(*  Assign  returned  path  number  to  Basic09  variable 
Path— Regs . dO 

(*  Read  and  print  file  contents 
WHILE  NOT (EOF (#Path) )  DO 
READ  #Path, Line 
PRINT  Line 
ENDWHILE 

(*  Use  Basic09  interface  to  close  file 

CLOSE  iPath 

END 


Code  Example  2:  Procedure  to  invoke  Syscall 


Dr.  Dobb  s  Journal,  January  1987 

20 


35 


OS-9  OPERATING  SYSTEM 

(continued  from  page  35) 

grain.)  These  concepts  can  be  extend¬ 
ed  in  vastly  more  complex  ways,  all 
the  while  yielding  the  benefits  of  to¬ 
tally  modular  construction  and  inter¬ 
active  program  development.  The  in¬ 
teractivity  of  Basic09  has  left  me 
consistently  choosing  it  rather  than  C 
when  portability  is  not  a  requisite, 
and  I  have  found  development  to  be 
much  faster  because  of  the  interactive 
compilation,  editing,  and  debugging. 

Conclusions 

I  am  tempted  to  call  OS-9  the  “poor 
man’s  Unix,”  even  though  I  have  to 
marvel  at  the  power  of  its  most  so¬ 
phisticated  implementations,  which 
are  on  par  (or  superior)  to  Unix  run¬ 
ning  on  a  VAX  in  terms  of  speed  and 
memory  utilization.  The  concept  of 
named,  automatically  located  memo¬ 
ry  modules  is  an  innovation  that  ex¬ 
tends  its  capabilities  into  process  con¬ 
trol  and  real-time  applications  that 
are  unsuitable  for  Unix,  and  its  user 
interface  and  overall  organization 
enable  those  familiar  with  Unix  to 
adapt  to  its  operation  quickly.  Its 
growing  user  base  indicates  that 
what  was  considered  for  many  years 
an  underground  classic  is  now 
emerging  from  the  shadows,  and  it 
should  provide  an  interesting  and 
productive  environment  for  pro¬ 
grammers  of  machines  based  on  Mo¬ 
torola  processors  for  years  to  come. 

I  would  like  to  thank  GMX  Corp.  for 
providing  me  with  one  of  its  Micro-20 
systems  for  evaluating  that  imple¬ 
mentation  of  OS-9. 

Availability 

All  the  source  code  for  articles  in  this 
issue  is  available  on  a  single  disk.  To 
order,  send  $14.95  to  Dr.  Dobb's  Jour¬ 
nal,  501  Galveston  Dr.,  Redwood  City, 
CA  94063  or  call  (415)  366-3600  ext. 
216.  Please  specify  the  issue  number 
and  format  (MS-DOS,  Macintosh, 
Kaypro). 


DDJ 


Vote  for  your  favorite  feature/article. 
Circle  Reader  Service  IMo.  4. 


PROCEDURE 

FormLetter 

0001 

(*  Forraletter:  Add  a  front-end  to  K  4  P  text  formatter 

0039 

(*  Written  by  Brian  Capouch,  7/86 

005C 

(*  Define  complex  type  to  hold  demographic  info 

008C 

00A8 

TYPE  NameRec-Name.Addl,  Add2, Greeting: STRING [80] 

00A9 

(*  Declare  variable  storage 

00C5 

DIM  Record :NameRec 

00CE 

DIM  NameFile, BodyFile : STRING [48] 

00DE 

DIM  Includer:  STRING  [10] 

00EA 

DIM  NamPath, SplPipe, InPipe, OutPipe: INTEGER 

OOFD 

DIM  Counter : INTEGER 

0104 

DIM  Line :STRING [256] 

0110 

0117 

DIM  Index: INTEGER 

0118 

(*  Trap  out  EOF  coming  down  pipeline 

01 3D 
0143 

ON  ERROR  GOTO  10 

0144 

(*  Initialize  variables 

01SC 

NameFile- "Demographics" 

01 6F 

0 17E 

Includer-". us  Body" 

017F 

(*  Set  up  files 

0190 

OPEN  ♦NamPath, NameFile: READ 

019C 

01A6 

GET  ♦NamPath,  Counter 

01A7 

(*  Set  up  output  path  to  file 

01C4 

01D9 

CREATE  ♦SplPipe, "LettersOut" :WRITE 

01  DA 

(*  Send  each  set  to  formatter,  with  preface  and  suffix 

021E 

FOR  Index-1  TO  Counter 

0230 

(*  Set  up  pipes  to  text  formatter 

0252 

(*  This  procedure  "owns"  this  named  pipe 

027B 

CREATE  ♦InPipe, "/pipe/in" :WRITE 

028E 

(*  Now  couple  this  pipe  to  text  formatter 

02B8 

SHELL  "runb  tformat  </pipe/in  >/pipe/outs" 

02DF 

(*  Open  path  to  output  pipe  from  formatter 

030A 

OPEN  ♦OutPipe, "/pipe/out" :READ 

03  IF 

(*  Next  two  lines  commented  out 

033E 

(*  They  would  send  output  to  spooler  instead  of  a  file 

0374 

(*  SHELL  "spl  -nj  </pipe/SplIn" 

0394 

(*  OPEN  ♦SplPipe, "/pipe/SplIn" : write 

03BA 

(*  Now  process  data 

03CE 

GET  ♦NamPath, Record 

03D9 

RUN  PrintHeading (InPipe,  "7/24/86") 

03ED 

WRITE  ♦InPipe, Record. Name 

03FA 

WRITE  ♦InPipe, Record. Addl 

0407 

WRITE  ♦InPipe, Record. Add2 

0414 

WRITE  ♦InPipe 

04 1A 

WRITE  ♦InPipe, Record. Greeting 

0427 

042D 

WRITE  flnPipe 

042E 

(*  Send  some  commands  to  the  formatter 

0454 

WRITE  ♦InPipe, ".fi" 

0460 

WRITE  ♦InPipe, Includer 

04  6A 

WRITE  ♦InPipe, ".bp" 

0476 

047C 

CLOSE  ♦InPipe 

047D 

(*  Get  output  from  formatter,  send  to  spooler 

04AB 

WHILE  NOT  (EOF  (♦OutPipe)  )  DO 

04B6 

READ  ♦OutPipe, Line 

04C0 

WRITE  +SplPipe, Line 

04  CA 
04CE 

ENDWHILE 

04CF 

10 

(*  We  return  here  after  each  EOF  trap 

04F8 

CLOSE  ♦OutPipe 

04FE 

NEXT  Index 

0509 

CLOSE  ♦NamPath 

05  OF 

CLOSE  ♦SplPipe 

0515 

0517 

END 

0518 

100 

(*  EOF  on  pipe  generates  OS-9  Error  ^211 

0544 

ErrNo— ERR 

054B 

IF  ErrNo<>211  THEN 

0558 

PRINT  Beep; 

05  5E 

PRINT  “System  Error - >No.  "+STR$ (ErrNo) 

0  57B 

END 

057D 

ELSE 

0581 

GOTO  10 

0585 

ENDIF 

0587 

END 

Code  Example  3:  Procedure  to  add  front  end  to  text  formatter 


36 


Dr.  Dobb's  Journal,  January  1987 

21 


ARTICLES 


Macintosh  Buttons 

and 

Amiga  Gadgets 


When  you  put  aside  emo¬ 
tional  reactions  to  the  Mac¬ 
intosh  and  Amiga  comput¬ 
ers  to  take  a  more  objective  look  at 
the  two  machines,  it  appears  that,  at 
least  in  concept,  they  are  more  alike 
than  different.  In  addition  to  similar¬ 
ities  in  user  interface  standard,  both 
use  the  68000  microprocessor.  Both 
have  operating  systems  that  are  par¬ 
tially  in  ROM  and  partially  on  disk. 
Programmers  who  wish  to  adhere  to 
the  standard  user  interface  make  use 
of  a  set  of  system  routines  to  create 
and  manage  windows,  menus, 
graphics,  and  text. 

The  functions  that  the  Macintosh 
and  Amiga  system  routines  provide 
are  not  equivalent,  however.  The 
Macintosh,  for  example,  provides 
text  handling  routines  not  found  on 
the  Amiga.  By  the  same  token,  the 
Amiga  libraries  contain  animation 
routines  not  found  on  the  Macintosh. 
The  disparity  between  the  functions 
provided  by  system  routines  pre¬ 
sents  challenges  for  programmers, 
especially  if  they  are  attempting  to 
adapt  software  from  one  machine  to 
the  other. 

This  article  explores  the  details  of 
the  standard  Macintosh  and  Amiga 
user  interfaces,  examines  the  system 
routines  programmers  use  to  create 
those  interfaces,  and  discusses  those 
system  routines  through  a  pair  of 
sample  68000  assembly-language 
programs  that  support  a  portion  of 
the  standard  user  interfaces. 


Jan  L.  Harrington,  4002  Stearns  Hill 
Rd.,  Waltham,  MA  02154.  Jan  is  an  as¬ 
sistant  professor  in  the  Computer  Sci¬ 
ence  Department  at  Bentley  College. 
She  is  the  author  o/Macintosh  Assem¬ 
bly  Language:  An  Introduction. 


by  Jan  L.  Harrington 


The  Mac's  system 
routines  support  its 
user  interface  more 
completely. 


User  Interface  Standards 

The  Macintosh  standard  user  inter¬ 
face  has  caused  many  people  to  rede¬ 
fine  what  they  mean  when  they  say 
software  is  easy  to  use.  You  expect  to 
be  able  to  run  a  Macintosh  program 
by  double-clicking  its  icon  on  the 
Macintosh  desktop.  You  expect  to 
find  at  least  Apple,  File,  and  Edit 
menus  present,  and  you  expect  those 
menus  to  behave  in  a  consistent  man¬ 
ner  (they  should  "pull  down”  to  ex¬ 
pose  the  menu  items,  some  of  which 
may  be  associated  with  a  keyboard 
equivalent).  You  expect  to  be  able  to 
use  scroll  bars  to  move  throughout  a 
document  and  to  be  able  to  simulate  a 
click  on  an  OK  button  by  pressing  the 
Return  key.  The  Macintosh  user  in¬ 
terface  standard  is  clearly  defined  in 
Chapter  2  of  Inside  Macintosh,1  the 
extensive  technical  documentation 
for  the  machine. 

The  Amiga  also  supports  a  mouse- 
driven  interface.  Amiga  applications 
that  have  icons  can  be  run  by  double¬ 
clicking  that  icon  from  the  Work¬ 
bench  window  with  the  left  mouse 
button.  Amiga  menus  also  pull  down, 
the  result  of  right  mouse  button  ac¬ 
tion.  Menu  items  can  be  associated 
with  keyboard  equivalents,  and 
Amiga  software  has  appeared  with 
scroll  bars. 

The  Amiga  interface  standard, 
however,  is  not  as  strictly  defined  as 
is  the  Macintosh  standard.  Sugges¬ 


tions  for  the  interface  do  appear  in 
the  Intuition2  manual,  the  part  of  the 
technical  documentation  that  de¬ 
scribes  the  routines  supporting  the 
windowing  environment. 

The  window  is  central  to  both  the 
Macintosh  and  Amiga.  You  can  move 
windows  about  the  screen  (that  is, 
within  the  same  plane),  move  them 
relative  to  other  windows  on  the 
screen  (that  is,  from  front  to  back), 
size  them,  and  close  them.  All  of 
these  functions  are  initiated  by 
mouse  action  on  some  portion  of  the 
window.  Window  movement  within 
the  plane  is  controlled  by  the  drag  re¬ 
gion,  that  part  of  the  window’s  title 
bar  not  taken  up  by  the  window  title 
or  other  graphics.  A  click  of  the 
mouse  button  on  a  box  at  the  far  left 
of  the  title  bar  will  close  the  window 
(the  Macintosh  calls  it  a  "GoAway 
box,”  the  Amiga  a  "CloseWindow 
gadget”).  The  Amiga  window  title  bar 
also  contains,  at  the  far  right,  gadgets 
that  move  the  window  from  front  to 
back  ("depth  arrangement  gadgets”); 
Macintosh  windows  move  back  one 
layer  when  another  window  is  acti¬ 
vated  and  brought  to  the  front.  In 
both  machines,  windows  are  sized 
with  the  overlapping  boxes  that  ap¬ 
pear  in  the  lower-right  corner  of  the 
window  (the  Macintosh  calls  it  a 
"grow  icon,”  the  Amiga  calls  it  a  "siz¬ 
ing  gadget”). 

Both  the  Macintosh  and  Amiga  col¬ 
lect  information  and  give  warnings 
using  special  windows.  The  Macin¬ 
tosh  uses  "dialog  boxes”  to  collect  in¬ 
formation  and  "alerts”  to  give  warn¬ 
ings.  For  example,  a  dialog  box  will 
appear  to  accept  the  name  under 
which  a  file  should  be  saved,  and  an 
alert  box  appears  if  you  attempt  to 
close  a  document  window  without 


40 

22 


Dr.  Dobb's  Journal,  January  1987 


saving  its  contents.  The  Amiga  uses 
“requesters”  to  collect  information 
and  "alerts”  to  let  you  know  that 
something  catastrophic  has  oc¬ 
curred.  Requesters  are  analogous  to 
dialog  boxes,  but  whereas  Macintosh 
alerts  signal  that  a  potential  for  harm 
occurs,  Amiga  alerts  appear  only  af¬ 
ter  it's  too  late  to  recover.  Amiga 
alerts  are  generally  equivalent  to 
Macintosh  system  alerts  (generated 
by  68000  system  errors).  Usually  you 
close  requesters,  dialog  boxes,  and 
Macintosh  alerts  by  clicking  on  a 
small  rectangle  containing  a  message 
such  as  OK  or  Cancel.  (The  Amiga 
calls'  the  small  rectangles  "gadgets”; 
the  Macintosh  calls  them  "buttons,” 
which  are  a  type  of  "control.”) 

Menus,  too,  are  essential  to  both 
the  Amiga  and  Macintosh  user  inter¬ 
faces.  Macintosh  menu  titles  are  visi¬ 
ble  at  all  times  across  the  top  of  the 
screen  in  the  "menu  bar.”  The  Amiga 
"menu  strip,”  which  also  appears 
across  the  top  of  the  screen,  is  visible 
only  when  you  press  the  right  mouse 
button. 

Menus  on  both  computers  pull 
down;  the  menu  choices  appear  in  a 
box  below  the  menu  title  as  you  drag 
the  mouse  pointer  over  them.  Amiga 
menu  items  may  also  have  sub¬ 
menus,  which  appear  when  the 
mouse  pointer  is  dragged  over  the 
menu  item.  In  either  machine,  menu 
items  can  be  associated  with  key¬ 
board  equivalents — a  single  charac¬ 
ter  that,  when  pressed  in  conjunction 
with  a  special  modifier  key  (the  com¬ 
mand  key  on  the  Mac,  the  solid  A  key 
on  the  Amiga),  simulates  the  selection 
of  a  menu  item  with  the  mouse. 

Both  computers  maintain  menu 
definitions  as  linked  lists  of  data 
structures  containing  data  needed  to 
draw  the  menus.  At  any  one  time,  the 
Macintosh  supports  only  one  menu 
list.  You  can  make  changes  by  insert¬ 
ing  and  removing  menus  from  the 
list.  The  Amiga  logically  attaches 
menu  lists  to  windows  rather  than  to 
the  screen,  making  it  possible  for 
many  menu  lists  to  be  present  in 
memory  at  any  given  time.  Which 
menus  appear  when  you  depress  the 
right  mouse  button  therefore  de¬ 
pends  on  which  window  is  active  at 
the  time. 

A  Macintosh  program  that  adheres 
to  the  standard  user  interface  has  at 
least  three  menus.  The  Apple  menu 


appears  at  the  far  left  of  the  menu 
bar;  its  title  appears  as  an  apple  icon 
(an  apple  with  a  bite  taken  out  of  it). 
The  Apple  menu  supports  the  Macin¬ 
tosh  desk  accessories.  The  second 
menu  from  the  left  is  the  File  menu. 
The  File  menu  opens,  closes,  saves, 
and  prints  files  and  exits  from  the  pro¬ 
gram  to  the  operating  system.  The 
third  menu,  the  Edit  menu,  handles 
the  text  editing  functions — Cut,  Copy, 
Paste,  Clear,  and  sometimes  the  Undo 
function  as  well. 


The  Amiga 
does  not 

provide  routines  to 
implement  its 
own  standard 
interface 

recommendations. 


Amiga  programs  that  follow  the  in¬ 
terface  suggestions  made  in  the  Intu¬ 
ition  manual  provide  at  least  two 
menus  in  each  menu  strip.  The  left¬ 
most  menu  is  the  Project  menu, 
which  is  analogous  to  the  Macintosh 
File  menu — it  manages  opening,  clos¬ 
ing,  saving,  and  printing  of  files  as 
well  as  exiting  from  the  program  to 
the  operating  system.  The  second 
menu  from  the  left  is  an  Edit  menu, 
which  provides  access  to  the  same 
editing  functions  as  the  Mac’s  Edit 
menu. 

Text  editing  standards  are  also  an 
important  part  of  both  the  Macintosh 
and  Amiga  user  interfaces.  Macin¬ 
tosh  programs  that  support  text  en¬ 
try  of  any  kind,  including  even  the 
entry  of  a  single  line  into  a  dialog  box, 
support  cut,  copy,  and  paste  opera¬ 
tions  from  a  standard  Edit  menu.  The 
editing  functions  are  also  present  in 
programs  that  do  no  text  editing  be¬ 
cause  desk  accessories  use  cut,  copy, 
and  paste  even  if  an  application  does 
not.  Amiga  programs  that  require 
text  entry  (most  notably  word  pro¬ 
cessors)  support  the  same  text  editing 
functions  as  Macintosh  programs  do, 
but  the  absence  of  desk  accessories 
means  that  Edit  menus  do  not  appear 


in  programs  that  are  not  concerned 
with  text. 

Implementation 

To  explore  the  support  the  Macin¬ 
tosh  and  Amiga  provide  for  their 
standard  user  interfaces,  I  wrote  two 
programs  in  68000  assembly  lan¬ 
guage,  one  for  each  machine  (see  List¬ 
ings  One  and  Two,  pages  64  and  69 
for  the  Macintosh  program  and  List¬ 
ing  Three,  page  69,  for  the  Amiga 
program).  The  programs  are  more  or 
less  equivalent  in  function.  Each 
opens  a  window  for  text  entry  (the 
Amiga  program  also  opens  a  custom 
screen)  and  creates  menus  (three  for 
the  Macintosh,  two  for  the  Amiga). 
Each  supports  the  entry  of  text  from 
the  keyboard.  The  programs  both 
terminate  if  you  either  select  Quit 
from  the  appropriate  menu  or  click 
the  mouse  pointer  on  the  box  that 
closes  the  window.  Both  also  make 
extensive  use  of  constants  and  data 
structure  offsets  from  the  include 
files  supplied  with  the  Macintosh 
68000  Development  System  and  the 
Amiga  Macro  Assembler, 
respectively. 

The  Macintosh  system  routines  are 
invoked  through  the  68000’s  trap 
mechanism  (all  system  calls  are  as¬ 
sembled  to  begin  with  %1010,  which 
is  then  trapped  by  the  microproces¬ 
sor).  The  trap  mechanism  retrieves 
the  actual  location  of  the  routine 
from  a  jump  table,  which  is  loaded 
from  ROM  to  RAM  at  system  start-up. 
Calls  to  system  routines  are  per¬ 
formed  with  trap  macros,  all  of 
which  begin  with  an  underbar  (for 
example,  _ GetRMenu ).  The  trap  mac¬ 
ros  themselves  are  defined  in  the 
Macintosh’s  include  files.  Macintosh 
system  routines  are  organized  into 
“managers,”  which  must  be  initial¬ 
ized  before  the  routines  are  called. 
Lines  5—14  of  Listing  One  initialize 
all  the  Macintosh  managers. 

Amiga  system  routines  are  con¬ 
tained  in  libraries.  All  libraries  ex¬ 
cept  the  exec  must  be  opened  before 
they  can  be  used.  This  includes  the 
intuition  library,  which  is  opened  at 
the  top  of  Listing  Three  with  a  call  to 
the  system  routine  OpenLibrary  (lines 
44—51).  Each  library  has  a  base  ad¬ 
dress.  Exec's  base  is  fixed  and  as¬ 
signed  to  the  constant  __AbsExecBase; 
all  other  library  bases  are  returned 
by  OpenLibrary.  System  routines  are 


Dr.  Dobb’s  Journal,  January  1987 


41 

23 


MAC  BUTTONS,  AMIGA  GADGETS 

(continued  from  page  41) 


called  as  subroutines  whose  starting 
addresses  are  relative  to  the  library 
base,  which  is  placed  in  register  A6 
before  the  call.  Assuming  that  the 
correct  address  is  in  A6,  calls  to 
Amiga  system  routines  are  handled 
by  the  macro  callsys  (lines  4— 6). 

The  functional  differences  be¬ 
tween  the  two  programs  are  the  re¬ 
sult  of  the  system  routines  available 
on  the  two  computers.  The  Macin¬ 
tosh  program,  for  example,  has  a 
menu  the  Amiga  program  does  not. 
The  Apple  menu  is  provided  for  the 
Macintosh’s  desk  accessories.  Imple¬ 
mentation  of  the  desk  accessories  is 
handled  completely  by  a  series  of 
calls  to  system  routines.  The  text  in 
the  Macintosh  window  will  word 
wrap  because  the  routines  that  man¬ 
age  the  text  editing  environment 
automatically  handle  that  function. 
The  editing  operations — cut,  copy, 
and  paste — have  also  been  imple¬ 
mented  because  each  is  handled  by  a 
call  to  a  single  system  routine. 

The  Amiga  does  not  support  desk 
accessories  and  so  has  no  equivalent 
to  the  Macintosh’s  Apple  menu.  The 
Amiga  also  does  not  provide  system 
routines  for  doing  word  wrap  or  the 
standard  editing  functions.  There¬ 
fore,  text  entry  in  the  Amiga  window 
is  exactly  like  using  a  typewriter.  On 
the  other  hand,  the  Amiga  window 
can  be  sized,  moved  about  the  screen, 
and  moved  back  and  forth  in  the 
plane.  These  functions  are  handled 
automatically  by  the  Amiga's  operat¬ 
ing  system;  they  do  not  need  to  be  in¬ 
cluded  in  an  application's  code.  Per¬ 
forming  the  same  operations  on  the 
Macintosh  requires  including  addi¬ 
tional  program  code  (calls  to  system 
routines). 

The  Amiga  program  is  more  than 
twice  as  long  as  the  Macintosh  pro¬ 
gram.  There  are  two  major  reasons 
for  this.  First,  the  Macintosh  can  ob¬ 
tain  parameters  for  creating  data 
structures  (for  example,  windows 
and  menus)  from  a  resource  file.  A 
resource  file  contains  templates  that 
describe  the  contents  (for  example, 
items  to  appear  in  a  menu),  location 
(for  example,  initial  coordinates  of  a 
window),  and  characteristics  (for  ex¬ 
ample,  menu  items  that  should  ini¬ 
tially  be  disabled)  of  structures  an  ap¬ 


plication  will  use.  The  Macintosh 
program’s  resource  file  (Listing  Two) 
contains  templates  for  the  three 
menus  and  one  window.  Second,  al¬ 
though  both  the  Macintosh  and  the 
Amiga  maintain  data  about  program 
objects  in  linked  lists  of  data  struc¬ 
tures,  Macintosh  system  routines  per¬ 
form  most  of  the  structure  initializa¬ 
tion  and  list  management,  whereas 
the  Amiga  leaves  those  functions  to 
the  programmer.  The  amount  and 
type  of  programming  required  to 
achieve  the  same  program  function 
is  therefore  rather  different  on  the 
two  machines.  You  can  see  examples 
of  these  differences,  especially  in 
terms  of  creating  menus  and  win¬ 
dows,  when  the  two  sample  pro¬ 
grams  perform  event  trapping  and 
do  text  I/O. 

Creating  Menus 

The  Macintosh  can  create  a  menu  by 
calling  the  system  routine  GetRMenu. 
GetRMenu  reads  the  template  from 
the  resource  file,  allocates  space  in 
RAM  for  the  menu  record,  and  loads 
the  data  into  the  appropriate  loca¬ 
tions  in  that  menu  record.  Menu 
items  are  stored  in  a  linked  list.  The 
routine  requires  one  parameter  (the 
resource  file  ID  that  identifies  the 
menu)  and  space  on  the  stack  for  a 
handle  to  the  menu  record.  After  the 
routine  is  called,  an  application  pulls 
the  handle  from  the  top  of  the  stack, 
where  it  has  been  placed  by  GetR¬ 
Menu.  Initializing  a  menu  record 
therefore  requires  four  lines  of  code. 
The  template  for  the  Macintosh  pro¬ 
gram’s  File  menu,  for  example,  ap¬ 
pears  in  lines  5  —  14  of  Listing  Two. 
The  menu  structures  are  actually  ini¬ 
tialized  in  lines  29 — 32  of  Listing  One. 

Initializing  a  menu  data  structure 
for  an  Amiga  program  is  considerably 
more  complex.  Assuming  that  the 
menu  items  are  to  be  text  rather  than 
graphics,  the  text  must  first  be  loaded 
into  intuition  text  structures.  Each  in¬ 
tuition  text  structure  must  be  includ¬ 
ed  in  a  menu  item  structure.  The 
menu  item  structures  are  then  assem¬ 
bled  into  a  linked  list  whose  head  is 
incorporated  into  the  menu  data 
structure. 

In  the  program  in  Listing  Three, 
the  intuition  text  structures  are  ini¬ 
tialized  in  the  subroutine  SetTeyt 
(lines  257—265),  which  is  invoked  by 
the  macro  passteyt  (lines  10—14).  For 


each  menu  item,  the  text  must  be  in¬ 
cluded  in  the  program  as  a  constant. 
The  program  must  also  allocate  space 
for  the  intuition  text  structure  (see 
lines  354—393  for  the  data  structures 
associated  with  the  Amiga  program's 
menus).  The  intuition  text  structures 
for  the  Project  menu,  for  example, 
are  handled  in  lines  92—98. 

Once  the  intuition  text  structures 
have  been  initialized,  the  menu  items 
are  initialized  and  chained  into  a 
linked  list.  The  actual  initialization  is 
performed  by  the  subroutine  Setltem 
(lines  266—277),  which  is  invoked  by 
the  macro  passitem  (lines  15—22).  It  is 
up  to  the  programmer  to  allocate 
storage  for  each  menu  item  data 
structure  and  to  load  the  list  pointers 
correctly.  The  Project  menu's  menu 
items  are  initialized  and  linked  in 
lines  99—111. 

The  final  step  in  the  process  is  the 
initialization  of  the  menu  data  struc¬ 
ture.  All  the  menus  that  will  be  pre¬ 
sent  in  a  single  menu  strip  are  main¬ 
tained  in  a  linked  list.  Therefore,  the 
initialization  must  include  setting  a 
pointer  to  the  next  menu  in  the  list. 
The  programmer  must  also  deter¬ 
mine  coordinates  for  the  physical  po¬ 
sition  of  the  menu  in  the  menu  strip. 
The  Amiga  program  sets  up  the  Pro¬ 
ject  menu  in  lines  112— 123.  After  the 
Project  menu  has  been  completed, 
the  entire  process  is  repeated  for  the 
Edit  menu. 

The  Macintosh  requires  the  use  of 
a  separate  system  routine  to  insert 
menus  into  the  linked  list  of  menus 
that  will  appear  in  the  menu  bar  at 
any  given  time.  InsertMenu  handles 
the  insertion  of  a  menu  into  the 
menu  list.  A  programmer  supplies  ei¬ 
ther  the  handle  of  a  menu  after 
which  this  menu  should  be  inserted 
or  a  parameter  indicating  that  this 
menu  should  be  last  (that  is,  right¬ 
most).  Lines  33 — 35  of  Listing  One,  for 
example,  insert  the  File  menu  into 
the  Macintosh’s  menu  list. 

Merely  inserting  a  menu  into  the 
menu  list  does  not,  on  either  machine, 
display  the  menu.  The  Macintosh  rou¬ 
tine  DrawMenuBar  (see  line  43  in  List¬ 
ing  One)  is  equivalent  in  function  to 
the  Amiga  routine  SetMenuStrip  (lines 
149—152  in  Listing  Three).  The  Macin¬ 
tosh  routine  actually  displays  the 
menus  currently  in  the  menu  list  on 
the  menu  bar;  the  Amiga  routine  at¬ 
taches  the  menu  list  to  a  window  so 


42 

24 


Dr.  Dobb's  Journal,  January  1987 


MAC  BUTTONS,  AMIGA  GADGETS 

(continued  from  page  42) 


that,  when  the  window  is  active  and 
the  right  mouse  button  is  depressed, 
the  menu  strip  appears. 

There  are  both  advantages  and 
drawbacks  to  the  Amiga’s  way  of 
handling  menus.  The  major  draw¬ 
back  is  the  burden  the  Amiga  places 
on  the  programmer.  Establishing 
Amiga  menus  requires  a  great  deal  of 
work  to  initialize  data  structures  and 
a  great  deal  of  care  to  ensure  that 
pointers  are  stored  properly.  The 
Macintosh,  on  the  other  hand,  iso¬ 
lates  the  programmer  from  dealing 
directly  with  the  data  structures  and 
from  list  management.  The  trade-off 
is  flexibility.  Macintosh  menu  items 
are  restricted  to  text  (though  a  limited 
number  of  icons  can  appear  with 
them),  whereas  Amiga  menu  items 
can  be  graphics  (the  intuition  text 
structures  can  be  replaced  with 
graphics  data  structures).  A  Macin¬ 
tosh  programmer  has  control  only 
over  the  relative  placement  of  a 


menu  in  the  menu  bar;  an  Amiga 
programmer  can  determine  exactly 
where  a  menu  should  appear. 

Creating  Windows 

Macintosh  windows  can  also  be  de¬ 
fined  by  templates  in  resource  files 
(see  lines  24—29  in  Listing  Two).  An 
application  can  then  create  the  win¬ 
dow  by  pushing  space  for  a  window 
pointer  on  the  stack,  pushing  three 
parameters,  and  calling  GetNewWin- 
dow  (see  lines  44—  49  in  Listing  One). 
The  programmer  must  provide  stor¬ 
age  for  the  window  record  and  for  its 
pointer  (lines  195  —  196). 

Because  the  Amiga  doesn’t  support 
resource  files,  an  Amiga  program 
must  load  a  window  data  structure 
explicitly  before  calling  the  Open- 
Window  system  routine,  which  actu¬ 
ally  creates  the  window  (lines  69  —  91 
in  Listing  Three).  If  the  window  is  to 
appear  in  a  custom  screen  rather 
than  the  default  Workbench  screen, 
the  program  must  also  first  initialize 
a  screen  data  structure  and  call  Open- 
Screen  (lines  52—68).  Note  that  al¬ 


though  it  doesn't  matter  to  the  Macin¬ 
tosh  whether  windows  or  menus  are 
created  first,  it  does  matter  to  the 
Amiga.  Amiga  menus  are  attached  to 
windows,  not  to  the  screen,  and 
therefore  a  menu  strip  is  useless  un¬ 
less  a  window  has  previously  been 
created  to  which  it  can  be  attached. 

Event  Trapping 

Event  trapping  on  the  Macintosh  and 
Amiga  is  similar  in  principle  but 
somewhat  different  in  detail.  The 
general  idea  is  to  somehow  let  the 
computer  know  which  events  are  of 
importance  and  then  to  enter  a  wait 
state  until  a  desired  event  occurs. 
Once  an  event  has  been  recorded,  a 
program  must  identify  which  type  of 
event  has  been  posted  and  take  ac¬ 
tion  based  on  that  particular  event. 

The  Macintosh  posts  events  to  a 
system  event  queue.  Events  of  inter¬ 
est  to  an  application  program  (that  is, 
those  that  aren’t  handled  automati¬ 
cally  by  the  system)  are  passed  on  to 
the  program.  An  application  pro¬ 
gram  checks  its  queue  repeatedly 


MAC  BUTTONS,  AMIGA  GADGETS 


with  the  routine  GetNeytEvent  (lines 
62-65  in  Listing  One)  to  determine  if 
an  event  has  been  posted.  If  GetNejtt- 
Event  returns  a  Boolean  result  of 
FALSE,  then  the  program  simply 
branches  to  the  top  of  the  event  loop 
(line  59)  to  check  again.  Assuming 
that  an  event  has  been  posted,  infor¬ 
mation  about  the  event  is  stored  in  an 
event  record.  The  program  can  then 
use  information  from  that  record  to 
identify  the  exact  type  of  event  (see 
lines  69—74).  In  some  cases,  events 
not  of  interest  to  the  application  can 
appear  in  the  event  queue.  When 
that  occurs,  the  program  simply  ig¬ 
nores  the  event  (line  75).  When  a  de¬ 
sired  event  is  identified,  however, 
Macintosh  programs  generally 
branch  to  submodules,  each  of  which 
processes  a  single  event  type.  When 
the  event  has  been  handled,  the  pro¬ 
gram  returns  to  the  event  loop  to  idle 
until  another  event  is  posted  to  the 
event  queue. 


The  Amiga  reports  events  via  mes¬ 
sage  ports,  which  must  be  initialized 
before  they  can  be  used.  The  Open- 
Window  routine  creates  an  intuition 
message  port,  but  ports  must  be  cre¬ 
ated  explicitly  for  the  console  device, 
which  will  be  used  for  text  I/O.  The 
subroutine  CreatePort  (lines  278  —  307 
in  Listing  Three)  allocates  a  signal  bit 
for  a  new  port,  allocates  memory  for 
the  port's  data  structure,  initializes  a 
task  control  block  for  the  port,  and 
adds  the  port  to  the  linked  list  of  cur¬ 
rent  message  ports. 

Amiga  programs  do  not  need  a  pro¬ 
gram  loop  to  idle  while  waiting  for 
an  event.  Instead,  they  can  use  the 
system  routine  Wait,  which  idles  un¬ 
til  a  desired  event  occurs.  Wait  must 
be  supplied  with  the  signal  bits  as¬ 
signed  to  each  of  the  input  ports  that 
should  be  monitored  for  events.  In 
the  sample  Amiga  program,  that  in¬ 
cludes  the  signal  bit  for  the  intuition 
message  port  and  the  signal  bit  for 
the  console  read  port  (see  lines 
184—193  in  Listing  Three). 

Unlike  the  Macintosh,  the  Amiga 


doesn't  report  all  types  of  events 
automatically.  In  line  77,  the  system 
is  instructed  to  report  only  two  of  the 
types  of  events  that  may  occur  when 
this  window  is  active — a  click  in  the 
window  close  gadget  and  a  selection 
from  a  menu.  Therefore,  any  event 
detected  by  Wait  should  be  an  event 
useful  to  the  program. 

If  a  Macintosh  program  identifies  a 
selection  from  a  menu  (a  "mouse- 
down”  event  in  the  menu  bar),  the 
program  is  faced  with  the  problem  of 
identifying  which  item  from  which 
menu  has  been  selected.  A  single  sys¬ 
tem  routine,  MenuSelect  (lines 
125  — 127  in  Listing  One),  returns  both 
the  menu's  resource  ID  and  the  num¬ 
ber  of  the  menu  item.  MenuSelect 
uses  a  field  from  the  event  record  as 
input — the  coordinates  of  the  mouse 
pointer  when  the  selection  was 
made.  The  menu  number,  returned 
in  the  high-order  word  of  the  long- 
word  result,  is  then  isolated  from  the 
menu  item,  which  is  returned  in  the 
low-order  word  (lines  128—130).  Fi¬ 
nally,  the  menu  number  can  be  used 


Dr.  Dobb's  Journal,  January  1987 


47 

25 


MAC  BUTTONS,  AMIGA  GADGETS 

(continued  from  page  47) 

to  identify  the  precise  menu  posting 
the  event. 

An  Amiga  program  must  also  iden¬ 
tify  which  menu  and  which  item 
have  been  selected.  The  Amiga, 
though,  reports  only  that  an  intuition 
event  has  occurred.  The  program 
must  retrieve  the  input  message  with 
the  system  routine  GetMsg  (lines 
205  —  208  in  Listing  Three)  to  deter¬ 
mine  exactly  which  of  the  intuition 
events  requested  in  the  window  data 
structure  has  been  detected.  Once  the 
type  of  event  has  been  recovered 
(line  211),  it  can  be  compared  against 
the  desired  types  of  events  (lines 
212 — 215)  in  the  same  way  as  the  Mac¬ 
intosh  program  identifies  events.  The 
menu  number  and  menu  item  (and 
the  menu  subitem,  if  applicable)  are 
stored  in  a  16-bit  field  in  the  message 
data  structure.  The  menu  number  is 
in  bits  0—4  and  can  be  isolated  by 
masking  off  the  11  high-order  bits 
(line  222).  The  menu  item  is  in  bits 
5  —  10  (see  lines  225—226). 


The  sample  Macintosh  program 
supports  activities  from  all  three 
menus.  The  Apple  menu's  desk  acces¬ 
sories  are  handled  by  system  routines. 
Getltem  (lines  140—143  in  Listing  One) 
retrieves  the  name  of  the  desk  acces¬ 
sory  that  has  been  selected.  Open- 
DeskAcc  (lines  144  —  147)  actually 
opens  the  program.  Once  the  desk  ac¬ 
cessory  has  been  opened,  subsequent 
events  in  that  program  are  posted  to 
the  event  queue  and  detected  as 
mouse-button  down  events  in  a  sys¬ 
tem  window  (lines  104—105).  In  that 
case,  the  system  routine  SystemClick 
(lines  111  —  113)  processes  the  event 
without  further  intervention  from 
the  application  program.  For  the  pur¬ 
poses  of  the  sample  program,  only  the 
Quit  item  is  actually  trapped  from  the 
File  menu  (lines  176—177);  all  other 
options  simply  return  to  the  event 
loop.  The  CloseAndQuit  routine  (lines 
187—191)  frees  the  space  used  for  text 
storage  ( TEDispose  in  lines  187—188), 
closes  the  window  (CloseWindow  in 
lines  189—190),  and  then  returns  to 
the  Finder.  The  Edit  menu,  which  is 
fully  implemented  with  system  rou¬ 


tines,  is  discussed  later  in  this  article  in 
the  context  of  text  editing. 

The  Amiga  program  also  handles 
only  the  Quit  item  from  its  Project 
menu.  Its  CloseAndQuit  routine  (lines 
229  —  240  in  Listing  Three)  first  re¬ 
moves  any  unprocessed  console 
events  with  the  system  macro 
ABORTIO  (this  macro  is  found  in  the 
Amiga  include  files),  then  closes  the 
console  device  (CloseDevice  in  lines 
231  —  233),  closes  the  window 
( CloseWindow  in  lines  234—236),  and 
finally  closes  the  custom  screen 
( CloseScreen  in  lines  237—239)  before 
returning  to  the  operating  system. 
The  Edit  menu  is  not  implemented 
because  the  Amiga  does  not  have  sys¬ 
tem  routines  for  text  editing. 

Text  Editing 

The  environment  for  text  editing  is 
very,  very  different  on  the  Macin¬ 
tosh  and  Amiga.  Other  writers  have 
glossed  over  the  Macintosh’s  text 
editing  abilities  (for  example,  see  the 
September  1986  issue  of  Byte  maga¬ 
zine3),  but  it  is  in  this  area  that  the 
difference  between  the  Mac  and 
Amiga  system  routines  is  glaringly 
apparent.  The  Macintosh's  system 
routines  for  text  editing  are  simple 
and  elegant.  The  Amiga  has  nothing 
comparable;  Amiga  text  I/O  relies  on 
low-level  device  communication. 

To  do  text  I/O,  a  Macintosh  pro¬ 
gram  allocates  a  text  edit  record  with 
the  system  routine  TENew  (see  lines 
52—56  in  Listing  One).  The  text  edit 
record  is  associated  with  a  window, 
though  the  text  stored  in  the  text  edit 
record  can  be  much  larger  than  what 
can  be  seen  in  the  window  at  any  giv¬ 
en  time.  A  call  to  TEActivate  (lines 
57—58)  makes  the  text  edit  window 
active  and  draws  the  straight-line 
cursor  as  the  text  entry  point.  Repeat¬ 
ed  calls  to  TEldle  (line  61),  usually 
within  the  program's  event  loop,  en¬ 
sure  that  the  cursor  blinks  regularly. 

When  a  text  edit  window  is  active, 
Macintosh  programs  generally  as¬ 
sume  that  key-down  events  not  asso¬ 
ciated  with  the  command  key  repre¬ 
sent  text  to  be  both  displayed  on  the 
screen  and  stored  in  the  text  edit  re¬ 
cord.  Whenever  such  an  event  is  de¬ 
tected,  the  key  pressed  is  stored  in 
the  event  record.  TEKey  (lines  79  —  81 
in  Listing  One)  displays  that  charac¬ 
ter  at  the  current  cursor  position,  ad¬ 
justing  word  wrap  as  necessary,  and 


48 

26 


Dr.  Dobb's  Journal,  January  1987 


inserts  the  character  into  the  text  edit 
record  in  RAM. 

Mouse-down  events  in  active  text 
edit  windows  signal  that  the  user 
wishes  to  either  change  the  position 
of  the  straight-line  cursor  or  identify  a 
block  of  text  for  editing  activities.  The 
Macintosh  refers  to  blocks  of  code  or 
the  straight-line  cursor  as  the  "selec¬ 
tion  range.”  Setting  the  selection 
range  requires  calls  to  two  system 
routines.  GlobalToLocal  (lines 
115—116)  is  a  graphics  routine  that 
takes  the  point  where  the  mouse- 
down  event  occurred,  which  is  ex¬ 
pressed  in  global  screen  coordinates, 
and  translates  it  to  the  text  edit  win 
dow's  local  coordinate  system.  The 
transformed  coordinates  can  then  be 
passed  to  TEClick  (lines  117  —  123), 
which  actually  sets  the  selection 
range. 

The  text  editing  functions  listed  in 
the  Macintosh's  standard  Edit  menu 
are  each  available  as  a  single  system 
routine  that  bases  its  operation  on  the 
current  selection  range.  TECut  (lines 
156—159)  deletes  the  current  selec¬ 
tion  range  from  the  screen,  adjusting 
word  wrap  and  the  text  edit  record. 
The  deleted  text  is  stored  in  RAM  in  an 
area  known  as  the  "Clipboard.”  Cut, 
and  any  other  operations  that  write 
to  the  Clipboard,  erase  the  previous 
contents  of  the  Clipboard.  TECopy 
(lines  161  — 165)  writes  the  current  se¬ 
lection  range  to  the  Clipboard  but 
does  not  affect  the  screen  or  text  edit 
record.  TEPaste  (lines  166—169)  in¬ 
serts  whatever  it  finds  on  the  Clip¬ 
board  at  the  current  selection  point 
(either  at  the  straight-line  cursor  or 
after  the  selection  range).  Both  the 
screen  display  and  text  edit  record 
are  adjusted.  TEClear  (lines  171  —  175) 
deletes  the  current  selection  range 
without  affecting  the  Clipboard. 
These  functions  also  provide  "intelli¬ 
gent  cut  and  paste,”  automatically  ad¬ 
justing  spacing  between  words  when 
text  is  either  deleted,  cut,  or  pasted. 

There  are  two  major  ways  to  do 
text  I/O  on  the  Amiga — either  via  the 
console  device,  which  processes  key¬ 
strokes  before  passing  them  on  to  the 
system,  or  via  the  RAW  device,  which 
transmits  unprocessed  key  codes.  For 
simple  text  I/O,  the  console  device  is 
easier  to  use  because  it  automatically 
manages  special  keys  such  as  the 
backspace.  To  use  the  console  device, 
an  Amiga  program  must,  as  discussed 


earlier,  allocate  two  message  ports — 
one  for  input  and  one  for  output. 
These  message  ports  are  then  incor¬ 
porated  into  standard  I/O  request 
blocks,  the  data  structures  that  are  ac¬ 
tually  used  for  I/O.  The  subroutine 
CreateStdIO  (lines  308—319,  Listing 
Three)  allocates  memory  for  a  stand¬ 
ard  I/O  data  structure  and  initializes 
the  data  structure  with  the  address  of 
its  message  port.  Finally,  the  console 
device  can  be  opened  by  the  subrou¬ 
tine  OpenConsole  (lines  320  —  330). 
Note  that  the  call  to  the  system  rou¬ 
tine  OpenDevice  (line  327)  returns  a 
pointer  to  the  device  data  structure  in 
a  field  of  one  of  the  standard  I/O  re¬ 
quest  blocks.  In  other  words,  the  de¬ 
vice  is  linked  to  the  I/O  request 
blocks  rather  than  the  I/O  request 
blocks  being  linked  to  the  device.  If 
the  console  is  opened  successfully,  a 
solid  block  cursor  appears  in  the  up¬ 
per  left-hand  corner  of  the  window 
to  which  the  console  is  attached.  The 
cursor  does  not  blink. 

The  system  routine  SendIO,  which 
appears  in  the  subroutine  QueueRead 
(lines  332—337),  issues  a  request  for 
console  input.  SendIO  uses  data  from 
the  input  standard  I/O  request  block, 
including  the  type  of  operation  to  per¬ 
form  (line  332),  the  place  where  input 
should  be  stored  (line  333),  and  the 
number  of  bytes  to  input  (line  334). 

The  subroutine  ConPutChar  (lines 
338—343)  handles  console  output.  Us¬ 
ing  the  system  routine  Do/O  (line  342), 
it  displays  a  single  character  at  the 
current  cursor  position  and  moves 
the  cursor  to  the  right.  If  the  cursor  is 
pushed  past  the  right  edge  of  the 
window,  it  drops  to  the  leftmost  posi¬ 
tion  on  the  line  below.  ConPutChar 
does  not  do  word  wrap,  nor  does  it 
store  the  text  displayed  in  main 
memory.  Subsequent  calls  to  Queue- 
Bead  reuse  the  same  storage  space  for 
input  characters.  As  mentioned  be¬ 
fore,  the  Amiga  also  has  no  system 
routines  for  text  editing.  Therefore, 
although  the  backspace  key  can 
erase  whatever  character  is  dis¬ 
played  to  the  left  of  the  cursor,  no 
other  editing  is  possible.  To  imple¬ 
ment  the  standard  text  editing  func¬ 
tions,  programmers  must  write  their 
own  routines  to  do  word  wrap,  ma¬ 
nipulate  the  selection  range,  store 
text  in  main  memory  (or  on  disk  if 
applicable),  manage  a  clipboard,  and 
do  the  editing  operations. 


The  Bottom  Line 

It  might  be  unfair  to  base  an  evalua¬ 
tion  of  the  system  routines  of  the 
Macintosh  and  Amiga  simply  on  the 
subset  of  the  routines  designed  to  ma¬ 
nipulate  the  user  interface.  On  the 
other  hand,  the  programming  strate¬ 
gies  used  to  implement  the  standard 
user  interfaces  are  similar  to  those  re¬ 
quired  for  other  system  operations. 
In  general,  the  Macintosh  routines 
isolate  programmers  from  low-level 
tasks  such  as  list  manipulation  and 
initialization  of  data  structures  (File 
Manager  parameter  blocks  are  nota¬ 
ble  exceptions).  Although  this  re¬ 
duces  the  burden  placed  on  the  pro¬ 
grammer,  it  can  decrease  the 
programmer's  flexibility.  The  Macin¬ 
tosh  routines  are  also  more  complete 
in  terms  of  their  support  for  the  doc¬ 
umented  user  interface.  The  effect  is 
again  to  reduce  the  burden  placed  on 
the  programmer. 

On  the  Amiga,  the  intuition  library 
provides  routines  for  the  standard 
user  interface.  Although  support  for 
screen,  windows,  menus,  and  fonts  is 
available,  there  is  a  great  gap  in  terms 
of  text  editing.  In  other  words,  the 
Amiga  does  not  provide  system  rou¬ 
tines  to  fully  implement  its  own 
standard  interface  recommenda¬ 
tions.  As  someone  who  writes  more 
programs  that  rely  on  text  manipula¬ 
tion  than  on  graphics,  I  believe  that 
this  is  a  serious  deficiency.  It  is  true 
that  the  Amiga  performs  some  func¬ 
tions  "automatically”  for  which  a 
Macintosh  program  must  include 
code  (for  example,  moving  and  sizing 
windows).  Nonetheless,  the  Macin¬ 
tosh  does  include  system  routines  to 
handle  those  functions. 

Notes 

1.  Caroline  Rose  et  al.,  Inside  Macin¬ 
tosh  (Reading,  Mass.:  Addison- Wes¬ 
ley,  1985). 

2.  Robert  J.  Mical  and  Susan  Deyl,  In¬ 
tuition:  The  Amiga  User  Interface 
(Commodore-Amiga  Inc.,  1985). 

3.  Adam  Brooks  Webber,  "Amiga  vs. 
Macintosh,”  Byte  (September  1986): 
249-257. 

DDJ 

(Listings  begin  on  page  64.) 

Vote  for  your  favorite  feature/article. 

Circle  Reader  Service  No.  5. 


Dr.  Dobb's  Journal ,  January  1987 


49 

27 


68000  MINI  FORTH 

Listing  One  (Text  begins  on  page  22.) 


;  Listing  One 
;  FLINT'S  system  macros 


MACRO  definition  file  for  FLINT  as  implemented  on  the 
SAGE  II  microcomputer. 


.MACRO 

XSTOP 

.WORD 

4EF9H 

.WORD 

00FEH 

.WORD 

0030H 

.ENDM 

.MACRO 

TGET 

.WORD 

4EB9H 

.WORD 

00FEH 

.WORD 

0008H 

.ENDM 

.MACRO 

TPUT 

.WORD 

4EB9H 

.WORD 

00FEH 

.WORD 

0014H 

.ENDM 

.MACRO 

GETBLOCK 

MOVE . L 

A4,  -  (A7) 

MOVE . W 

CBLOCK, - (A7) 

LEA 

DISKBUF, A0 

MOVE . L 

A0, -  (A7) 

MOVE. L 

♦  1024,-  (A7) 

MOVE.W 

#1,-(A7) 

.WORD 

4EB9H 

.WORD 

00FEH 

.WORD 

0028H 

MOVEA.L 

(A7)+,A4 

.ENDM 

.MACRO 

SAVBLOCK 

MOVE . L 

A4,-(A7) 

MOVE.W 

CBLOCK,  -  (A7) 

LEA 

DISKBUF,  A0 

MOVE . L 

A0, -  (A7) 

MOVE . L 

♦1024, -(A7) 

MOVE.W 

♦  1,-<A7) 

.WORD 

4EB9H 

.WORD 

00FEH 

.WORD 

002CH 

MOVEA.L 

.ENDM 

(A7)+,A4 

determines  the  environment  (if  any) 
into  which  the  exit  is  made 

exit  to  PROM  monitor/debugger 


gets  a  character  from  the  terminal 
and  puts  its  ASCII  value  in  the 
lower  byte  of  DO 

PROM  BIOS  call 


sends  the  character  whose  ASCII  value 
is  in  the  lower  byte  of  DO  to  the 
terminal 

PROM  BIOS  call 


loads  1024  bytes  from  the  disk  block 
numbered  by  the  value  in  CBLOCK  into 
a  buffer  whose  address  is  DISKBUF 
save  A4 
push  block  ♦ 

push  buffer  address 

push  length  of  transfer  (1024  bytes) 

push  drive  ♦ 

PROM  BIOS  call  for  diskread 
recover  A4 


saves  the  contents  of  a  1024  byte 
buffer  whose  address  is  DISKBUF  onto 
the  disk  block  numbered  by  the  value 
in  CBLOCK 
save  A4 
push  block  ♦ 

push  buffer  address 

push  length  of  transfer  (1024  bytes) 
push  drive  ♦ 

PROM  BIOS  call  for  diskwrite 
recover  A4 


End  Listing  One 


Listing  Two 


;  Listing  Two 
;  The  FLINT  Interpreter 

.NOLIST 

.NOSYMTABLE 

.NOMACROLIST 

.NOPATCHLIST 

. I NC LUDE  MACROS . TEXT 

.LIST 


FLINT  (Forth- Like  INterpreter  and  Threader) 


We  note  the  effect  that  the  execution  of  a  word  has  on  the  stack  by  a 
conventional  shorthand.  Before  and  after  lists  of  the  relevant  stack 
parameters  are  given  in  "rightmost  is  topmost"  order  separated  by 
"  •  A11  address  is  denoted  by  "a",  an  integer  by  "n",  and  a  flag  by 

"fl".  Parameters  in  parentheses  may  or  may  not  be  present. 

word  action  stack  effects 

TOKEN  gets  token  from  input  buffer 
SEARCH 


EXECUTE 


NUMBER 


searches  dictionary  for  current  token 
pushes  its  code  address  and  true  flag 
or  else  false  flag 

executes  routine  whose  code  address 
is  on  stack 

determines  if  a  token  whose  address  is 
popped  from  stack  represents  a  number 
if  so  pushes  value  and  true  flag  or  else 
pushes  false  flag 


a  — >  (a) ,  fl 


a  — >  (n),fl 


5a 

28 


Dr.  Dobb's  Journal,  January  1987 


;  WHAZZAT 

sends  ?  to  terminal 

;  HEADER 

makes  a  dictionary  header  for  the  next 
token  in  the  input  stream  (the  name  of 

a  word  being  defined) . 

calls  HEADER  and  then  sets  the  compile 

mode  (colon)  flag 

•  COMPILE 

writes  code  for  JSR  in 

the  dictionary 

and  calls  "  (which  furnishes  the  a  — 

> 

operand  for  JSR) 

;  CODE 

an  "immediate"  word  (executed  even  in 

compile  mode)  which  sets  the  base  to  hex 
and  sets  the  code  sufcmode  flag 

takes  a  number  from  the 

stack  and  writes  n  — > 

it  in  the  dictionary  directly 

LITERAL 

takes  a  number  from  the 

stack  and 

generates  code  which  when  executed  will  n  — > 

push  the  number  back  on 

the  stack 

closes  the  current  definition  by  writing 
an  RTS  in  the  dictionary  and  clearing  the 

compile  and  code  flags. 

;  PROMPT 

sends  prompt  (ok)  to  terminal 

- 

;  LINE 

gets  a  line  from  the  input  device 

and 

places  it  in  the  line  buffer 

;  control  structure 

of  the  prompt  and  input  code 

PROMPT 

LINE 

outer  interpreter  loop 

TOKEN 

SEARCH 
if  found 

EXECUTE 

STKCHK 

(execute  mode) 

or 

COMPILE 

(compile  mode) 

else 

NUMBER 
if  fail 

WHAZZAT 

else 

COMMA  (code  mode) 

or 

LITERAL  (compile  mode) 

or 

return  (execute  mode) 

;  register  usage 

;  A4 

line  buffer  pointer 

;  A5 

dictionary  pointer  DO  I/O  port 

;  A6 

parameter  stack  pointer  D1  "scratch 

;  A  7 

return 

stack  pointer  AO  registers" 

.ABSOLUTE 

.PROC 

FLINT 

.ORG 

01000H 

BRA 

START 

set  up 

TERMBUF 

.BLOCK 

84 , 32 

system  buffers. 

DISKBUF 

.BLOCK 

1024,32 

.ASCII 

"INTERACTIVE  " 

.BYTE 

13 

.BYTE 

32 

RTNSTK 

.BLOCK 

80,  0 

system  stacks. 

PARMSTK 

.BLOCK 

256,0 

UNDRFLW 

.WORD 

0 

DICT 

.WORD 

LAST 

system  variable 

s. 

CBLOCK 

.WORD 

0050H 

BASE 

.WORD 

10 

FCOLON 

.BYTE 

0 

system  flags. 

FIMMED 

.BYTE 

0 

FCODE 

.BYTE 

0 

FNEG 

.BYTE 

0 

FINT 

.BYTE 

0 

.ALIGN 

2 

BLANK 

.  BLOCK 

4,32 

etc. 

ZERO 

.BLOCK 

3,32 

.BYTE 

48 

START 

LEA 

RTNSTK+8  0,  A7 

initialize  pointers 

LEA 

PARMSTK+256,  A6 

MOVEA.W 

BUFPNT, A5 

CLR.L 

FCOLON 

initialize  flags 

CLR.B 

FINT 

JSR 

LOAD 

load  boot  screen 

RESTART 

JSR 

WHICHBUF 

select  input  buffer 

MAIN 

JSR 

TOKEN 

get  next  token 

MOVE . W 

DICT,  -  (A6) 

push  dictionary  pointer 

JSR 

SEARCH 

look  for  token 

TST.W 

(A6)  + 

if  found  then 

BEQ 

TSTNUM 

BCLR 

#7,  FIMMED 

if  immediate  flag 

off  then 

(continued  on 

ne*f  page) 

Dr.  Dobb's  Journal,  January  1987 


S3 

29 


68000  MINI  FORTH 

Listing  TWO  (Listing  continued ,  text  begins  on  page  22.) 

BNE 

GODO 

TST.B 

FCOLON 

;  if  in  corapile  mode 

BEQ 

GODO 

JSR 

COMPILE 

;  corapile 

BRA 

MAIN 

;  else 

GODO 

JSR 

EXECUTE 

;  execute  word 

JSR 

STKCHK 

;  and  check  underflow 

BRA 

MAIN 

;  else 

TSTNUM 

MOVE . W 

A5,  -  (A6) 

;  push  token  buffer  address 

JSR 

NUMBER 

;  is  token  a  number  ?? 

TST.W 

(A6)  + 

;  if  not 

BNE 

TSTCODE 

JSR 

WHAZZAT 

;  ?  whazzat  ? 

BRA 

MAIN 

;  else 

TSTCODE 

TST.B 

FCODE 

;  if  code  flag  on 

BEQ 

TSTLIT 

JSR 

COMMA 

;  write  code  in  dictionary 

BRA 

MAIN 

;  else 

TSTLIT 

TST.B 

FCOLON 

;  if  in  corapile  mode 

BEQ 

MAIN 

JSR 

LITERAL 

;  compile  number  as  literal 

BRA 

MAIN 

WHICHBUF 

TST.B 

FINT 

BNE 

GOLINE 

;  if  not  in  interactive  mode 

LEA 

DISKBUF, A4 

;  get  input  from  disk  buffer 

RTS 

;  else 

GOLIHE 

JSR 

LINE 

;  fill  line  buffer  from  terminal 

RTS 

LINE 

JSR 

PROMPT 

;  prompt 

MOVEQ 

#76,D1 

LEA 

TERMBUF, A4 

CLEANUP 

MOVE . L 

BLANK, 0 (A4,D1) 

;  clear  line  buffer 

SUBQ.B 

#4,D1 

BGE 

CLEANUP 

CLR.L 

D1 

;  clear  character  count 

INCHAR 

TGET 

;  get  next  character  and 

CMP  I .  B 

♦  13, DO 

;  do  while  not  CR 

BEQ 

EXIT 

CMP I . B 

♦  8, DO 

;  if  character  is  backspace 

BNE 

INBUF 

JSR 

RUBOUT 

;  rubout  previous  char 

BRA 

INCHAR 

;  else 

INBUF 

MOVE . B 

DO,  0 (A4,D1) 

;  copy  it  into  buffer 

ADDQ.B 

*1,D1 

;  increment  count 

TPUT 

BRA 

INCHAR 

;  echo  to  terminal 

EXIT 

MOVE . B 
RTS 

DO,  1 (A4,D1) 

;  imbed  CR  in  buffer 

RUBOUT 

TST.B 

D1 

;  if  count  >  0  then 

BLE 

TPUT 

RRET 

#1,D1 

;  echo  backspace 

;  decrement  count 

MOVEQ 

#32, DO 

;  erase  previous  character. . . 

MOVE . B 

DO,  0 (A4,D1) 

;  in  buffer  and  . . . 

TPUT 

MOVEQ 

TPUT 

#8, DO 

;  on  terminal 

RRET 

RTS 

;  return 

STKCHK 

LEA 

UNDRFLW,  AO 

CMPA.W 

AO,  A6 

;  if  top  is  below  bottom 

BLE 

OKSTK 

OKSTK 

JSR 

RTS 

STKU 

;  underflow  exception 

NULLWORD 

.BLOCK 

6,0 

;  DICTIONARY 

.BYTE 

4 

.ASCII 

"CRL" 

;  CRLF 

.WORD 

NULLWORD 

CRLF 

MOVEQ 

TPUT 

#13, DO 

;  send  CR 

MOVEQ 

TPUT 

#10, DO 

;  send  LF 

RTS 

;  return 

.BYTE 

6 

•ASCII 

"PRO" 

;  PROMPT 

.WORD 

CRLF- 6 

PROMPT 

JSR 

CRLF 

;  send  CR  LF 

MOVEQ 

TPUT 

#111, DO 

;  send  "o" 

MOVEQ 

TPUT 

#107, DO 

;  send  "k" 

MOVEQ 

TPUT 

#32, DO 

;  send  space 

;  return 

.BYTE 

7 

.ASCII 

“WHA" 

;  WHAZZAT 

.WORD 

PROMPT-6 

WHAZZAT 

JSR 

CRLF 

;  send  CR  LF 

MOVE . W 

A5,-(A6) 

;  push  token  address 

ADDI.W 

♦  1,  (A6) 

MOVE . B 

(A5),-(A6) 

;  push  character  count 

CLR.B 

~  (A6) 

;  pad  it  to  "word"  length 

TYPE 

;  type  offending  token 

BR 

STKU 

JSR 

CRLF 

;  (underflow  entry  point) 

*42, DO 

TPUT 

;  send  * 

54 

30 


Dr.  Dobb's  Journal,  January  1987 


RSET 

MOVEQ 

#63, DO 

TPUT 

send  ? 

LEA 

PARMSTK+2  5  6,  A6 

reset  stack  pointer 

CLR.L 

FCOLON 

initialize  flags 

TAS 

FINT 

set  interactive  mode 

JSR 

CR 

get  new  line 

RTS 

.BYTE 

5 

.ASCII 

"TOK" 

TOKEN 

.WORD 

WHAZZAT-6 

TOKEN 

CLR.L 

(A5 ) 

clear  token  buffer 

CLR.L 

D1 

clear  count 

GETCHAR 

MOVE . B 

(A4)  +,  DO 

getcharacter  until  space 

CMP  I .  B 

*32, DO 

BEQ 

EXITGET 

MOVE . B 

DO, 1 (A5,D1) 

place  in  token  buffer 

ADDQ.B 

♦1,  D1 

increment  count 

BRA 

GETCHAR 

EXITGET 

MOVE . B 

Dl,  (A5) 

put  count  in  1st  byte  of  buffer 

BEQ 

GETCHAR 

if  count  is  0  repeat 

RTS 

return 

.BYTE 

129 

.BYTE 

13 

(CR) 

.WORD 

0 

.WORD 

TOKEN-6 

CR 

LEA 

RTN  STK+  80, A7 

reset  system  stack 

JMP 

RESTART 

restart 

.BYTE 

6 

.ASCII 

"SEA" 

SEARCH 

.WORD 

CR-6 

SEARCH 

MOVE . L 

(A5 )  ,  Dl 

put  token  "stem"  in  Dl 

MOVEA.W 

A6,A0 

use  AO  as  search  pointer 

COMPARE 

TST.W 

(AO) 

DO 

MOVEA.W 

(AO)  , AO 

get  address  of  next  word 

NOFIND 

if  nullword,  exit  NOFIND 

CMP.L 

(AO) , Dl 

compare  word  to  candidate 

BEQ 

FIND 

if  found,  exit  FIND 

BCHG 

*31, Dl 

set  precedence  bit 

CMP.L 

(AO) ,  Dl 

compare  to  "immediate"  candidate 

BEQ 

FINDIMM 

if  found,  exit  FINDIMM 

BCHG 

#31, Dl 

reset  precedence  bit 

LEA 

4  (AO), AO 

get  link  address 

BRA 

COMPARE 

LOOP 

FINDIMM 

TAS 

FIMMED 

set  immediate  flag 

FIND 

LEA 

6  (AO),  AO 

get  code  address 

MOVE . W 

AO,  (A6) 

push  it 

MOVE . W 

#-l,-(A6) 

push  success  flag 

RTS 

NOFIND 

MOVE.W 

AO,  (A6) 

push  fail  flag 

RTS 

.BYTE 

6 

.ASCII 

"NUM" 

NUMBER 

•  WORD 

SEARCH-6 

NUMBER 

CLR.L 

D2 

clear  conversion  register 

MOVEA.W 

(A6) +, AO 

get  token  address 

MOVE . B 

(AO ) +, Dl 

get  digit  count 

DO 

NXTDIG 

MOVE . B 

(A0)+,D0 

get  next  digit 

SUBI.B 

#48, DO 

strip  ASCII  prefix 

BLT 

FAIL 

if  digit  too  small,  FAIL 

CMP.W 

#10, DO 

if  digit  >  9 

BLT 

CMP  BASE 

adjust  for  "odd"  values 

SUBI.B 

*7, DO 

and  test  again 

CMP.W 

*10, DO 

BLT 

FAIL 

CMP BASE 

CMP.W 

BASE, DO 

if  base  <  digit 

BGE 

FAIL 

FAIL 

MULU 

BASE,  D2 

multiply  current  value  by  base 

SWAP 

D2 

TST.W 

D2 

if  overflow 

BNE 

FAIL 

FAIL 

SWAP 

D2 

ADD.W 

DO,  D2 

add  current  digit 

BCS 

FAIL 

if  overflow,  FAIL 

SUBQ.B 

#1,D1 

decrement  count 

BNE 

NXTDIG 

UNTIL  no  digits  remain 

MOVE.W 

D2,-(A6) 

push  number 

MOVE.W 

#— 1,  —  (A6) 

push  success  flag 

RTS 

FAIL 

CLR.W 

-(A6) 

;  push  fail  flag 

RTS 

.BYTE 

7 

.ASCII 

"EXE" 

;  EXECUTE 

.WORD 

NUMBER-6 

EXECUTE 

MOVEA.W 

(A6) +, AO 

;  pop  code  address 

JSR 

(AO) 

;  execute 

RTS 

.BYTE 

7 

.ASCII 

"COM" 

;  COMPILE 

.WORD 

EXECUTE-6 

COMPILE 

MOVE.W 

♦  04EB8H,  (A5)  + 

;  compile  "JSR" 

JSR 

COMMA 

;  compile  code  address 

RTS 

.BYTE 

6 

.ASCII 

"HEA" 

;  HEADER 

.WORD 

COMPILE-6 

HEADER 

MOVE.W 

DICT,  4 (A5) 

;  link  header  to  dictionary 

MOVE.W 

A5, DICT 

;  update  DICT 

LEA 

6 (A5) ,A5 

;  move  pointer  to  code  field 

RTS 

.BYTE 

9 

.ASCII 

"IMM" 

;  IMMEDIATE 

.WORD 

HEADER-6 

(continued  on  nejct  page) 

Dr.  Dobb’s  Journal,  January  1987 


55 

31 


_ 68000  MINI  FORTH 

LlStlllgf  TWO  (Listing  continued,  tepct  begins  on  page  ZZ.) 


MOVEA.W 

TAS 

RTS 


DICT,A0 

(AO) 


;  get  address  of  most  recent  word 
;  set  precedence  bit 


.BYTE 

1 

•ASCII 

M  .  II 

.WORD 

0 

.WORD 

IMMED-6 

JSR 

TOKEN 

JSR 

HEADER 

TAS 

FCOLON 

RTS 

.BYTE 

132 

.ASCII 

"COD" 

.WORD 

COLON-6 

MOVE.W 

*16, BASE 

TAS 

FCODE 

RTS 

.BYTE 

129 

.ASCII 

**• M 

.WORD 

0 

.WORD 

CODE- 6 

MOVE.W 

*10, BASE 

CLR.B 

FCOLON 

CLR.B 

FCODE 

MOVE.W 

♦04E75H, (A5)+ 

get  token 
make  header 
set  colon  flag 


change  BASE  to  hex 
set  code  flag 


;  change  BASE  to  decimal 
;  clear  colon  flag 
;  clear  code  flag 
;  compile  "RTS" 


0 

SEMI -6 
(A6)+,  <A5)  + 


pop  number  to  dictionary 


7 

"LIT" 

COMMA- 6 
♦03D3CH,  (AS)  + 
COMMA 


compile  literal  code 
compile  constant 


.BYTE 

.ASCII 

.WORD 

INTRACTV  TAS 

RTS 


11 

“INT" 

LITERAL-6 

FINT 


;  INTERACTIVE 
;  set  interactive  mode 


.BYTE 

.ASCII 

.WORD 

BASECODE  LEA 

MOVE.W 

RTS 


4 

"BAS" 

I NT RAC TV- 6 
BASE, AO 
AO,  -  (A6) 


;  push  BASE  address 


6 

"CBL" 

BASECODE -6 
CBLOCK, AO 
AO,  -  (A6) 


push  CBLOCK  address 


5 

"EDB" 
CBLCODE-6 
DISKBUF, AO 
AO,  -  (A6) 


get  edit  buffer  address 
push  it 


.BYTE 
.ASCII 
•  WORD 

DICTCODE  LEA 

MOVE.W 

RTS 


4 

"DIC" 
EDBCODE-6 
DICT, AO 
AO, - (A6) 


get  dictionary  address 
push  it 


.BYTE 

.ASCII 

.WORD 

GETBLOCK 

RTS 


4 

"LOA" 

DICTCODE-6 


system  dependent  macro 


.BYTE 

.ASCII 

.WORD 

SAVBLOCK 

RTS 


leave  interactive  mode 
restart  input  sequence 


;  system  dependent  macro 


.BYTE 

4 

.ASCII 

"TYP" 

.WORD 

SAVE- 6 

TYPE 

MOVE.W 

(A6)+,D1 

SUBQ.B 

♦1,  D1 

MOVEA.W 

(A6)+,A0 

PUT 

MOVE . B 

(AO) +,D0 

get  character  count 

get  buffer  address 
DO 

send  buffer  character 
to  terminal 
UNTIL  exhausted 


(continued  on  page  58) 


56 

32 


Dr.  Dobb's  Journal,  January  1987 


68000  MINI  FORTH 

Listing  TWO  (Listing  continued,  teyt  begins  on  page  22.) 

.BYTE 

1 

.ASCII 

" .  " 

.WORD 

0 

.WORD 

TYPE- 6 

PRINT 

MOVEQ 

#13, DO 

;  send  CR 

TPUT 

; 

MOVEQ 

#10, DO 

TPUT 

MOVE  A.  W 

A5,  AO 

;  get  buffer  pointer 

MOVE . L 

BLANK,  (AO)  + 

MOVE . L 

BLANK,  (AO )  + 

;  zero  out  buffer 

MOVE . L 

BLANK, (AO) + 

MOVE . L 

ZERO, (AO ) + 

MOVE.W 

(A6)  +,D2 

;  pop  number 

BGE 

DLOOP 

;  if  negative 

NEG.W 

D2 

;  make  positive 

TAS 

FNEG 

;  set  negative  flag 

DLOOP 

ANDI.L 

*65535, D2 

;  while  quotient  not  0  do 

BEQ 

TSTMINUS 

;  clear  remainder 

DIV 

DIVS 

BASE, D2 

;  divide  by  base 

MOVE . L 

D2,  DO 

;  (do  dirty  work  in  DO) 

SWAP 

DO 

;  get  remainder 

CMPI.B 

*10, DO 

BLT 

PREFIX 

;  convert  to  digit 

ADDQ.B 

#7, DO 

ADDI.B 

*48, DO 

;  make  ASCII  prefix 

MOVE . B 

DO,- (A0) 

;  place  digit  in  buffer 

BRA 

DLOOP 

TSTMINUS  BCLR 

#7, FNEG 

;  test  and  clear  negative  flag 

PRNT 

;  if  set 

MOVE . B 

*45,- (A0) 

;  put  minus  sign  in  buffer 

PRNT 

MOVE.W 

A5,-(A6) 

;  push  buffer  address 

MOVE.W 

#16,- (A6) 

;  push  buffer  length 

JSR 

TYPE 

;  type  number 

RTS 

.BYTE 

2 

.ASCII 

".S" 

;  .s 

.BYTE 

0 

.WORD 

PRINT-6 

SPRINT 

LEA 

UNDRFLW, A1 

;  get  address  of  bottom 

MOVEA.W 

A6,A2 

BOTTOM 

CMPA.W 

A1,A2 

;  while  above  bottom  do 

BEQ 

DONE 

MOVE.W 

(A2)  +,-  (A6) 

;  push  next  number 

JSR 

PRINT 

;  print  it 

BRA 

BOTTOM 

DONE 

RTS 

.BYTE 

129 

.ASCII 

'*(" 

;  M(M 

.WORD 

0 

.WORD 

SPRINT-6 

MOVE . B 

(A4)  +,  DO 

;  get  character  from  input  buffer 

CMPI.B 

#41, DO 

;  until  " ) “ 

BNE 

CMMNT 

RTS 

.BYTE 

4 

.ASCII 

"QUI" 

;  QUIT 

.WORD 

CMMNT-6 

QUIT 

XSTOP 

;  return  to  monitor 

RTS 

LAST 

.BYTE 

6 

.ASCII 

"LOG" 

;  LOGOFF 

.WORD 

QUIT-6 

LEA 

BUFPNT, A0 

;  save  dictionary  pointer 

MOVE.W 

A5,  (A0) 

RTS 

BUFPNT 

•  WORD 

BUFFER 

BUFFER 

.WORD 

End  Listing  Two 

Listing  Three 

(Listing  Three) 

("inner 

shell"  words 

for  FLINT) 

(The 

>"  symbol  when  used  in  a  comment  signifies  that  the) 

(instruction  corresponding  to  the  preceding  assembler) 

(mnemonic  will  be  written  in  the  dictionary  at  execution  time.) 

:  CONSTANT  (  n  — 

creates  a  constant) 

|  TOKEN 

HEADER  LITERAL  CODE  3 AFC  4E75  (  rts  ->  )  ; 

:  CREATE  (  — 

creates  the  header  and  code  body  for  a  variable) 

TOKEN 

HEADER  CODE 

2AFC  41 FA  0006 

(  lea  6[a5] , aO  ->  ) 

3 AFC  3D 08 

(  raove.w  a0,-[a6]  ->  ) 

3 AFC  4E75 

(  rts  ->  )  ; 

__ 

used  after  CREATE  to  allocate  space  for  a  ) 

( 

- 

variable  or  an  array  ) 

CODE 

DADE  (  ADDA 

W  [A6]+,A5  )  ; 

:  VARIABLE  (  — 

creates  a 

variable) 

CREATE  2  ALLOT  ; 

:  0 

(  a 

—  n 

"fetch"  -  replaces  an  address  with  its  value) 

CODE 

305E  (  MOVEA.W  [A6]+,A0  ) 

3D10  (  MOVE 

W  [A0] ,  -  [A6]  )  ; 

58 


Dr.  Dobb's  Journal,  January  1987 

33 


:  !  {  n  a  — 

CODE  305E  (  MOVEA.W 

309E  (  MOVE . W 


stores  a  word  length  value  in  the  address) 
[A6]+,A0  ) 

[A6]+, [AO]  )  ; 


:  IBYTE  {  n  a  —  _ 

CODE  305E  (  MOVEA.W 

4A1E  (  TST.B  [A6]  + 
109E  (  MOVE . B 


stores  a  byte  length  value  in  the  address) 
[A6]  +,  AO  ) 

) 

[A6]+,  [AO]  )  ; 


:  HEX  (  — 

16  BASE  !  ; 


changes  the  system  base  to  16) 


:  DECIMAL 
10  BASE  i 


changes  the  system  base  to  10) 


SWAP 

(  nl  n2  —  n2 

nl  ) 

CODE 

221E 

(  MOVE .  L 

[A6] +, Dl  ) 

4841 

(  SWAP  Dl 

) 

2D01 

(  MOVE.  L 

Dl,  -  [A6]  )  ; 

DUP 

(  n  —  n  n  ) 

CODE 

3D16 

(  MOVE.W 

[A6],-[A6]  )  ; 

7 

4A5E 

(  n  — 

tests  the  top  value,  drops  it,  and  sets 

)  ; 

CCS) 

CODE 

(  TST.W  [A6]  + 

\ 

CODE 

4A5E 

{  n  — 

(  TST.W  [A6]  + 

synonym  for  M?"  used  to  emphasize  the 

drop) 

OVER 

(  nl  n2  —  nl 

n2  nl  ) 

CODE 

3D2E 

0002  (  MOVE.W 

2  [A6]  ,  -  [A6]  )  ; 

2  DUP 
OVER 

OVER 

(  nl  n2  —  nl 

n2  nl  n2  ) 

>R 

(  n  — 

removes  a  value 

from 

the  parameter  stack 

( 

and  places  it  on 

the 

return  stack 

CODE 

221F 

(  MOVE. L 

[A7 ]  +,  Dl  ) 

3F1E 

(  MOVE.W 

[A6]+,-[A7]  ) 

2F01 

(  MOVE .  L 

D1,-[A7]  )  ; 

<R 

(  —  n 

removes  a  value 

from 

the  return  stack 

CODE 

221F 

( 

and  places  it  on 

the 

parameter  stack 

;  MOVE . L 

[A7  ]  +,  Dl  ) 

3D1F 

(  MOVE.W 

[A7]+,-[A6]  ) 

2F01 

(  MOVE . L 

Dl, “ [A7]  )  ; 

ROT 

(  nl  n2  n3 

—  n2  n3  nl  ) 

>R  SWAP  <R  SWAP  ; 


CODE  321E 
D356 


(  nl  n2  —  nl+n2  ) 

(  MOVE.W  [A6]+,D1  ) 

(  ADD.WD1,  [A6]  )  ; 


:  ~  (  n - n  ) 

CODE  4456  (  NEG.W  [A6]  )  ; 

:  “  (  nl  n2  —  nl-n2  ) 


CODE  321E 
C3DE 
3D01 

:  /MOD 

CODE  321E 
341E 
4  8C2 
85C1 
3D02 
4842 
3D02 

:  / 

/MOD  \  ; 


MOD 

/MOD  SWAP  \  ; 


:  0< 

CODE  4A56 
6D04 
4256 
4E75 

3CBC  FFFF 


(  nl  n2  —  nl*n2  ) 

(  MOVE.W  [A6]+,D1  ) 

(  MULS  [A6] +,D1  ) 

(  MOVE.W  Dl, - [A6]  )  ; 

(  nl  n2  —  nl/n2  nl  mod  n2  ) 

(  MOVE.W  [A6]+,D1  ) 

(  MOVE.W  [A6]+,D2  ) 

(  EXT.LD2  ) 

(  DIVS  Dl, D2  ) 

(  MOVE.W  D2,-[A6]  ) 

(  SWAP  D2  ) 

(  MOVE.W  D2,-[A6]  )  ; 


(  nl  n2  —  nl/n2  ) 

(  nl  n2  —  nl  mod  n2  ) 
(  n 


f  returns  a  true  flag  if  n  <  0) 

(  TST.W  [A6]  ) 

(  BLT  4 [PC]  ) 

(  CLR.W  [A6]  ) 


{  RTS 
(  MOVE.W 


:  0> 

~  0< 

:  < 

-  0< 

:  > 

-  0> 

CGET 


) 

+-1, [A6]  )  ; 

(  n  —  f  returns  a  true  flag  if  n  >  0) 

(  nl  n2  —  f  returns  a  true  flag  if  next  <  top) 

I  nl  n2  —  f  returns  a  true  flag  if  next  >  top) 


CODE  4EB9  00FE  0008 
3D00 

EMIT  (  n  — 

( 

CODE  30 IE 

4EB9  00FE  0008 


:  CLEAR 
26  EMIT 


(  — 


(  — 
( 


i  gets  a  character  from  the  terminal  and) 

places  its  ASCII  value  on  the  stack  ) 

{  TGET  ) 

(  MOVE.W  DO, - [A6]  )  ; 

takes  an  ASCII  value  from  the  stack  and) 
sends  it  to  the  terminal  ) 

(  MOVE.W  [A6] +, DO  ) 

(  TPUT  )  ; 

erases  the  screen) 


increments  CBLOCK;  loads  and  executes  ) 

the  new  block  [allows  chaining] 


(continued  on  page  61) 


Dr.  Dobb's  Journal,  January  1987 

34 


59 


68000  MINI  FORTH 

Listing  ThreO  (Listing  continued,  text  begins  on  page  22.) 

CBLOCK  @  2  + 

CBLOCK 

!  LOAD  GO  ; 

:  ?>  (  — 
CODE  3 AFC  4A5E  ( 

) 

tst.  w 

[a6]+  ->  )  ; 

:  BRA>  (  — 

CODE  3 AFC  6000  ( 

) 

bra 

->  )  ; 

:  BEQ>  (  — 

CODE  3 AFC  6700  ( 

) 

beq 

->  )  : 

:  BNE>  (  — 

CODE  3 AFC  6600  ( 

) 

bne 

->  )  ; 

:  MARK  ( 

CODE  3D0D  (  MOVE 

—  a 

W 

pushes  the  contents  of  the  dictionary  pointer) 

A5, -  [A6]  )  ; 

:  SPLIT  (  —  a  ) 

?>  BEQ>  MARK  2  ALLOT  ; 

:  JOIN  (  —  ) 

DUP  MARK  SWAP  -  SWAP  !  ; 

:  IF  (  : 

SPLIT  ;  IMMEDIATE 

_  —  a 

x  f  —  _  ) 

:  THEN  (  : 

JOIN  ;  IMMEDIATE 

a  —  _ 

x  —  ) 

:  ELSE  (  : 

BRA>  MARK  2  ALLOT 

al  —  a2  x  —  ) 

SWAP  JOIN  ;  IMMEDIATE 

:  DO  (  : 

MARK  ;  IMMEDIATE 

—  a 

x  —  ) 

:  UNTIL  (  : 

?>  BNE>  MARK  -  , 

a  —  x  f  —  ) 

IMMEDIATE 

:  WHILE  (  : 

SPLIT  ;  IMMEDIATE 

_  —  a 

x  f  —  _  ) 

:  LOOP  (  : 

BRA>  SWAP  MARK  - 

al  a2  — 
JOIN  ; 

x  —  ) 

IMMEDIATE 

•i  (  —  a  pushes  the  address  of  the  token  which  follows  ') 

TOKEN  DICT  @  SEARCH  IF  ELSE  WHAZZAT  THEN  ; 

:  FORGET  (  — 

( 

( 

'  DUP  2-0  DICT 

erases  the  dictionary  entries  for  the  token  following) 

FORGET  as  well  as  all  tokens  which  succeed  it  in  the  ) 

dictionary  ) 

!  6  -  CODE  3A5E  {  MOVEA.W  (A6)+,A5  )  ; 

End  Listings 

Dr.  Dobb's  Journal,  January  1987 


61 

35 


MAC  AND  AMIGA 

Listing  One  (Text  begins  on  page  40.) 

Listing  One 

1 

Include 

MacTraps.D 

2 

Include 

ToolEqu.D 

3 

Include 

SysEqu.D 

4 

Include 

QuickEqu.D 

5 

PEA 

-4(A5) 

;  initialize  managers 

6 

I nit Graf 

7 

I nit Fonts 

8 

MOVE.L 

#$0000FFFF,D0 

9 

FlushEvents 

10 

InitWindows 

11 

InitMenus 

12 

CLR.L 

-  (SP) 

13 

InitDialogs 

14 

_InitCursor 

15 

CLR 

-(SP) 

16 

PEA 

'DrD.Rsrc' 

17 

_OpenResFile 

;open  resource  file 

18 

MOVE 

(SP)  +,D0 

; discard  unused  result 

19 

CLR.L 

“(SP) 

; space  for  handle 

20 

MOVE 

#1,-(SP) 

;menu  ID 

21 

_GetRMenu 

;get  Apple  menu  template 

MOVE.L 

(SP) +, AppleHandle (A5) 

/retrieve  &  store  handle 

23 

MOVE.L 

AppleHandle  (A5) ,  -  (SP) 

;put  handle  back  on  stack 

24 

MOVE.L 

#'DRVR'/-(SP) 

;res  type  for  desk  accs 

25 

_AddResMenu 

;get  desk  accessories 

26 

MOVE.L 

Appl eHandl e (A5 ) ,  - (SP ) 

CLR 

-(SP) 

;put  menu  after  all  others 

28 

_InsertMenu 

;put  menu  in  menu  list 

29 

CLR.L 

-(SP) 

/repeat  for  other  menus 

30 

MOVE 

#2,  -  (SP) 

31 

GetRMenu 

32 

MOVE.L 

(SP) +, FileHandle (A5) 

33 

MOVE.L 

FileHandle (A5) (SP) 

34 

CLR 

-(SP) 

35 

_InsertMenu 

36 

CLR.L 

-(SP) 

37 

MOVE 

#3.  - (SP) 

38 

GetRMenu 

39 

MOVE.L 

(SP)  +  ,  EditHandle  (A5) 

40 

MOVE.L 

EditHandle(A5) ,- (SP) 

CLR 

-(SP) 

42 

_InsertMenu 

43 

_DrawMenuBar 

; - 

-  Open  the  window  with  a  text  edit 

record  - 

CLR.L 

-(SP) 

;  space  for  window  pointer 

MOVE 

#1,- (SP) 

/window  ID 

PEA 

WindowStorage  (A5) 

/window  storage 

MOVE.L 

#-l/-(SP) 

;put  window  in  front 

GetNewWindow 

49 

MOVE.L 

(SP) +,WindowPtr (A5) 

50 

MOVE.L 

WindowPtr  (A5 ) ,  -  (SP) 

_SetPort 

/makes  window  current  grafport 

52 

CLR.L 

-(SP) 

/place  for  text  handle 

PEA 

DestRect 

/destination  rectangle 

PEA 

ViewRect 

/view  rectangle 

TENew 

/establish  text  edit  record 

MOVE.L 

(SP) +, TextHandle (A5) 

/get  handle 

57 

MOVE.L 

TextHandle (A5) (SP) 

58 

_TEActivate 

/make  text  edit  record  active 

; - 

-  Event  loop  begins  here  - 

Event 

59 

_SystemTask 

/update  desk  accessories 

MOVE.L 

TextHandle  (A5) ,-  (SP) 

61 

_TEIdle 

/make  cursor  blink 

62 

CLR 

-(SP) 

/space  for  boolean  result 

MOVE 

#-l,-(SP) 

/mask  to  select  all  events 

PEA 

EventRecord  (A5) 

/pointer  to  event  record 

65 

_Get Next Event 

/get  event  from  queue 

66 

MOVE 

(SP)+,D0 

/retrieve  boolean  result 

BEQ 

Event 

/no  event 

68 

MOVE 

EventRecord  (A5)  ,D0 

/get  event  type 

(continued  on  page  66) 

64 

36 


Dr.  Dobb's  Journal,  January  1987 


MAC  AND  AMIGA 

Listing  One  (Listing  continued,  text  begins  on  page  40.) 

69 

CMP 

ImButDwnEvt, DO 

;mouse  event? 

70 

BEQ 

MouseEvent 

71 

CMP 

♦keyDwnEvt , DO 

/key  pressed? 

72 

BEQ 

KeyEvent 

73 

CMP 

#upDatEvt,D0 

74 

BEQ 

Update 

; refresh? 

75 

BRA 

Event 

;no  desired  event  posted 

KeyEvent 

76 

MOVE 

EventRecord+evtMeta (A5) ,D0 

77 

BTST 

#cmdKey,D0 

; command  key  pressed? 

78 

BNE 

KeyboardEqu i val ent 

79 

MOVE 

Event Record+evtMessage+2 (A5) 

-(SP)  /character  pressed 

80 

MOVE.L 

Text Handle (A5) (SP) 

81 

_TEKey 

/insert  character 

82 

BRA 

Event 

KeyboardEquivalent 

;  place  for  menu  ID  &  item  nutmer 

83 

CLR.L 

-(SP) 

84 

MOVE 

Event Record+evtMessage+2 (A5) 

-(SP)  /character 

85 

MenuKey 

/which  menu? 

86 

BRA 

Selections 

Update 

87 

MOVE.L 

WindowPtr (A5) ,  - (SP) 

88 

_BeginUpdate 

89 

MOVE.L 

WindowPtr  (A51 ,  -  (SP) 

90 

_Set?ort 

91 

PEA 

ViewRect 

92 

MOVE.L 

Text Handle (A5) (SP) 

93 

TEUpdate 

94 

MOVE.L 

WindowPtr  (A5) ,  -  (SP) 

95 

_EndUpdate 

96 

BRA 

Event 

; _ 

-  Handle  mouse  down  events  - 

MouseEvent 

97 

CLR 

-(SP) 

/space  for  "what"  result 

98 

MOVE.L 

EventRecord+evtMouse  (A5) ,  -  (SP)  ,-place  where  event  occurred 

99 

PEA 

WhichWindowPtr (A5)  ,-window  affected  goes  here 

100 

FindWindow 

/get  exact  location  of  event 

101 

MOVE 

(SP)+,D0 

/recover  result 

102 

CMP 

#inMenuBar,D0 

/in  menu  bar? 

103 

BEQ 

MenuBar 

104 

CMP 

♦inSysWindow, DO 

/in  desk  accessory? 

105 

BEQ 

SysEvent 

106 

CMP 

# inContent, DO 

/in  text  edit  area? 

107 

BEQ 

Appl Window 

108 

CMP 

#inGoAway,D0 

/in  close  box? 

109 

BEQ 

Go  Away  Box 

110 

BRA 

Event 

/not  an  event  this  program  handles 

. _ 

-  Handle  events  in  system  windows  - 

SysEvent 

111 

PEA 

Event Record (A5) 

112 

MOVE.L 

WhichWindowPtr (A5) ,  -  (SP) 

/window  posting  event 

113 

_SystemClick 

/let  system  handle  it 

114 

BRA 

Event 

. _ 

-  Handle 

events  in  content  area  of  window 

Appl Window 

115 

PEA 

EventRecord+evtMouse  (A5) 

/event  location 

116 

_Globa IToLocal 

/convert  coordinates 

117 

MOVE.L 

EventRecord+evtMouse  (A5)  ,-(SE)  .-coordinates  now  local 

118 

MOVE 

EventRecord+evtMeta (A5)  ,D0 

119 

BTST.L 

♦shift Key, DO 

/extended  selection? 

120 

SNE 

DO 

121 

MOVE.B 

DO,  -  (SP) 

/extend  or  not  extend 

122 

MOVE.L 

TextHandle (A5) ,-  (SP) 

123 

JTEClidc 

/set  selection  range 

124 

BRA 

Event 

(continued  on  page  68) 

66 


Dr.  Dobb's  Journal,  January  1987 

37 


MAC  AND  AMIGA 


Listing  One  (Listing  continued ,  tejct  begins  on  page  40.) 


Handle  events  in  menu  bar 


CLR.L  -(SP)  ;  space  for  menu  ID  &  item 

MOVE.L  Event Record+evt Mouse (A5) (SP)  ;place  where  event  occurred 


MenuSelect 


;get  menu  ID  &  item 


Selections 

MOVE.L 

MOVE 

SWAP 


(SP) +,D7  ; recover  result 


CLR  -  (SP) 

HiLIteMenu 


133 

CMP 

#1,D7 

134 

BEQ 

AppleMenu 

135 

CMP 

#2,D7 

136 

BEQ 

FileMenu 

137 

CMP 

#3,D7 

138 

BEQ 

EditMenu 

139 

BRA 

Event 

-  -  ■ 

-  Handle  desk 

AppleMenu 

accessories  - 

140 

MOVE.L 

AppleHandle (A5) 

141 

MOVE 

D6, -(SP)  /menu 

142 

PEA 

DeskAccName  (A5 ) 

143 

_GetItem 

144 

CLR 

-(SP) 

145 

PEA 

DeskAccName  (A5) 

146 

CpenDeskAcc 

147 

MOVE 

(SP)+,D0 

148 

BRA 

Event 

; - 

-  Handle  editing  - 

EditMenu 

149 

SUBQ 

#1,D6 

150 

CLR 

-(SP) 

151 

MOVE 

D6,  -  (SP) 

152 

_SysEdit 

153 

MOVE 

(SP)+,D0 

154 

BNE 

Event 

155 

ADDQ 

#1,D6 

156 

CMP 

#3,D6 

357 

BNE 

EditMenu2 

158 

MOVE.L 

TextHandle (AS),- 

159 

TECut 

160 

BRA 

Event 

;D6  now  has  menu  item 
; low-order  word  has  menu  ID 

/selects  all  menus 
/remove  highlighting 

/apple  menu? 

/file  menu? 

/edit  menu? 


/space  for  desk  accessory  name 


/space  for  reference  number 
/desk  accessory  name 
/open  the  desk  accessory 
/discard  result 


/adjust  item  selected  for  SysEdit 
/space  for  result 
/adjusted  item  number 


/get  result 
/system  handled  edit 


/restore  item  nuamber 
/cut? 


CMP  #4,D6  ;< 

BNE  EditMenu3 

MOVE.L  TextHandle (A5) (SP) 

JTECopy 

BRA  Event 


166 

CMP 

#5,D6 

167 

BNE 

EditMenu4 

168 

MOVE.L 

TextHandle  (A! 

169 

TEPaste 

170 

BRA 

Event 

EditMenu4 

171 

CMP 

#6,D6 

172 

BNE 

Event 

173 

MOVE.L 

TextHandle (AS 

174 

TEDelete 

175 

BRA 

Event 

FileMenu 

176 

CMP 

#2,D6 

177 

BEQ 

CloseAndQuit 

178 

CMP 

#4,D6 

179 

BEQ 

CloseAndQuit 

C*®*  #2,D6  /close  selected? 

177  BEQ  CloseAndQuit 

178  CMP  #4,D6 

179  BEQ  CloseAndQuit  /quit  selected 

/ ! ! ! ! !all  other  file  menu  options  are  not  implemented  in  this  program! ! ! ! 

180  BRA  Event 


GoAwayBox 


Close  the  window  and  quit 


68 

38 


Dr.  Dobb's  Journal,  January  1987 


1 

181 

CLR.B 

-(SP)  /space  for  boolean  result 

182 

MOVE.L 

WhichWindowPtr  (A5) ,  -  (SP)  ;window  pointer 

183 

MOVE.L 

EventRecord+evtMouse  (A5) ,  -  (SP)  ,-point  of  event 

184 

TrackGoAway  /monitor  GoAway  box 

185 

MOVE.B 

(SP) +,D0  / get  result 

186 

BEQ 

Event  /don't  close 

CloseAndQuit 

187 

MOVE.L 

Text Handle (A5) (SP) 

188 

TEDispose  /close  text  edit  record 

189 

MOVE.L 

WindowPtr (A5) ,  - (SP) 

190 

CloseWindow  /close  the  window 

191 

RTS 

/return  to  Finder 

192 

AppleHandle 

DS.L  1 

193 

FileHandle 

DS.L  1 

194 

EditHandle 

DS.L  1 

195 

WindowPtr  DS.L 

1 

196 

WindowStorage 

DS  WindowSize 

197 

Text Handle 

DS.L  1 

198 

ViewRect  DC 

3,3,300,490 

199 

DestRect  DC 

3,3,300,490 

200 

EventRecord 

DS.B  16 

201 

WhichWindowPtr 

DS.L  1 

202 

DeskAccName 

DS  16 

End  Listing  One 

Listing  Two 

Listing  Two 

1 

DrD . Rsrc 

2 

TYPE  MENU  /menu  templates  follow 

3 

,  1  / resource 

ID 

4 

\14  /will  create  Apple  icon  for  title 

5 

,  2  / resource 

ID 

6 

File  /menu  title 

7 

New/N  /all  the 

rest  are  menu  items 

!  8 

Open/O 

9 

Close/W 

10 

Save  As _ 

11 

Save/S 

12 

Page  Setup. . . 

13 

Print/P 

14 

Quit/Q 

15 

,  3  / resource 

ID 

16 

Edit  /menu  title 

17 

Undo/Z 

18 

(-  / straight 

line  -  disabled 

19 

Cut/X 

20 

Copy/C 

21 

Paste/V 

22 

Clear 

23 

TYPE  WIND 

/window  templates  follow 

24 

,1 

/resource  ID 

25 

26 

27 

28 

29 

Dr.  Dobb's  Journal 
50  10  310  502 
Visible  GoAway 

0 

0 

/window  title 

/initial  coordinates 

/make  window  visible,  draw  GoAway  box 

/window  type  (standard  document  window) 

/optional  reference  number 

End  Listing  Two 

Listing  Three 

Listing  Three 

1 

include 

"exec/ types.!" 

2 

include 

"exec/exec. i" 

3 

include 

“intuition/intultion.i" 

4 

callsys  macro 

5 

CALLIB 

LVO\l  ;  calls  a  system  routine 

6 

endm 

7 

xlib  macro 

8 

9 

xref 

endm 

LVO\l  /for  library  routines  f  . 

-  (continued  on  page  70) 

Dr.  Dobb’s  Journal,  January  1987 


69 

39 


MAC  AND  AMIGA 

Listing  TlirGG  (Listing  continued,  text  begins  on  page  40.) 

10 

passtext  macro 

11 

lea 

\1,A0  ;  pointer  to  text 

12 

lea 

\2,A1  ;ptr  to  Intuition  text  structure 

13 

jsr 

SetText  .-initializes  text  structure 

14 

endm 

15 

passitem  macro 

16 

lea 

\1/ A0  ; pointer  to  menu  item  structure 

17 

move.l 

\2,A1  .-pointer  to  next  menu  item  in  list 

18 

lea 

\3,A2  ; pointer  to  text  structure 

19 

move.b 

\4,D0  /keyboard  equivalent 

20 

move 

\5,D1  /offset  from  top  of  menu  item  box 

21 

jsr 

Setltem  /initializes  menu  item  structure 

22 

endm 

23 

xlib 

AllocSignal  /external  refs  for  all  system 

24 

xlib 

AllocMem  /routines  that  the  program  will  call 

25 

xlib 

FreeSignal 

26 

xlib 

AddPort 

27 

xlib 

NewList 

28 

xlib 

FindTask 

29 

xlib 

OpenLibrary 

30 

xlib 

OpenWindow 

31 

xlib 

SetMenuStrip 

32 

xlib 

OpenDevice 

33 

xlib 

DoIO 

34 

xlib 

SendIO 

35 

xlib 

Wait 

36 

xlib 

GetMsg 

37 

xlib 

ReplyMsg 

38 

xlib 

CloseDevice 

39 

xlib 

CloseWindow 

40 

xlib 

CloseScreen 

41 

xref 

_AbsExecBase  ;exec‘s  base  is  fixed 

42 

FrontPen  equ 

0 

43 

BackPen  equ 

1  /for  rendering  window  and  text 

;  — 

Open  Intuition  library  -  j 

44 

move.l 

AbsExe cBa se , A6 

45 

lea 

IntNamefAl  /name  of  library  to  open 

46 

move.l 

#0,00 

47 

callsys 

OpenLibrary 

48 

bne 

Continue 

49 

rts 

/unsuccessful  opening  ends  program 

50 

Continue  clr.l 

IntBase 

51 

move.l 

DO, IntBase  /save  base  of  Intuition  library 

52 

lea 

TheScreen,A0  /pointer  to  screen  data  structure 

53 

move 

*0,ns  LeftEdge(AO)  /coordinates  of  screen 

54 

move 

#0,ns  TopEdge(A0) 

55 

move 

#320,  ns  Width  (A0) 

56 

move 

#200, ns  Height (A0) 

57 

move 

#2, ns  Depth (A0)  /graphics  depth 

58 

move.b 

#0,ns  Detail Pen  (A0)  /  color  for  details 

59 

move.b 

#l,ns  BlockPen(AO)  /color  for  rest  of  drawing 

60 

move 

#0,  ns  ViewModes  (A0) 

61 

move 

#CUSTOMSCREEN,  ns  Type  (A0) 

62 

move.l 

§0, ns  Fonts  (A0)  ;use  default  font 

63 

lea 

ScreenTitle.Al 

64 

move.l 

Al.ns  DefaultTitle  (A0) 

65 

move.l 

#0,ns  Gadgets  (A0)  ;no  special  gadgets  attached 

66 

move.l 

IntBase,  A6 

67 

callsys 

OpenScreen 

68 

move.l 

D0,ScreenPtr  /results  almost  always  come  back  in  DO 

.  ,  i 

69 

lea 

TheWindow,A0  /pointer  to  window  data  structure 

70 

move 

#20, nw  LeftEdge(AO)  /initial  coordinates 

71 

move 

#20, nw  TopEdge(A0) 

72 

move.b 

#0,nw  Detail Pen (A0)  /color  for  characters 

73 

move.b 

#l,nw  BlockPen(AO)  /color  for  rest  of  drawing 

74 

lea 

WindowTitle,  A1 

75 

move.l 

Al,nw  Title  (A0) 

76 

move.l 

#WINDOWCI/OSE+SMART  REFRESH+ACTIVATE+WINDOWDRAG+ 

WINDOWSIZING+WINDOWDEPTH.nw  Flags  (A0) 

/system  gadgets,  etc. 

77 

move.l 

♦CLOSEWINDOWtMENUPICK.nw  IDCMPFlags  (A0) 

/events  to  be  reported 

78 

move 

#CUSTOMSCREEN,  nw  Type  (A0) 

79 

move.l 

#0,nw  First Gadget (A0)  /no  special  gadgets  attached 

80 

move.l 

#0,nw  CheckMark  (A0)  /not  using  checked  menu  items 

81 

move 

#150,  nw  Height (A0)  ; initial 

82 

move 

#280, nw  Width  (A0)  ; initial 

83 

move 

#100, nw  MinWidth(AO)  /since  window  can  be  sized 

84 

move 

#25, nw  MinHeight (A0) 

85 

move 

#640,  nw  MaxWidth(AO) 

86 

move 

♦200, nw  MaxHeight  (A0) 

87 

move.l 

ScreenPtr, nw  Screen (A0) 

88 

move.l 

IntBase, A6 

(continued  on  page  72) 

70 

40 


Dr.  Dobb's  Journal,  January  1987 


MAC  AND  AMIGA 

TlirGG  (listing  continued,  text  begins  on  page  40.) 

89 

callsys 

OpenWindow 

90 

lea 

WindowPtr,A0 

91 

move.l 

DO,  (A0) 

92 

pas st ext 

Pro jTextl, Pro jl  ;First  must  initialize  all 

93 

passtext 

ProjText2,Proj2  /Intuition  text  structures. 

94 

pass text 

Pro jText3, Pro j3 

95 

passtext 

Pro jText4, Pro j4 

96 

passtext 

Pro jText5, Pro j5 

97 

passtext 

ProjText6,Proj6 

98 

passtext 

Pro  jText  7 , Pro  j  7 

99 

lea 

ProjItem2,A3  ;Then  must  include  the  text 

100 

passitem 

Pro jlteml, A3, Pro jTextl, #  *N' ,  #0 

101 

lea 

ProjItem3,A3  ;in  menu  item  structures. 

102 

passitem 

Pro  j  Item2 ,  A3 ,  Pro  jText  2 ,  # '  O ' ,  #  9 

103 

lea 

ProjItem4,A3 

104 

passitem 

Pro jltem3, A3, Pro jText3,  # ' S ' ,  *18 

105 

lea 

Pro  j  ItemS,  A3 

106 

passitem 

Pro jltem4, A3,  Pro jText4,  #  'A' ,  #27 

107 

lea 

ProjItem6,A3 

108 

passitem 

Pro  jltem5,  A3,  Pro  jText  5,  #  'P ' ,  #36 

109 

lea 

ProjItem7,A3 

110 

passitem 

Pro jltem6, A3,  Pro jText6,  #  'R' ,  #45 

111 

passitem 

ProjItem7,#0,ProjText7, #'Q' ,#54 

112 

lea 

ProjMenu,A0  /Finally,  must  initialize  the 

113 

lea 

EditMenu,Al  /menu  structure  itself. 

114 

move.l 

Al,mu  NextMenu(AO)  /pointer  to  next  menu  in  list 

115 

move 

*0,mu  LeftEdge(AO)  /place  for  title  in  menu  strip 

116 

move 

#0,mu  TopEdge(A0)  /ignored 

117 

move 

#0,mu  Height  (A0)  /ignored 

118 

move 

#100, mu  Width  (A0) 

119 

move 

#MENUENABLED , mu  Flags (A0)  /menu  is  enabled 

120 

lea 

Pro jName, A1 

121 

move.l 

Al,mu  MenuName(AO) 

122 

lea 

Projltem.Al 

123 

move.l 

Al,mu  First  Item  (A0)  ;head  of  menu  item  list 

124 

passtext 

EditTextl, Editl  /Now,  repeat  process  for 

125 

passtext 

EditText2,  Edit2  /the  second  menu. 

126 

passtext 

EditText3, Edit3 

127 

passtext 

EditText4,Edit4 

128 

passtext 

EditText5,  Edit5 

129 

lea 

EditItem2,A3 

130 

passitem 

Edit  Iteml ,  A3 ,  EditTextl  ,*' Z  ■,  *0 

131 

lea 

EditItem3,A3 

132 

passitem 

Edit  Item2 ,  A3 ,  EditText  2 ,  #  ■  X 1 ,  #  9 

133 

lea 

EditItem4,A3 

134 

passitem 

Editltem3,  A3,  EditText3,  # '  C  ■ ,  *18 

135 

lea 

EditItem5,A3 

136 

passitem 

EditItem4,A3,  EditText4,  #  'V*  ,#27 

137 

passitem 

Editltem5, *0, EditText5, # 'D • , #36 

138 

lea 

EditMenu,A0 

139 

move.l 

#0,mu  NextMenu(AO)  ;end  of  the  list 

140 

move 

#101, mu  LeftEdge(AO) 

141 

move 

#0,mu  TopEdge(AO) 

142 

move 

#75, mu  Width  (A0) 

143 

move 

#0,mu  Height  (A0) 

144 

move 

#MENUENABLED,mu  Flags  (A0) 

145 

lea 

EditName,  A1 

146 

move.l 

Al,mu  MenuName(AO) 

147 

lea 

Editlteml.Al 

148 

move.l 

Al,mu_FirstItem(A0) 

149 

move.l 

IntBase,A6 

150 

move.l 

WindowPtr,A0  /window  in  question 

151 

lea 

Pro jMenu, A1  /first  menu  in  strip 

152 

callsys 

SetMenuStrip  /attach  menu  strip  to  window 

153 

lea 

WritePort,A3  /storage  for  pointer  to  write  port 

154 

move.l 

#0,A5  /unnamed  ports  -  first  in  list 

155 

jsr 

CreatePort  /initialize  the  port 

156 

move.l 

WritePort,A3  /write  port  pointer 

157 

lea 

WriteMsg,A5  /storage  for  pointer  to  IO  block 

158 

jsr 

CreateStdIO  /initialize  IO  block 

159 

lea 

ReadPort,A3  /Repeat  for  read  port. 

160 

lea 

ReadName, A5  /has  name  -  not  first  in  list 

161 

jsr 

CreatePort 

162 

move.l 

ReadPort,  A3 

163 

lea 

ReadMsg,A5 

164 

jsr 

CreateStdIO 

... 

165 

move.l  WriteMsg,A3  .-output  IO  request  block 

166 

move.l 

ReadMsg,A5  /input  IO  request  block 

7a 


Dr.  Dobb’s  Journal,  January  1987 

41 


-  - '  ■  ;  :  _  ' 

167 

move.l 

WindowPtr,A4 

; window  for  this  console 

168 

jsr 

OpenConsole 

169 

anp 

#0,D0 

170 

beq 

GoOn 

171 

rts 

/unsuccessful  opening  ends  program 

172 

GoOn  move.l 

WindowPtr,A0 

173 

move.l 

wd  UserPort (A0) ,A0 

/message  port  for  Intuition 

174 

move.b 

MP  SIGBIT  (A0) ,  DO 

/Intuition  signal  bit 

175 

lea 

IntSigBit.AO 

176 

move.b 

DO,  (A0) 

/save  it 

177 

move.l 

ReadPort, A0 

178 

move.b 

MP  SIGBIT (A0), DO 

/console  signal  bit 

179 

lea 

ConSigBit,A0 

180 

move.b 

DO,  (A0) 

/save  it 

. 

... 

* 

181 

move.l 

ReadMsg,Al 

/console  IO  request  block 

182 

lea 

letter,  A4 

/place  to  put  character  read 

183 

jsr 

QueueRead 

/queue  up  a  message 

• - 

-  Wait  for  Intuition  or  console  event  - 

184 

Event  clr.l 

D1 

185 

move.b 

IntSlgBlt.Dl 

186 

clr.l 

DO 

188 

bset.l 

D1,D0 

/sys  will  look  for  Intuition  event 

189 

clr.l 

D1 

190 

move.b 

ConSigBit.Dl 

191 

bset . 1 

D1,D0 

/system  looks  for  console  evt,  too 

192 

move.l 

AbsExecBase,A6 

193 

callsys 

Wait 

/wait  for  event  to  occur 

194 

clr.l 

D1 

/Note  -  now  a  bit  in  DO  is  set 

195 

move.b 

IntSigBit,D2 

/to  correspond  to  signal  causing 

196 

bset . 1 

D2,D1 

/event. 

197 

anp.l 

D1,D0 

/Incuition  event? 

198 

beq 

IntuitionEvent 

19S 

clr.l 

D1 

200 

move.b 

ConSigBit,D2 

201 

bset . 1 

D2,D1 

202 

arp.l 

D1,D0 

/console  event? 

203 

beq 

ConsoleEvent 

204 

bra 

Event 

/fail-safe  trap -'should  never  get  here 

. _ 

Intuit ionEvent 

205 

move.l 

WindowPtr,A0 

206 

move.l 

wd  OserPort  (A0)  ,A0 

/Intuition's  message  port 

207 

move.l 

AbsExecBase,  A6 

208 

callsys 

GetMsg 

/retrieve  the  input  message 

209 

beq 

Event 

/no  message  present 

210 

move.l 

D0,A1 

/  GetMsg  returns  address  of  message  in  DO 

211 

move.l 

im  Class (Al) ,D0 

/type  of  event 

212 

anp 

ICLOSEWINDCW,  DO 

;was  window  close  box  clicked? 

213 

beq 

CloseAndQuit 

214 

anp 

*MENUPICK,D0 

/menu  choice  made? 

215 

beq 

MenuEvent 

DoneWithEvent 

216 

move.l 

AbsExecBase, A6 

217 

callsys 

ReplyMsq 

; remove  message  so  it  can  be  reused 

218 

bra 

Event 

MenuEvent 

219 

move 

im  Code(Al)  ,D0 

;menu  s  menu  item  number 

220 

beq 

DoneWithEvent 

;user  backed  out  before  chosing 

221 

move 

D0,D1 

/save  the  code 

222 

and 

#%0000000000011111, 

DO  /get  menu  number 

223 

anp 

#0,D0 

/project  menu? 

224 

bne 

DoneWithEvent 

/Project  menu  is  the  only  one 

/trapped  by  this  program! ! ! ! 

225 

lsr 

#5,D1 

226 

and 

#%0000000000111111, 

DO  /get  menu  it  on  number 

227 

anp 

#6,D1 

/Quit  selected? 

228 

bne 

DoneWithEvent 

/Quit  is  the  only  option 

/ implemented  by  this  program! ! ! ! ! 

CloseAndQuit 

229 

move.l 

ReadMsg,  Al 

230 

ABORTIO 

/remove  last  event  from  queue 

231 

move.l 

ReadMsg,  Al 

232 

move.l 

AbsExecBase,  A6 

233 

callsys 

CloseDevice 

/close  the  console 

234 

move.l 

WindowPtr,A0 

235 

move.l 

IntBase,A6 

236 

callsys 

CloseWindow 

/close  the  window 

(continued  on  nepct  page) 

Dr.  Dobb's  Journal,  January  1987 

42 


73 


MAC  AND  AMIGA 

Three  (Listing  continued,  tegt  begins  on  page  40.) 

237 

move.l 

ScreenPtr,A0 

238 

move.l 

IntBase,A6 

239 

callsys 

CloseScreen 

close  the  custom  screen 

240 

rts 

return  to  DOS 

,, 

Consol eEvent 

241 

move.l 

ReadMsg,A0 

242 

move.l 

AbsExe  cBa  se ,  A6 

243 

callsys 

GetMsg 

retrieve  the  message 

244 

move.l 

WriteMsg,Al 

output  IO  request  block 

245 

lea 

letter, A4 

place  where  character  is  stored 

246 

jsr 

ConPutChar 

display  the  character 

247 

move.b 

letter, DO 

248 

anp 

#$D,D0 

was  character  a  <cr>? 

249 

bne 

MoreLetters 

250 

move.b 

#$A,  (A4) 

put  line  feed  in  letter 

251 

move.l 

WriteMsg,Al 

252 

jsr 

ConPutChar 

add  a  line  feed  to  the  <cr> 

MoreLetters 

253 

move.l 

ReadMsg,Al 

254 

move.l 

DevAdd,  IO  DEVICE  (Al] 

255 

jsr 

QueueRead  ;go  get  another  letter 

256 

bra 

Event 

257 

SetText  move.b  #FrontPen,  it  FrontPen(AO)  ,-colors  for  drawing 

258 

move.b 

#BackPen, it  BackPen(AO)  i 

259 

move.b 

#0,  it  DrawMode(AO) 

260 

move 

#2, it  LeftEdge(AO) 

posn  rel  to  container 

261 

move 

#1, it  TopEdge (AO) 

262 

move.l 

#0,  it  ITextFont  (A0) 

use  default  font 

263 

move.l 

Al,  it  IText(AO) 

pointer  to  the  text  structure 

264 

move.l 

#0,it  Next Text (A0) 

no  link  to  other  txt  struct s 

265 

rts 

266 

Setltem  move.l 

Al,mi  Next  Item  (A0) 

pointer  to  next  item  in  list 

267 

move 

#2, mi  LeftEdge(AO) 

posn  rel  to  container 

268 

move 

Dl,mi  TopEdge  (A0) 

269 

move 

#100, mi  Width  (A0) 

270 

move 

#9, mi  Height  (A0) 

271 

move 

#ITEMTEXT+CCMMSEO#ITEMENABLED+HIGHCCMP,mi  Flags  (A0) 

272 

move.l 

#0,mi  Mutual Exclude (A0)  ;no  mutually  exclusive  items 

273 

move.l 

A2,mi  ItemFill  (A0) 

■pointer  to  text  structure 

274 

move.b 

DO, mi  Coirmand  (A0) 

•keyboard  equivalent 

275 

move.l 

#0,mi  Subitem  (A0) 

•no  subitems 

276 

move 

#0,mi  NextSelect  (A0) 

;no  associated  items 

277 

rts 

;NOTE 

-  this  subroutine  is  an  assembly  language  version  of  the  C  source 

;code  provided  in  the  Amiga 

ROM  kernel  manual.  It  needs  error  trapping 

; after  the  system  calls  to  be  complete. 

;  Load 

address  of  pointer  to  message  port  structure  in  A3. 

;Load  pointer  to  name  for  message  port  in  A5. 

278 

move 

#-l,D0 

;no  preference  for  signal  bit 

279 

move.l 

AbsExecBase,A6 

280 

callsys 

AllocSignal 

; allocate  a  signal  bit  for  port 

281 

move.b 

D0,D7 

;save  signal  bit 

282 

clr.l 

D1 

283 

bset . 1 

#16, D1 

;  (clear) 

284 

bset . 1 

#0,D1 

,-  (public)  requirements 

285 

move 

#MP  SIZE, DO 

; number  of  bytes  needed 

286 

move.l 

AbsExecBase,A6 

287 

callsys 

AllocMem 

,-memory  for  message  port  structure 

288 

move.l 

DO,  (A3) 

,-save  pointer  to  message  port 

289 

move.l 

DO,  A4 

290 

move.l 

#0,D0 

291 

move.l 

AbsExecBase,A6 

292 

callsys 

FindTask  initialize  task  control  block 

293 

move.l 

A5,LN  NAME  (A4) 

; port 1 s  name 

294 

move.b 

#0,LN  PRI  (A4) 

/port's  priority 

295 

move.b 

#NT  MSGPORT,LN  TYPE  (A0)  ;type  of  port 

296 

move.b 

#PA  SIGNAIi,MP  FLAGS  (A0) 

297 

D7,MP  SIGBIT  (A4) 

/signal  bit 

298 

move.l 

D0,MP_SIGTASK(A4) 

; address  of  task  Ctrl  block 

299 

#0,  A5 

/is  name  specified? 

300 

beq 

Port  2 

; head  of  list  of  ports 

301 

move.l 

A4,A1 

302 

move.l 

AbsExe cBase,A6 

303 

callsys 

AddPort 

;add  this  port  to  list 

(continued  on  page  79) 

74 


Dr.  Dobb's  Journal,  January  1987 

43 


MAC  AND  AMIGA 

Listing  Three  (Listing  continued,  text  begins  on  page  40.) 

304 

rts 

305 

Port 2  lea 

MP  MSGLI  ST  ( A4 ) ,  A0 

306 

NEWLIST 

A0  /initialize  a  new  list  of  ports 

307 

rts 

/NOTE 

-  this  subroutine  is  an  assembly  language  version  of  the  C  source  j 

;code  provided  in  the  Amiga 

ROM  kernel  manual.  It  needs  error  trapping 

; after 

the  system  calls  to  be  ccnpl ete. 

;Load  pointer  to  message  port  in  A3. 

;Load 

address  of  pointer  for  standard  10  structure  in  A5. 

CreateStdIO 

308 

clr.l 

D1 

309 

bset . 1 

#16,D1 

310 

bset . 1 

#0,D1 

311 

move.l 

#IOSTD  SIZE, DO 

312 

move.l 

AbsExecBase,A6 

313 

callsys 

AllocMem  /space  for  10  request  block 

314 

move.l 

DO,  (A5)  /save  the  pointer 

315 

move.l 

DO,  AO 

316 

move.b 

♦NT  MESSAGE, LN  TYPE  (AO)  ;type  of  structure 

317 

move.b 

#0,LN  PRI(AO)  ; priority 

318 

move.l 

A3,MN  REPLYPORT (A0)  /address  of  message  port 

319 

rts 

.  .  ,  ,  , 

•NOTE 

-  this  subroutine  is  an  assembly  language  version  of  the  C  source 

;code  provided  in  the  Amiga 

RCM  kernel  manual. 

;Load  pointer  to  WriteMsg  in  A3. 

;Load  pointer  to  ReadMsg  in  A5. 

;Load  pointer  to  Window  in  A4. 

OpenConsole 

320 

move.l 

A4, 10  DATA  (A3)  /pointer  to  window  record 

321 

move 

♦nw  SIZE, IO  LENGTH (A3)  /size  of  window  record 

322 

lea 

ConDev, A0  /name  of  device 

323 

move.l 

#0,D0 

324 

move.l 

A3,A1 

325 

move.l 

#0,D1 

326 

move.l 

AbsExe cBa se , A6 

327 

callsys 

OpenDevice 

328 

move.l 

10  DEVICE  (A3  ),IO  DEVICE  (A5)  ,-save  device  pointers 

329 

move.l 

IO  DEVICE (A3) , DevAdd 

330 

move.l 

10  UNIT  (A3) ,  10  UNIT  (A5) 

331 

rts 

. _ 

—  Subroutine  to  queue 

up  a  read  request  to  the  console  - 

;NOTE 

-  this  subroutine  is 

an  assembly  language  version  of  the  C  source 

;code  provided  in  the  Amiga 

RCM  kernel  manual. 

;Load  pointer  to  read  message  in  Al. 

;Load  pointer  to  storage  space  for  character  in  A4. 

QueueRead 

332 

move 

#CMD  READ,  IO  COMMAND  (Al)  /type  of  operation 

333 

move.l 

A4, IO  DATA (Al)  /where  data  should  be  placed 

334 

move.l 

#1,10  LENGTH  (Al)  /number  of  bytes  to  read 

335 

move.l 

AbsExe cBase , A6 

336 

callsys 

SendIO 

337 

rts 

; - 

—  Subroutine  to  print  a  single  character  in  a  console  window  - 

;NOTE 

this  subroutine  is  an  assembly  language  version  of  the  C  source 

;code  provided  in  the  Amiga 

RCM  kernel  manual. 

;Load  pointer  to  write  message  in  Al 

;lT>ad  pointer  to  character  to  be  printed  in  A4. 

ConPutChar 

338 

move 

#CMD  WRITE,  IO  COMMAND  (Al)  /type  of  operation 

339 

move.l 

A4,I0  DATA  (Al)  /where  data  will  cone  from 

340 

move.l 

#1,10  LENGTH (Al)  /number  of  bytes  to  output 

341 

move.l 

AbsExecBase , A6 

342 

callsys 

DoIO 

343 

rts 

**************************** 

Data  Structures  ******************************* 

344 

DevAdd 

ds.l  1 

345 

TheScreen  ds.b 

ns  SIZEOF 

346 

TheWindow  ds.b 

nw  SIZE 

347 

IntBase 

ds.l  1 

348 

IntName 

dc.b  'intuition. library', 0 

349 

WindowPtr  ds.l 

1 

350 

WindowTitle 

dc.b  'Text  Window', 0 

351 

ScreenPtr  ds.l 

1 

352 

ScreenTitle 

dc.b  'Dr.  Dobbs  Journal ',0 

353 

ConDev 

dc.b  ' console. device' , 0, 0 

?NOTE  -  extra  0  included  to  keep  addresses  even 

354 

Pro jMenu  ds.b 

mu  SIZEOF 

355 

Pro  j  Name  dc.b 

' Project ' ,  0 

(continued  on  next  page) 

Dr  Dobb  s  Journal,  January  1987 

44 


79 


MAC  AND  AMIGA 


Listing  Three  (Listing  continued,  tent  begins  on  page  40.) 


356 

Projl 

dc.b  'New',0 

357 

ProjTextl  ds.b 

it  SIZEOF 

358 

Projlteml  ds.b 

mi_SIZEOF 

359 

Proj2 

dc.b  'Open' ,0,0 

360 

ProjText2  ds.b 

it  SIZEOF 

361 

ProjItem2  ds.b 

mi_SIZEOF 

362 

Proj3 

dc.b  'Save ',0,0 

363 

ProjText3  ds.b 

it  SIZEOF 

364 

ProjItem3  ds.b 

mi_SIZEOF 

365 

Proj4 

dc.b  ' Save  As' ,0 

366 

ProjText4  ds.b 

it  SIZEOF 

367 

ProjItem4  ds.b 

mijSIZEOF 

368 

Proj5 

dc.b  '  Print ',0 

369 

ProjText5  ds.b 

it  SIZEOF 

370 

ProjItem5  ds.b 

mijSIZEOF 

371 

Proj6 

dc.b  'Print  As', 0,0 

372 

ProjText6  ds.b 

it  SIZEOF 

373 

ProjItem6  ds.b 

mi_SIZEOF 

374 

Proj7 

dc.b  'Quit',  0,0 

375 

ProjText7  ds.b 

it  SIZEOF 

376 

ProjItem7  ds.b 

mi_SIZEOF 

377 

EditMenu  ds.b 

mu  SIZEOF 

378 

EditName  dc.b 

•Edit ',0,0 

379 

Editl 

dc.b  'Undo ',0,0 

380 

EditTextl  ds.b 

it  SIZEOF 

381 

Editlteml  ds.b 

mi_SIZE0F 

382 

Edit  2 

dc.b  '  Cut ' ,  0 

383 

Edit Text 2  ds.b 

it  SIZEOF 

384 

Editltem2  ds.b 

mi_SIZEOF 

385 

Edit3 

dc.b  'Copy',  0,0 

386 

EditText3  ds.b 

it  SIZEOF 

387 

Editltem3  ds.b 

mi_SIZEOF 

388 

Edit  4 

dc.b  '  Paste ',0 

389 

EditText4  ds.b 

it  SIZEOF 

390 

Editltem4  ds.b 

mi_SIZEOF 

391 

Edit  5 

dc.b  ' Erase ',0 

392 

Edit Text 5  ds.b 

it  SIZEOF 

393 

Editltem5  ds.b 

mi_SIZEOF 

394 

ReadPort  ds.l 

1 

395 

ReadMsg 

ds.l  1 

396 

ReadName  dc.b 

'Read' , 0, 0 

397 

WritePort  ds.l 

1 

398 

WriteMsg  ds.l 

1 

399 

IntSigBit  ds.b 

1 

400 

ConSigBit  ds.b 

1 

401 

letter 

ds.b  1 

End  Listings 


80 


Dr.  Dobbs  Journal,  January  1987 

45 


32000  CROSS  ASSEMBLER 

Listing  One  (Text  in  December) 


/*  A32000.C  -  Series  32000  assembler 

850903  rr  fix  addr  ext,  scaled  index,  acp,  exp  0.10 

850902  rr  add  scaled  index  logic  0.09 

850828  rr  fix  enter,  setefg,  lpr/spr,  index  0.08 

850809  rr  add  equate  logic  0.07 

850730  rr  add  binary  search  in  lookup  0.06 

850729  rr  symbol  table  mods,  reglist  0.05 

Still  need: 

-  register  names  for  lpr/spr 

-  linkable  modules 

Note:  While  68000  is  hilo  (high-bytes  at  lower  memory 
addresses) ,  32000  is  lohi  (low— bytes  at  lower  memory 
addresses,  like  the  Z80) . 


This  is  a  3-pass  assembler;  3  passes  to  make 

sure  that  relative  branches  are  computed  correctly.  */ 


♦define  EOF  -1 

♦define  SYMSIZ  1024 

/* 

symbol  table  size  */ 

char  inpbuf[  256  ]; 

/* 

input  buffer  */ 

int  inpent,  inpptr; 

/* 

input  counter,  pointer  */ 

char  word_buffer[  128  ], 

/*  buff  for  current  word  */ 

char  ambig_buffer [  128 

i; 

/*  ambiguous  refs  here  */ 

char  listline [  81  ]; 

/* 

line  of  listing  output  */ 

int  listop,  listep; 

/* 

pointers  for  list  output  */ 

int  paren  =0; 

/* 

used  in  gchar  ()  */ 

int  brack  =0; 

int  quote  =0; 

int  iwparen  =0; 

/* 

used  in  inword ()  */ 

int  errors  =  0; 

/* 

count  of  errors  */ 

char  *word; 

/* 

pointer  to  current  word  */ 

char  *ambig[  10  ]; 

/* 

filled  in  by  match  */ 

int  ambent  =0; 

/* 

count  of  pointers  in  ambig[]  */ 

int  pass; 

/* 

pass  =1,  2  or  3  */ 

long  int  asmadr,  coda dr; 

r  /’ 

k  assembly  addr,  code  addr  */ 

char  filename [  30  ]; 

int  fasm,  fobj; 

/* 

file  numbers  */ 

char  objbuf[  64  ]; 

/* 

object  byte  buffer  */ 

long  int  objadr; 

/* 

addr  of  first  byte  of  buf  */ 

int  objent  =  0; 

/* 

count  of  bytes  in  buffer  */ 

struct  { 

/* 

Symbol  table  */ 

char  *snam; 

/* 

symbol  name  */ 

long  int  sval; 

/* 

value  */ 

}  symbol [  SYMSIZ  ]; 

int  syment; 

/* 

count  of  symbols  */ 

char  hexchr [  17  ]  -  "0123456789abcdef 


/* -  32000  opcodes - */ 

/*  Note:  Shortest  form  of  opcode  must  be  listed  first.  */ 

♦define  MAXOP  149 

/*  the  opcode  binary  value  should  be  a  string  of  bits, 
e.g.  OlllxxxxxOOOb  the  opcode  opopt  character  is  used 
to  specify  special  operands,  etc.  */ 

/*  opopts  used  here  for  the  32000  are: 
blank  nothing  special 

a  gen 

b  gen  short 

c  gen  gen 

d  00000  short 

e  gen  gen  reg 

f  reglist  save/enter 


82 

46 


Dr.  Dobb's  Journal,  January  1987 


g  reglist 

restore/exit 

h  00000  gen  (sfsr) 

i  inss/exts 

j  movs/skps/cmps 

k  setcfg 

1  procreg 

,  gen  for  lpr/spr 

m  index  {operand  order) 

n  ret/rett  - 

postbyte 

o  movm 

p  exp  (disp  after  instruction)  */ 

struct  { 

char  *onam; 

/* 

opcode  name  */ 

int  oent; 

/* 

operand  count,  negative 

if  PC-rel  */ 

char  *obin; 

/* 

opcode  binary  value  */ 

char  oopt; 

/* 

opcode  opopt  char  */ 

)  opcode [  MAXOP 

]  -  { 

/*  Format  1  ops 

(16)  */ 

"bsr". 

-1, 

"02h" ,  1  ’, 

"ret". 

1, 

"12h" ,  'n1. 

"exp". 

1, 

"22h",  'p'. 

"rxp". 

1, 

"32h" ,  'n'. 

"rett". 

1, 

"42h" ,  'n\ 

"reti“. 

o. 

"52h",  '  ', 

" save". 

1, 

" 62h" ,  'f'. 

"restore". 

1, 

"72h" ,  'g'. 

"enter". 

2, 

"82h",  ’f. 

"exit". 

1, 

"92h",  'g'. 

"nop". 

o. 

"0a2h",  '  ', 

"wait". 

0, 

"0b2h",  '  ', 

"dia". 

0, 

"0c2h",  '  ', 

"flag". 

o. 

"0d2h",  '  ', 

“ sve". 

o. 

"0e2h",  '  ', 

"bpt ", 

o. 

"0f2h",  '  ', 

/*  Conditional 

branches  (15)  */ 

"beq". 

-1, 

"Oah",  'b'. 

"bne". 

-1, 

“lah",  'b'. 

"bes". 

-1, 

“ 2ah“ ,  'b1, 

"bcc". 

-1, 

“3ah",  'b'. 

"bhi". 

-1, 

"4ah",  'b'. 

"bis". 

-1, 

"5ah",  'b* , 

"bgt ", 

-1, 

"6ah“,  'b\ 

"ble". 

-1, 

" 7ah",  'b* , 

"bfs". 

-1, 

"8ah",  'b'. 

"bfc". 

-1, 

"9ah",  'b'. 

"bio". 

-i. 

“Oaah",  'b'. 

"bhs". 

-i. 

"Obah”,  'b'. 

"bit". 

-1, 

"Ocah",  'b'. 

"bge". 

-1, 

"Odah”,  'b'. 

"br". 

-i. 

"Oeah",  'b'. 

/*  Format  2  ops  (7)  */ 

"addq?". 

2, 

" xxxxxxxxxO 0 0 1 1 i ib " , 

'  e ' , 

" empq? " , 

2, 

"xxxxxxxxxOOllliib" , 

'  e' , 

"spr?". 

2, 

"xxxxxxxxxOlOlliib" , 

'1'/ 

“lpr?“. 

2, 

" xxxxxxxxxl 1 0 1 1 i ib " , 

■l’. 

" seq?". 

i. 

"xxxxxOOOOOlllliib", 

'a' , 

"sne?". 

1, 

"xxxxxOOOlOlllliib", 

'a'. 

"scs?". 

i, 

"xxxxxOOlOOlllliib", 

'a'. 

"see?" , 

1, 

"xxxxxOOHOlllliib", 

'a'. 

"shi?". 

1, 

"xxxxxOlOOOlllliib", 

'  a ' , 

"sis?". 

1, 

"xxxxxOlOlOlllliib", 

'a'. 

"sgt?". 

1, 

"xxxxxO 11001 111 iib". 

'a'. 

"sle?". 

1 

"xxxxxOlllOlllliib", 

'  a ' , 

"sfs?". 

1 

"xxxxxlOOOOlllliib" , 

'a'. 

"sfc?". 

1 

"xxxxxlOOlOlllliib", 

'a'. 

"slo?". 

1 

“xxxxxlOlOOlllliib" , 

'a'. 

"shs?". 

1 

“xxxxxlOllOlllliib", 

'a'. 

"sit?". 

1 

"xxxxxllOOOlllliib" , 

'a'. 

" sge?". 

1 

"xxxxxllOlOlllliib", 

'  a ' , 

"St?", 

1 

"xxxxxlllOOlllliib", 

'  a ' , 

"sf?". 

i 

"xxxxxllllOlllliib", 

'a' , 

/*  The  acb  instruction  3rd  operand  is  a  relative 

jump  */ 

(continued 

on  page  85) 

Dr.  Dobb's  Journal,  January  1987 


83 

47 


_ 32000  CROSS  ASSEMBLER 

Listing  OnC  (Listing  continued) 


"acb?", 

"movq?", 


“xxxxxxxxxlOOlliib", 

"xxxxxxxxxlOllliib", 


/*  Format  3  instructions  (7)  */ 

“  expd" ,  1,  "xxxxxOOOOlUllllb",  'a', 

"biepsr?",  1,  "xxxxxOOlOlUlliib”,  'a1, 

“jump",  1,  “xxxxxOlOOmilllb",  'a', 

"bispsr?",  1,  "xxxxxOllOllllliib",  ‘a1, 

“adjsp?”,  1,  “xxxxxlOlOllllliib",  'a', 

“jsr”,  1,  “xxxxxllOOlllllllb”,  'a1, 

“case?”,  1,  “xxxxxlllOllllliib",  ‘a1, 

/*  Format  11  ops  (16)  - 

moved  here  so  wildcards  won't  interfere  */ 


“addf", 
"addl", 
“movf “ , 
“movl", 
"empf", 
"cmpl", 
“subf", 
"subl", 
“negf", 
"negl" , 
"divf”, 
“divl", 
"mulf", 
"mull", 
“absf", 
"absl". 


"xxxxxxxxxxOOOOOllOlllllOb", 
"xxxxxxxxxxO 0000010111110b", 
“xxxxxxxxxxOOOlOllOlllllOb", 
"xxxxxxxxxxOOOlOOlOlllllOb”, 
"xxxxxxxxxxO 010011 011111 Ob", 
“xxxxxxxxxxOOlOOOlOlllllOb", 
"xxxxxxxxxxOlOOOllOlllllOb", 
“xxxxxxxxxxOlOOOOlOlllllOb", 
"xxxxxxxxxxOlOlOllOlllllOb", 
"xxxxxxxxxxOlOlOOlOlllllOb", 
"xxxxxxxxxxlOOOOllOlllllOb", 
"xxxxxxxxxxl 0000010111110b", 
"xxxxxxxxxxl 100011011 1110b", 
"xxxxxxxxxxllOOOOlOlllllOb", 
“xxxxxxxxxxllOlOllOlllllOb", 
"xxxxxxxxxxl 101 0010111 110b", 


/*  Format  4  instructions  (12)  */ 


add?". 

2, 

"  xxxxxxxxxxO  0  0 0  i  ib , 

'  c 

emp?". 

2# 

"xxxxxxxxxxOOOliib", 

'  c 

bic?". 

2, 

" xxxxxxxxxxO  0 1 0 i ib " , 

■  c 

addc?". 

2, 

"xxxxxxxxxxOlOOiib", 

'  c 

mov?". 

2, 

“xxxxxxxxxxOlOliib”, 

■  c 

or?". 

2, 

"xxxxxxxxxxOHOiib" , 

'  c 

sub?". 

2, 

“ xxxxxxxxxxl 0 00 i ib " , 

'  c 

addr" , 

2, 

"xxxxxxxxxxlOOlllb", 

'  c 

lxpd". 

2, 

"xxxxxxxxxxlOOlllb", 

'  c 

and?". 

2# 

"xxxxxxxxxxlOlOiib", 

'c 

subc?". 

2, 

“xxxxxxxxxxllOOiib", 

'  c 

tbit?". 

2/ 

"xxxxxxxxxxl 1 01 i ib“ , 

'  c 

xor?". 

2, 

“xxxxxxxxxxlllOilb", 

'  c 

/*  Format  5  instructions  (4)  */ 


/*  Format  6  ops  (14)  */ 


“Inss?",  4,  "xxxxxxxxxxOOlOiillOOlllOb", 
"exts?",  4,  "xxxxxxxxxxOOlliillOOlllOb", 
“movxbw?",  2,  "xxxxxxxxxxOlOOiillOOlllOb", 
“movzbw?" ,  2,  “xxxxxxxxxxOlOliillOOlllOb", 
“movz?d",  2,  "xxxxxxxxxxOllOiillOOlllOb", 
"movx?d",  2,  "xxxxxxxxxxOllliillOOlllOb", 
"mul?“,  2,  " xxxxxxxxxxl OOOi ill 001110b", 
"mei?",  2,  "xxxxxxxxxxlOOliillOOlllOb", 
“del?",  2,  "xxxxxxxxxxlOlliillOOlllOb", 
"quo?",  2,  "xxxxxxxxxxllOOiillOOlllOb", 
"rem?",  2,  "xxxxxxxxxxllOliillOOlllOb", 
"mod?",  2,  “xxxxxxxxxxlllOiillOOlllOb", 
“div?",  2,  "xxxxxxxxxxlllliillOOlllOb", 


/*  Format  8  ops  (8)  */ 


“ext?", 

"evtp", 

“ins?", 

“check?", 

"index?", 

"ffs?“, 

"movsu?", 

"movus?". 


“xxxxxxxxxxxxxOiiOOlOlllOb", 
"xxxxxxxxxxxxxOllOllOlllOb", 
"xxxxxxxxxxxxxOiilOlOlllOb", 
" xxxxxxxxxxxxxO iilllOlllOb", 
" xxxxxxxxxxxxxl i i 0 0 1 0 1 1 1 Ob" , 
“xxxxxxxxxxOOOliiOllOlllOb", 
"xxxxxxxxxxOOlliilOlOlllOb", 
"xxxxxxxxxxOllliilOlOlllOb”, 


/*  Format  9  ops  (12)  */ 


“movlf", 

"movfl", 

"mov?f" , 

"mov?l", 

"lfsr", 

"sfsr", 

"roundf?" 

"roundl?" 

“truncf?" 

"truncl?" 

“floorf?" 

“floorl?" 


"xxxxxxxxxxOlOliiOOlllllOb", 
"xxxxxxxxxxOllliiOOlllllOb", 
"xxxxxxxxxxOOOliiOOlllllOb", 
"xxxxxxxxxxOOOOiiOOlllllOb", 
"xxxxxOOOOOOOllllOOlllllOb", 
" 00 OOOxxxxxl 101 1100 11 1110b", 
"xxxxxxxxxxl OOliiOOlll 11 Ob", 
"xxxxxxxxxxlOOOiiOOlllllOb", 
"xxxxxxxxxxlOlliiOOlllllOb", 
"xxxxxxxxxxlOlOiiOOlllllOb", 
"xxxxxxxxxxlllliiOOlllllOb", 
"xxxxxxxxxxl llOiiOOlllllOb", 


/*  Format  14  instructions  (4)  */ 


'rdval". 

1, 

"xxxxxxxxxOOOOOllOOOllllOb", 

'a 

'wrval". 

1, 

"xxxxxxxxxOOOOlllOOOllllOb", 

'a 

'lmr". 

2, 

"xxxxxxxxxOOOlOllOOOllllOb", 

'e 

'  shut  " , 

2/ 

"xxxxxxxxx000111100011110b". 

'e 

’*  Address 

Mode 

Table  */ 

♦define  MAXAM  42 


movst " , 

1, 

"OOOOOxxxl 00000 iiOOOOlll Ob", 

'j'. 

movs?". 

1, 

"OOOOOxxxOOOOOOiiOOOOlllOb", 

•y. 

struct  { 

cmpst " , 

1, 

"OOOOOxxxlOOOOliiOOOOlllOb", 

•y. 

char  *mstr 

crops?". 

1, 

"OOOOOxxxOOOOOliiOOOOlllOb", 

•y. 

char  *gstr 

skpst". 

1, 

"OOOOOxxxl 00011 iiOOOOlll Ob", 

•j'. 

int  ment; 

skps?". 

1, 

"OOOOOxxxOOOOlli 10000 1110b", 

•y. 

char  mopt; 

setefg". 

1, 

"OOOOOxxxxOOOlOl 100001 110b", 

•k'. 

}  admode [  MAXI 

/*  mode  match  string  */ 

/*  output  string  to  insert  (gen)  * 
/*  count  of  ambigs  to  be  put  into 
extension  bytes  */ 

/*  mode  option  */ 

M  )  =  ( 


/*  Scaled  index  modes  */ 


rot?". 

2, 

“xxxxxxxxxxOOOOiiOlOOlllOb", 

'c' , 

“* [r?:b] 

",  "11100", 

ash?". 

2, 

"xxxxxxxxxxOOOliiOlOOlllOb", 

’c' , 

“* [r?:w] 

",  "11101", 

cbit?". 

2, 

“xxxxxxxxxxOOlOiiOlOOlllOb", 

'c'. 

“* (r?:d) 

",  "11110", 

cbiti?". 

2, 

“xxxxxxxxxxOOlliiOlOOlllOb", 

'c* , 

“*[r?:q) 

",  "11111", 

lsh?". 

2, 

"xxxxxxxxxx0101ii01001110b". 

'c' , 

sbit?". 

2, 

"xxxxxxxxxxO  HOiiOl  00111  Ob", 

•c'. 

/*  Simple  register  modes 

sbiti?". 

2, 

"xxxxxxxxxxOllliiOlOOlllOb", 

'c' , 

neg?". 

2, 

"xxxxxxxxxxlOOOiiOlOOlllOb", 

'c' , 

"rO", 

"00000",  0, 

not?". 

2, 

"xxxxxxxxxxlOOliiOlOOlllOb", 

•c'. 

"rl". 

"00001",  0, 

subp?". 

2, 

“xxxxxxxxxxlOlliiOlOOlllOb", 

•c'. 

"00010",  0, 

abs?". 

2, 

"xxxxxxxxxxllOOiiOlOOlllOb", 

'c* , 

"r3". 

"00011",  0, 

com?". 

2, 

"xxxxxxxxxxl lOliiOl 001 110b", 

•c'. 

"r4  ", 

"00100",  0, 

ibit?". 

2, 

"xxxxxxxxxxl  HOiiOl  00111  Ob", 

'c* , 

"r5". 

"00101",  0, 

addp?". 

2, 

"xxxxxxxxxxl  1  Hi  iOlOOlllOb", 

'c' , 

"r6". 

"00110",  0, 

/*  Format  7  ops  (15)  */ 


"movm?", 

"empm?". 


3,  "xxxxxxxxxxOOOOiillOOlllOb", 
2,  “xxxxxxxxxxOOOliillOOlllOb", 


"00000", 

"00001", 

”00010“, 


/*  main  registers  */ 


,  /*  floating  point  */ 


(continued  on  next  page) 


Dr.  Dobb  s  Journal,  January  1987 

48 


85 


32000  CROSS  ASSEMBLER 


Listing  One  (Listing  continued) 


“f3",  “00011“,  0, 
"f4",  "00100",  0, 
"f5" ,  “00101",  0, 
“f6“,  “00110”,  0, 
"f7“,  “00111",  0, 


/*  Indexed  addressing  modes  */ 

“* (rO) " ,  “01000“,  1, 

“* (rl) ",  "01001“,  1, 

“*(r2)“,  “01010",  1, 

“*(r3>“,  “01011",  1, 

"*(r4)“,  “01100“,  1, 

"* (r5) ",  “01101“,  1, 

“* (r6) ",  “01110“,  1, 

"* (r7) ",  “01111",  1, 

“*(*(fp)>“,  “10000“,  2, 

“*(*(sp))“,  “10001",  2, 

“* (* (sb) ) ",  "10010",  2, 

"#*“,  “10100",  1, 

“10101“,  1, 
“ext  (*)+*“,  “10110",  2, 

"tos" ,  “10111“,  0, 


“*(fp)“, 

“*(sp)“, 

"* (sb) ", 

"[*]“, 

/*  catch-all  */ 


"11000“,  1, 

“11001",  1, 

“11010",  1, 


/*  indexed  */ 


"11011“,  1, 


/*  frame  ptr  */ 

/*  stack  mem  */ 

/*  static  mem  */ 

/*  immediate  */ 

/*  absolute  */ 

/*  external  */ 

/*  top  of  stack  */ 

/*  frame  mem  */ 

/*  stack  mem  */ 

/*  static  mem  */ 

/*  program  mem  */ 


•1',  /*  register  list  */ 


“*“,  ““,  1,  'w'  /*  fits  no  pattern  */ 

)  ; 

/* - MAIN  PROGRAM - */ 

main (  argc,  argv  ) 
int  argc; 
char  *argv[]; 

{ 

int  i; 

puts (  "\nA32000  v0.10“  ); 
if (  argc  <  2  )  ( 

puts(  “\n?No  file  name  specified"  ); 
exit  (  1  ) ; 

) 

symcnt  =0; 

for(  pass  =  1;  pass  <-  3;  ++pass  )  f 

makename (  argv[  1  ],  “.s“  ); 
fasm  =  fopen(  filename,  “r"  ); 

if(  fasm  =-  0  )  ( 

puts (  “\n?Unable  to  open  source  file"  ) ; 
exit  (  1  > ; 

) 

if (  pass  “  3  )  ( 

makename (  argv [  1  ] ,  " . hex"  ) ; 
fob j  =  fopen (  filename,  “w“  ) ; 
if (  !  fobj  )  ( 

puts(  “\n?No  directory  space"  ); 
exit (  1  ) ; 


puts (  "XnPass  “  ); 
putchart  pass  +  'O'  ); 

asmadr  -  0; 
codadr  -*  0; 
if (  pass  ==  3  )  ( 


obj flush!) ; 
listnl  ()  ; 

) 

inpload  () ; 

while(  gword()  )  ( 

if(  match!  word,  “end"  ))  break; 

/*  Each  word  is  processed  by  the  following  nested  if 
statement,  which  attempts  to  identify  what  it  is. 
Note  that  any  successful  identification  stops  the 
process  of  the  statement.  */ 

if(  !  islabel!  word  )) 

if!  !  ispseudo!  word  )) 
if(  !  isopcode!  word  )) 
if (  !  isequate!  word  )) 
error (  ' ? ' ,  word  ) ; 

) 

fclose (  fasm  ) ; 

/*  Sort  symbols  after  pass  1.  */ 

if(  pass  ==  1  )  sortsyms!); 

if(  pass  --  3  )  ( 

ob  j  flush!) ; 

putc(  fobj  );  /*  write  eof  record  */ 

for!  i  -  0;  i  <  10;  ++i  )  putc(  'O',  fobj  ); 
putc(  '\n',  fobj  ); 

fclose (  fobj  ) ; 


listpr  () ; 
puts!  “\n\n“  ); 
dumpsyms  ( )  ; 

if(  errors  ) 

puts (  “\n - Fix  errors  and  reassemble - “  ) ; 


/*  Construct  a  filename  from  two  strings.  */ 

makename (  p,  q  ) 
char  *p,  *q; 

{ 

char  *r; 

r  -  sfilename[  0  ]; 
while!  *p  )  *r++  -  *p++; 
while (  *q  )  *r++  -  *q++; 

*r  -  'NO1; 


/*  Check  to  see  if  the  word  is  a  label,  and  if  it  is,  add 
its  value  to  the  symbol  table  */ 

int  islabel (  w  ) 
char  *w; 

{ 

while!  *w  )  ++w> 

if(  * — w  l-  )  return  0; 

*w  -  '\0';  /*  take  off  the  colon  */ 

addsymbol (  word,  codadr  ) ; 
return  1; 


/*  Check  the  word  to  see  if  it  is  a  pseudo-op.  */ 

int  ispseudo (  w  ) 
char  *w; 

{ 

long  int  getargO,  temp; 

if(  match!  w,  "org“  ))  ( 
asmadr  -  getargO; 
codadr  =  asmadr; 
if(  pass  —  3  )  objflush(); 


(continued  on  next  page) 


86 


Dr.  Dobb's  Journal,  January  1987 


_ 32000  CROSS  ASSEMBLER 

Listing  One  (Listing  continued) 


return  1; 

} 

iff  match (  w,  “db“  ))  { 
temp  -  getargf); 
objout (  temp  s  OxFF  ) ; 
return  1; 

1 


/*  Note:  Allow  msgs?  */ 
/*  get  argument  */ 

/*  output  byte  */ 


case  ■ b ■  :  *p++  -  ' 0 1 ; 

*p++  -  * 0 ‘ ; 

break; 

case  1 w 1  :  *p++  =  '0‘; 
*p++  =  ‘l*; 
break; 


iff  matchf  w,  ”dw“  ))  ( 

temp  *•  getargf);  /*  get  argument  */ 

objout (  temp  &  OxFF  ) ;  /*  output  lsb  */ 

objout ( (  temp  »  8  )  s  OxFF  ) ;  /*  output  msb  */ 

return  1; 


case  1 d 1  :  *p++  =  • 1 1 ; 
*p++  =  ‘l1; 

) 

1 

) 

/*  now  parse  operands  */ 


iff  matchf  w,  "dd“  ))  ( 

temp  -  getargf);  /*  get  argument  */ 

objout  (  temp  s  OxFF  ) ;  /*  output  lsb  */ 

objout ( (  temp  »  8  )  s  OxFF  ) ; 
objout ( (  temp  »  16  )  s  OxFF  ) ; 

objout ( (  temp  »  24  )  s  OxFF  ) ;  /*  output  msb  */ 

return  1; 

) 

iff  matchf  w,  “even"  )  ss  (  codadr  s  1  ))  ( 

objout (  0  ) ;  /*  send  1  byte  to  go  to  word  bndry  */ 
return  1; 

) 

return  0; 

) 

/*  Check  to  see  if  the  word  is  an  opcode,  and  if  it  is, 
get  any  operands  required  and  generate  code.  */ 

int  isopcode (  w  ) 
char  *w; 

< 

long  int  value (),  bitbinf),  deebinf),  o,  ocodadr; 
char  opbuf [  33  J,  bytbuff  33  ],  extbuff  128  ); 
char  opopt,  modopt,  opslz,  opent; 
int  1,  j,  k,  1; 

char  *p,  *q,  *cpystr(>,  *regbits(); 


/*  get  count  of  operands. 

Take  abs  value  (neg  =  PC-relative)  */ 

opent  =  opcode!  i  l.ocnt; 

iff  opent  <  0  )  opent  -  0  -  opent; 

p  =  & opbuf [  0  ] ;  /*  modified  parts  start  at  beg.  */ 

forf  j  -  0;  j  <  opent;  ++j  )  f 

gwordf);  /*  get  operand  */ 

/*  find  addr  mode  */ 

k  -  0; 

while ((  k  <  MAXAM  ) 

ss  !  matchf  word,  admode [  k  ] .mstr  ))  ++k; 
modopt  =  admode [  k  l.mopt; 

/*  move  bit  string  into  place  */ 
q  =  admode [  k  ) . gstr; 

/*  if  opopt  h,  sfsr,  skip  5  bits  */ 
iff  opopt  “  ‘h1  )  p  H—  5; 

/*  for  most  opopts,  move  the  bits  in  */ 


/*  postbytes  s  scaled  indexes  */ 
int  opexbt [  4  ] ,  opexet; 

int  adexet,  adexlnf  8  J; 

char  “adexpt [  8  ],  *eoadex;  /*  addressing  extensions  */ 

ocodadr  ■=  codadr;  /*  save  addr  of  begin  of  instr  */ 

opexet  “  0;  /*  no  postbytes  as  yet  */ 

adexet  =  0;  /*  no  extensions  as  yet  */ 

eoadex  «  sextbuff  0  ] ;  /*  point  to  begin  of  extbuf  */ 

forf  i  =  0;  i  <  MAXOP;  ++i  ) 

iff  matchf  w,  opcode!  i  ] .onam  ))  ! 

opopt  -  opcode!  i  l.oopt; 

iff  opopt  =-  'x'  )  ( 

error!  'x',  w  );  /*  unimplemented  instruction  */ 
return  1; 

) 

p  -  cpystrf  opcode!  i  ) .obin,  sopbuf[  0  )  ); 

/*  see  if  length  modifier  */ 

iff  ambent  >  0  )  ( 
p  =  sopbuf [  0  J ; 
opsiz  -  *ambig[  0  ]; 

while)  *p  ss  *p  !-  • 1 ■  )  ++p; 

iff  !  *p  )  error  (  1 1 ' ,  w  ) ; 
else  { 

switch (  opsiz  )  f 


if ( (  opopt  —  •  •  ) 

I  I  (  opopt  ==  'a1  ) 

I  I  (  opopt  “  'c'  ) 

I  I  (  opopt  =“  'e1  ss  j  ==  1  ) 

I  I  (  opopt  ==■  'h‘  ) 

I  I  (  opopt  —  *i'  ss  j  <  2  ) 

I  I  (  opopt  ==  '1'  ss  j  ==  1  ) 

I  I  (  opopt  ==  'm'  ss  j  >  0  ss  j  <  3  ) 

I  I  (  opopt  ==  'o'  ss  j  <  2  )) 

while (  *q  )  *p++  =  *q++; 

/*  Double  the  effort  for  scaled  index  mode,  create  an 
extension  postbyte  opexbt []  with  basemode  as  upper 
5  bits,  reg  as  lower  3  bits.  */ 

iff  admode!  k  ] .mopt  --  's'  )  ( 

1  =  (  *ambig[  1  ]  )  s  7; 
q  =  cpystrf  ambig[  0  ],  sbytbuft  0  ]  ); 

/*  find  basemode  */ 

k  -  0; 

whilef  (  k  <  MAXAM  ) 

ss  !  matchf  sbytbuf!  0  J, 
admode!  k  ] .mstr  ))  ++k; 

modopt  -  admode!  k  J  .mopt; 

/*  move  bit  string  into  postbyte,  use  bitbin,  because 
valuef)  destroys  ambig[]  array  which  we  still  need.  */ 

q  -  cpystrf  admode!  k  ) .gstr, 
sbytbuf!  0  ]  ); 

— V  /*  back  up  to  null  */ 

*q++  -=,0'+ffl»2)sl); 

*q++  =  •O'  +  ((  1  »  1  )  s  1  ) ; 


88 

50 


Dr.  Dobb's  Journal,  January  1987 


32000  CROSS  ASSEMBLER 


*q++  =  'O'  +  (  1  &  1  ) ; 

*q  =  '\0'; 

opexbt [  opexct++  ]  =  bitbin (  &bytbuf[  0  ]  ); 
q  =  admodef  k  ] .gstr; 


/*  funny  handling  of  reg:  index  operation  (opopt  'm')  */ 

if (  opopt  ==  'm'  &&  j  ==  0  )  { 

p  =  &opbuf[  10  ];  /*  off  to  reg  */ 

q  +=  2;  /*  skip  0  bits  */ 

while (  *q  )  *p++  =  *q++; 
p  =  &opbuf[  0  ] ;  /*  reset  */ 


/*  move  ambigs  into  extension  bytes,  set  length  to 
variable  (0) .  For  some  addressing  modes,  the 
extensions  go  in  in  reverse  order  */ 

if (  modopt  --  'r'  ) 

for(  1  -  admode[  k  ] .mcnt  -  1;  1  >=  0; 

—1  )  { 

adexpt [  adexct  ]  =  eoadex; 
adexln[  adexct  ]  =  0; 

++adexct; 

eoadex  =  cpystr (  ambig [  1  ] ,  \ 
eoadex  ) ; 

}  else  for(  1=0;  1  <  admode[  k  ] .mcnt;  ++1  )  { 
adexpt [  adexct  ]  =  eoadex; 
adexln[  adexct  ]  =  0; 

++adexct; 

eoadex  =  cpystr (  ambig [  1  ] ,  \ 
eoadex  ) ; 


/*  special  logic  for  register  list  for  "enter",  "save", 
"restore",  "exit"  */ 

if ( (  j  =  0  &&  opopt  ==  'f'  >  II  opopt  ==  'g' 
adexpt [  adexct  ]  =  eoadex; 
adexln[  adexct  ]  =  1; 

++adexct; 

.  eoadex  =  regbits  (  word,  eoadex,  opopt  ) ; 


/*  shorten  extension  to  1  byte  for  enter,  return  */ 

if ( (  j  —  1  &&  opopt  ==  ' f '  )  II  opopt  ==  'n'  )  { 
if (  adexct  >  0  )  adexln[  adexct  -  1  ]  =  1; 
else  error (  'e',  w  ); 

} 

/*  opopts  'e'  or  'd':  immed  data  becomes  4  bit  value  */ 

if ( (  j  =  0  &&  opopt  ==  ' e'  )  ||  opopt  ==  'd'  )  { 
if (  !  adexct  )  error (  'e',  w  ); 
else  { 

1  =  value (  adexpt [  — adexct  ]  ); 
p  =  &opbuf [  5  ) ; 

*p++  =  ' 0 '  +  ( (  1  »  3  )  &  1  )  ; 

*p++  =  '0'  +  ((  1  »  2  )  &  1  ); 

*p++  -  ' 0 '  +  ( (  1  »  1  )  &  1  )  ; 

*p  -  'O'  +  (  1  &  1  ); 

p  =  sopbuf [  0  ] ; 


/*  opopt  ' i ' :  combine  last  two  ambigs  into  one  postbyte. 
Put  it  in  bytbuf  and  set  length  to  1  */ 

if(  opopt  ==  ' i '  &  &  j  ==  3  )  { 

if(  adexct  <  2  )  error (  'e',  w  ); 
else  { 

1  =  (  value (  adexpt [  — adexct  ]  >  -  1  ) 

&  31; 

1  +=  (  value  (  adexpt [  — adexct  ]  ) 

&  7  )  «  5; 
bytbuf [  0]='0'+ 

((  1  /  100  )  %  10  ); 
bytbuf [  1  ]  =  ' 0 '  + 

({  1  /  10  )  %  10  ); 


Listing  One  (Listing  continued ) 


bytbuf [  2  ]  -  'O'  +  (  1  %  10  ); 
bytbuf [  3  ]  -  '\0'; 
adexpt [  adexct  ]  =  sbytbuf [  0  ]; 
adexln[  adexct ++  ]  =  1; 


/*  opopt  • j • :  uwb  bits  for  movs,  cmps,  skps  */ 

if (  opopt  ==  ‘j'  )  f 
p  +=  5; 

uwbbits (  word,  p  ) ; 

adexct  -  0;  /*  in  case  no  paren  */ 


/*  opopt  'k':  config  bits  for  setcfg  */ 

if(  opopt  ==  'k'  )  t 
p  +=  5; 

cfgbits (  word,  p  )  ; 

1 

/*  opopt  ‘l1:  operand  1  becomes  4  bit  value  */ 

if(  opopt  ==  *1'  SS  j  ==  0  )  f 

adexct  =  0;  /*  unjunk  extensions  */ 

1  =  -1; 


if! 

strcmp ( 

word. 

"upsr" 

)  =-  0 

>1-0; 

if! 

strcmp ( 

word. 

"fp"  ) 

==  0  ) 

1  -  8; 

if! 

strcmp ( 

word. 

"  sp"  ) 

«=  0  ) 

1  -  9; 

if! 

strcmp ( 

word. 

"sb"  ) 

—  o  ) 

1  =  10; 

if! 

strcmp ( 

word. 

"psr"  ) 

i  ==  o  : 

I  1  =  13, 

if! 

strcmp ( 

word. 

•‘intbase"  )  =: 

=  0)1' 

if! 

strcmp ( 

word. 

"mod"  I 

i  ==  0  ; 

)  1-15 

if! 

1  ==  -1 

)  error (  ' p ' , 

r  word 

); 

else  f 

p  =  sopbuf [  5  ]  ; 

*p++  «  ‘ 0 •  +  ( (  1  »  3  )  4  1  ) ; 

*p++  =  -O'  +  ((  1  »  2  )  41); 

*p++  =  ' 0 1  +  ( (  1  »  1  )  41); 

*p  -  '01  +  (  1  4  1  ); 

p  =  4opbuf [  0  ] ; 


/*  odd  length  extension  for  movm,  opopt  'o'  */ 

if (  j  --  2  44  opopt  =-  'o'  )  t 

if  (  adexct  >  0  )  adexln[  —adexct  ]  -  1; 

else  error (  'e' ,  w  ) ; 

1  =  value (  adexpt [  adexct  ]  )  -  1; 
switch!  opsiz  )  f 
case  'd*  :  1  *=  4;  break; 
case  'w'  :  1  *=  2;  break; 

} 

bytbuf [  0  ]  =  ‘O'  +  ((  1  /  100  )  %  10  ) ; 

bytbuf [  1  )  ™  ' 0 '  +  ( (  1  /  10  )  %  10  ) ; 

bytbuf [  2  )  -  'O'  +  (  1  %  10  ); 

bytbuf [  3  ]  =  '  \0 '  ; 

adexpt [  adexct++  )  -  4bytbuf[  0  ) ; 


)  /*  done  operands  */ 

o  =  value!  4opbuf[  0  ]  ); 

1  =  strlen (  opbuf  ) ; 

/*  Send  as  many  opcode  bytes  as  necessary  */ 
objout (  o  %  256  ) ; 

if(  1  >  9  )  objout ( (  o  /  256  )  %  256  ); 
if(  1  >  17  )  objout!  o  /  65536  ); 

/*  Send  postbytes  for  scaled  index  mode  */ 

for(  1  -  0;  1  <  opexct;  ++1  ) 
objout (  opexbt [  1  )  ) ; 


/*  Send  addressing  extensions. 


(continued  on  next  page) 


Dr.  Dohb's  Journal,  January  1987 


89 


adexln  =  length  of  extension  word  in  bytes  if  +. 

If  0,  it  is  a  variable-length  signed  displacement. 

If  -1,  indicates  code-relative  */ 

/*  If  opcode  ocnt  was  negative,  last  address  extension  is 
code  relative.  */ 

if(  opcode [  i  ].ocnt  <  0  ) 
adexln [  adexct  -  1  ]  =  -1; 

/*  send  extension  words  */ 

for (  j  =  0;  j  <  adexct;  ++j  )  ( 

o  =  value (  adexpt [  j  ]  ) ; 

/*  if  adexln[]  negative,  operand(s)  code-relative. 

Note:  on  the  32000  you  don't  correct  by  adding  2  to 
codadr  first  */ 


if(  adexlnf  j  J  <  0  )  o  -=  ocodadr; 

/*  Compute  variable-length  signed  displacement  */ 


iff  adexln [  j  J  <=  0  )  { 
iff  o  <  63  s&  o  >  -64  )  ( 
o  =  (  o  s  0x7F  ) ; 

1=1;  /*  one-byte  */ 

)  else  iff  o  <  8191  ss  o  >  -8192  )  ( 
o  -  (  o  S  0x3 FFF  )  +  0x8000; 

1  “  2; 

)  else  f 

o  =  (  o  S  0x3FFFFFFF  ) 

+  0x00000000; 

1  -  4; 

) 

}  else  1  =  adexlnf  J  ] ; 

/*  address  extensions  are  sent  in  lohi  order  */ 


) 


iff  1  >  3  ) 
iff  1  >  2  ) 
iff  1  >  1  ) 

objout (  o  % 

} 

return  1; 

1 

return  0; 


objout ( (  o  »  24  )  s  OxFF  ) ; 
objout ( (  o  »  16  )  s  OxFF  ) ; 
objout ( (  o  »  8  )  s  OxFF  ) ; 
256  )  ; 


/*  Special  to  create  extension  word  for  register  list  */ 

/*  Regbits  may  look  like  “r0“  or  may  look  like  “[r0,r2]" 
or  like  "[r0-r7]”.  */ 


char  * regbits (  src,  dst,  fig  ) 
char  *src,  *dst,  fig; 
f 

int  bits,  reg,  loreg,  hireg; 
bits  =  0; 


iff  *src  -=  '['  )  ++src;  /*  strip  parens  */ 
else  error (  ' [ ' ,  src  ) ; 

while (  *src  )  ( 

iff  *src++  ! =  'r*  )  errorf  'r',  src  ); 
reg  -  (  *src++  )  -  'O'; 
bits  -  bits  |  (  1  «  reg  ) ; 
iff  *src++  -=•-■)  ( 
loreg  =  reg; 

iff  *src++  !-  *r'  )  errorf  'r',  src  ); 
hireg  -  (  *src++  )  -  'O'; 
if (  hireg  <  loreg  )  ( 
reg  =  hireg; 
hireg  =  loreg; 
loreg  =  reg; 

)  /*  swap  if  out  of  order  */ 

forf  reg  =  loreg;  reg  <-  hireg;  ++reg  ) 
bits  -  bits  |  (  1  «  reg  ) ; 

++src;  /*  skip  over  the  comma  */ 

) 

iff  *src  —  ■ J •  )  break; 

(continued  on  next  page) 


Dr.  Dobb's  Journal,  January  1987 

52 


91 


32000  CROSS  ASSEMBLER 


Listing  One  (Listing  continued ) 

) 

/*  if  fig  -  'f',  save/enter,  need  to  swap  bit 

significance.  The  routine  above  constructed  it  in 
reversed  order  in  the  first  place,  because  of  the 
routine  below  */ 

if (  fig  ==  'f  >  { 

hireg  =0; 

for(  reg  =0;  reg  <  8;  ++reg  )  { 

hireg  =  (  hireg  «  1  )  +  (  bits  &  1  ) ; 
bits  =  (  bits  »  1  ) ; 

} 

bits  =  hireg; 

} 

/*  now  create  a  binary  string  for  the  extension. 

Note  that  bit  significance  becomes  reversed  again  */ 

for(  reg  =0;  reg  <8;  ++reg  )  { 

*dst++  =  'O'  +  (  bits  &  1  ); 
bits  =  (  bits  »  1  ) ; 

} 

*dst++  =  'b';  /*  add  b  for  binary  */ 

*dst++  =  '\0';  /*  terminate  */ 

return  dst; 

} 

/*  put  config  bits  into  instruction  */ 

cfgbits (  src,  dst  ) 
char  *src,  *dst; 

{ 

char  b [  4  ] ; 

b[  0  ]  =  'O'; 
b[  1  ]  -  'O'; 
b[  2  ]  =  'O'; 
b[  3  ]  =  'O'; 

if(  *src  ==  '['  )  ++src;  /*  strip  parens  */ 
else  error  (  '  [ ' ,  src  ) ; 

while (  *src  )  { 

switch (  *src++  )  { 


c  *  :  b  [  0  ] 

=  '1 

break; 

m*  :  b[  1  ] 

-  ■! 

break; 

case  'b'  :  b [  2  ]  =  ' 1 ' ; 

/*  backward  */ 

break; 

case  'u'  :  b[  0  ]  =  '1'; 

/*  until  match  */ 

b[  1  ]  -  '1'; 

break; 

case  'w':b[0]='0';  /*  while  match  */ 

b[  1  ]  =  '  1 ' ; 

} 

} 

*dst++  =  b[  0  ] ; 

*dst++  =  b[  1  ] ; 

*dst  =  b[  2  ] ; 


/*  Check  to  see  if  the  word  begins  an  equate,  and  if  it 
does,  add  the  symbol  to  the  symbol  table.  */ 

int  isequate (  w  ) 
char  *w; 

{ 

char  tempword[  128  ]; 
char  *q,  *cpystr(); 
long  int  1,  getargO; 

q  =  cpystr(  w,  &tempword[  0  ]  ); 

gword();  /*  get  next  word  */ 

if (  strcmp(  word,  "equ"  )  ==  0  || 
strcmp(  word,  )  ==  0  )  { 

1  =  getargO;  /*  get  argument  */ 

addsymbol (  &tempword[  0  ],  1  ); 

return  1;  /*  it  was  an  equate  */ 

} 

return  0;  /*  we  lost  a  word  */ 

} 

/*  Get  an  argument  value  (for  use  above) .  */ 

long  int  getargO 
{ 

long  int  value (); 

gword();  /*  get  next  word  */ 

return  value  (  word  ) ; 


/*  copy  string  and  return  new  ending  address  */ 


case  'f'  :  b[  2  ]  =  '1'; 
break; 

case  'i'  :  b[  3  ]  =  '1'; 

} 

if{  *src++  ==  ']'  )  break; 

} 

*dst++  =  b[  0  ] ; 

*dst++  =  b[  1  ]; 

*dst++  =  b[  2  ] ; 

*dst  =  b[  3  ] ; 


/*  put  uwb  bits  into  instruction  */ 

uwbbits (  src,  dst  ) 
char  *src,  *dst; 

{ 

char  b [  3  ] ; 

b[  0  ]  =  'O';  /*  default  =  forward  */ 

b[  1  ]  =  'O';  /*  default  =  neither  */ 

b[  2  ]  =  'O'; 

while  (  *src  )  { 

switch  (  *src++  )  { 


char  *cpystr(  src,  dst  ) 
char  *src,  *dst; 

{ 

while (  *src  )  *dst++  =  *src++; 

*dst++  =  '\0';  /*  terminate  copied  string  */ 

return  dst;  /*  return  next  address  */ 

} 

/*  Calculate  the  value  of  a  word.  It  may  be  a  symbol,  a 
constant,  or  a  computed  value  (must  be  enclosed  in 
parentheses.)  */ 

long  int  value (  w  ) 
char  *w; 

{ 

long  int  hexbin(),  octbin(),  bitbin(),  decbin(),  v; 
int  lookup  (),  i,  negate; 

char  *q; 

char  *wp[  16  ]; 

int  wpcnt; 

negate  =  0; 

if(  *w  —  '-'  )  {  /*  Unary  negation  */ 


92 


Dr.  Dobb's  Journal,  January  1987 

53 


negate  -  1; 
++w; 


if(  strcmpf  w,  0  ) 

return  codadr;  /*  .  =  code  address  */ 
if(  strcmpf  w,  )  —  0  ) 

return  asmadr;  /*..=•  assembly  address  */ 

iff  isdigitf  *w  ))  ( 

iff  match (  w,  "*h"  ))  v  -  hexbinf  w  ); 

else  iff  match (  w,  "‘q"  ) )  v  =  octbinf  w  ); 
else  iff  match (  w,  “*b“  )) 
v  =  bitbin  (  w  ) ; 
else  v  -  decbin (  w  ) ; 

)  else  { 

if  (  /* - FORMULA - */ 

++w;  /*  skip  (  */ 

q  =  w; 

while (  *q  )  ++q;  /*  find  end  of  string  */ 

— q; 

iff  *q  !=  ') •  )  error (  • ) 1 ,  q  ); 
else  *q  =  ‘\0‘;  /*  zap  )  */ 

iwparen  =  0;  /*  no  parens  now  */ 

wpcnt  -  0; 

while (  1  )  f  /*  find  beg  of  word  */ 

while (  inword (  *w  ))  ++w; 

iff  !  *w  )  break; 

wp[  wpcnt++  ]  -  w;  /*  ptr  to  value  */ 

iwparen  =  0;  /*  find  end  of  word  */ 

while (  *w  &&  !  inword (  *w  ))  ++w; 
iff  !  *w  )  break; 

*w++  -  '\0';  /*  terminate  it  */ 

iff  wpcnt  ”  16  )  ( 

error (  *1',  w  );  /*  too  long  */ 
break; 

1 

) 

if ( (  wpcnt  %  2  )  ==  0  )  f 

error (  'v‘(  w  );  /*  must  be  odd  */ 

— wpcnt ; 

) 


value (  wp[  0  J  ); 


opdone : 


forf  i  =  1;  i  <  wpcnt;  i  +-  2  )  ( 

iff  strcmpf  wp[  i  J,  "+"  )  ==  0  )  f 
v  +=  value (  wpf  i  +  1  ]  ); 
goto  opdone; 

} 

iff  strcmpf  wpf  i  J,  >  —  0  >  ( 

v  -=  value  (  wpf  i  +  1  ]  ); 
goto  opdone; 

) 

iff  strcmpf  wpf  i  ),  )  ==  0  )  f 

v  *=  value  (  wpf  i  +  1  ]  ); 
goto  opdone; 

) 

iff  strcmpf  wpf  i  ],  “/*■  )  —  0  )  f 
v  /=  value  (  wpf  i  +  1  ]  ); 
goto  opdone; 

) 

error (  ‘o‘,  wpf  i  ]  ); 


i 


) 


)  else  ( 


lookup (  w  ) ; 


/*  unknown  op  */ 
/*  get  around  c/80  bug  */ 

/* - PLAIN  VALUE - */ 

/*  look  up  symbol  */ 


iff  i  <  0  )  return  0;  /*  unknown  symbol  */ 

v  =  symbolf  i  J.sval;  /*  return  sym  value  */ 


(continued  on  next  page) 


Dr.  Dobb's  Journal,  January  1987 

54 


93 


32000  CROSS  ASSEMBLER 

Listing  OnC  (Listing  continued) 

j  /=  2;  /*  halve  step  */ 

if<  j  —  0  )  { 

if(  negate  >  v  =  0  -  v; 

if(  k —  ==  0  )  { 

found  =0;  /*  not  found  */ 

return  v; 

break; 

} 

j  -  l; 

/*  function  for  value ()  */ 

int  inword (  c  ) 

1 

char  c; 

{ 

if (  !  found  )  { 

if (  C  —  • (•  )  ++iwparen;  /*  special  var  for  this  */ 

if(  pass  !=  1  )  error;  'u',  w  ); 

if (  c  ==  ■)•  )  — iwparen;  /*  function  */ 

return  -1; 

if (  iwparen  )  return  0; 

return  i; 

if (  c  ==  ■  '  )  return  1;  /*  is  space  */ 

i 

return  0; 

/*  display  error  code  */ 

} 

error (  c,  p  ) 

/*  - - SYMBOL  TABLE  LOGIC - */ 

char  c; 

char  *p; 

/*  add  new  symbol  to  symbol  table  */ 

t 

puts;  "\n» - >  Error  “  ); 

addsymbol (  p,  v  ) 

put char (  c  ); 

char  *p; 

puts;  "  at  "  ); 

long  int  v; 

puts (  p  ) ; 

{ 

char  *w,  *cpystr(),  *alloc(); 

++errors; 

int  i,  lookup;),- 

i 

i  =  lookup (  p  ) ;  /*  see  if  already  known  */ 

/*  sort  symbols  by  shell  sort  */ 

if(  i  <  0  )  f  /*  new  symbol  */ 

sortsyms  () 

i  "  syment; 

t 

++symcnt;  /*  count  a  new  symbol  */ 

int  jump,  done,  k,  1; 

char  *n; 

symbol [  1  l.snam  =  alloc(  strlen(  p  )  +  1  ); 

long  int  v; 

w  =  cpystr(  p,  symbol [  i  ].snam  ); 

i 

jump  -  syment;  /*  set  jmp  to  ent  of  elements  */ 

symbol [  i  l.sval  -  v;  /*  update  value  in  table  */ 

while (  jump  >  0  )  { 

) 

jump  -  jump  /  2; 

/*  lookup  -  returns  symbol  number  or  -1  if  not  found  */ 

while;  1  )  ( 

done  =1; 

int  lookup(  p  ) 

for(  k  -  0;  k  <  (  syment  -  jump  );  ++k  )  ( 

char  *p; 

1  =  k  +  jump; 

( 

if(  stremp;  symbol;  k  l.snam. 

char  *w; 

symbol [  1  ] . snam  )  >  0  )  ( 

int  i,  j,  k,  found; 

n  -  symbol;  k  l.snam; 

v  =  symbol [  k  ] .sval; 

found  -  0;  /*  not  found  yet  */ 

symbol [  k  l.snam  =  symbol [  1  l.snam; 

symbol;  k  l.sval  =  symbol;  1  ].sval; 

/*  pass  1  -  use  linear  search  */ 

symbol [  1  ] . snam  -  n; 

symbol [  1  l.sval  =  v; 

iff  pass  ==  1  )  ( 

done  =  0; 

for(  i  =  0;  i  <  syment  it  !  found;  ++i  )  { 

i 

w  =  symbol [  i  ].snam; 

i 

found  -  (  stremp (  p,  w  )  —  0  ) ; 

if (  done  )  break; 

}  else  { 

i 

V*  passes  2  and  3  -  use  binary  search  */ 

/*  dump  symbol  table  */ 

j  -  (  syment  +  1  )  /  4;  /*  step  to  use  */ 

i  =  syment  /  2;  /*  starting  point  */ 

dumpsyms ( ) 

k  =  j  +  i;  /*  one-step  count  */ 

; 

char  *w; 

while(  1  )  ( 

int  i; 

w  =  symbol [  i  ] .snam; 

long  int  v; 

found  -  strcmp(  p,  w  ); 

if(  found  =—  0  )  ( 

puts (  "\nSymbol  and  ValueXn"  ) ; 

found  “1; 

break; 

for(  i  =  0;  i  <  syment;  ++i  )  { 

}  else  if (  found  <  0  )  i  —  j; 

puts (  symbol [  i  l.snam  ); 

else  i  +=  j; 

puts (  "  -  “  ) ; 

v  =  symbol [  i  l.sval; 

if(  i  <  0  )  i  -  0; 

puthex(  v  »  24,  0  ); 

if(  i  >-  syment  )  i  -  syment  -  1; 

put hex  (  (  v  »  16  )  &  OxFF,  0  ); 

(continued  on  page  97) 

Dr.  Dobb's  Journal,  January  1987 

55 


32000  CROSS  ASSEMBLER 


if{  c  ==  1 V*  )  quote  =  !  quote; 
if(  c  ==  )  quote  =  !  quote; 

if (  c  ==  '  ('  )  ++paren; 


Listing  One  (Listing  continued ) 

put hex ( (  v  »  8  )  6  OxFF,  0  ); 
puthex (  v  6  OxFF ,  0  ) ;  /*  print  value  */ 

put char (  '\n’  ); 

} 

1 

/*  Match  string.  If  match,  returns  1;  else  returns  0. 
Ambiguous  values  from  the  matches  are  saved  and 
pointed  to  by  the  array  of  char  pointers  ambig[],  so 
they  can  be  checked  later.  */ 

int  match!  wl,  w2  ) 
char  *wl ,  *w2 ; 

{ 

char  c; 

char  *next_ambig; 

nextambig  -  6ambig_buffer[  0  ] ;  /*  init  ambig  buff  */ 
ambcnt  =  0;  /*  ambigs  so  far  */ 

while!  *wl  )  ( 
c  -  *w2++; 
if(  c  —  )  ( 

ambig[  ambcnt ++  )  -  next  ambig; 
while!  *wl  ss  *wl  !=  *w2_) 

*next_ambig++  -  *wl++; 
if(  !  *wl  66  *w2  )  return  0; 

*next_ambig++  -  '  X0‘;  /*  terminate  this  ambig  */ 

)  else  iff  c  —  ■?•  )  { 

ambig!  ambcnt ++  ]  =  next_ambig; 

*next_ambig++  =  *wl++;  /*  1-char  ambig  */ 

*next__ambig++  =  *\0';  /*  terminate  it  */ 

}  else  if(  c  !«  *wl++  )  return  0; 

) 

return  1; 

) 

int  gword() 
t 

char  *p,  *q; 
char  c,  gchar  ()  ; 

p  =  5word_buffer [  0  ]; 

c  -  1  1 ; 

While!  isdelim!  c  ))  c  =  gchar (); 

while!  !  isdelim!  c  ))  ) 

*p++  =  tolower (  c  ) ; 
c  =  gchar(); 

) 

*p  =  ‘\0’;  /*  terminate  word  */ 

word  =  6word_buffer [  0  ) ; 
return  1; 

) 

/*  is  the  character  a  delimeter?  */ 

isdelim!  c  ) 
char  c; 

( 

if(  paren  ||  quote  ||  brack  ) 
return  0;  /*  not  a  delim  */ 

if!  c  =-  1  '  ||  c  ==  ', •  ||  c  —  ■;■  ||  c  =-  ' \n‘  \ 
lie-  '\r'  | |  c  —  '\t'  ) 
return  1; 
return  0; 

( 

/*  get  next  char  from  source  file  */ 

char  gchar!) 

I 

char  c,  getch  ()  ; 

c  “  getch  () ;  /*  get  char  from  file  */ 


if (  c  ==  1 ) ■  )  — paren; 
if(  c  -=  ■ [•  )  ++brack; 
iff  c  ==  • ] ■  )  — brack; 

if(  !  quote  66  !  paren  66  !  brack  )  ( 

while!  c  ==■;•)  (  /*  ; comment \n  */ 

while!  getch!)  !=  *\n‘  )  ; 
c  =  getch(); 

) 

) 

return  c; 

) 

puts!  P  ) 
char  *p; 

( 

while!  *P  )  putchar!  *p++  ); 

1 

/*  -  source  file  routines  -  */ 

char  getch  () 

( 

while!  inpent 
listpr  () ; 
inpload  ()  ; 

) 

— inpent; 

return)  inpbuf[  inpptr++  ]  ); 


inpload!) 

1 

char  c,  getc(); 

inpent  =  0; 
inpptr  =  0; 

while!  ((  c  =  getc(  fasm  ))  !=  ‘Xn1  )  66  (  c  ! =  EOF  ))  ( 
inpbuf [  inpcnt++  ]  =  c; 

if(  listep  <  81  )  listline[  listcp++  ]  =  c; 


inpbuf[  inpent ++  ]  -  'Xn1; 

) 

/*  -  listing  file  routines  -  */ 

listnl ()  /*  list  new  line  */ 

f 

int  i; 

if(  pass  !=  3  )  return; 

for!  i  =  0;  i  <  26;  ++i  )  listline!  i  )  =  1  '; 
for(  i  =  26;  i  <  81;  ++i  )  listline!  i  ]  =  ‘XO1; 

listop  -  0;  /*  flag  to  cause  addr  output  */ 

listep  =  26; 


lbyt('b  )  /*  put  object  byte  in  list  file  */ 

unsigned  int  b; 

( 

char  c; 

if(  pass  !=  3  )  return; 

c  =  ((  b  /  16  )  %  16  )  +  ‘0 1 ; 
if!  c  >  1 9 *  )  c  +=  (  1  a 1  —  1 : 1  ); 
listline!  listop++  )  =  c; 

c  =  (  b  %  16  )  +  -O'; 

if(  c  >  '9'  )  c  +=  (  'a'  -  ); 

listlinet  listop++  J  =  c; 


(continued  on  next  page) 


“=  0  )  {  /*  if  input  buf  empty,  */ 

/*  print  listing  line  */ 

/*  reload  input  buffer  */ 


Dr.  Dobb's  Journal,  January  1987 

56 


97 


32000  CROSS  ASSEMBLER 


Listing  One  (Listing  continued) 

if(  listop  >  24  )  listprO;  /*  print  list  line  */ 

} 

listprO  /*  print  list  line  */ 

{ 

if(  pass  !=  3  )  return; 

put char (  '\n'  ); 
puts (  listline  ) ; 
listnl  (); 


/*  -  object  file  routines  -  */ 

objout (  c  ) 
char  c; 

{ 

asmadr++;  /*  incr  asmadr,  codadr.  DON'T  incr*/ 

codadr++;  /*  objadr,  it  is  addr  of  1st  byte  */ 

if (  pass  !=  3  )  return;  /*  skip  if  not  last  pass  */ 

objbuf [  objcnt++  ]  -  c;  /*  put  new  byte  in  buffer  */ 

if(  objcnt  ==  32  )  objflush(); 

if (  listop  ==  0  )  {  /*  print  address?  */ 

lbyt (  asmadr  /  16777216  ); 
lbyt { (  asmadr  /  65536  )  %  256  ); 
lbyt ( (  asmadr  /  256  )  %  256  ); 
lbyt (  asmadr  %  256  ) ; 
listop  =  9; 

} 

lbyt (  c  ) ;  /*  send  byte  to  listing  too  */ 


objflush () 

{ 

int  i,  cksum; 

if (  pass  1=  3  )  return;  /*  just  in  case  we  get  here  */ 

cksum  =0; 

if(  objcnt  >  0  )  { 
putc (  ' : ' ,  fobj  ) ; 
puthex(  objcnt,  fobj  ); 
puthex(  objadr  /  256,  fobj  ); 
puthex(  objadr  %  256,  fobj  ); 
put hex  (  0,  fobj  ); 
cksum  = 

objcnt  +  (  objadr  /  256  )  +  (  objadr  %  256  ); 
for (  i  =  0;  i  <  objcnt;  ++i  )  { 
put hex  (  objbuf [  i  ],  fobj  ); 
cksum  +=  objbuf [  i  ]; 

} 

puthex(  0  -  cksum,  fobj  ); 
putc (  '\n‘,  fobj  > ; 

} 

objadr  =  asmadr; 
objcnt  =  0; } 

puthex (  b,  c  ) 
int  b,  c; 

{ 

int  v; 

v  =  (  b  &  OxOOFO  )  »  4; 

if (  v  >  9  )  v  +=  'A'  -  10;  else  v  +=  'O'; 
putc (  v,  c  ) ; 
v  =  (  b  &  OxOOOF  ) ; 

if (  v  >  9  )  v  +-  'A'  -  10;  else  v  +=  'O'; 
putc (v,  c  ) ; 

} 

long  int  hexbin  (  p  ) 
char  *p; 

{ 

long  int  v; 
v  =  0; 


while (  *p  )  { 

if(  isdigit (  *p  ) )  v  =  (  16  *  v  )  +  *p++  -  'O'; 
else  if(  *p  >-  'a'  &&  *p  <=  'f'  ) 
v  =  {  16  *  v  )  +  *p++  -  'a'  +10; 
else  ++p; 

} 

return  v; 

} 

long  int  octbin(  p  ) 
char  *p; 

{ 

long  int  v; 
v  =  0; 

while (  *p  )  { 

if (  *p  >=  'O'  &&  *p  <=  '7*  ) 
v  =  (  8  *  v  )  +  *p++  -  'O'; 
else  ++p; 

} 

return  v; 


long  int  bitbin (  p  ) 
char  *p; 

{ 

long  int  v; 
v  =  0; 

while  (  *p  )  { 

if (  *p  =  'O'  ||  *p  ==  '1'  ) 
v  =  (  2  *  v  )  +  *p++  -  'O'; 
else  ++p; 

} 

return  v; 


long  int  decbin (  p  ) 
char  *p; 

{ 

long  int  v; 
v  =  0; 

while  (  *p  )  { 

if  (  isdigit  (  *p  ) )  v  =■  (  10  *  v  )  +  *p++  -  'O'; 
else  ++p; 

} 

return  v; 


#include  "stdlib.c" 

End  Listing 


98 


Dr.  Dobb's  Journal,  January  1987 

57 


C  CHEST 


Listing  One  (Te/ct  begins  on  page  104.) 


# include  <stdio.h> 

# include  <fcntl.h> 

# include  <getargs.h> 


EXEPRINT.C  Either  print  or  modify  the  exe  file  header: 


exe  file 
exe  -mN  file 


ex6  -sN  file 


Print  the  contents  of  a  file's  EXE  header 
Modify  the  exe  header  so  that  N  bytes  of  memory 
are  allocated  for  the  combined  bss/stack/heap 
area.  If  N  is  smaller  that  the  required  minimum 
(bss  +  stack  size)  then  it's  rounded  up.  The 
largest  permitted  value  of  N  is  65,535.  Use 
-ml  for  the  minimum  possible  heap. 

Modify  the  exe  header  so  that  N  bytes  of  stack 
are  used.  If  necessary,  increase  the  bss/stack/heap 
size  to  accommodate  the  new  stack.  (The 
bss/stack/heap  won't  be  made  smaller,  however). 


typedef  unsigned  short  word;  /*  2-byte  unsigned  number 
typedef  struct 


EXE_HEAD  Ec¬ 


static  int 


signature; 

image_len; 

file_size; 

num_reloc; 

header_size; 

bss_min; 

bss_max; 

stack_disp; 

init  sp; 

checksum; 

init_ip; 

code_disp; 

first_reloc; 

overlay; 


Hsize  -  0  ,  Ssize 


Length  of  load  module  image  %  512  */ 
File  size  in  512-byte  units  */ 
Number  of  relocation  table  items  */ 
Size  of  the  neader  in  paragraphs  */ 
min  size  of  data  area  above  program  */ 
max  size  of  data  area  above  program  */ 
displacement  in  para,  of  stack  seg.  */ 
Initial  SP  register  contents  */ 
Checksum  for  file  */ 
Initial  IP  register  contents  (PC)  */ 
displacement  in  para,  to  code  seg.  */ 
displacement  (bytes)  to  1st  reloc  item  */ 
overlay  number.  */ 


Argtab [ ]  - 

{  'm'  ,  INTEGER,  &Hsize,  "Set  miminum  heap  size  to  <num>"  }, 
{  's'  ,  INTEGER,  &Ssize,  "Set  stack  size  to  <num>"  } 


Idefine  TSIZE  (sizeof (Argtab) /sizeof (ARG) ) 


fprintf (  stderr,  "exe  (-ms[<num>)]  file\n"  ); 
exit  (1) ; 


argc,  argv  ) 
**argv; 

EXE  HEADER 
int 

unsigned 


numpara,  ostack,  odata  ; 


argc  *  getargs (  argc,  argv,  Argtab,  TSIZE,  usage  ) ; 
if(  argc  !»  2  ) 

fata l_err ("exe:  exactly  one  file  name  required\n") ; 

if (  (fd  =  open (  argv(l],  0_RDWR  |  0J3INARY  ))  ~=  -1  ) 
fatal_err(  "Can't  open  %s\n",  argv[l]  ); 

if (  read (  fd,  (char  *)  &h,  sizeof (h)  )  !=  sizeof (h)  ) 
fatal_err(  "Can't  read  %s\n",  argv[l]  ); 

if (  Hsize  ) 

{ 

/*  1)  numpara  =  the  number  of  paragraphs  required  to  hold  the 

*  specified  number  of  bytes. 

*  2)  h. bss. max,  the  maximum  heap  size,  gets  either  the 

*  current  minimum  or  the  specified  size, 

*  whichever  is  larger. 

*  3)  write  out  the  modified  header. 

V 


numpara  =  Hsize/16  +  (Hsize  %  16  !=  0)  ;  /*  1  */ 
h.bssjnax  =  (numpara<h.bss_min)  ?  h.bss_min  :  numpara;  /*  2  */ 
lseek (  fd,  0L,  0  ) ;  /*  3  */ 


('continued  on  page  102 ) 


100 

58 


Dr.  Dobb's  Journal,  January  1987 


C  CHEST 


Listing  One  (Listing  continued,  tejct  begins  on  page  104. ) 

QC  write  (  fd,  (char  *)  &h,  sizeof  (h)  ); 


96 

97 

98 

99 
100 
101 
102 

103 

104 

105 

106 

107 

108 

109 

110 
111 
112 

113 

114 

115 

116 

117 

118 

119 

120 
121 
122 

123 

124 

125 

126 

127 

128 

129 

130 

131 

132 

133 

134 

135 

136 

137 

138 

139 

140 

141 

142 

143 

144 

145 

146 

147 

148 

149 

150 

151 

152 

153 

154 

155 

156 

157 

158 

159 

160 
161 
162 

163 

164 

165 

166 

167 

168 

169 

170 

171 

172 

173 

174 

175 

176 

177 


} 


if  (  Ssize  ) 

{ 

/*  1) 

*  2) 

*  2) 

*  4) 

*  5) 

*  6) 

*/ 

ostack 

odata 

numpara 


h.init_sp 
h.bss  min 


ostack  =  number  of  paragraphs  in  original  stack 
odata  -  number  of  paragraphs  of  data, 
numpara  =  number  of  paragraphs  in  new  stack, 
modify  stack  size. 

Adjust  the  size  of  the  stack+data  area  as  appropriate, 
write  the  modified  header  out  to  the  file. 


h.init_sp/16  +  (h.init_sp  %  16  !=  0) 
h.bss_max  -  ostack  ; 

Ssize/16  +  (Ssize  %  16  !=  0)  ; 

Ssize  ; 

odata  +  numpara; 


/*  1  */ 
/*  2  */ 
/*  3  */ 


if (  h.bss_min  >  h.bss_max  ) 

h.bss_max  =  h.bssjnin; 

lseek (  fd,  0L,  0  ); 

write (  fd,  (char  *)  &h,  sizeof (h)  ); 


print_hdr (  &h  ) ; 
close (  fd  ) ; 


/*  6  */ 


print_hdr(  h  ) 
EXE  HEADER 


print f ("%6d  (0x%04x) ;  ",  h->signature,  h->signature  ); 
printf ("Signature  (marks  this  as  a  valid  .exe  file)\n"); 

print f ("%6d  (0x%04x) :  ",  h->image_len,  h->image_len  ); 
printf ("Length  of  image  mod  512\n"  ); 

printf ("%6d  (0x%04x) :  ",  h->file_size,  h->file_size  ); 
printf ("File  size  (512-byte  blocks)  including  header\n"); 

printf ("%6d  (0x%04x) ;  ",  h->num_reloc,  h->num_reloc  ); 
printf ("Number  of  relocation  table  entries\n") ; 

printf ("%6u  (0x%04x) :  ",  h->header_size,  h->header  size  ); 
printf ("Size  of  the  header  (paragraphs)  =  %lu  bytes\n", 

(unsigned  long)  h->header_size  *  16  ); 

printf ("%6u  (0x%04x) :  ",  h->bss_min,  h->bss_min  ); 

printf ("Min.  memory  above  program  (paragraphs)  =  %lu  bytes\n", 

(unsigned  long)  h->bss_min  *  16  ); 

printf ("%6u  (0x%04x) :  ",  h->bss_max,  h->bss_max  ); 

printf ("Max.  memory  above  program  (paragraphs)  =  %lu  bytes\n", 

(unsigned  long)  h->bss_max  *  16  ) ; 

printf ("%6d  (0x%04x) :  ",  h->stack_disp,  h->stack_disp); 

printf ("Displacement  (paragraphs)  of  stack  within  load  module\n") ; 

printf ("%6d  (0x%04x) :  ",  h->init_sp,  h->init_sp  ); 

printf  ("Initial  value  of  the  SP  register  (=  the  stack  size)\n"); 

printf ("%6d  (0x%04x) :  ",  h->checksum,  h->checksum  ) ; 
printf ("Checksum  for  file\n"); 

printf ("%6d  (0x%04x) :  ",  h->init_ip,  h->init_ip  ) ; 
printf ("Initial  value  of  the  PC  (IP)  register\n") ; 

printf ("%6d  (0x%04x) :  ",  h->code_disp,  h->code_disp  ) ; 

printf ("Displacement  (paragraphs)  of  code  seg.  within  load  module\n"); 

printf ("%6d  (0x%04x) :  ",  h->first  reloc,  h->first  reloc  ); 

printf ("Displacement  (bytes)  to  fTrst  relocation  Item  in  module\n") ; 

printf ("%6d  (0x%04x) :  ",  h->overlay,  h->overlay  ); 
printf ("Overlay  number. \n"); 


End  Listing  One 


Listing  Two 


♦include  <stdio.h> 
♦include  <stdarg.h> 

fatal  err(  fmt  ) 


char 

1 


*fmt; 


Print  an  error  message  to  stderr  and  */ 
then  exit. 


va_list  args; 
va_start  (  args,  fint  ); 
vfprintf (  stderr,  fmt,  args  ); 
exit (  1  )  ; 


End  Listing  Two 


102 


Dr.  Dobb's  Journal,  January  1987 

59 


Dr.  Dobb  s  Journal,  January  1987 

60 


Listing  Three 


1 

2 

3 

4 

5 

6 

7 

8 
9 

10 

11 

12 

13 

14 

15 

16 

17 

18 

19 

20 
21 
22 

23 

24 

25 

26 

27 

28 

29 

30 

31 

32 

33 

34 

35 

36 

37 

38 

39 

40 

41 

42 

43 

44 

45 

46 

47 

48 

49 

50 

51 

52 

53 

54 

55 

56 

57 

58 

59 

60 
61 
62 

63 

64 

65 

66 

67 

68 

69 

70 

71 

72 

73 

74 

75 

76 

77 

78 

79 

80 
81 
82 

83 

84 

85 

86 

87 

88 


I  include  <stdio.h> 
typedef  struct  n 


/*  NRTRAV.C:  A  non-recursive  binary  */ 
/*  tree  traversal  routine  that  uses  */ 
/*  the  link-inversion  method.  */ 


int 

struct 

struct 

char 


tag; 

*left; 

* right; 
*key; 


NODE; 
/* - 


Idefine  print (nodep) 
/* - 


printf (  "%s  ",  (nodep) ->key  ); 


- */ 


descend_left (  pres,  prev  ) 
NODE 
( 


••pres,  **prev; 

register  NODE  *next; 

while (  next  =  (*pres)->left  ) 

(*pres) ->left  =  *prev; 
*prev  -  *pres; 

*pres  -  next; 


/*  Descend  left  till  we  can't 
/*  go  any  farther,  reversing 
/*  links. 


descend__right  (  pres,  prev  ) 


NODE 

{ 


prev 


*  * pros 
register  NODE 


•next; 


/*  Descend  right  one  node, 

/*  reversing  links. 

/•  Return  0  if  we  couldn't  go. 


) 

/*  - 


if(  ! (next  =  (*pres)->rlght)  ) 
return  0; 

(*pres)->tag  -  1; 
(*pres)->right  =  *prev; 

•prev  -  (*pres) ; 

•pres  -  next; 

return  1; 


trav(  pres  ) 
NODE 
( 


'pres; 

NODE 


•prev  =  NULL,  *next; 


do 

( 


descend_left (  spres,  sprev  ) ; 
print (  pres  ) ; 

)  while  (  descend_right (  spres,  sprev)  ); 

while (  prev  ) 

( 

iff  prev->tag  =-  0  ) 


( 


next  =  prev->left; 

prev->left  -  pres; 

pres  =  prev; 

prev  =  next; 


/*  go  up  from  a 
/*  left  child 


while(  1  ) 

( 


/*  and  back  down  */ 


print (  pres  ) ; 

if(  !descend_right (Spres,  sprev)  ) 
break; 


descend_left  (  spres. 

sprev 

) ; 

} 

else 

/* 

go  up 

{ 

next  =  prev->right ; 

prev->tag  =0; 
prev->right  =  pres; 

pres  =  prev; 

prev  =  next; 

/* 

right 

End  Listings 


103 


COLUMNS 


C  CHEST 


Shrinking  .EXE  File  Images 

t's  been  pointed  out  to  me  that  the 
shell  occupies  much  more  memo¬ 
ry  at  run  time  than  is  actually  need¬ 
ed.  Fortunately,  the  problem  is  easy 
to  fix  without  having  to  recompile, 
and  the  techniques  used  are  applica¬ 
ble  to  all  .EXE  files. 

The  problem  has  to  do  with  how 
.EXE  files  are  loaded  into  memory  by 
MS-DOS  and  with  how  memory  is 
used  by  mallocf )  and  free(  ).  (See  this 
month’s  Flotsam  and  Jetsam,  page 
108,  for  a  description  of  memory  or¬ 
ganization  within  a  C  program.) 

The  MS-DOS  loader  reads  the  text 
and  data  segments  from  the  disk  and 
then  allocates  all  remaining  memory 
for  the  bss  segment,  stack,  and  heap, 
even  if  the  program  is  a  small-model 
program  that  couldn’t  possibly  use 
all  that  memory.  When  an  .EXE  file  is 
loaded,  DOS  can  reduce  this  default  to 
an  amount  of  memory  specified  in  a 
header  found  at  the  beginning  of  the 
file  (in  the  first  14  words).  This 
amount  has  to  be  large  enough  to  ac¬ 
commodate  the  entire  bss  and  stack 
segments.  The  heap,  however, 
doesn't  need  to  be  allocated  at  load 
time  because  more  memory  is  re¬ 
quested  from  DOS  if  the  heap  isn't 
large  enough  when  mallocf  )  is 
called,  thereby  increasing  the  size  of 
the  run-time  image. 

The  .EXE  file  header  is  initialized  by 
the  linker  so  that  all  available  memo¬ 
ry  is  allocated  to  the  current  pro¬ 
gram.  Most  small-model  programs 
will  then  reduce  this  amount  to  a  64K 


by  Allen  Holub 

combined  data/bss/stack/heap  area 
as  part  of  the  boot  process.  Unfortu¬ 
nately,  64K  is  always  allocated, 
whether  or  not  you  need  it.  The  shell, 
in  its  released  form,  has  this  64K  data 
space  allocated  to  it,  even  though  it 
only  needs  about  3K  for  static  data 
and  another  3K  for  the  heap.  Conse¬ 


quently  it  takes  up  almost  90K  of 
memory  rather  than  the  50K  or  so 
that's  actually  needed. 

The  automatic  assignment  of  64K 
can  be  circumvented  by  having  the 
linker  put  a  more  reasonable  num¬ 
ber  into  the  .EXE  file  header  (by  using 
the  /CP  or  /STACK  command-line  op¬ 
tions).  It’s  not  always  convenient  to 
relink,  however,  especially  if  you 
don't  have  the  original  source  or  ob¬ 
ject  modules.  Fortunately,  the  size  of 
the  run-time  image  can  be  reduced 
by  doing  nothing  more  than  chang¬ 
ing  a  couple  of  numbers  in  the  .EXE 
file  header. 

The  Microsoft  C  compiler  comes 
with  a  nifty  little  program  called  exe- 
mod  that  does  just  that — messes 
around  with  the  .EXE  file  header  to 
change  the  default  run-time  size  of 
the  program.  Unfortunately  the  Mi¬ 
crosoft  version  is  needlessly  difficult 
to  use  (requiring  you  to  specify  stack 
sizes  in  hex  bytes  and  heap  sizes  in 
decimal  paragraphs),  and,  of  course, 
if  you  don't  have  the  compiler,  you 
don’t  have  exemod  either.  An  easier- 
to-use  version  of  exemod  is  in  Listing 
One  (page  100). 

The  program  (called  exe)  can  be 
used  in  one  of  three  ways  (shown  in 
Table  1,  page  107,  along  with  a  sample 
output).  If  no  command-line  switches 
are  present,  then  exe  just  prints  the 
contents  of  the  header.  I'll  look  at  this 
header  in  greater  depth  in  a  moment. 
The  —  m  flag  is  used  to  change  the  de¬ 
fault  data  area  size  (the  combined 
sizes  of  the  stack,  heap,  and  bss  areas). 
If  N  is  too  small  (less  than  the  com¬ 
bined  bss  and  stack  sizes),  then  it’s 
rounded  up  to  the  minimum.  You  can 


use  —ml  to  get  the  smallest  possible 
run-time  image,  though  the  image 
will  grow  larger  if  the  program  ever 
calls  mallocf  ).  The  —sN  switch  in¬ 
creases  or  decreases  the  stack  size  to  N 
bytes.  If  necessary,  the  run-time  im¬ 
age  will  be  made  larger  to  accommo¬ 
date  a  larger  stack.  The  image  isn't 
made  smaller  when  you  reduce  the 
stack  size.  You  can  run  exe  twice, 
however,  reducing  the  stack  size  the 
first  time  and  then  reducing  the  total 
file  size  the  second  time.  For  example: 

exe  —  sl024  file.exe 
exe  —ml  file.exe 

reduces  a  file's  stack  to  1,024  bytes 
and  then  eliminates  the  space  allocat¬ 
ed  to  the  heap.  Be  careful  about  re¬ 
ducing  the  stack  of  a  Microsoft-com¬ 
piled  program  to  less  than  IK — I 
always  seem  to  get  a  stack  overflow 
error  message  when  I  do  this.  God 
knows  what  all  that  stack  space  is 
used  for — my  own  part  of  the  pro¬ 
gram  isn't  using  it. 

Note  that  the  largest  N  that  can  be 
associated  with  either  switch  is 
65,535.  The  only  reason  for  this  limi¬ 
tation  is  that  I’ve  used  getargsf  )  to 
process  command-line  arguments 
and  getargsf )  can't  handle  long-size 
arguments  very  easily.  If  you  want 
larger  images,  replace  the  getargsf ) 
call  on  line  68  with  your  own  com¬ 
mand-line  processing  routine. 

The  .EXE  file  header  is  defined  by 
the  structure  on  lines  22  —  39,  repro¬ 
duced  in  Code  Example  1,  page  107. 
The  signature  is  a  unique  number 
used  to  identify  this  file  as  an  .EXE  file. 
The  file—size,  header-size,  and  ima¬ 
ge— len  fields  are  used  to  determine 
the  size  of  the  load  module  (the  com¬ 
bined  text  and  initialized  data  areas). 
In  particular,  the  load  image  requires 

((file_size  *  512)— (header__size  *  16)) 

+  image_len 


104 


Dr.  Dobb's  Journal,  January  1987 

61 


C  CHEST 

(continued  from  page  104) 


bytes.  In  Table  1,  this  comes  to 
((22  *  512)  — (32  *  16))  +  124 
or  10,876  bytes. 

Several  of  the  fields  are  used  for 
patching  up  a  few  instructions  that 
the  linker  can't  patch.  There  are 
num—reloc  of  these  items  (three  in 
exe.exe)  organized  as  a  linked  list 
with  the  first  node  in  the  list  at  offset 
first— reloc  from  the  beginning  of  the 
load  module.1 


106 

62 


The  bss—min  and  bss—may  fields 
are  used  to  allocate  the  combined 
heap,  stack,  and  bss  space.  The  initial¬ 
ized  data,  because  it’s  stored  on  the 
disk,  is  considered  to  be  part  of  the 
load  module,  so  its  size  isn’t  duplicat¬ 
ed  here.  Bss—min  is  the  minimum 
amount  of  required  memory  in  para¬ 
graphs  (16-byte  chunks).  It’s  the  com¬ 
bined  bss  and  stack  sizes.  Bss—may 
determines  the  maximum  amount  of 
allocated  memory  (also  in  para¬ 
graphs),  so  if  it’s  larger  than  bss-min, 
the  difference  between  the  two 
numbers  is  the  amount  of  heap  that 
can  be  allocated  before  DOS  has  to  be 


called.  The  default  values  of  bss—min 
and  bss—may  for  exe.exe  are  shown 
in  Table  1  (196  and  65,535,  respective¬ 
ly).  This  means  that  the  program  re¬ 
quires  a  minimum  of  196  paragraphs 
(3,136  bytes)  for  the  combined  bss/ 
stack  area  and  will  use  all  the  rest  of 
memory  for  the  heap.  The  smallest- 
possible  image  can  be  created  by  set¬ 
ting  bss— max  to  bss—min. 

The  init—sp  and  stack— jdisp  fields 
are  used  to  set  up  the  stack  size.  In¬ 
it—sp  is  both  the  stack  size  and  the  ini¬ 
tial  value  of  the  SP  register  (the  SS  reg¬ 
ister  points  at  the  bottom  of  the  stack 
area).  Stack— disp  is  used  to  locate  the 
bottom  of  the  stack.  It  is  added  to  the 
initial  contents  of  the  CS  register  to 
initialize  the  SS  register  when  DOS 
loads  the  program.  Note  that  the  stack 
size  is  included  in  the  bss—min  and 
bss—may  figures,  so  these  will  have  to 
be  modified  if  init—sp  is  made  larger. 

All  these  transformations  are  done 
by  the  code  in  Listing  One.  The  file  is 
opened  on  line  74,  the  .EXE  header  is 
read  on  line  77,  the  bss—min  and 
bss—may  fields  are  modified  on  lines 
81  —  97,  and  the  stack  variables  are 
modified  on  lines  99  —  122.  The  .EXE 
header  is  written  back  out  on  both 
lines  95—96  and  120—121  (you  have 
to  seek  back  to  the  start  of  the  file  be¬ 
fore  writing).  Finally,  the  header  con¬ 
tents  are  printed  by  print— hdr(  ), 
called  on  line  124. 

The  fatal— err(  )  subroutine  is  given 
in  Listing  Two,  page  102.  It  is  used  just 
like  printfi  )  is  used.  It  writes  a  mes¬ 
sage  to  stderr  and  then  exits  to  the 
operating  system.  Note  that  I’ve  used 
the  ANSI  (as  compared  to  Unix)  con¬ 
ventions  for  subroutines  with  a  vari¬ 
able  number  of  arguments.  Va—list 
and  va— start  are  macros  defined  in 
stdarg.h,  supplied  with  the  compiler. 
If  your  compiler  doesn’t  support 
these,  substitute  calls  to  fprintfl  )  and 
then  ejdt( )  for  the  fatal— err ( )  calls. 
I’ll  talk  more  about  subroutines  with 
a  variable  number  of  arguments  in  a 
future  column. 

Erratum 

The  nonrecursive  binary-tree  tra¬ 
versal  routine  presented  in  July  has  a 
serious  bug  in  the  algorithm.  It 
couldn’t  handle  the  case  of  a  leaf  that 
had  a  right,  but  no  left,  descendant. 
Listing  Three,  page  103,  is  another 
version  of  the  routine  that  seems  to 
work  correctly.  The  basic  process  is 


Dr.  Dobb's  Journal,  January  1987 


still  the  same  (descend  the  tree  re¬ 
versing  pointers  so  you  can  go  back 
up  again,  setting  a  tag  bit  just  before 
going  right),  but  the  code  has  been 
shuffled  around  a  bit.  Note  that  in  this 
version  I’m  keeping  a  tag  field  in  the 
structure  rather  than  setting  the  high 
bit  of  the  first  character  of  the  key 
string.  Look  back  in  the  July  C  Chest  if 
you  need  a  more  detailed  explana¬ 
tion  of  what's  going  on. 

Availability 

All  the  code  from  this  month  is  avail¬ 
able  on  CompuServe  in  DL1  (type 


ddjforum).  The  getargsf  )  subroutine, 
used  but  not  printed  this  month,  was 
originally  published  in  the  May  1985 
C  Chest.  The  version  used  here  has  a 
fifth  argument  not  present  in  the 
original  version.  If  you’re  using  the 
earlier  version,  just  omit  the  extra  ar¬ 
gument.  The  current  version  of 
getargs  is  available  both  on  Compu¬ 
Serve  and  as  part  of  the  /util  program 
disk  distributed  by  DDJ  (see  ad,  page 
124). 

All  the  source  code  for  articles  in 
this  issue  is  available  on  a  single  disk. 
To  order,  send  $14.95  to  Dr.  Dobb  s 


exe  —  mN  file 

Modify  the  maximum  memory  used  to  N  bytes.  If  this  number  is 
smaller  than  the  combined  bss  and  stack  sizes,  then  it’s  rounded  up. 
Jse  —  ml  for  the  smallest  possible  load  module.  The  maximum  value 
of  N  is  65,535. 

exe  —  sN  file 

Modify  the  stack  size  to  be  N  bytes. 

exe  file 

Just  print  out  the  contents  of  the  .EXE  header.  The  command  line 
exe  exe.exe  generated  the  following: 

23117  (0x5a4d) 

Signature  (marks  this  as  a  valid  .EXE  file) 

124  (0xC07c) 

Length  of  image  mod  51 2 

22  (0x0016) 

File  size  (512-byte  blocks)  including  header 

3  (0x0003) 

Number  of  relocation  table  entries 

32  (0x0020) 

Size  of  the  header  (paragraphs)  =  512  bytes 

196  (0x00c4) 

Min.  memory  above  program  (paragraphs)  =  3,1 36  bytes 

65535  (Oxffff): 

Max.  memory  above  program  (paragraphs)  =  1 ,048,560  bytes 

715  (0x02cb) 

Displacement  (paragraphs)  of  stack  within  load  module 

2048  (0x0800):  Initial  value  of  the  SP  register  (=  the  stack  size) 

-14653  (0xc6c3) 

Checksum  for  file 

2008  (0x07d8) 

Initial  value  of  the  PC  (IP )  register 

0  (0x0000) 

Displacement  (paragraphs)  of  code  seg.  within  load  module 

30  (0x001  e):  Displacement  (bytes)  to  first  relocation  item  in  module 

0  (0x0000):  Overlay  number 

Table  1:  Using  exe 


typedef  unsigned  short  word;  /*  2-byte  unsigned  number  */ 


typedef 

/ 

struct 

\ 

word 

signature; 

word 

image_len; 

/* 

word 

f i le_size; 

/* 

word 

num_reloc ; 

/* 

word 

header_size ; 

/* 

word 

bss_min ; 

/* 

word 

bss_max ; 

/* 

word 

stack_disp ; 

/* 

word 

init_sp ; 

/* 

word 

checksum; 

/* 

word 

init_ip ; 

/* 

word 

code_disp ; 

/* 

word 

f i rst_reloc ; 

/* 

word 

overlay; 

/* 

EXE_HEADER; 


Checksum  f  or  f  i  le 


loc  item 


*/ 

*/ 

*/ 

*/ 

*/ 

*/ 

*/ 

*/ 

*/ 

*/ 

*/ 

*/ 

*/ 


Code  Example  1:  The  .EXE  file  header 


Dr.  Dobb's  Journal,  January  1987 


107 

63 


C  CHEST 

(continued  from  page  107) 

Journal,  501  Galveston  Dr.,  Redwood 
City,  CA  94063  or  call  (415)  366-3600 
ext.  216.  Please  specify  the  issue  num¬ 
ber  and  disk  format  (MS-DOS,  Macin¬ 
tosh,  Kaypro). 

Note 

1.  For  more  information  about  how 
the  relocations  are  processed  consult 
Chapter  10  of  IBM  Corp.'s  DOS  Techni¬ 
cal  Reference  (Boca  Raton,  Fla.:  IBM 
Corp.,  1985).  Much  better  descrip¬ 


tions  of  relocatable  object  module 
formats  in  general  are  in  Steven  Arm- 
brust  and  Ted  Forgeron’s  “.OBJ  Les¬ 
sons,”  PC  Tech  Journal  3:10  (October 
1985),  63—81,  and  Intel  Corp.’s  8086 
Relocatable  Object  Module  Formats 
(Santa  Clara,  Calif.:  Intel  Corp.,  1981), 
order  number  121748-001. 


(Listings  begin  on  page  100.) 

Vote  for  your  favorite  feature/article. 
Circle  Reader  Service  No.  6. 


Flotsam  and  Jetsam 


O  Memory  Organization 
in  a  C  Program 

Most  C  programs  are  segmented  into 
five  parts:  the  code  area  (called  the 
text  segment);  the  initialized  data 
space  (called  the  data  segment);  the 
uninitialized  data  space  (or  bss  seg¬ 
ment;  the  stack  space  (or  stack  seg¬ 
ment);  and  the  area  of  memory  used 
by  malloc(  ),  called  the  heap. 

The  stack  is  used  for  subroutine 
calling,  in  the  normal  way,  but  it’s 
also  used  to  store  local  automatic 
variables.  When  a  subroutine  is 
called,  it  subtracts  a  constant  from 
the  stack  pointer  to  make  room  on 
the  stack  for  its  own  local  variables 
and  then  accesses  those  variables  in¬ 
directly  through  either  the  stack 
pointer  or  a  special  register  called  the 
frame  pointer. 

Variables  at  fixed  addresses  (glo- 
bals  and  local  statics)  are  in  either  the 
data  or  bss  segments,  depending  on 
whether  they  are  initialized  by  your 
program.  Variables  can  be  initialized 
explicitly  (with  an  equal  sign  as  part 
of  the  declaration)  or  implicitly.  An 
example  of  the  latter  is  a  string  con¬ 
stant,  such  as  a  format  string  passed 
to  a  printfl )  call.  Here  the  compiler 
automatically  allocates  and  implicit¬ 
ly  initializes  an  area  of  memory  to 
hold  the  string,  and  that  memory  is 
put  into  the  data  segment  (rather 
than  the  bss  segment). 

Usually,  both  the  text  and  data  seg¬ 
ments  (code  and  initialized  data  areas) 
are  stored  on  the  disk  together.  When 
the  program  is  loaded,  the  variables  in 


the  data  segment  are  loaded  directly 
into  their  correct  place  in  memory. 
There's  no  code  generated  to  initialize 
static  data;  the  data  that  is  read  in 
from  the  disk  has  the  correct  initial 
value.  This  explains  why  a  static  local 
variable  has  its  initial  value  the  first 
time  a  subroutine  is  called  but  on  sub¬ 
sequent  calls  the  variable  contains  the 
same  value  that  it  had  at  the  end  of  the 
previous  call.  It's  read  in  having  the 
initial  value,  but  once  you  change  it,  it 
stays  changed. 

The  remaining  three  memory  areas 
(bss,  stack,  and  heap)  are  created  as 
part  of  the  loading  process.  After  the 
loader  has  transferred  the  text  and 
data  segments  from  the  disk  into 
memory,  it  allocates  enough  addition¬ 
al  memory  above  the  data  segment  to 
contain  the  other  three  segments.  The 
loader  usually  sets  up  the  stack  point¬ 
er  to  point  into  the  stack.  The  pro¬ 
gram  itself  (or  more  correctly,  the 
root  or  start-up  module)  initializes  the 
entire  bss  segment  with  0s  and  then 
calls  your  main(  )  subroutine. 

The  usual  order  of  segments,  going 
from  low  to  high  memory,  is: 

text  (code) 

data  (initialized  data) 

bss  (uninitialized  data) 

stack  (local  variables ) 

heap  (used  by  mallocl ) ) 

However,  the  stack  and  heap  are  of¬ 
ten  reversed.  The  text  and  data  areas 
are  always  adjacent  because  they're 
read  from  the  disk  as  a  single  unit.  C 


108 

64 


Dr.  Dobb's  Journal,  January  1987 


COLUMNS 


STRUCTURED  PROGRAMMING 


Forth  has  been  designed  by  pro¬ 
grammers  who  were  using  it, 
and  so  Forth's  design  is  responsive  to 
programmers'  needs  in  small  ways  as 
well  as  large.  Other  languages  don’t 
seem  to  be  quite  so  programmer- 
friendly.  For  example,  I  was  sur¬ 
prised  to  read  in  an  article  about 
Modula-2  (by  this  column's  own  Na- 
mir  Shammas)  the  complaint  that  the 
underscore  character  was  not  al¬ 
lowed  in  names.  How  did  Modula-2's 
designer  conclude  that  programmers 
are  helped  by  disallowing  some  char¬ 
acters  in  names?  In  my  heart  of 
hearts,  I  suspect  that  the  rule  was  for 
the  benefit  of  the  compiler  writers, 
not  the  compiler  users,  and  exempli¬ 
fies  fitting  the  task  to  the  program 
rather  than  the  other  way  around. 
This  type  of  programming  focuses  on 
what  is  easy  now  (for  the  program 
writer),  not  what  is  easy  over  the  life 
of  the  program  (for  the  program 
users). 

Service  workers  must  always  fight 
the  tilt  toward  serving  themselves  be¬ 
fore  the  clients  of  their  profession. 
College  administrators  who  bemoan 
the  loss  of  serenity  when  students  re¬ 
turn  to  the  campus,  shelf  stockers 
whose  tempers  flare  when  custom¬ 
ers  disorganize  displays  by  buying 
items  from  them,  and  programmers 
who  have  had  it  up  to  here  with  fig¬ 
uring  how  to  help  the  endlessly  con¬ 
fused  user — all  should  remind  them¬ 
selves  of  the  point  of  the  enterprise. 

Programmers  using  a  particular 
language  pray  that  its  developers 


by  Michael  Ham 

kept  in  mind  that  they  were  writing 
for  programmers  and  made  their 
first  objective  easing  the  program¬ 
mer’s  life,  not  their  own  task.  The 
programmer  users  hope  that  the  lan- 


®  1986  by  Michael  Ham.  All  rights 
reserved 


Naming  Names 


guage  developers,  weary  of  consider¬ 
ing  all  the  ins  and  outs  of  implemen¬ 
tation,  did  not  finally  throw  in  the 
towel  and  say,  "This  will  be  good  for 
you,  really.  You'll  like  not  being  able 
to  use  underscores.  Anyway,  you'll 
get  used  to  it.” 

In  Forth,  any  character  can  be  used 
in  a  name — well,  almost  any.  Blank 
doesn't  work  because  it  is  the  name 
delimiter,  and  carriage  return 
doesn't  work  because  it  marks  the 
end  of  the  line.  Generally  speaking, 
however,  Forth  is  not  picky  about  the 
characters  you  want  to  use. 

Some  standard  usages  have 
evolved  in  which  some  characters 
represent  a  class  of  tasks.  A  word  be¬ 
ginning  with  a  period  normally  dis¬ 
plays  information:  .DATE,  for  exam¬ 
ple,  can  be  assumed  to  display  the 
date,  .NAME  a  name,  .S  the  stack  (ab¬ 
breviated  because  so  often  used), 
.FILENAME  a  file  name,  and  so  on.  The 
greater-than  symbol  is  often  used  for 
"to” — to  indicate  movement  OB 
puts  a  character  from  the  data  stack 
onto  the  return  stack,  H>  takes  it  off 
the  return  stack  and  returns  it  to  the 
data  stack)  or  transformation  ( S>D 
converts  a  single-precision  number 
to  a  double-precision  equivalent, 
>JULIAN  converts  a  date  to  a  Julian 
date).  The  symbol  ?  denotes  a  Boolean 
flag,  and  in  a  program  in  which 


names  are  carefully  chosen,  its 
meaning  is  obvious.  For  example, 
STOP?  would  leave  a  flag  true,  mean¬ 
ing  stop,  and  ?STOP  would  consume  a 
flag  true,  causing  a  stop. 

Code  Example  1,  belovy  shows  a  tiny 
tool  YES?  that  collects  a  yes/anything 
response  and  leaves  a  true  flag  if  the 
user  answers  yes.  The  word  suggests 
a  yes  response  (hence  the  name)  by 
displaying  Y  as  the  default  answer. 
YES?  uses  CAP  to  capitalize  any  lower¬ 
case  input  before  checking  whether 
it  was  a  Y. 

Some  of  these  name  patterns  come 
from  conventions,  but  conventions 
are  more  successful  when  they  rec¬ 
ognize  and  reinforce  usage  than 
when  they  attempt  to  create  it.  Rush¬ 
ing  the  process  or  trying  tb  fence  it  in 
with  rules  does  not  lead  to  better  re¬ 
sults  more  quickly  but  merely  frus¬ 
trates  and  confuses  the  evolutionary 
movement.  Forth  gives  the  program¬ 
mer  complete  freedom  in  naming, 
and  the  conventions  for  naming 
emerge  gradually. 

Other  languages  give  the  compiler 
writer  the  authority  to  decide  the 
sorts  of  names  that  would  be  good  for 
programmers  (or,  possibly,  good  for 
compiler  writers)  with  the  result  that 
some  characters  fall  beyond  the  pale 
that  pens  the  programmer.  "You 
want  underscores  in  the  name?  That 
sort  of  thing  isn’t  done  in  Modula-2. 
Don’t  be  perverse.” — the  mark  of 
bluestockings,  ready  to  ease  their  life 
by  adding  difficulties  to  yours.  Fight 
back.  Use  Forth. 

Some  programming  languages 
build  in  conventions  through  tactics 
such  as  precedence  rules.  One  popu- 


:  CAP  (  c  -  C  )  DUP  96  )  OVER  123  <  AND  IF  BL  -  THEN  ; 

:  YES?  (  —  f  )  ASCII  Y  EMIT  8  EMIT  (  backspace  ) 

KEY  CAP  DUP  ASCII  Y  =  SWAP  13  =  (  cr?  )  OR  DUP 
IF  ASCII  Y  ELSE  ASCII  N  THEN  EMIT  SPACE  ; 


Code  Example  1:  Two  tiny  tools 


110 


Dr.  Dobb's  Journal,  Jan  lary  1987 

65 


lar  language  has  upward  of  20  prece¬ 
dence  rules.  It's  too  many.  The  doctor 
in  John  Barth's  novel  End  of  the  Road 
(New  York:  Avon  Books,  1964)  sug¬ 
gests  to  Jacob  Horner  that  three  will 
suffice:  "  'If  the  alternatives  are  side 
by  side,  choose  the  one  on  the  left;  if 
they're  consecutive  in  time,  choose 
the  earlier.  If  neither  of  these  applies, 
choose  the  alternative  whose  name 
begins  with  the  earlier  letter  of  the 
alphabet.  These  are  the  principles  of 
Sinistrality,  Antecedence,  and  Alpha¬ 
betical  Priority.'  " 

Of  course,  it  is  always  legitimate  to 
propose  rules.  Ideas  can  be  stimulat¬ 
ed  through  discussion,  but  their  ac¬ 
ceptance  should  ultimately  be  based 
upon  experience,  not  fiat.  To  demon¬ 
strate  that  I  am  willing  to  entertain 
rule  proposals,  I  offer  the  following 
suggestions  for  naming  conventions 
for  the  arithmetic  operators. 

The  names  of  the  arithmetic  opera¬ 
tors  are  perhaps  inescapably  pedes¬ 
trian.  Forth  requires  a  variety  of 
names  because  data  are  not  typed 
and  thus  the  operators  are.  Operators 
come  in  several  flavors:  single  preci¬ 


sion,  double  precision,  quad  preci¬ 
sion,  and  mixed  precision  (operations 
in  which  the  two  operands  are  of  dif¬ 
ferent  precision,  typically  one  being 
single  precision  and  the  other  double 
precision).  Happily,  all  the  mixed- 
precision  operators  can  be  defined  to 
take  the  lesser  precision  operand  on 
top  of  the  stack,  the  greater  precision 
second  on  the  stack. 

To  simplify  the  discussion,  let’s  fol¬ 
low  FORTH  Inc.'s  lead  and  call  single¬ 
precision  numbers  singles  and  dou¬ 
ble-precision  numbers  doubleis.  There 
are  no  mixed-precision  numbers,  of 
course,  only  mixed-precision  opera¬ 
tions  (with  a  single  and  a  double  or  a 
double  and  a  quad  as  arguments). 

The  precision  of  the  result  of  an  op¬ 
eration  is  another  question.  Normally 
sums  and  differences  are  assumed  to 
have  the  same  precision  as  the  oper¬ 
ands  that  produced  them,  though 
that  is  not  logically  necessary:  a  sum 
of  two  singles  could,  for  example,  be 
a  double.  In  multiplication  you  more 
commonly  will  want  to  allow  the  re¬ 
sult  to  be  of  a  higher  precision  than 
the  factors  (the  product  of  two  singles 


being  a  double,  for  example).  And 
with  division  you  might  be  content 
for  the  quotient  to  drop  back  a  notch: 
double  divided  by  single  with  the 
quotient  a  single.  Note,  however,  that 
it  seems  best  for  the  remainder  to  be 
accepted  as  a  double  even  if  the  quo¬ 
tient  is  taken  as  a  single. 

The  sign  lurks  as  the  high  bit  of  the 
binary  representation  of  the  num¬ 
ber,  but  unsigned  numbers  use  that 
bit  in  its  numeric  meaning.  Addition 
and  subtraction  do  not  need  to  distin¬ 
guish — the  programmer  can  choose 
how  to  interpret  the  high  bit  when 
the  result  is  displayed.  But  in  other 
operations  it  can  make  a  difference: 
is  it  10  compared  to  65,535  (10  is  less) 
or  10  compared  to  —  1  (10  is  greater)? 
If  an  operation  treats  the  high  bit  as 
number  rather  than  as  sign,  the  oper¬ 
ation  is  called  unsigned. 

Ideally,  the  Forth  names  for  the  op¬ 
erators  could  offer  the  programmer 
some  reliable  signposts  through  this 
maze  of  options:  single,  mixed,  dou¬ 
ble,  unsigned,  signed,  incoming,  out¬ 
going.  The  current  crop  of  names 
was  not  designed  to  offer  this  kind  of 


Dr.  Dobb's  Journal,  January  1987 

66 


in 


STRUCTURED  PROGRAMMING 

(continued  from  page  111) 


help.  An  alternative  scheme  is  sug¬ 
gested  in  Tables  1  and  2,  below. 

Table  1  contains  a  list  of  prefixes 
for  the  arithmetic  operators,  based 
on  the  precision  of  the  operands. 
When  both  operands  are  signed,  sin¬ 


gle-precision  numbers,  the  operators 
are  unadorned.  Otherwise,  the  oper¬ 
ator  names  include  information  that 
describes  the  nature  of  the  operands. 

Table  2  contains  a  list  of  suffixes. 
Just  as  the  prefix  describes  the  nature 
of  the  operands  (the  input),  the  suffix 
describes  the  nature  of  the  result  (the 
output).  Again,  a  garden-variety  op¬ 


erator  that  produces  the  natural  re¬ 
sult  (for  example,  an  operation  on 
single-precision  numbers  that  pro¬ 
duces  a  single-precision  result  or  on 
doubles  that  produces  a  double).  This 
convention  assumes  that  the  ''natu¬ 
ral”  result  for  a  mixed-precision  op¬ 
eration  has  the  higher  of  the  preci¬ 
sions  of  the  two  operands.  For 
example,  the  natural  result  of  a  sin¬ 
gle-double  mixed  operator  is  a  dou¬ 
ble,  and  thus  no  suffix  is  used  in  that 
case.  If  a  single-double  operator  pro¬ 
duces  a  single  result,  then  it  is  named 
with  a  suffix  S  to  show  that  the  result 
is  single  precision. 

These  names  group  double-preci¬ 
sion  operators  under  D  and  mixed- 
precision  operators  under  M.  The  af¬ 
fixes  allow  you  to  decipher  the 
special  qualities  of  an  arbitrary  oper¬ 
ator  and  also  make  it  easy  to  remem¬ 
ber  the  operation  names.  Table  3 
shows  a  variety  of  operator  names 
that  test  this  scheme. 

Mixed  precision  normally  is  an  is¬ 
sue  only  with  the  input  because  the 
output  is  normally  only  one  number. 
The  /MOD  operators,  however,  pro¬ 
duce  two  numbers  as  output:  a  quo¬ 
tient  and  a  remainder.  ‘  Would  it  be 
reasonable  to  see  mixed  precision 
here,  with  the  quotient  being  one  pre¬ 
cision  and  the  remainder  another? 

In  integer  division,  the  max  of  the 
dividend  and  the  divisor  will  be  larg¬ 
er  than  the  quotient  and  the  remain¬ 
der,  so  if  the  dividend  and  divisor  are 
both  single  precision,  both  quotient 
and  remainder  will  be  single  preci¬ 
sion.  The  single-precision  /MOD  thus 
leaves  single-precision  results,  and 
the  issue  does  not  arise. 

With  D/MOD,  however,  the  situa¬ 
tion  is  different.  Dividing  a  double  by 
a  double  might  well  produce  a  single. 
The  remainder,  however,  could  well 
be  a  double.  Thus,  you  might  want  the 
operation  D/MODM,  two  doubles  pro¬ 
ducing  a  mixed  result:  a  single-preci¬ 
sion  quotient  and  a  double-precision 
remainder.  Mirabile  dictu,  the  single 
is  again  on  top  of  the  stack,  the  double 
beneath.  (Perhaps  single  numbers, 
weighing  less  than  doubles,  naturally 
float  to  the  top  of  the  stack.)  You  can 
use  M  as  a  suffix  to  indicate  that  the 
mixed  precision  is  on  the  output  side 
rather  than  the  input: 

D*/MODM — three  signed  doubles, 
with  quad-precision  intermediate  re- 


Operands 

Sign  Assumption 

Prefix 

Single-precision 

signed  operands 

none 

unsigned  operands 

u 

Double-precision 

signed  operands 

D 

unsigned  operands 

DU 

Quad-precision 

signed  operands 

Q 

unsigned  operands 

QU 

Mixed-precision 

Single-double 

signed  operands 

M 

unsigned  operands 

MU 

Double-quad 

signed  operands 

MD 

unsigned  operands 

MDU 

Table  1:  Prefixes  for  arithmetic  operators 

Operands 

Result 

Suffix 

Single-precision 

single  precision 

none 

double  precision 

D 

Double-precision 

single  precision 

S 

double  precision 

none 

quad  precision 

Q 

Mixed-precision 

Single-double 

single  precision 

S 

double  precision 

none 

Double-quad 

double  precision 

D 

quad  precision 

none 

Table  2:  Suffixes  for  arithmetic  operators 


*D  Two  signed  single-precision  factors  producing  a  double  product. 

U*D  Two  unsigned  single-precision  factors  producing  a  double  product. 

MU*  Mixed-precision  factors  (single  and  double),  unsigned,  with  double  prod¬ 

uct.  By  convention,  the  single-precision  factor  is  on  top  of  the  stack,  the 
double  under  it. 

MDU*  Mixed-precision  factors  (double  and  quad),  unsigned,  with  quad  product. 

MU>  Mixed-precision  unsigned  compare.  The  sense  of  the  comparison  is  dou¬ 

ble  >  single,  both  considered  as  unsigned. 

D */  Factors  and  result  are  all  doubles,  with  quad-precision  intermediate 

product  (The  */  operator  always  takes  the  intermediate  product  to  the 
next  higher  level  of  precision.  Because  operators  in  the  7  family  have 
three  operands,  it  seems  best  to  avoid  mixed  precision  on  the  input 
side.) 

D*  Factors  and  product  all  signed  doubles. 

M*  Single  on  top  of  stack,  double  beneath,  with  double-precision  product, 

all  signed. 

D /  Two  signed  doubles  divided,  producing  double  as  quotient. 

D/S  Two  signed  doubles  divided,  producing  signed  single  as  quotient. 

M/S  Double  divided  by  single  with  single  quotient,  all  signed. 

D/MOD  Two  doubles  divided,  producing  a  double  quotient  and  a  double 

remainder. 


Table  3:  Examples  of  operator  names 


112 


Dr.  Dobb’s  Journal,  January  1987 

67 


suit,  producing  a  mixed-precision  re¬ 
sult:  single  quotient  on  top  of  stack 
and  double  remainder  beneath. 

Q* /MODM — three  unsigned  quads, 
with  octuple-precision  intermediate 
result,  producing  a  quad  remainder 
(second  on  the  stack)  and  a  double 
quotient  (top  of  stack). 

The  example  Q "/MODM  is  a  grotesque- 
rie  that  you  would  probably  never  en¬ 
counter.  It  serves  merely  to  illustrate 
that  even  novel  operations  can  be  de¬ 
ciphered  easily  with  this  scheme. 

If  you  accept  that  all  M  operators 
require  the  single  number  on  top  of 
the  stack  and  the  double  beneath, 
you  can  define  the  following  opera¬ 
tors  that  might  be  useful  in  mixed- 
precision  situations.  But  beware  of 
the  syndrome  of  maniacal  complete¬ 
ness,  in  which  you  define  fistfuls  of 
operators  to  complete  a  set  of  logical 
possibilities,  even  if  those  operators 
are  seldom  or  never  used.  I  suggest 
that  these  words  be  defined  only  in 
applications  doing  a  lot  of  mixed-pre¬ 
cision  calculation.  In  that  setting, 
their  descriptive  names  make  the 
code  more  readable  and  thus  justify 
their  existence. 

MSWAP  (  d  n  —  n  d  ) 

MOVER  (  d  n  -  d  n  d  ) 

MNIP  (  d  n  -  n  ) 

MTUCK  (  d  n  -  n  d  n  ) 

Some  of  the  names  I  have  suggest¬ 
ed  for  the  arithmetic  operators  don't 
match  names  in  the  83  Standard.  I  see 
no  problem  in  this;  the  names  I  pro¬ 
pose  could  be  adopted  and  the  older 
names  kept  as  synonyms.  Forth  pro¬ 
grammers  can  be  encouraged  to  shift 
to  the  new  names  by  having  a  tiny 
speed  penalty  associated  with  the 
older  names — for  example,  by  defin¬ 
ing  the  older  name  as  an  alias  of  the 
newer  name.  Any  speed  penalty, 
however  slight,  is  more  than  enough 
to  make  most  Forth  programmers 
switch  to  the  faster  name. 

The  Toolbox 

This  perhaps  seems  like  a  lot  of  atten¬ 
tion  lavished  on  names,  particularly 
for  a  publication  whose  title  includes 
the  phrase  “software  tools.”  But 
names  are  an  important  part  of  a  pro¬ 
gramming  tool.  In  a  well-designed 
hand  tool,  the  grip  gets  serious  atten¬ 
tion  as  the  primary  ergonomic  inter¬ 


face,  which  plays  a  major  part  in  de¬ 
termining  the  effectiveness  of  the 
tool.  The  programmer  uses  names 
and  verbal  constructs  to  manipulate 
the  power  of  the  computer.  If  those 
names  and  constructs  fit  well  the 
habits  of  the  mind,  the  task  is  done 
that  much  more  easily. 

I  have  proposed  new  operators  as 
well  as  new  names.  I  believe  that  any 
Forth  package  should  include  as  a 
standard  component  a  complete  set 
of  double-precision  operators,  with 
quad-precision  operators  available  as 
an  extension  for  16-bit  Forths.  Double 
precision  is  often  needed  when 
working  with  large  numbers  in 
which  round-off  errors  must  be  min¬ 
imized,  as  in  accounting  applications. 
Sometimes  (in  16-bit  Forths)  even 
double  precision  does  not  offer 
enough  range  and  quad  precision 
must  be  used. 

These  are  the  double-precision  op¬ 
erators  I  believe  should  be  present: 

D+  D—  D*  *D  D/MOD  D*/MOD  D> 

D=  D< 

2SWAP  20VER  2DROP  D2DROP  2DUP 

D2DUP  2ROT 

2,  2@  2! 

2CONSTANT  2VARIABLE 

With  these  at  hand,  programmers 
can  easily  construct  other  double-pre¬ 
cision  operators  that  might  be  needed: 
D/,  D* /,  D0=,  D0>,  and  so  forth.  I 
suggest  D2DROP  instead  of  4DROP  and 
D2DUP  instead  of  4DUP  because  the  for¬ 
mer  show  the  intention  with  more 
clarity  and  less  mental  arithmetic. 

MMSForth,  published  by  Miller  Mi¬ 
crocomputer  Services,  provides  a  dif¬ 
ferent  (and  complete)  solution  to  the 
need  for  operators  of  higher  preci¬ 
sion.  Instead  of  offering  optional 
quad  precision,  octuple  precision, 
and  so  on,  MMSForth  generalizes  the 
idea  of  integer  precision. 

The  Utilities  option  for  Version  2.4 
of  MMSForth  includes  an  optional  ex¬ 
tension  called  N-LEN *.  The  N-LEN#  op¬ 
erators  parallel  the  usual  number 
and  stack  operators  and  use  the  same 
names  except  that  *  is  included  as  an 
identifier.  All  the  operators  (*+, 
#DUP,  #OVER,  <##,  ##S,  #*/MOD,  and 
so  on)  work  by  reference  to  the  value 
of  #PREC,  which  the  user  sets. 

#PREC  specifies  the  number  of  cells 
to  be  used  in  the  arithmetic  opera¬ 
tions.  Setting  fPREC  to  1  produces  the 


normal  single-precision  operators, 
and  setting  #PREC  to  2  produces  the 
double-precision  operators.  But  you 
can  set  it  to  arbitrarily  high  values  to 
allow  integer  arithmetic  of  arbitrary 
precision. 

Do  not  think  these  operations  are 
sluggish.  Test  routines  included  with 
the  package  time  the  computation  of 
Fibonacci  numbers  and  factorials.  I 
computed  and  printed  the  277th  Fi¬ 
bonacci  number  in  1.08  seconds  and 
the  number  46!  (46  factorial)  in  1.50 
seconds  (both  on  an  IBM  PC  with  the 
NEC  V20  chip  instead  of  the  8088).  The 
number  of  (2-byte)  cells  of  precision 
specified  in  these  test  routines  was  50 
for  the  Fibonacci  test  and  540  for  the 
factorial  test.  For  most  uses,  540  preci¬ 
sion  seems  more  than  ample.  Allow¬ 
ing  users  to  specify  the  number  of 
bytes  of  precision  they  need  is  clearly 
a  better  solution  than  hand-tailoring 
operators  of  various  precision. 

It  should  be  noted  that  Version  2.4 
of  MMSForth,  which  runs  on  the  IBM 
PC  and  on  the  Radio  Shack  Model  4 
and  equivalents,  is  a  native-mode 
Forth,  incompatible  with  the  normal 
operating  systems  on  those  machines. 
When  MMSForth  is  used,  it  monopo¬ 
lizes  the  machine  and  its  resources. 

Names,  Names,  Names 

One  problem  Forth  programmers 
face  on  large  projects  or  when  work¬ 
ing  as  a  programming  team  is  the 
number  of  names  that  are  generated. 
John  James  has  called  this  "the  name 
explosion.”  Because  Forth  programs 
are  best  written  as  a  collection  of  use¬ 
ful  tools  (short  definitions  with  gen¬ 
eral  utility),  the  situation  is  particu¬ 
larly  acute.  Compared  to  procedural 
languages,  Forth  systems  have  more 
names  (shortness  of  definition)  and  it 
is  more  important  to  know  them 
(general  utility). 

Forth  programmers  generally 
agree  also  on  the  importance  of  find¬ 
ing  the  "right”  name.  The  criteria  for 
rightness  vary,  with  one  major  divi¬ 
sion  between  those  who  prefer  play¬ 
ful  names  and  those  of  more  serious 
mien.  Both  parties  agree,  however, 
that  the  best  names  accurately  and 
immediately  convey  the  idea  of  the 
word's  function.  Both  parties  prefer 
short  names  to  long.  And  both  find  it 
difficult  to  get  precisely  the  right 
name  when  names  must  be  assigned 
continually. 


Dr.  Dobb's  Journal,  January  1987 

68 


113 


STRUCTURED  PROGRAMMING 

(continued  from  page  113) 

The  first  problem  is  getting  a  good 
name.  But  then,  once  good  (or  even 
merely  tolerable)  names  have  been 
arrived  at,  they  must  also  be  remem¬ 
bered  somehow  and  (in  a  team  situa¬ 
tion)  communicated.  The  Forth  dic¬ 
tionary  is  not  really  a  dictionary  for 
humans:  the  names  are  not  arranged 
alphabetically.  Some  versions  of 
Forth  provide  words  such  as  LOCATE 
or  VIEW  that  work  reasonably  well — 
but  only  if  you  can  remember  the 
name  to  begin  with  (does  anyone 
have  a  LOCATE  and  VIEW  that  work 
with  wildcards?)  and  if  the  documen¬ 
tation  (comments  and  shadow 
screens)  is  understandable  and  up-to- 
date — or,  failing  that,  the  source  code 
is  readable. 

The  difficulty  of  using  new  words 
fluently  is  the  same  as  the  difficulty 
of  speaking  or  writing  a  foreign  lan¬ 
guage.  Having  to  look  up  every  word 
in  a  dictionary  is  insufferably  slow. 
In  a  single-programmer  shop,  the 
programmer  gradually  learns  his  or 
her  own  language  and  becomes  flu¬ 
ent  in  the  tools  he  or  she  has  created. 
But  what  is  to  be  done  in  a  multiper¬ 
son  shop,  with  each  programmer 
creating  several  names  a  day?  Are 
there  regular  meetings  wherein  the 
programmers  present  their  words  to 
each  other?  Do  they  pass  around  a  list 
of  their  creations  for  others  to  learn 
and  use?  Do  they  maintain  an  on-line 
encyclopedia?  When  a  new  pro¬ 
grammer  joins  the  group,  how  is  that 
person  trained  in  the  local  language? 
How  long  does  it  take  a  programmer 
new  to  the  group  to  become  fluent  in 
the  special  words  that  are  in  use? 

I  am  in  the  lone-programmer  cate¬ 
gory,  but  I  would  be  interested  in 
hearing  how  multiperson  shops  han¬ 
dle  the  problem  of  names  and  the 
problem  of  promulgating  the  gener¬ 
al-purpose  tools  the  programmers 
create.  If  you  have  found  a  working 
solution  to  this  problem,  do  share  it. 

I  also  would  be  interested  in  find¬ 
ing  out  how  you  lone  wolves  keep 
track  of  your  own  tools.  Do  you  sort 
your  tools  into  files,  each  file  being  a 
toolkit  for  a  particular  purpose?  Do 
you  use  precompiled  overlays  as  tool¬ 
kits?  Do  you  keep  all  your  words  and 
their  use  in  your  head,  or  do  you 
maintain  some  kind  of  written  refer- 


114 


Dr.  Dobb's  Journal,  January  1987 

69 


ence  book — a  dictionary,  or  thesau¬ 
rus,  or  encyclopedia?  Let  us  in  on 
your  secrets. 

Fragility  as  Strength 

Once  there  was  a  contest  to  define 
Forth  in  25  words  or  less.  My  defini¬ 
tion  was  "Forth  is  like  the  Tao:  it  is  a 
Way,  and  is  realized  when  followed. 
Its  fragility  is  its  strength,  its  simplicity 
is  its  direction.”  I  want  to  talk  about 
the  seeming  oxymoron  in  this  defini¬ 
tion:  Forth's  fragility  being  its 
strength. 

Forth  has  no  training  wheels.  If 
you  tip  over,  you  fall:  the  stack  ex¬ 
plodes,  the  system  crashes,  whatev¬ 
er.  The  design  decision  in  creating 
Forth  was  to  remove  safeguards  to 
enhance  performance.  For  program¬ 
mers  accustomed  to  bulletproof  com¬ 
pilers,  this  approach  seems  foolhar¬ 
dy.  Why  not  have  as  much  protection 
as  possible? 

Protection  of  course  imposes  per¬ 
formance  penalties,  but  perhaps 
even  more  important  is  the  degrada¬ 
tion  of  the  feedback.  In  high-per¬ 
formance  machines,  the  flip  side  of 
responsiveness  is  sensitivity.  The 
more  the  machine  gives  control  to 
the  operator,  the  more  responsibility 
the  operator  must  accept.  The  advan¬ 
tage  of  the  operator  taking  control  is 
that  the  operator  becomes  more  di¬ 
rectly  connected  to  what  is  happen¬ 
ing.  This  connection  amplifies 
awareness  and  allows  the  mind  and 
the  tool  to  merge,  providing  the  im¬ 
mediacy  of  feedback  that  more  close¬ 
ly  connects  thought  and  action.  The 
intimacy  and  control  of  such  a  con¬ 
nection  is  almost  addictive,  which  is 
why  people  who  have  learned  to 
work  with  such  tools  are  so  reluctant 
to  abandon  them.  Racing-car  drivers 
don't  enjoy  spending  the  day  behind 
the  wheel  of  a  station  wagon. 

Robert  Berkey  first  pointed  out  to 
me  how  the  Forth  stack,  leaving  the 
arguments  nakedly  exposed,  also  lets 
the  programmer  see  what  is  going 
on.  Errors  surface  immediately — 
that’s  the  fragility — and,  being  dis¬ 
covered,  are  then  corrected — that’s 
the  strength.  Merely  because  Forth  is 
fragile  for  the  programmer  does  not 
mean  that  the  application  programs 
are  fragile.  Indeed,  the  very  degree  to 
which  errors  will  out  during  devel¬ 
opment  makes  the  final  product  that 
much  more  robust. 


Fragility  often  accompanies  flexi¬ 
bility.  The  more  options  the  machine 
or  language  offers,  the  more  ways  it 
can  be  used  against  itself  (fragility), 
but  the  greater  diversity  of  needs  it 
can  address  and  the  more  quickly  it 
can  be  modified  (strength).  A  me¬ 
chanical  example  is  the  Gossamer 
Condor,  a  successful  human- 
powered  aircraft.  A  key  design  deci- 


Table  4:  Flag  bit  structure 


sion  was  not  to  attempt  to  make  it  an 
unbreakable  machine  but  to  make  it 
as  simple  as  possible,  with  every¬ 
thing  visible  and  accessible.  Let  it 
break,  as  long  as  it  is  easy  to  fix.  That 
simplicity  also  made  it  flexible  in  the 
sense  that  it  was  easy  to  modify,  and 
in  fact  the  Gossamer  Condor's  success 
was  based  upon  a  process  of  iterative 
development  familiar  to  Forth  pro- 


(  Work  areas  ) 

CREATE  OUTAREA  20000  ALLOT 
OUTAREA  20000  ERASE 
CREATE  INAREA  5  12  ALLOT 


(  will  contain  uncoded  image  ) 

(  size  depends  on  application  ) 
(  work  area  for  input  blocks  ) 


(  Pointers  ) 

VARIABLE  INBYTE 
VARIABLE  OUTBYTE 


(  current  byte  in  work  area  ) 

(  current  byte  in  output  area  ) 


:  INPOINT  (  —  adr  )  INBYTE  3  INAREA  +  ; 

:  OUTPOINT  (  —  adr  )  OUTBYTE  3  OUTAREA  + 

(  Flag  manipulation  ) 

(  These  use  the  encodingf  1  ags  ) 


NEXTFLAG 

BLOCKEND? 

REPLICATE? 

CHARCOUNT 


(  -  £  ) 

(  £  -  f  ) 

(  f  -  f  ) 

(  f  —  n  ) 


INPOINT  C 3 

128  =  ; 

128  AND  ; 
127  AND  ; 


(  next  source  byte  ) 
(  next  target  byte  ) 


(  puts  f  1  ag  on  the  stack 
) 

(  end  of  input  block  ) 

(  replicate  next  byte  ) 

(  #  of  replications  or  ) 
(  ft  of  bytes  to  move  ) 


(  Replication  ) 

:  REPLICATE  (  f  -  ) 

INBYTE  INCR 
INPOINT  C 3 
OUTPOINT 
ROT 

CHARCOUNT  OVER  +  SWAP 

DO  DUP  I  C  !  OUTBYTE  INCR  LOOP 

DROP 

INBYTE  INCR  ; 

(  Move  ) 

:  MOVECHUNK  (  f  -  ) 

CHARCOUNT  DUP  INBYTE  INCR 
INPOINT  OUTPOINT  ROT  CMOVE 
DUP  INBYTE  +  !  OUTBYTE  +  !  • 


(  replicates  based  on  count  inf  1  ag  ) 
(  move  past  the  f  1  ag  ) 

(  char  to  replicate  ) 

(  destination  address  ) 

(  bringf 1 ag  to  top  ) 

(  indices  =  address  range  ) 

(  replicating  S  counting  ) 

(  character  replicated  ) 

(  to  next  f  1  ag  location  ) 

(  moves  #  of  chars  specif  i  ed  inf  1  ag  ) 

(  move  characters  ) 

(  update  pointers  ) 


(  Actual  decoding  of  block  ) 

:  BLOCKWORK  {  decompress  the  run-length  encoding  ) 

BEGIN  NEXTFLAG  DUP  BLOCKEND?  NOT  OUTBYTE  3  20000  <  AND 
WHILE  DUP  REPLICATE?  IF  REPLICATE  ELSE  MOVECHUNK  THEN 
REPEAT  DROP  (flag)  ; 


Code  Example  2:  Decoding  run-length  encoded  data 


1000 0000 

No  more  data  in  this  51 2-byte  block. 

Oxxx  xxxx 

Data  field  consists  of  as  many  bytes  as  specified  by  the  number  in 
the  low  bit  positions  (and  thus  a  maximum  of  1 27  bytes).  Though  un- 

important  for  decoding,  it  is  worth  noting  that  the  data  bytes  contain 
no  duplicates. 

1 XXX  xxxx 

Data  field  consists  of  a  single  byte  (the  next  byte  after  this  flag), 
which  is  to  be  replicated  as  many  times  as  the  number  in  the  low  bit 
positions  (and  thus  a  maximum  of  127  replications). 

Dr.  Dobb's  Journal,  January  1987 

70 


115 


STRUCTURED  PROGRAMMING 

(continued  from  page  115) 

grammers.  (The  best  account  of  the 
development  of  the  Gossamer  Con¬ 
dor  and  its  sibling  the  Gossamer  Alba¬ 
tross  is  the  book  Gossamer  Odyssey 
by  Morton  Grosser  [Boston:  Houghton 
Mifflin,  1981]). 

In  this  spirit,  tools  for  developers 
typically  lack  the  safeguards  that 
programmers  provide  in  application 
programs:  DROP,  for  instance,  doesn't 
check  stack  depth  before  trying  to 
drop.  Such  a  check  would  slow  it 
down  too  much.  The  programmer  is 
responsible  for  making  sure  that  the 
program  will  always  have  something 
on  the  stack  when  DROP  is  used. 

On  the  other  hand,  some  safeguards 
don’t  cost  much.  Paul  Simon  pointed 
out  in  a  letter  that  the  defining  word 
FOR,  which  appeared  in  this  column 
in  July  1986,  could  include  an  error 
check  with  no  speed  penalty. 

In  its  final  version,  FOR  expects  two 
numbers  on  the  stack  and  will  crash 
the  system  if  it  is  executed  with  an 
empty  stack.  This  behavior,  perfectly 
acceptable  when  FOR  was  mine 


alone,  becomes  arguable  when  FOR  is 
promulgated  as  a  tool  for  general  use. 
It  is  easy  to  provide  some  protection. 
The  simplest  approach  is  to  include 
at  the  beginning  of  FOR's  definition 
(right  after  CREATE )  this  phrase: 

DEPTH  2  <  ABORT"  Need  both  array 
type  and  number  of  slots" 

The  speed  of  the  words  defined  by 
FOR  is  unaffected  by  this  additional 
check.  On  the  whole,  putting  in  this 
bit  of  protection  seems  reasonable. 
Moreover,  as  Forth  is  an  open-archi¬ 
tecture  language,  those  who  don’t 
want  to  spend  the  memory  space  on 
the  error  message  can  remove  the 
check.  After  all,  they  might  reason,  if 
FOR  fails,  it  is  during  development, 
when  the  developer  can  immediately 
correct  the  condition.  At  run  time 
(when  the  end-user  is  running  the  ap¬ 
plication  program),  it  is  not  FOR  but 
the  words  defined  by  FOR  that  are 
used,  and  they,  of  course,  will  work 
fine. 

Though  I  tested  FOR  for  suitability 
as  a  tool  for  other  programmers  (as  a 
tiny  application  program),  I  never 


recognized  the  problem  of  what  hap¬ 
pens  when  the  stack  is  empty.  My 
oversight  occurred  because  I  fell  into 
the  vulgar  error  of  testing  to  show 
that  the  routine  works  instead  of 
viewing  as  a  failure  any  test  that  fails 
to  find  a  bug. 

Glenford  J.  Myers  observes  in  The 
Art  of  Software  Testing  (New  York: 
John  Wiley  &,  Sons,  1979)  that  the  pri¬ 
mary  difference  between  successful 
and  unsuccessful  test  efforts  is  that 
single,  critical  definition:  a  successful 
test  is  one  that  finds  a  bug;  a  test  that 
finds  no  bug  is  a  failure.  And  Gerald 
Weinberg’s  enjoyable  book  The  Psy¬ 
chology  of  Computer  Programming 
(New  York:  Van  Nostrand-Reinhold 
Co.,  1971)  points  out  that  a  program¬ 
mer  trying  to  find  errors  in  his  or  her 
own  work  is  unlikely  to  be  success¬ 
ful,  which  is  why  independent  test¬ 
ing  is  so  important. 

Run-Length  Decoding 

I  close  the  column  with  a  brief  discus¬ 
sion  of  run-length  decoding.  One 
way  of  compressing  data  is  run- 
length  encoding.  There  may  be  vari¬ 
eties  of  this  technique,  but  the  one  I 
ran  across  was  as  follows. 

The  data  are  stored  in  512-byte 
blocks,  coded  in  variable-length  data 
fields.  Each  data  field  has  as  the  first 
byte  a  flag  that  determines  the  type 
of  field  and  the  length  of  the  field. 
The  flag’s  bit  structure  determines  its 
meaning  (see  Table  4,  page  115). 

The  words  in  Code  Example  2, 
page  115,  decode  such  encoded  data. 
In  this  particular  application  I  knew 
that  the  decoded  data  would  not  ex¬ 
ceed  a  length  of  20,000  bytes.  Each 
coded  block  is  read  in  turn  into  the 
512-byte  input  work  area  and  is  then 
decoded  into  the  next  available  area 
of  the  output  work  area. 

The  pointers  INPOINT  and  OUT¬ 
POINT  keep  track  of  where  you  are  in 
the  two  areas.  The  flag-manipulation 
words  take  care  of  all  flag  interpreta¬ 
tion.  REPLICATE  replicates,  and  MOVE- 
CHUNK  moves  a  chunk  of  data. 
BLOCKWORK  decodes  the  block  and  is 
used  within  a  loop  that  reads  each  of 
the  512-byte  input  blocks  into  the  in¬ 
put  work  area  in  turn. 

DDJ 


Vote  for  your  favorite  feature/article. 
Circle  Reader  Service  No.  7. 


116 


Dr  Dobb's  Journal,  January  1987 

71 


COLUMNS 


THE  RIGHT  TO  ASSEMBLE 


I’d  like  to  design  a  versatile,  easy- 
to-use  interpreted  language,  using 
occasional  essays  in  this  space  to  stim¬ 
ulate  my  own  creative  juices  and  get 
feedback  from  you.  My  approach  to 
this  project  will  be  experimental,  and 
the  entire  interpreter  will  be  written 
in  680xx  assembly  language.  Why? 
Because  I  love  680xx  assembly  lan¬ 
guage,  and  I  like  to  noodle  around 
looking  for  really  efficient  ways  to  do 
stuff.  As  my  interpreted  language 
comes  together,  I  want  to  know  what 
you  think  of  it.  If  the  ideas  expressed 
here  get  your  juices  flowing,  send  me 
a  letter.  If  you  send  me  interesting 
enough  letters,  I’ll  include  them 
here.  I’d  like  this  to  be  both  an  educa¬ 
tional  project  for  interpreter  design¬ 
ers  and  a  general  discussion  of  data 
handling  in  assembly  language. 

Why  an  Interpreted 
Language? 

The  nicest  thing  about  an  interpreted 
language  is  that  it  can  be  very  interac¬ 
tive,  and  if  it's  extensible  and  fast 
enough,  it  can  be  a  real  joy  to  work 
with.  Forth  is  an  example  of  the  kind 
of  language  I'm  talking  about.  I  like 
Forth  a  lot.  But  the  problem  with 
Forth  is  that  it's  too  weird — I  find  I 
have  to  think  backward  to  use  it  ef¬ 
fectively,  and  I’d  like  to  be  able  to 
think  in  my  most  efficient  way — for¬ 
ward.  Thus,  because  I’ve  never  seen  a 
true  interpreted  language  that  satis¬ 
fies  me,  I  want  to  design  my  own, 
with  your  feedback  to  help  guide  me. 

by  Nick  Turner 

Juggling  Numbers 

In  this  first  essay  I  want  to  talk  about 
math;  specifically,  numeric  formats. 
This  is  intended  as  both  an  introduc¬ 
tion  to  numeric  representation  (for 
those  who  may  not  have  a  lot  of  low- 
level  practice)  and  as  a  source  of  in¬ 
spiration  for  experienced  assembly- 
language  programmers.  I’ll  start 


A  IVew  Project  Is  Born 


things  off  with  a  summary  of  some  of 
the  various  numeric  formats  that 
have  been  used  on  computer  sys¬ 
tems.  This  will  be  a  general  descrip¬ 
tion;  more  detailed  stuff  will  come 
later  on.  I  hope  to  make  most  of  these 
formats  available  in  the  final 
interpreter. 

Simple  Integers 

The  simplest  approach  to  computer 
math  is  to  use  integers.  Though  inte¬ 
ger  (INT)  math  may  initially  seem 
rather  limited,  a  surprising  amount 
of  complex  calculation  can  be  done 
with  integers  alone.  On  typical  com¬ 
puter  systems,  there  are  usually 
three  kinds  of  integers:  bytes,  words 
(two  bytes),  and  long  words  (two 
words).  Sometimes  you  might  need 
extra-precision  integers  of  eight  or 
more  bytes.  I  propose  at  least  two 
kinds  of  integers  for  my  interpreter: 
word  size  (INT)  and  long  word  size 
(LINT).  Both  would  be  signed  values, 
with  negative  numbers  expressed  in 
two's-complement  form.  Will  I  need 
double-long,  8-byte  integers  (DINT)? 

Simple  math  with  integers  is 
straightforward.  The  biggest  advan¬ 
tage  of  INT  math  is  speed.  Overflow 
and  underflow  are  typically  the  most 
important  error  conditions.  The  big¬ 
gest  practical  disadvantage  of  integer 
math  is  the  inability  to  represent 
fractional  values  directly.  Fractions 
can  be  represented  by  multiplying 
all  the  numbers  in  the  system  by 
some  constant,  but  it  requires  extra 
time  and  programming.  Besides,  if 
the  constant  multiplier  is  a  power  of 
2,  you’ve  just  invented  the  next  cate¬ 
gory:  fixed-point  numbers. 


Fixed-Point  Numbers 

A  typical  fixed-point  (FIX)  representa¬ 
tion  allocates  a  number  of  bits  for  the 
integer  portion  of  a  value  and  an 
equal  number  of  bits  for  the  fractional 
portion.  For  example,  you  might  use  a 
4-byte  long  word  in  which  the  high- 
order  word  is  the  integer  and  the  low- 
order  word  is  the  fraction.  Some  sys¬ 
tems  use  larger  FIX  formats  with  a 
whole  long  word  for  each  portion, 
and  a  few  systems  have  unequal  dis¬ 
tributions  of  bits.  In  such  cases,  it’s 
usually  the  fractional  portion  that  has 
fewer  bits.  I  propose  one  FIX  format 
for  my  interpreter  (mostly  for  speed 
in  calculations  involving  fractions). 
My  FIX  could  be  two  long  words— one 
for  the  integer  and  one  for  the  frac¬ 
tion.  The  high  bit  of  the  integer  por¬ 
tion  would  be  reserved  for  the  sign, 
and  the  rest  would  be  an  unsigned 
value.  (This  simplifies  output  of  ASCII 
translations  of  the  number.) 

FIX  has  the  advantage  of  being  able 
to  deal  with  fractions,  but  it  still  has 
the  problem  of  limited  precision,  es¬ 
pecially  for  small  numbers.  From 
here  there  are  two  directions  in 
which  to  go.  Which  path  a  system 
takes  depends  on  what  the  numbers 
will  be  used  for.  If  the  ability  to  rep¬ 
resent  really  huge  or  miniscule  val¬ 
ues  is  more  important  than  vastly 
precise  representations,  then  float¬ 
ing  point  is  probably  best.  On  the 
other  hand,  if  incredibly  high  preci¬ 
sion  is  necessary,  you  might  choose 
what  I  call  extended  representation. 

Floating-Point  Numbers 

By  far  the  most  frequent  choice  in 
typical  systems  is  a  floating-point 
representation  (FLOAT),  in  which  the 
value  is  divided  into  two  subvalues: 
the  exponent  and  the  mantissa.  The 
exponent  represents  the  logarithm  in 
base  2  of  a  number  by  which  the 
mantissa  is  to  be  multiplied  to  create 
the  actual  value  stored.  For  example, 
if  the  exponent  is  4  and  the  mantissa 


126 

72 


Dr.  Dobb's  Journal,  January  1987 


is  3,  then  the  value  might  be  3  times  2 
to  the  fourth  power,  or  3  times  16,  or 
48.  In  actual  practice,  the  mantissa  is 
almost  always  treated  as  a  fraction.  In 
the  above  case,  the  exponent  would 
be  6  and  the  mantissa  would  be  0.11 
(binary),  which  is  3  (or  11  binary) 
shifted  left  twice.  Note  that  the  expo¬ 
nent  really  represents  nothing  more 
than  the  number  of  times  the  mantis¬ 
sa  must  be  shifted  to  create  the  actual 
value.  If  the  exponent  is  negative, 
you  shift  the  mantissa  to  the  right.  If 
it's  positive,  you  shift  it  left.  The  man¬ 
tissa  usually  also  has  a  sign  bit,  which 
governs  the  sign  of  the  entire  value. 

Now  here’s  the  tricky  part  about 
floating  point:  most  FLOAT  represen¬ 
tations  nowadays  have  something 
called  a  "hidden  1  bit.”  This  means 
that  the  high-order  bit  of  the  mantis¬ 
sa,  which  is  always  a  1  bit  in  a  proper¬ 
ly  normalized  FLOAT  value,  is  "over¬ 
laid”  by  the  sign  bit  of  the  mantissa  or 
is  omitted  altogether.  The  cost  of  this 
1-bit  saving  is  that  the  missing  bit  must 
be  recreated  every  time  a  calculation 
is  done.  For  systems  with  a  hardware 
assist,  such  as  the  MC68881  floating¬ 
point  math  chip,  this  is  trivial.  Anoth¬ 
er  tricky  point  is  that  the  exponent  is 
usually  represented  as  an  "augment¬ 
ed”  value — this  means  you  must  first 
subtract  a  certain  number  from  it  in 
order  to  get  the  actual  exponent.  The 
augment  number  is  chosen  such  that 
an  exponent  of  zero  is  represented  as 
a  bit  field  with  only  the  high  bit  set. 
The  result  is  that  the  exponent  can  be 
treated  as  a  simple  unsigned  value, 
simplifying  many  calculations. 

For  my  interpreter,  I  propose  the 
FLOAT  formats  used  by  the  MC68881 
chip — specifically,  the  single,  double, 
and  extended  representations,  which 
I  will  call  FLOAT1,  FLOAT2,  and 
FLOATX  for  my  language.  The  reason 
is  simple:  I’d  like  to  use  the  68881  chip 
eventually. 

A  Weird  Extended  Hybrid 

The  last  approach  to  numeric  repre¬ 
sentation,  and  one  that  I've  not  seen 
used  very  much,  is  sort  of  a  weird 
hybrid  between  floating  point  and 
fixed  point.  I  call  it  extended  repre¬ 
sentation  (EXT),  and  it’s  the  only  nu¬ 
meric  format  in  my  proposed  system 
that  uses  variable-length  fields.  The 
basic  concept  is  simple:  a  number  is 
represented  in  full  precision  as  a 
large  field  of  2-byte  words,  with  a  giv- 


THE  RIGHT  TO  ASSEMBLE 


en  number  of  words  representing 
the  integer  portion  (except  the  high¬ 
est  order  bit,  which  is  the  sign  bit)  and 
the  remainder  representing  the  frac¬ 
tional  part.  Of  course,  there  must  also 
be  a  field  somewhere  that  contains 
some  clue  as  to  where  the  radix  point 
is  (the  radix  point  separates  the  inte¬ 
ger  from  the  fraction).  It’s  also  impor¬ 
tant  to  have  a  value  that  says  how 
long  the  whole  thing  is. 

The  EXT  format  has  certain  advan¬ 
tages  for  a  limited  set  of  problems. 
For  instance,  I’ve  always  wanted  to 
be  able  to  compute  various  irrational 
values  to  an  arbitrarily  high  preci¬ 
sion.  My  EXT  format  can  do  this,  but 
problems  arise.  For  example,  as  soon 
as  you  attempt  to  calculate  a  tran¬ 
scendental  function,  you  run  into 
precision  vs.  time  trade-offs:  if  you 
use  the  traditional  polynomial  ap¬ 
proximation  method,  your  polyno¬ 
mial  factors  will  limit  the  precision  of 
the  result,  which  must  then  be 
chopped  accordingly.  On  the  other 
hand,  if  you  use  the  full  Taylor  (or 
similar)  series  to  compute  the  tran¬ 
scendental  result,  you  may  end  up 
spending  an  inordinate  amount  of 
time  to  get  the  desired  accuracy.  I’m 
very  interested  in  feedback  on  this  is¬ 
sue;  I  have  by  no  means  reached  a 
satisfying  resolution. 

Do  You  Want  More? 

If  there's  a  good  response  to  this  essay, 
I’ll  continue  the  story.  Future  topics 
might  include  a  detailed  expansion  on 
each  of  the  numeric  formats  de¬ 
scribed  here,  with  listings  of  working 
math  routines  and  a  discussion  of  the 
''housekeeping”  information  sur¬ 
rounding  the  number  formats — how 
does  the  system  know  what  kind  of 
number  it’s  dealing  with  and  how 
does  it  keep  track  of  all  the  variables? 
I’d  also  like  to  discuss  the  actual  syntax 
and  interface  of  the  language — but 
first  I'd  like  to  see  your  blue-sky  sug¬ 
gestions.  What  would  your  ideal  in¬ 
terpreted  language  look  like?  Write  to 
me  care  of  DDJ. 

DDJ 

Vote  for  your  favorite  feature/article. 

Circle  Reader  Service  No.  8. 


Dr.  Dobb's  Journal,  January  1987 


FORUM 


DDJ  ONLINE 


PROLOG  and  the 
Future  of  AI 

The  following  is  an  excerpt  from  a 
real-time  conference  held  by  Borland 
International  on  CompuServe  on  July 
26,  1986.  A  complete  transcript  of  this 
three-hour  on-line  conference  on  AI 
and  PROLOG  can  be  found  in  DL6  of  the 
Borland  SIG  on  CompuServe  ( <GO 
BOR100>  KEYWORDS.CONFERENCE). 

Larry  Kraft,  SYSOP  of  Borland  SIG: 

Our  panel  of  featured  "speakers”  to¬ 
day  includes  Borland’s  president, 
Philippe  Kahn;  assistant  professor 
Mark  Chignell  of  USC,  and  Mike 
Swaine,  editor-in-chief  of  Dr.  Dobb's 
Journal  of  Software  Tools.  The  first 
part  of  this  conference  will  consist  of 
a  panel  discussion  centered  on  nu¬ 
merous  questions  that  were  submit¬ 
ted  in  advance.  Our  first  panelist  to 
speak  will  be  Mark  Chignell. 

Mark:  I’ll  start  with  a  little  informa¬ 
tion  on  my  background  in  AI  and 
PROLOG.  I  am  an  assistant  professor  in 
the  Department  of  Industrial  and  Sys¬ 
tems  Engineering  at  the  University  of 
Southern  California.  I  have  a  Ph.D.  in 
psychology  and  an  M.S.  in  industrial 
and  systems  engineering.  At  USC  I  be¬ 
came  interested  in  PROLOG  as  a  prac¬ 
tical  implementation  language  for  AI 
applications  in  engineering.  My  cur¬ 
rent  research  is  concerned  with  the 
development  of  human-computer  in¬ 
terfaces  in  engineering  design  and 
on-line  information  retrieval. 

Here’s  the  first  question  I'm  going 
to  answer:  What  is  artificial  intelli¬ 
gence?  People  usually  point  to  smart 
computer  programs  and  say,  "That's 
AI.”  In  the  early  days  of  AI, 
1956  —  1970,  AI  was  thought  of  as  a 


process  of  domain-independent,  gen¬ 
eral-purpose  reasoning.  More  recent¬ 
ly,  people  have  focused  on  domain- 
specific  knowledge  and  the  kind  of 
heuristic  reasoning  that  experts  use. 
Perhaps  the  main  unifying  feature  of 
all  AI  applications  is  the  element  of 
machine  reasoning.  In  vision,  for  in¬ 
stance,  the  program  is  reasoning 
about  how  to  update  its  model  of  the 
visual  environment  based  on  the  sen¬ 
sory  data.  In  planning,  the  program  is 
reasoning  about  how  to  act  on  its 
model  of  the  task  environment  so 
that  a  set  of  goals  can  be  achieved  and 
so  on.  (See  P.  McCorduck,  Machines 
Who  Think,  for  a  historical  introduc¬ 
tion  to  the  issues  faced  by  AI.) 

Philippe:  Well,  to  me  AI  is  what 
hasn't  been  done  yet — once  it’s  been 
written,  it's  called  programs! 

Mark:  Question:  What  are  “true”  AI 
applications?  Perhaps  the  thing  that 
distinguishes  AI  from  other  applica¬ 
tions  is  the  need  for  symbolic  reason¬ 
ing.  In  statistics,  for  instance,  it 
wouldn't  make  much  sense  to  use  an 
AI  program  to  find  the  straight  line 
that  had  the  best  least-squares  fit  to  a 
set  of  data.  That  task  is  already  done 
well  by  numerical  algorithms.  In  ma¬ 
chine  chess,  brute-force  methods 
based  on  grinding  through  possible 
move  sequences  do  yield  fairly  good 
results,  but  the  problem  of  combina¬ 
torial  explosion  of  search  possibilities 
has  led  to  an  examination  of  how  hu¬ 
man  masters  perform  the  task  and 
has  led  to  attempts  to  incorporate 
their  knowledge  representations  and 
heuristics  into  chess  programs. 

Philippe:  Well,  that  is  one  way  of 
typing  things.  ...  On  the  other  hand, 

I  think  that  five  years  ago  people 
would  have  called  a  resident,  beep¬ 
ing  spelling  checker  AI. 

Mark:  I  guess  we  have  a  difference  of 
opinion  here.  I  think  it  should  be  pos¬ 
sible  to  characterize  AI  independent¬ 
ly  from  the  current  status  of  technol¬ 
ogy.  We  should  be  moving  toward  a 
definition  of  intelligence  that  covers 
both  humans  and  machines. 

Question:  How  can  you  tell  if  a  pro¬ 
gram  is  intelligent?  In  today's  cli¬ 


mate,  there  is  a  tendency  to  assign 
the  label  AI  rather  liberally.  Deciding 
on  whether  a  program  is  intelligent 
is  a  particular  case  of  the  general 
problem  of  recognizing  intelligent 
behavior.  One  method  for  establish¬ 
ing  the  intelligence  of  a  program  is  a 
type  of  Turing  test.  If  the  perfor¬ 
mances  of  a  program  and  of  a  person 
on  a  task  that  requires  intelligence 
are  virtually  indistinguishable,  then 
we  assume  the  program  is  intelligent. 

Larry:  Thank  you  very  much,  Mark. 
Mike,  you’re  up  next.  Go  ahead, 
please. 

Mike:  I'm  Mike  Swaine,  editor-in- 
chief  of  Dr.  Dobb's  Journal  of  Soft¬ 
ware  Tools.  My  background  includes 
graduate  study  in  both  human  cogni¬ 
tion  and  artificial  intelligence  and 
three  years  reporting  on  AI  and  new 
technologies  as  a  senior  editor  for 
InfoWorld.  I  am  coauthor  of  Fire  in 
the  Valley,  a  history  of  the  personal 
computer,  and  creator  of  .the  fiction¬ 
al  puzzle-detective  Mr.  Usasi. 

Question:  What  is  declarative  pro¬ 
gramming,  and  why  would  you 
want  to  use  this  type  of  program¬ 
ming?  Declarative  programming 
stresses  static  aspects  of  knowledge: 
facts  about  the  world  and  rules  about 
how  the  facts  are  connected.  It  con¬ 
centrates  on  representing  these  facts 
and  rules,  and  it  deliberately  sub¬ 
merges  all  procedural  details.  These 
procedural  details  are  nothing  less 
than  the  entire  control  structure  of 
the  program — that  is,  what  state¬ 
ment  gets  executed  next  or,  in  more 
conceptual  terms,  how  to  use  these 
static  facts  and  rules  to  answer  ques¬ 
tions,  solve  problems,  or  derive  new 
facts  and  rules. 

PROLOG,  for  example,  uses  the 
model  of  first-order  predicate  logic  to 
represent  the  facts  and  rules  about 
some  domain  of  knowledge — such  as 
U.S.  geography — and  submerges  the 
procedural  details  in  an  inference  en¬ 
gine,  a  mechanism  that  automatical¬ 
ly  makes  the  necessary  deductions 
from  the  facts  and  rules.  To  oversim¬ 
plify,  using  PROLOG  means  pouring 
facts  and  rules  into  the  system,  asking 
questions,  and  letting  the  system  de¬ 
rive  the  answers  from  the  informa- 


130 

74 


Dr.  Dobb's  Journal,  January  1987 


tion  you  have  supplied.  You  declare; 
it  deduces.  To  the  extent  that  declara¬ 
tive  programming  actually  sub¬ 
merges  the  procedural  details,  it 
achieves  one  of  the  goals  of  what  is 
called  fifth-generation  language  de¬ 
sign:  it  allows  the  programmer  to  fo¬ 
cus  on  the  problem  rather  than  on 
the  program.  In  implementing  a  geo¬ 
graphical  database,  for  example,  you 
can  concentrate  on  facts  about  U.S. 
geography  rather  than  on  details  of 
database  design. 

Question:  What  are  the  advantages 
and  disadvantages  of  declarative  vs. 
procedural  programming?  The 
choice  of  a  declarative  or  a  procedur¬ 
al  approach  to  solving  a  particular 
problem  can  depend  on  what  kind  of 
knowledge  about  the  problem  is 
most  accessible.  If  you  can  gather  the 
important  facts  and  rules  about  the 
problem  domain  easily,  then  you 
should  consider  a  declarative  ap¬ 
proach.  If  it's  easier  to  specify  the 
steps  or  techniques  for  solving  the 
problem,  then  you  should  consider  a 
procedural  approach.  Another  con¬ 
sideration  is  consistency  vs.  efficien¬ 
cy.  A  declarative,  first-order,  predi- 
cate-logic-based  approach  can  be 
trusted  to  be  consistent;  you  won't 
get  false  conclusions  from  true  pre¬ 
mises.  But  by  giving  up  control  over 
the  way  the  program  searches  for  so¬ 
lutions,  you  give  up  the  option  of 
fine-tuning  the  code  for  efficiency.  A 
procedural  approach  lets  you  specify 
how  to  solve  the  problem  efficiently 
but  at  the  cost  of  introducing  com¬ 
plexities  that  make  it  harder  to  trust 
the  results. 

Question:  Are  PROLOG  and  LISP 
both  declarative?  No  language  is 
strictly  one  or  the  other,  although 
most  programming  languages  are 
mainly  procedural.  LISP  can  be 
thought  of  as  declarative,  but  it's  a 
funny  fit — LISP  wants  to  be  thought 
of  as  functional,  and  it's  old  enough  to 
be  humored.  PROLOG  was  designed  to 
be  used  declaratively.  PROLOG  proba¬ 
bly  gains  in  efficiency  by  not  being 
purely  declarative,  but  it  pays  for  it  in 
inconsistency  because  of  extensions 
to  the  basic  idea  and,  I  think,  to  the 
way  in  which  falsity  is  implemented. 

Philippe:  Well,  LISP  really  manipu- 

lates  functions,  and  I  would  call  it  nents  that  are  built  into  PROLOG — like 
procedural,  just  like  its  contempo-  an  inference  engine;  like  a  natural 
rary  FORTRAN.  mechanism  for  adding  to  the  knowl¬ 

edge  base  without  rewriting  the  en- 
Mike:  OK.  Next  question:  Why  is  PRO-  tire  program.  In  fact,  expert  systems 
LOG  appropriate  for  writing  expert  and  PROLOG  grew  out  of  the  same 
systems,  and  what  systems  have  motivation:  a  desire  to  represent  stat- 
been  written  in  PROLOG?  Last  week  I  ic  knowledge  in  a  computer  pro- 
watched  an  expert  systems  "knowl-  gram.  Writing  an  expert  system  in 
edge  engineer"  being  grilled  by  a  PROLOG  involves  using  some  power- 
roomful  of  skeptical  C  programmers,  ful  tools.  Rewriting  it  in  C  means  re- 
The  C  programmers  all  wanted  to  creating  those  tools.  The  latter  may 
know  "What  do  you  do  that  can’t  be  allow  opportunities  to  optimize,  but 
done  in  C?”  The  knowledge  engineer  it  also  distracts  attention  from  the 
had  to  admit  that  anything  he  did  real  task. 

could  be  done  in  C  and  that  in  fact  his  As  of  today,  though,  PROLOG  is  not 

company  typically  ported  its  prod-  the  language  of  choice  for  develop- 
ucts  to  C  for  efficiency  and  portabil-  ing  expert  systems  because  of  the 
ity.  The  programmers  already  knew  past  lack  of  a  decent  PROLOG  pro- 
these  things,  but  it  made  them  feel  gramming  environment.  Of  the  hun- 
good  to  hear  them.  dreds  of  expert  systems  in  nonaca- 

So  why  use  PROLOG  for  expert  sys-  demic  use  in  the  U.S.,  nearly  all  were 
terns  if  you'll  eventually  rewrite  developed  in  some  version  of  LISP,  in 
them  in  C?  Well,  that’s  almost  like  a  specialized  expert-system-develop- 
asking  why  use  a  graphics  language  ment  language  such  as  Teknow- 
for  graphics  processing.  Expert  sys-  ledge's  S.l,  or  in  a  conventional  third- 
tems  logically  include  certain  compo-  generation  language  such  as  C.  (One 

Dr.  Dobb's  Journal,  January  1987 


131 

75 


DDJ  ON  LINE 

(continued  from  page  131) 

counterexample  to  keep  you  awake 
nights:  Lockheed  is  using  PROLOG  to 
develop  an  expert  system  for  the  De¬ 
partment  of  Defense  [DoD]  to  analyze 
electronic  intelligence  data  to  deter¬ 
mine  "enemy  intentions.”)  In  Japan, 
PROLOG-based  expert  systems  have 
been  developed  for  (at  least)  medical, 
commercial,  and  engineering  appli¬ 
cations.  The  new  PROLOG  implemen¬ 
tations  coming  to  market  may 
change  this  picture  radically. 

Philippe:  Well,  in  Europe,  where 
PROLOG  was  born,  PROLOG  has  been 
more  widely  used  than  LISP.  Further¬ 
more,  PROLOG  is  newer  and  younger, 
and  it  is  just  now  picking  up  a  lot  of 
momentum. 

Mark:  The  two  other  panelists  classi¬ 
fied  LISP  as  a  procedural  language.  As 
someone  who  has  spent  some  time 
with  the  language,  I  feel  some  duty  to 
defend  it.  LISP  is  really  a  language-de¬ 
velopment  environment  rather  than 
a  single  language.  Once  you  start 
writing  functions,  you  can  create 
your  own  language.  LOGLISP  is  an  ex¬ 
ample  of  a  PROLOG-like  language  in 
LISP — that  is,  declarative  versions  of 
LISP  have  already  been  written.  Ob¬ 
ject-oriented  languages  have  also 
been  written  on  top  of  LISP. 

Mike:  What  does  AI  offer  to  the  aver¬ 
age  programmer  or  user?  I'll  give 
only  one  of  the  answers;  maybe  oth¬ 
ers  will  emerge  from  discussion.  AI  is 
the  domain  of  exploration  of  new 
programming  techniques.  When 
they  cease  to  be  new,  they  cease  to  be 
AI,  but  they  don’t  cease  to  be  useful. 
Also,  I'd  like  to  point  out  that  PROLOG 
may  be  a  good  prototyping  language 
for  anything,  not  just  AI  applications. 

I  have  a  question.  As  editor-in-chief 
of  a  magazine  for  software  develop¬ 
ers,  I  am  interested  in  the  interface — 
in  the  sense  of  the  zone  of  transmis¬ 
sion — between  the  other  two  panel¬ 
ists’  areas  of  expertise.  I'm  curious 
about  developments  in  AI  labs  that 
may  lead  to  commercial  products  in 
the  future.  I  haven't  been  particular¬ 
ly  prescient  about  this  in  the  past. 
Having  written  a  simple  expert  sys¬ 
tem  in  graduate  school,  I  understood 
the  principles.  I  had  been  following 
AI  work  closely  when  Teknowledge 


was  founded  and  knew  the  creden¬ 
tials  of  its  founders;  nevertheless,  I 
did  not  foresee  the  current  success  of 
expert  systems.  Expert  system  com¬ 
panies  are  beloved  of  investment 
capitalists.  Teknowledge  was  one  of 
the  few  sales  winners  in  a  recent  San 
Jose  Mercury  News  summary  of  the 
sales  slump  in  Silicon  Valley.  I'd  like 
to  do  better  the  next  time.  I’d  like  to 
be  able  to  see  the  next  area  of  com¬ 
mercial  development  and  practical 
application  of  laboratory  develop¬ 
ments  in  AI — the  next  Big  Thing. 
There  is  a  tantalizing  suggestion  of 
what  that  might  be.  .  .  . 

Mark:  That  is  a  very  good  point, 
Mike.  I  think  it  is  often  hard  for  re¬ 
searchers  to  predict  what  is  going  to 
fly  in  the  marketplace.  If  I  were  to 
make  a  bet,  I  would  say  that  in  the 
near  term  we  may  see  a  revolution 
on  retrieval  and  utilization  of  infor¬ 
mation/knowledge  using  Al-based 
front  ends. 

Larry:  OK.  Philippe's  turn  for  ques¬ 
tions.  Philippe,  I  believe  you  have 
some  opening  comments?  Go  ahead. 

Philippe:  First:  My  updated  biogra¬ 
phy:  Failed  musician,  mathemati¬ 
cian,  relatively  artificially  intelligent, 
self-appointed  "the  software  indus¬ 
try’s  resident  court  jester”!  Pops  up 
unexpectedly  anywhere. 

Larry:  Now,  I  have  the  questions  for 
Philippe.  What  plans  are  there  to  tie 
PROLOG  to  conventional  databases?  Is 
this  a  growing  area  of  AI  technology? 

Philippe:  I  don’t  know  whether  it’s  a 
growing  area,  but  it  should  be  very 
useful.  The  biggest  problem  people 
have  with  large  databases,  or  in  the 
AI  world  "knowledge  bases,”  is  reen¬ 
tering  data.  The  best  thing  is  to  be 
able  to  read  and  write  "usual”  data¬ 
base  files. 

Larry:  Next  question:  Is  PROLOG  a 
general  programming  language  that 
can  be  used  for  a  wide  variety  of  pro- 
gramming  applications  or  is  it  specifi¬ 
cally  database-oriented? 

Philippe:  Well,  it  is  inference-orient¬ 
ed,  if  anything.  With  good  exten¬ 


sions,  PROLOG  can  let  you  do  different 
general  things,  but  as  with  any  tool, 
you  need  to  use  the  right  tool  for  the 
right  job.  If  you  use  a  hammer  when 
you  were  supposed  to  use  a  saw,  you 
might  get  into  trouble! 


Alis 

the  domain  of 
exploration 
of  new 

programming 

techniques. 


Mark:  You  know,  there  are  close  to 
two  billion  documents  on-line  in  the 
world  today.  This  is  a  huge  amount  of 
information,  but  it's  not  really 
knowledge  until  you  can  distill  the 
essential  meaning.  Perhaps  the  next 
big  AI  industry  will  be  the  replace¬ 
ment  of  much  of  the  current  knowl¬ 
edge-engineering  effort  with  what  I 
will  call  a  "knowledge  mining"  ef¬ 
fort,  looking  to  translate  the  current 
backlog  of  electronic  information 
into  usable  knowledge  bases. 

Philippe:  There  is  much  more  in  a  lot 
of  this  information — things  as  simple 
as  typesetting  codes,  tables  of  con¬ 
tents,  indexes,  cross-references,  and 
so  on.  Millions  of  man-hours  of  edit¬ 
ing  have  gone  into  that  stuff.  It  is 
much  more  than  dumb  data,  at  least 
in  many  cases.  You  still  have  to  inter¬ 
pret  it,  but  a  lot  of  the  work  has  al¬ 
ready  been  done.  Take  a  book  such  as 
Roget's  Thesaurus,  for  example.  It  di¬ 
vides  the  world  into  categories,  and 
you  can  thus  define  a  vector  space  in 
a  given  metric  and  talk  of  a  much 
broader  way  to  index  data  semanti¬ 
cally  rather  than  through  a  keyword 
system.  But  as  our  old  friend  Kipling 
said,  "This  is  yet  another  story!” 

DDJ 

Vote  for  your  favorite  feature/article. 

Circle  Reader  Service  No.  9. 


134 

76 


Dr.  Dobb's  Journal,  January  1987 


FORUM _ 

VIEWPOINT 

(continued  from  page  14) 

indeed  a  principle  of  classical  logic.  As 
such  it  does  not  support  the  former 
invalid  inference.  The  assumption 
PROLOG  makes  is  the  altogether  differ¬ 
ent  assumption  termed  the  closed- 
world  assumption.  PROLOG  automati¬ 
cally  assumes  that  any  postulate  set 
(knowledge  base)  is  complete:  if  it  can¬ 
not  be  derived  that  S,  then  it  must  be 
that  not  S.  Thus  PROLOG  enshrines  the 
fallacy  dubbed  by  Spinoza  as  the  argu- 
mentum  ad  ignorandam:  If  it  can’t  be 
proven  true,  it's  false,  and  if  it  can't  be 
proven  false,  it's  true.  Of  course  it  fol¬ 
lows  that  a  proposition  that  can’t  be 
proven  true  or  false  is  both  true  and 
false.  Indeed,  with  the  proper  reposi¬ 
tioning  of  postulates  (rules),  PROLOG 
will  answer  "yes’’  and  "no”  to  the 
same  query  even  though  the  postu¬ 
late  set  is  consistent. 

The  point  is  that  if  a  system  is  not 
complete  (there  are  such  complete 
systems — real  closed  fields  being  an 
example),  then  the  assumption  of 
completeness  made  by  PROLOG  (built 
into  its  definition  of  negation)  is  false 
and  will  lead  to  fallacious  inferences 


and  contradictory  inferences.  The 
justification  offered  for  PROLOG’S 
treatment  of  negation  is  that  not 
means  not  known  or  not  derivable. 
But  this  lame  attempt  at  justification 
doesn’t  hold  up.  Neither  the  episte- 
mic  nor  the  apodictic  concept  obey 
DeMorgan's  laws,  whereas  the  truth- 
functional  not  in  PROLOG  does. 

It  is  simply  not  safe  to  use  not  unless 
it  is  pinned  down  to  a  range  (for  exam¬ 
ple,  with  the  use  of  ON).  Otherwise 
the  negation  logic  needed  should  be 
provided  by  the  programmer.  Negat¬ 
ed  sentences  can  be  treated  as  units — 
for  example,  use  of  not-L  instead  of 
not  L.  The  relationship  between  L  and 
not-L  and  other  negation  relationships 
must  be  spelled  out  by  the  program¬ 
mer.  The  programmer  must  use  (ei¬ 
ther  not-L  or  not-K)  instead  of  not(L 
and  K)  and  so  forth.  PROLOG  never  ac¬ 
tually  transforms  any  of  the  rules  in 
the  knowledge  base,  which  means 
that  the  programmer  can  provide  the 
negation  logic  needed. 

Texts  and  manuals  for  PROLOG 
should  be  up  front  about  PROLOG'S 


limitations.  It  is  not  a  full  predicate 
logic  in  any  direct  sense.  What  PRO¬ 
LOG  is  is  a  negation-restricted,  ex¬ 
panded  logic  of  definition  with  mar¬ 
velous  recursive  powers.  Properly 
billed,  the  foregoing  facts  about  PRO¬ 
LOG'S  inconsistency  and  radical  in¬ 
completeness  merely  become  irrele¬ 
vant  considerations  based  on  a 
confusion  about  what  PROLOG  is  sup¬ 
posed  to  be.  Consider  the  very  first 
problem  with  the  introduction  of  a 
postulate  declaring  the  transitivity  of 
R.  Considered  as  a  definition,  the  pos¬ 
tulate  violates  the  cannons  of  defini¬ 
tion  by  attempting  to  define  R  nonre- 
cursively  in  terms  of  itself.  PROLOG 
can  easily  handle  the  introduction  of 
a  transitive  relation  when  defined  re¬ 
cursively.  The  transitive  closure  TR 
of  a  relation  R  is  defined: 

TR(x  y)  if  R(x  y) 

TR(x  z)  if  R(x  y)  and  TR(y  z) 

The  general  theory  of  definition 
and  the  theory  of  recursive  defini¬ 
tion  can  be  given  a  rigorous  syntacti¬ 
cal  formulation — it  would  be  inter¬ 
esting  to  see  an  exact  syntactical 
formulation  of  the  extension  used  by 
PROLOG.  From  a  computer  science 
point  of  view,  this  would  amount  to 
giving  syntactical  rules  to  rule  out 
non-logic-based  semantic  errors  (log¬ 
ic  errors  in  the  field  of  partial  recur¬ 
sive  functions  cannot  be  ruled  out  by 
syntactical  rules). 

The  deductive-axiomatic  method 
has  been  the  central  unifying  meth¬ 
odology  of  knowledge  of  the  West¬ 
ern  intellectual  tradition.  By  remov¬ 
ing  the  barrier  to  the  real-time  use  of 
the  deductive-axiomatic  method, 
PROLOG  may  have  an  impact  on 
knowledge  use  and  acquisition  that  is 
hard  to  overestimate.  After  all,  being 
able  to  query  Aristotle,  or  Goethe,  or 
Einstein  with  an  updated  database 
does  give  new  meaning  to  the  expres¬ 
sion  deus  ex  machina. 

Those  who  package  PROLOG 
should  have  the  courtesy  and  integri¬ 
ty  to  say  what  it  is  and  what  it  isn’t. 

DDJ 

Vote  for  your  favorite  feature/article. 

Circle  Reader  Service  No.  1 . 


138 


Dr.  Dobb's  Journal,  January  1987 

77 


PROGRAMMER'S  SERVICES 


OF  INTEREST 


Artificial  Intelligence 

ESP  Frame-Engine  is  an  expert  system 
shell  from  Expert  Systems  Interna¬ 
tional.  Designed  for  customized  en¬ 
hancements,  the  shell  uses  frames 
that  allow  for  rule-based  and  vari¬ 
able  groupings  as  well  as  inheritance. 
Object-oriented  programming  is  pos¬ 
sible  with  several  types  of  objects 
such  as  numeric,  Boolean,  text,  set, 
and  instance.  The  shell's  open-ended 
architecture  facilitates  interfacing  to 
Prolog-2,  C,  and  other  languages. 
Reader  Service  No.  16. 

Expert  Systems  International 
1700  Walnut  St. 

Philadelphia,  PA  19103 
(215)  735-8510 

The  second  edition  of  Experiments  in 
Artificial  Intelligence  for  Microcom¬ 
puters  by  John  Krutch  has  been  pub¬ 
lished  by  Howard  W.  Sams  &  Co. 
This  edition  contains  75  percent 
more  material  providing  step-by-step 
procedures  detailing  how  AI  can  be 
applied  to  a  variety  of  practical  activi¬ 
ties.  Programs  are  provided  in  BASIC 
for  the  Commodore  64  and  128,  with 
instructions  for  converting  them  to 
other  BASICS.  Reader  Service  No.  17. 
Howard  W.  Sams  &  Co. 

4300  W.  62nd  St. 

Indianapolis,  IN  46268 
(800)  428-SAMS 

Borland  International  has  released 
an  enhanced  version  of  Turbo  Prolog 
that  provides  more  support  for  the 
development  of  large  applications. 
The  new  version  (1.1)  has  a  faster 
compilation  speed  and  an  internal 
linker,  with  single-step  compiling  to 
executable  files.  It  also  requires  less 
space  in  memory  than  the  previous 
version.  It  costs  $99.95  (free  to  regis¬ 


tered  owners  of  Version  1.0).  Reader 
Service  No.  18. 

Borland  International 
4585  Scotts  Valley  Dr. 

Scotts  Valley,  CA  95066 
(408)  438-8400 

Languages 

CET  Technology  has  released  CET  BA¬ 
SIC,  a  compiled  application  develop¬ 
ment  language  for  Intel-based  Unix 
and  Xenix  systems.  The  CET  BASIC 
compiler  is  compatible  with  OASIS, 
THEOS,  and  UX-BASIC,  and  CET  BASIC 
programs  can  be  intermixed  with 
programs  and  subroutines  in  other 
languages.  Additional  features  in¬ 
clude  multiuser  support  for  ISAM,  di¬ 
rect  and  sequential  files,  terminal  in¬ 
dependence,  error  trapping,  program 
chaining,  and  COBOL-like  formatting. 
The  CET  BASIC  compiler  is  available 
for  $695.  Reader  Service  No.  19. 

CET  Technology  Inc. 

5405  Garden  Grove  Blvd.,  Ste.  160 
Westminster,  CA  92683 
(714)  895-4345 

Rational  Systems  has  released  In¬ 
stant  C,  an  incremental  C  compiler 
that  pares  down  development  time 
by  processing  only  those  parts  of  a 
program  that  have  been  changed. 
The  compiler  incorporates  a  full¬ 
screen  editor;  source-level  debugger; 
object  code  linker;  source  code 
checker;  run-time  checker;  and  sup¬ 
port  for  linking  Lattice  C,  Versions  2.0 
and  3.0,  and  Microsoft  C,  Version  3.0, 
object  code  and  libraries.  It  runs  on 
computers  with  MS-DOS  or  Concur¬ 
rent  DOS  and  costs  $495.  Reader  Ser¬ 
vice  No.  20. 

Rational  Systems  Inc. 

P.O.  Box  480 
Natick,  MA  01760 
(617)  653-6194 

Lattice  now  offers  the  SSP/PC  library 
of  more  than  145  mathematical  sub¬ 
routines  for  use  in  scientific,  engi¬ 
neering,  and  statistical  computations. 
The  subroutines  can  be  called  from 
Lattice  C  and  provide  PC  program¬ 
mers  with  routines  similar  to  pack¬ 
ages  used  on  mainframes.  Most  rou¬ 
tines  yield  a  maximum  machine 
accuracy  of  15  significant  figures. 
The  SSP/PC  library  sells  for  $350. 


Reader  Service  No.  21. 

Lattice  Inc. 

P.O.  Box  3072 
Glen  Ellyn,  IL  60138 
(312)  858-7950 

Marshal  Pascal  from  Marshal  Lan¬ 
guage  Systems  is  a  code-optimized 
ISO  Pascal  compiler  for  MS-DOS,  CP/M- 
86,  and  Concurrent  DOS.  The  compil¬ 
er  lets  you  address  as  much  memory 
as  your  operating  system  allows,  and 
it  supports  a  variety  of  memory  mod¬ 
els.  Marshal  Pascal  costs  $189.  Reader 
Service  No.  22. 

Marshal  Language  Systems 
1136-P  Saranap  Ave. 

Walnut  Creek,  CA  94595 
(415)  947-1000 

Lifeboat  Associates  has  introduced 
a  C+  +  language  for  micros.  Advan¬ 
tage  C++  provides  extensions  and 
enhancements,  using  C+  +  as  a  pre¬ 
processor  to  emit  pure  C  code.  C+  + 
allows  you  to  design  your  own  data 
types  and  enables  you  to  use  object- 
oriented  programming  methods. 
Versions  of  Advantage  C++  are 
available  for  use  with  Lattice  C,  Mi¬ 
crosoft  C,  and  other  popular  C  com¬ 
pilers  for  $495.  Reader  Service  No.  23. 
Lifeboat  Associates  Inc. 

55  South  Broadway 
Tarrytown,  NY  10591 
(914)  332-1875 

The  Whitewater  Group  has  released 
a  new  object-oriented  programming 
language  that  incorporates  Microsoft 
Windows.  ACTOR  is  a  fast  and  power¬ 
ful  programming  environment  that 
uses  Pascal-style  syntax.  DDE  (Dynamic 
Data  Exchange)  has  been  implement¬ 
ed  for  Microsoft  Windows  to  allow  for 
the  simultaneous  transfer  of  informa¬ 
tion  between  separate  programs  on 
the  same  computer.  ACTOR  runs  on 
the  IBM  PC,  PC/XT,  or  PC/AT  and  sells 
for  $495.  Reader  Service  No.  24. 

The  Whitewater  Group 
Technology  Innovation  Center 
906  University  Pi. 

Evanston,  IL  60201 
(312)  491-2370 

Workman  &  Associates  has  released 
FTL  Modula-2  for  MS-DOS.  The  soft¬ 
ware  provides  a  complete  language 


140 

78 


Dr.  Dobb's  Journal,  January  1987 


OF  INTEREST 

(continued  from  page  141) 

development  system,  including  com¬ 
piler,  linker,  integral  editor,  library 
source,  assembler,  and  support.  MS- 
DOS  libraries  include  "peek  and  poke” 
throughout  DOS  memory,  low-level 
DOS  calls,  and  a  debugger.  The  compa¬ 
ny  also  sells  the  editor’s  source  code, 
which  is  written  almost  entirely  in 
Modula-2,  for  $39.95.  FTL  Modula-2 
costs  $49.95.  A  package  of  FTL  Modula- 
2  and  the  editor's  source  code  costs 
$79.95.  Reader  Service  No.  25. 
Workman  &  Associates 
1925  East  Mountain  St. 

Pasadena,  CA  91104 
(818)  791-7979 

Tools 

PC/ Assembler  from  Computer  Sys¬ 
tems  Documentation  is  an  interac¬ 
tive  syntax-checking  assembler  for 
the  Intel  80xx,  801xx,  and  802xx  and 
the  NEC  V20/30  processors.  PC/ As¬ 
sembler  is  used  to  write  assembler 
subprograms  that  can  be  invoked 
from  a  high-level  language.  It  is  not 
copy-protected  and  costs  $99.  Reader 
Service  No.  26. 

Computer  Systems  Documentation 
P.O.  Rox  5478 
Albuquerque,  NM  87115 

TurboMAGIC,  a  code  generator  for 
Turbo  Pascal  programmers,  is  now 
available  from  Sophisticated  Soft¬ 
ware.  TurboMAGIC  includes  a  full- 
featured  editor  and  the  ability  to  cre¬ 
ate  both  pop-up  and  pull-down 
menu  systems.  The  form  image  can 
be  stored  either  as  a  typed  constant 
or  in  a  picture  file.  The  software  runs 
on  the  IBM,  PC/XT,  PC/AT,  or  compati¬ 
ble  computers  with  256K  and  is  not 
copy-protected.  It  costs  $99.  Reader 
Service  No.  27. 

Sophisticated  Software 
6586  Old  Shell  Rd. 

Mobile,  AL  36608 
(205)  342-7026 

Expanding  the  IBM  PC 

Fort’s  Software  has  released  NVRD, 
the  Non-Volatile  RAM-Disk,  a  soft¬ 
ware  package  designed  to  work  with 
expanded  memory  hardware  or 
with  the  company’s  Virtual  Expand¬ 
ed  Memory  Manager  (V-EMM).  Com¬ 
bined  with  V-EMM  hardware,  NVRD 
provides  the  improved  performance 
of  a  RAM  disk  with  the  nonvolatility 


of  a  hard  disk.  It  runs  on  IBM  PCs  and 
compatibles  with  DOS  2.0— 3.2,  192K 
RAM,  a  fixed-disk  drive  and  fixed-disk 
adapter,  and  an  EMS  or  V-EMM  board. 
NVRD  is  available  on  its  own  for  $49.95 
or  bundled  with  the  V-EMM  for 
$119.90.  Reader  Service  No.  28. 

Fort’s  Software 
P.O.  Box  396 
Manhatten,  KS  66502 
(913)  537-2897 

SOTA  Technology  has  announced 
MotherCard  5.0,  a  plug-in  card  for 
IBM  PC/XTs  and  compatibles  that  of¬ 
fers  full  AT-compatibility  with  all 
software  written  for  the  80286,  in¬ 


cluding  protected  mode  operating 
systems.  The  on-board  1-megabyte 
RAM  is  all  usable,  and  a  DaughterCard 
connector  is  included  that  allows 
memory  expansion  to  16  megabytes. 
MotherCard  5.0  sells  for  $995.  Reader 
Service  No.  29. 

SOTA  Technology 
657  N.  Pastoria  Blvd. 

Sunnyvale,  CA  94086 
(408)  245-3366 

An  8-megabyte  memory  expansion 
board  for  the  IBM  RT/PC  is  available 
from  Tall  Tree  Systems.  The  JRAM- 
RT  is  a  32-bit  board  that  makes  use  of 
the  host  motherboard's  hardware  to 


Dr.  Dobb's  Journal,  January  1987 


143 

79 


OF  INTEREST 

(continued  from  page  143) 


find  and  correct  memory  errors.  It 
sells  for  $3,995.  Reader  Service  No.  30. 
Tall  Tree  Systems 
1120  San  Antonio  Rd. 

Palo  Alto,  CA  94303 
(415)  964-1980 

Micro  Enhancer  from  Everex  Sys¬ 
tems  is  a  5-inch  short  card  that  makes 
EGA  capabilities  available  for  users 
with  limited  space  in  their  IBM  PC/XTs 
or  compatibles.  The  board  provides 
640  X  350-pixel  resolution  graphics  in 
16  colors  from  a  palette  of  64  colors 
and  is  100  percent  compatible  with 
the  IBM  Enhanced  Graphics  Adapter. 
To  simplify  using  the  board,  Everex 
also  supplies  its  EGMODE  menu-driven 
software.  Micro  Enhancer  costs  $499. 
Reader  Service  No.  31. 

Everex  Systems  Inc. 

48431  Mimont  Dr. 

Fremont,  CA  94538 
(415)  498-1111 

Personal  Computer  Support 
Group’s  half-slot  speed-up  board 
called  the  Breakthru  286  replaces  the 
CPU  of  an  IBM  PC  or  PC/XT  with  an 


80286  microprocessor  faster  than  the 
one  found  in  the  6-MHz  IBM  PC/AT. 
PCSG  claims  that  the  Breakthru  286 
can  beat  the  performance  of  other 
caching  speed-up  boards  and  that 
better  performance  can  be  expected 
in  nearly  all  applications.  The 
Breakthru  286  costs  $595.  Reader  Ser¬ 
vice  No.  32. 

Personal  Computer  Support  Group 
11035  Harry  Hines  Blvd.,  #207 
Dallas,  TX  75229 
(214)  351-0564 

For  the  Mac 

Datacopy  Corp.  has  released  two 
scanning  systems — the  Jet  Reader 
and  the  Model  730 — that  produce 
high-resolution  images  containing 
300  dots  per  square  inch.  Once 
scanned,  images  can  be  formatted  for 
insertion  into  documents  produced 
with  a  Macintosh  desktop-publishing 
program.  The  company’s  Maclmage 
software  lets  you  control  the  scanner 
and  manage,  print,  and  view  image 
files.  The  JetReader  with  Maclmage 
software  is  priced  at  $2,250;  the  Mod¬ 
el  730  sells  for  $3,250.  Reader  Service 


No.  33. 

Datacopy  Corp. 

1215  Terra  Bella  Ave. 

Mountain  View,  CA  94043 
(415)  965-7900 

THINK  Technologies  has  introduced 
Lightspeed  Pascal,  a  full  ANSI  Pascal. 
Lightspeed  Pascal  supports  the  Mac¬ 
intosh  Toolbox  and  operating  system 
and  Apple's  SANE  extended  numerics 
software.  Compatible  with  Macin¬ 
tosh  Pascal  and  Lisa  Pascal,  it  runs  on 
Macintosh  computers  with  512K  BAM 
or  more  and  is  not  copy-protected.  It 
costs  $125.  Reader  Service  No.  34. 
THINK  Technologies  Inc. 

420  Bedford  St. 

Lexington,  MA  02173 
(617)  863-5590 

Miscellaneous 

Dynapro  Systems  has  announced 
chronOS,  a  real-time  multitasking  op¬ 
erating  system  that  lets  you  use 
standard  DOS  programming  tools  to 
write  real-time  applications.  Source 
code  is  included  for  the  console  and 
sound  generator  drivers,  language  in- 


144 

80 


Dr.  Dobb's  Journal,  January  1987 


OF  INTEREST 

(continued  from  page  144 ) 

terfaces,  and  demo  program. 
ChronOS  runs  on  the  IBM  PC/XT  or 
PC/AT  and  is  tailored  specifically  for 
the  iAPX86  microprocessor  line.  A 
U.S.  site  license  costs  $1,995,  and  a  Ca¬ 
nadian  site  license  costs  $2,495.  Read¬ 
er  Service  No.  35. 

Dynapro  Systems  Inc. 

1000-1200  W.  73rd  Ave. 

Vancouver,  B.C. 

Canada 
(604)  263-2638 

Alligator  Transforms  has  released 
Prime  Factor  FFT,  a  fast  Fourier  trans¬ 
form  subroutine  library  for  the  IBM 


PC,  PC/XT,  PC/AT,  and  compatibles 
equipped  with  an  8087  math  co¬ 
processor.  The  library  can  be  called 
from  any  high-level  language,  and  in¬ 
terface  examples  are  provided  in  all 
languages.  The  library  includes  for¬ 
ward  and  inverse  FFTs  for  single-  and 
double-precision,  floating-point, 
complex  number  sets.  Users  are  not 
limited  to  radix2  data  set  sizes 
Prime  Factor  FFT  sells  for  $159. 
Reader  Service  No.  36. 

Alligator  Transforms 
P.O.  Box  11386 
Costa  Mesa,  CA  92627 
(714)  662-0660 


The  Model  E232-51  is  a  new  in-circuit 
emulator  from  Signum  Systems  that 
provides  real-time,  transparent  emu¬ 
lation  for  the  8031,  8051,  and  8751  mi- 
crocontollers.  Connected  to  an  IBM  PC 
via  the  RS-232  interface,  Model  E232- 
51  features  complete  debugging  facil¬ 
ities.  Along  with  a  command-driven 
user  interface,  the  emulator  provides 
users  with  windowing  software  and 
mouse  support  for  controlling  and 
monitoring  program  execution.  Mod¬ 
el  E232-51  with  64K  of  overlay  pro¬ 
gram  memory  is  priced  at  $3,195. 
Reader  Service  No.  37. 

Signum  Systems 
1820  14th  St.,  Ste.  203 
Santa  Monica,  CA  90404 
(213)  450-6096 


Lang- Allan  has  announced  Version 
2.00  of  Bluestreak  Plus,  its  communi¬ 
cation  package  for  the  IBM  PC,  PC/XT, 
PC/AT,  and  compatibles.  Bluestreak 
Plus  is  a  full-featured  software  pack¬ 
age  that  combines  PC-to-PC  commu¬ 
nication  and  PC-to-mainframe  com¬ 
munication  with  an  open- 
architecture  format  for 
customization  and  modification  at 
any  level.  The  programming  inter¬ 
face  allows  you  to  develop  applica¬ 
tions  in  many  popular  languages 
such  as  C,  assembly  language,  Turbo 
Pascal,  and  dBASE.  Bluestreak  Plus 
2.00  sells  for  $89.  Reader  Service  No. 
38. 

Lang-Allan  Inc. 

2457  Aloma  Ave.,  Ste.  B 
Winter  Park,  FL  32792 
(305)  677-1539 

Full-function  simulation  models  of 
the  Motorola  MC68000  and  MC68010 
miroprocessors  are  available  from 
Quadtree  along  with  an  extensive  li¬ 
brary  of  models  of  associated  periph¬ 
erals  and  an  optional  graphic  micro¬ 
processor  development  system.  The 
new  models  are  part  of  Quadtree's 
Designer's  Choice  library,  a  library  of 
software  simulation  models  of  stand¬ 
ard,  off-the-shelf  digital  devices.  Call 
for  prices.  Reader  Service  No.  39. 
Quadtree  Software  Corp. 

1170  Rt.  22  East 
Bridgewater,  NJ  08807 
(201)  725-2272 

A  3V2-hour  C  programming  course  is 


146 


Dr.  Dobb's  Journal,  January  1987 

81 


OF  INTEREST 

(continued  from  page  146) 


available  on  video  cassette  from 
Berkeley  Decision/Systems.  The 
course  is  designed  for  expert  pro¬ 
grammers  who  want  to  learn  C  or 
people  who  are  familiar  with  the  lan¬ 
guage  and  want  to  learn  more  about 
it.  The  course  includes  a  manual  and 
more  than  40  complete  C  programs  to 
demonstrate  the  concepts  presented 
in  the  video.  A  Programmer's  Intro¬ 
duction  to  C  is  available  for  $400  in  VHS 
or  Beta  formats.  Reader  Service  No.  40. 
Berkeley  Decision/Systems 
150  Belvedere  Terr. 

Santa  Cruz,  CA  95062 
(408)  458-0500 

Zoom  Telephonies  is  now  shipping 
a  2,400-baud  version  of  its  Zoom/Mo¬ 
dem  PC  1200.  The  new  version  has 
such  additional  features  as  demon  di¬ 
aling,  audio  input  and  output  ports, 
and  a  high-speed  16450  UART  for  as¬ 
sured  compatibility  with  IBM  PC/ATs 
and  compatibles.  The  Zoom/Modem 
PC  1200  can  be  upgraded  to  2,400 
baud  with  a  plug-in  board  that  costs 
$249.  The  Zoom/Modem  PC  2400  ST 


sells  for  $499,  and  an  XL  version  with 
more  features  costs  $50  more.  Reader 
Service  No.  41. 

Zoom  Telephonies  Inc. 

207  South  St. 

Boston,  MA  02111 
(617)  423-1072 

Access  Associates  has  introduced 
Alegra ,  a  memory  expansion  unit  de¬ 
signed  to  add  512K  of  external  memo¬ 
ry  to  the  Commodore  Amiga.  Alegra 
has  a  small  footprint  and  allows  for 
future  expansion  of  up  to  2  mega¬ 
bytes  by  replacing  memory  and  con¬ 
figuration  devices.  The  unit  supports 
the  auto-configuration  architecture 
of  the  Amiga,  and  power  is  supplied 
by  the  computer  at  the  expansion 
connector.  Alegra  sells  for  $379. 
Reader  Service  No.  42. 

Access  Associates 
491  Aldo  Ave. 

Santa  Clara,  CA  95054 
(408)  727-8520 

Digitronix  has  released  a  low-cost 
Turbo  upgrade  kit  called  Veloz  that 


brings  IBM  PC/XTs  and  compatibles 
up  to  the  speed  of  a  PC/AT.  Veloz  of¬ 
fers  100  percent  compatibility  with 
all  major  software  packages  and  can 
be  run  with  either  the  8088-2  or  V20 
with  no  need  to  power  down  or  re¬ 
place  the  CPU.  The  price  is  $98.  Read¬ 
er  Service  No.  43. 

Digitronix 
2135  Junction  Ave. 

Mountain  View,  CA  94043 
(415)  964-5103 

PAX  from  Baker  &  Rabinowitz  is  a 
real-time  multitasking  executive  for 
the  IBM  PC.  It  runs  in  concert  with  MS- 
DOS  and  requires  no  licensing  or  in¬ 
corporation  fees.  The  system  kernel 
supports  up  to  32  concurrent  tasks 
and  is  fully  preemptive.  The  package 
is  priced  at  $149.95.  Reader  Service 
No.  44. 

Baker  &  Rabinowitz  Inc. 

3869  Kilbourne  Ave. 

Cincinnati,  OH  45209 
(513)  871-0886 

CodeWorks  is  a  new  magazine  for 


148 

82 


Dr.  Dobb's  Journal,  January  1987 


people  interested  in  BASIC  program¬ 
ming  under  MS-DOS,  TRS-DOS,  or  CP/M. 
Each  annual  subscription  covers  all 
six  issues  of  the  calendar  year  and 
costs  $24.95.  Interested  readers  can 
write  for  a  free  sample  issue.  Reader 
Service  No.  45. 

Code  Works 
Sample  Copy  Offer 
3838  South  Warner 
Tacoma,  WA  98409 

A  system  modeling  tool  called  Per¬ 
formance  Analysis  Tool  Box  is  avail¬ 
able  from  Computer  Technology 
Associates.  The  package  simulates  a 
variety  of  centralized  or  distributed 
computer  architectures,  allowing  de¬ 
signers  to  investigate  broad  ranges  of 
use  patterns  in  a  hypothetical  sys¬ 
tem.  It's  $10,000  for  the  IBM  PC.  Read¬ 
er  Service  No.  46. 

Computer  Technology  Associates 
7927  Jones  Branch  Dr.,  #600W 
McLean,  VA  22102 

Microsoft  Corp.  has  announced  the 
availability  of  extensions  to  the  MS- 
DOS  operating  system  that  support 
the  use  of  CD  ROM  disk  drives  with 
personal  computers.  The  MS-DOS  CD 
ROM  extension  consists  of  two  soft¬ 
ware  modules — a  hardware-inde¬ 
pendent  program  and  an  installable 
device  driver  that  must  be  custom¬ 
ized  by  each  manufacturer  to  work 
with  its  own  hardware.  Microsoft 
will  supply  the  hardware-indepen- 
dent  program — the  DOS  extension 
that  will  handle  the  much  higher  ca¬ 
pacity  of  the  CD  ROMs — plus  a  sample 
device  driver  and  documentation. 

With  the  new  software,  PCs  run¬ 
ning  DOS  3.1  or  3.2  can  read  data  from 
any  CD  ROM  disk  that  is  compatible 
with  the  High  Sierra  Group  file  for¬ 
mat  proposed  at  the  National  Com¬ 
puter  Conference  in  May  1986.  Mi¬ 
crosoft  will  license  these  extensions 
directly  to  CD  ROM  drive  manufactur¬ 
ers,  and  they  are  available  only  on  an 
OEM  basis.  Reader  Service  No.  47. 
Microsoft  Corp. 

16011  N.E.  36th  Way 
P.O.  Box  97017 
Redmond,  WA  98073-9717 
(206)  882-8080 

Practical  Peripherals  is  now  offer¬ 
ing  a  stand-alone  1,200-bps  modem, 


the  Practical  Modem  1200  SA.  It  is 
fully  Hayes-compatible,  includes 
auto-dial/auto-answer  capabilities, 
supports  virtually  all  communica¬ 
tions  software,  and  includes  an  up¬ 
grade  path  for  a  programmable  en¬ 
hancement  card.  The  suggested  retail 
price  of  the  modem  is  $239.  Reader 
Service  No.  48. 

Practical  Peripherals 
31245  La  Baya  Dr. 

Westlake  Village,  CA  91362 
(818)  991-8200 

DogStar  Software  has  announced 
subVol,  a  disk  subvolume  manager 
that  brings  PC-DOS-  and  ProDOS-like 
subdirectories  to  Apple  Pascal  DOS. 
SubVol  works  on  any  Pascal  format¬ 
ted  disk  device  and  allows  hard-disk 
users  to  format  directly  and  install  a 
complete  set  of  subvolumes  with  Ap¬ 
ple  Pascal. 

The  program  works  by  attaching 
virtual  disk  drivers  to  the  unused  disk 
units  in  an  Apple  Pascal  system.  The 
virtual  disk  drivers  cause  a  portion  of 


any  real  disk  to  behave  as  though  it 
were  a  volume  in  itself,  including  a 
subdirectory  plus  any  files  in  that 
subvolume.  The  product  runs  on  Ap¬ 
ple  II  computers  with  Apple  Pascal 
(Version  1.1,  1.2,  or  1.3).  The  price  is 
$34;  with  source  code  it  costs  $75. 
Reader  Service  No.  49. 
dogStar  Software 
P.O.  Box  302 
Bloomington,  IN  47402 
(812)  333-5616 

Instant  Replay  by  Nostradamus  is  a 
demonstration  development  toolkit 
that  generates  tutorials,  demos,  pre¬ 
sentations,  menu  systems,  and  timed 
keyboard  macros.  It  is  not  copy-pro¬ 
tected  and  runs  on  the  IBM  PC,  PC/XT, 
and  PC/AT.  The  product  is  priced  at 
$89.95.  Reader  Service  No.  50. 
Nostradamus 
5320  South  900  E,  Ste.  110 
Salt  Lake  City,  UT  84117 
(801)  261-0769 


DDJ 


Dr.  Dobb's  Journal,  January  1987 


149 

83 


FORUM 


SWAINE'S  FLAMES 


es,  I  was  kidding  about  BASIC  in 
November.  BASIC,  on  the  other 
hand,  seems  to  be  getting  serious, 
with  such  features  as  advanced  struc¬ 
ture  constructs,  new  data  types,  la¬ 
bels  for  branching,  internal  and  ex¬ 
ternal  subroutines,  multiline  and 
decision-making  functions,  recur¬ 
sion,  libraries,  modules,  better  file 
I/O,  low-level  and  DOS  access,  sophis¬ 
ticated  string  manipulation,  and  ad¬ 
vances  in  portability  and  standard¬ 
ization.  And  now  that  Borland  is 
bringing  out  a  Turbo  BASIC,  competi¬ 
tion  may  hasten  the  maturation 
process. 

Granted  that  the  Turbo  Pascal-Tur¬ 
bo  BASlC-QuickBASlC  type  of  rapid-in¬ 
teraction  compiler  is  not  a  reasonable 
substitute  for  C  and  assembly  lan¬ 
guage  in  a  major  software  develop¬ 
ment  task,  it  does  seem  to  be  finding  a 
place  in  certain  kinds  and  stages  of 
development.  The  fact  that  both  Bor¬ 
land  and  Microsoft  have  interactive  C 
compilers  in  the  works  suggests  that 
they  think  so — it's  hard  to  believe 
that  they  expect  to  sell  C  to  Sunday- 
afternoon  programmers.  But  it’s 
more  significant  that  there  seems  to 
be  a  growing  interest  in  enlarging  the 
programmer's  toolkit,  offering  more 
options  in  development  environ¬ 
ments,  and  offering  tools  that  make 
the  programmer  more  productive 
and  efficient. 

Good  programming  is  often  artful. 
Any  engineering  discipline  is  like  art 
in, that  both  bring  new  things  into  the 
world  and  rely  on  skill  and  serendip¬ 
ity  and  unlike  art  in  being  more  con¬ 
cerned  with  result  than  with  process. 
In  fact,  one  goal  of  engineering  is  to 
refine  processes,  automating  por¬ 
tions  in  order  to  free  the  engineer  to 
be  artful  at  another  level.  Does  soft¬ 
ware  development  do  this? 

If  not,  it  should.  I  recently  reread 
Bobert  Heinlein’s  The  Door  into  Sum¬ 
mer  and  was  struck  by  how  convinc¬ 


ingly  and  appealingly  Heinlein  por¬ 
trayed  the  engineer  as  an  artist 
whose  very  art  provides  the  means  to 
improve  his  brush.  If  software  devel¬ 
opment  really  is  an  engineering  disci¬ 
pline,  it  needs  more  Waldoes  and 
Drafting  Dans,  more  tools  to  automate 
the  repetitive  processes. 

John  Backus  does  not  believe  that 
programming  is  an  engineering  disci¬ 
pline  yet,  arguing  that  it  is  still  too 
much  an  art.  (See  "From  Function 
Level  Semantics  to  Program  Trans¬ 
formation  and  Optimization,”  Lec¬ 
ture  Notes  in  Computer  Science  185 
[1982].)  Solving  the  software  crisis,  he 
maintains,  requires  that  it  become  an 
engineering  discipline. 

Parallel  processing  is  the  wave 
of  the  future,  right? 

That  was  the  gist  of  Gordon  Bell’s 
keynote  address  at  the  Fall  Joint  Com¬ 
puter  Conference  in  Dallas  in  Novem¬ 
ber.  The  promise  of  parallelism  may 
have  been  overstated  in  some  areas. 
Two  former  advocates  of  parallel  de¬ 
sign  for  database  machines,  Haran 
Boral  of  the  Israel  Institute  of  Tech¬ 
nology  and  David  DeWitt  of  the  Uni¬ 
versity  of  Wisconsin  at  Madison, 
have  come  to  believe  that  we  should 
not  build  database  machines  that  at¬ 
tempt  to  maximize  throughput  via 
massive  parallelism.  The  problem, 
they  contend,  is  that  I/O  bandwidth 
per  gigabyte  of  storage  is  decreasing 
rapidly  and  that  it  is  the  I/O  band¬ 
width  issue  that  will  be  the  bottle¬ 
neck  in  database  machines.  "Unless 
mechanisms  for  increasing  the  band¬ 
width  of  mass  storage  devices  are 


found,  highly  parallel  database  ma¬ 
chines  are  doomed  to  extinction,” 
they  conclude  in  "Database  Ma¬ 
chines:  An  Idea  Whose  Time  has  Pas¬ 
sed?  "(Database  Machines,  H.  O.  Lei- 
lich  and  M.  Missikoff,  eds.  [Berlin: 
Springer-Verlag,  1983]).  E 

Those  of  you  who  live  in  California 
may  have  discerned  a  bit  of  chauvin¬ 
ism  in  Massachusetts’  claiming  to  be 
the  Software  Capital  of  the  country. 
Those  of  you  who  do  not  live  in  Cali¬ 
fornia  probably  detected  even  more 
chauvinism  in  the  recent  passage  of  a 
ballot  proposition  making  English 
California’s  state  language.  No  doubt 
the  California  legislature  will  soon  be 
turning  arroyos  and  mesas  into 
gulches  and  hills  just  as  the  French, 
who  invented  chauvinism,  have  for 
years  been  kicking  Americanisms 
out  of  la  langue  Franqaise. 

My  cousin  Corbett  is  cursing  what 
he  calls  "egotistical  linguistic  chau¬ 
vinism.”  He  had  worked  hard  on  an 
alternative  proposition  that  he  thinks 
is  much  more  appropriate  because  it 
actually  addresses  a  real  problem. 
Unfortunately,  it  did  not  get  on  the 
ballot  in  1986,  but  Corbett  plans  to  be 
more  successful  in  1988  and  is  start¬ 
ing  to  organize  now.  His  plan  is  to  re¬ 
place  English  with  C  as  California's 
language,  and  he  invites  your  partici¬ 
pation.  C,  he  points  out,  is  capable  of 
expressing  anything  anyone  could 
ever  need  to  express,  and  the  time 
has  come  for  fanatical  supporters  of 
minority  languages  to  put  aside  their 
pride,  accept  the  inevitable,  and  end 
the  Babel  of  incommensurable  dia¬ 
lects  once  and  for  all. 

“Des  egos,  et  encore  des  egos.” — Ed 
Faber,  in  Silicon  Valley,  the  French 
edition  of  Fire  in  the  Valley  by  Paul 
Freiberger  and  me. 

'  Michael  Swaine 

editor-in-chief 


152 

84 


Dr.  Dobb's  Journal,  January  1987 


#124  FEBRUARY  1987 


Dr.  Dobb’s  Journal  of 


2.95  (3.95  CANADA)  / 


Software  Tools 

FOR  THE  PROFESSIONAL  PROGRAMMER 


TEXT  EDITORS: 


The  Baby  Duck  Syndrome 


New  Column  on 
Artificial  Intelligence 


6502  Killer  Hacks 


Efficient  Hashing 
Languages: 

Turbo  Pascal  to  Modula-2 
Modula-2  Criticisms 
True  BASIC 
C  Text  Formatter 
6502  Assembly 


FEBRUARY  1987 


CONTENTS 


VOLUME  12,  ISSUE  2 


Text  editors:  ► 
whaddaya 
want? 


Hacking  6502  ► 
assembly 


C  teyt formatter  ► 


MS-DOS  ^ 
resources 

Artificial  ► 
intelligence 
today 

Translating  ► 
Turbo  Pascal  to 
Modula-2 

Knuth's  WEB  ^ 


WhatJs  wrong  ► 
with  Modula-2? 

Memories  ^ 


ARTICLES 


TOOLS:  Text  Editors:  In  Matters  of  Taste... 

by  Levi  Thomas  and  Nick  Turner 

The  choice  of  a  text  editor  is  based  partly  on  "hard" 

pragmatic  requirements  and  partly  on  more  subjective 

considerations,  such  as  personal  tastes  and  biases.  In  this 

month's  cover  story,  ddj  talks  to  some  programmers  about 

the  agony  and  the  ecstasy  of  searching  for  the  perfect  text 

editor. 

CODING:  6S02  Hacks 

by  Mark  S.  Ackerman 

Mark  gives  us  a  peek  into  the  magician's  hat.  He  describes 
some  killer  hacks  designed  to  squeeze  that  last  byte  and/or 
machine  cycle  out  of  an  assembly-language  program. 

ALGORITHMS:  Hashing  for  High  Performance 
Searching 

by  Edwin  T.  Floyd 

Edwin  explains,  demonstrates,  and  evaluates  four  different 
hashing  algorithms. 


COLUMNS 


About  tbe  Cover 

Ethologists  call  it  imprinting — a 
baby  duck  emerges  from  a  shell 
and  regards  the  first  large  object  it 
sees  as  its  mother.  Is  this  why  so 
many  programmers  still  use 
WordStar? 


C  CHEST  90 

by  Allen  Holub 

The  first  in  a  series  of  installments  on  nr — Allen's  version 
of  the  Unix  formatting  utility,  nroff.  This  month  he 
discusses  symbol  table  maintenance  with  hashing 
functions,  expression  analysis,  and  a  method  for  printing 
Roman  numerals. 

16-BIT  SOFTWARE  TOOLBOX  1 02 

by  Ray  Duncan 

A  look  at  the  MS-DOS  STACK  command,  techniques  for 
writing  memory-resident  programs,  and  books  for  MS-DOS 
programmers. 


This  Issue 

This  month,  we  delve  into  some  of 
the  issues  that  must  be  considered 
when  choosing — or  designing — a 
text  editor.  We  asked  some  pro¬ 
grammers  to  tell  us  about  their  fa¬ 
vorite  and  least  favorite  editing  pro¬ 
grams.  Also,  Allen  Holub  begins  a 
series  of  columns  on  his  own  text 
processing  program,  nr. 


ARTIFICIAL  INTELLIGENCE  108 

by  Ernest  R.  Tello 

Our  new  columnist  starts  things  off  with  some  AI  history, 
philosophy,  and  recommended  reading. 

STRUCTURED  PROGRAMMING  1*4 

by  Namir  Clement  Shammas 
Namir  discusses  program  translation. 


Next  Issue 

In  March,  DDJ  looks  at  data  com¬ 
pression  with  a  lead  article  that 
describes  a  recursive  scheme  for 
compressing  image  data  and  a 
comparative  review  of  microcom¬ 
puter  archival  programs. 


FORUM 

PROGRAMMER'S 

SERVICES 

EDITORIAL 

e 

THE  STATE  OF  BASIC:  142 

by  Michael  Swaine 

Modules  in  true  basic 

RUNNING  LIGHT 

8 

OF  INTEREST:  146 

by  Nick  Turner 

New  products  out  there 

ARCHIVES 

8 

ADVERTISER  INDEX:  181 

LETTERS 

to 

Where  to  find  those  ads 

by  you 

VIEWPOINT 

14 

by  Mike  Suman 

DDJ  ON  LINE 

134 

SWAINE’S  FLAMES 

182 

by  Michael  Swaine 

Dr.  Dobb’s  Journal,  February  1987 

86 


3 


FORUM 


EDITORIAL 


Our  feature  article 
this  month  deals 
with  editors,  and  it  is  in 
some  ways  frustrating. 

It  is  certainly  not  one  of 
those  buyer's  guide 
pieces  that  lead  the  be¬ 
wildered  consumer 
through  the  maze  of 
creeping  functions.  We 
set  out  rather  to  exam-  | 
ine  the  fundamental  is- 
sues  in  editing,  issues  that  both  the 
user  and  the  designer  of  a  program¬ 
mer’s  editor  must  face.  What  we 
found  as  we  dug  into  the  topic  was 
that  the  more  fundamental  an  issue 
was  the  more  rabidly  subjective  it  was 
likely  to  be — and  arbitrary.  The  Baby 
Duck  Syndrome  is  a  force  to  be  reck¬ 
oned  with.  Again,  frustrating. 

Some  recent  developments, 
though  not  falling  precisely  in  the  do¬ 
main  of  programmers'  editors,  sug¬ 
gest  alternatives  to  traditional  ap¬ 
proaches  to  editing. 

One  of  these  is  a  product  called 
Guide  made  by  OWL  International. 
The  firm  claims  to  have  implement¬ 
ed  Ted  Nelson’s  vision  of  Hypertext  in 
this  Macintosh  product.  The  claim  it¬ 
self  is  relatively  uninteresting;  Hy¬ 
pertext  will  be  realized,  is  being  real¬ 
ized,  piecemeal,  just  like  the  grab  bag 
of  ideas  that  Alan  Kay  lumped  togeth¬ 
er  under  the  name  Dynabook. 

What  is  really  interesting  is  that 
Guide  has  been  explicitly  designed  to 
handle  electronic  documents.  It  will 
accept  text  files  from  conventional 
editors  and  will  in  turn  produce  flat 
text  for  conventional  editors,  but 
these  tasks  are  outside  its  purpose. 
The  document  you  structure  with  it 
lives  in  more  than  two  dimensions. 

OWL  does  not  promote  Guide  as  a 
tool  for  software  development,  but 
some  of  its  features  are  interesting  to 
look  at  in  that  context.  Elements  of 
the  text  can  be  suppressed  or  ex¬ 
pressed  at  will,  in  a  similar  manner  to 
the  way  an  outline  processor  allows 
collapsing  of  detail  but  without  the 


P  constraint  of  the  out¬ 

line  structure.  A  sup¬ 
pressed  section  is  tied 
to  a  name,  which  it  re¬ 
places  when  clicked 
on.  This  suggests  auto¬ 
matic  expansion  of 
macros  or  pulling  in  a 
I;  subroutine  to  examine 

Ji  its  code.  Guide  also 

If  J  permits  alternative 

B _ m  /  text  elements  to  be  de¬ 

fined  and  selected. 

Another  interesting  development  is 
the  WEB  documentation  system,  cre¬ 
ated  by  Donald  Knuth  and  discussed 
in  The  WEB  System  of  Documentation 
(Report  Stan-CS-83-980)  available  from 
the  Stanford  University  Computer  Sci¬ 
ence  Department.  WEB  actually  im¬ 
plements  some  of  the  ideas  described 
above  in  a  tool  specifically  designed 
for  writing  structured  and  well-docu¬ 
mented  programs. 

WEB  is  a  high-level  description  lan¬ 
guage  that  produces  Pascal  code.  WEB 
programs  consist  of  short  sections, 
each  section  comprising  informal 
commentary,  macro  definitions,  and 
Pascal  code.  The  Pascal  code  can  con¬ 
sist  of  a  name  and  replacement  text, 
in  which  case  WEB  will  replace  the 
name  wherever  it  occurs  with  the  re¬ 
placement  text  (code). 

One  idea  that  both  these  develop¬ 
ments  suggest  is  the  possibility  of  a 
program  that  would  exist  while  un¬ 
der  simultaneous  development  in  two 
complementary  forms:  the  working 
version  and  the  working-on  version. 
As  Knuth  puts  it,  "a  WEB  program  is  a 
Pascal  program  that  has  been  cut  up 
in  pieces  and  rearranged  into  an  or¬ 
der  that  is  easier  for  a  human  being  to 
understand.  A  Pascal  program  is  a 
WEB  program  that  has  been  rear¬ 
ranged  into  an  order  that  is  easier  for 
a  computer  to  understand.” 

Michael  Swaine 
editor-in-chief 


Software  Tbols 

FOR  THE  PROFESSIONAL  PROGRAMMER 

Editorial 

Editor-in-Chief  Michael  Swaine 
Editor  Nick  Turner 
Managing  Editor  Vince  Leone 
Assistant  Editors  Sara  Noah  Ruddy 
Levi  Thomas 

Technical  Editor  Allen  Holub 
Contributing  Editors  Ray  Duncan 
Michael  Ham 
Bela  Lubkin 
Namir  Shammas 
Ernest  R.  Tello 

Copy  Editor  Rhoda  Simmons 

Production 

Production  Manager  Bob  Wynne 

Art  Director  Michael  Hollister 
Assoc.  Art  Director  Joe  Sikoryak 
Typesetter  Jean  Aring 
Cover  Photographer  Michael  Carr 
Circulation 

Circulation  Director  Maureen  Kaminski 
Newsstand  Sales  Mgr.  Stephanie  Barber 
Book  Marketing  Mgr.  Jane  Shaminghouse 
Circulation  Coordinator  Kathleen  Shay 
Administration 
Finance  Director  Kate  Wheat 
Business  Manager  Betty  Trickett 
Accounts  Payable  Supv.  Mayda  Lopez-Qpintana 
Accts.  Receivable  Supv.  Laura  Di  Lazzaro 
Advertising  Director 
Robert  Horton  (415)366-3600 
Account  Managers 
Michele  Beaty  (317)  875-8093 
Lisa  Boudreau  (415)366-3600 
Gary  George  (404 )  897-1923 
Michael  Wiener  (415)366-3600 
Cynthia  Zuck  (718)499-9333 
Promotions/Srvcs.Mgr.  Anna  Kittleson 
Advertising  Coordinator  Charles  Shively 

M&.T  Publishing  Inc. 

Chairman  of  the  Board  Otmar  Weber 
Director  C.F.  von  Qpadt 
President  and  Publisher  Laird  Foshay 
Associate  Publisher  Michael  Swaine 

Dr.  Dobb's  Journal  of  Software  Tools  (USPS  307690) 
is  published  monthly  by  M&.T  Publishing  Inc.,  501  Gal¬ 
veston  Dr.,  Redwood  City,  CA  94063;  (415)  366-3600. 
Second-class  postage  paid  at  Redwood  City  and  at  ad¬ 
ditional  entry  points. 

Article  Submissions:  Send  manuscripts  and  disk 
(with  article  and  listings)  to  the  Editor. 

DDJ  on  CompuServe:  Type  GO  DDJ 
Address  Correction  Requested:  Postmaster:  Send 
Form  3579  to  Dr.  Dobb's  Journal,  P.O.  Box  27809,  San 
Diego,  CA  92128.  ISSN  0888-3076 

Customer  Service:  For  subscription  problems  call: 
outside  CA  (800)  321-3333;  in  CA  (619)  485-9623  or  566- 
6947.  For  liook/soft  ware  order  problems  call  (415)  366- 
3600. 

Subscriptions:  $29.97  per  1  year;  $56.97  for  2  years. 
Canada  and  Mexico  add  $27  per  year  airmail  or  $10 
per  year  surface.  All  other  countries  add  $27  per  year 
airmail.  Foreign  subscriptions  must  be  prepaid  in  U.S. 
funds  drawn  on  a  U.S.  bank.  For  foreign  subscriptions, 
TELEX:  752-351. 

Foreign  Newsstand  Distributor:  Worldwide  Media 
Service  Inc.,  386  Park  Ave.  South,  New  York,  NY  10016; 
(212)  686-1520  TELEX:  620430  (WUD. 

Entire  contents  copyright  °  1987  by  M&T 
Publishing,  Inc.,  unless  otherwise  noted  on 
specific  articles.  All  rights  reserved. 

People's  Computer  Company 

Dr.  Oobb  'a  Journal  of  Software  Tools  is  published  by  M&.T 
Publishing  Inc.  under  license  from  Peoples  Computer  Company, 
2682  Bishop  Dr.,  Suite  107,  San  Ramon,  CA  94583,  a  nonprofit 
corporation. 


6 


Dr.  Dobb's  Journal,  February  1987 

87 


RUNNING  LIGHT 


I  his  month  we  fo¬ 
cus  on  a  subject 
that  is  near  and  dear  to 
the  hearts  of  magazine 
editors  and  software 
authors  alike:  text  edi¬ 
tors,  and  what  makes 
them  good  or  bad. 

There  have  been 
many  new  develop¬ 
ments  in  the  field  re¬ 
cently,  and  we  wanted 
to  get  an  idea  of  what  you,  as  pro¬ 
grammers,  thought  of  them.  Specifi¬ 
cally,  what  is  desirable  in  a  program 
source  code  editor?  Your  opinions 
were  interesting.  Allen  Holub  also 
talks  about  editors  in  this  issue:  He  be¬ 
gins  a  series  of  columns  on  nr,  his  ver¬ 
sion  of  the  Unix  nroff  program. 

Also  in  this  issue,  we  introduce  our 
newest  columnist,  Ernie  Tello.  Rela¬ 
tively  sophisticated  AI  tools  are  final¬ 
ly  becoming  available,  and  Ernie 
talks  a  bo  u  1  some  of  them  in  his  first 
column.  It’s  a  welcome  addition  to 
DDJ,  and  1  hope  you'll  send  us  a 
mountain  of  mail  about  it.  Let  us 
know  what  you  think. 

In  this  year’s  August  issue,  we’ll 
take  a  look  at  tools  for  C  program¬ 
mers.  We're  particularly  interested 
in  articles  that  demonstrate  the 
power  and  efficiency  of  really  well- 
written  C  code.  The  ideal  article 
would  be  between  1,000  and  3,000 
words  long  and  would  include  a  list¬ 
ing  between  100  and  400  lines  long 
(see  my  listing  advice  below).  Short 
file  processors,  keyboard  filters,  data 
compression  techniques,  and  other 
utilities  are  all  of  interest,  as  well  as 
math  algorithms,  string  handlers, 
and  other  black  box  routines.  If 
you've  invented  a  tight  new  routine, 
let  us  know  about  it. 

Here's  more  advice  for  authors, 
this  time  about  program  listings.  DDJ 
is  proud  to  be  one  of  the  few  comput¬ 
er  magazines  that  still  regularly  pub¬ 
lishes  source  code  listings.  You  want 
them,  and  we  provide  them.  But  list¬ 


ings  frequently  turn 
into  a  headache  at  the 
layout  stage,  when  we 
have  to  balance  the 
cost  of  white  space  vs. 
the  placement  of  ads 
and  the  need  for  the 
listings  to  be  large 
enough  to  be  readable. 
Another  problem  with 
listings  is  that  fre¬ 
quently  we  have  to  do 
a  lot  of  time-consuming  editing  to  get 
them  into  a  format  that  will  print 
well  on  a  laser  printer.  Sometimes 
editing  of  listings  is  necessary  for  an¬ 
other  reason  as  well — the  listing  sim¬ 
ply  is  not  readable  enough. 

What  can  you  do  to  help?  It’s  really 
not  all  that  hard.  First,  keep  your  list¬ 
ings  as  clean  as  you  possibly  can.  That 
means  no  tab  characters,  for  exam¬ 
ple.  Instead,  whenever  possible  use 
spaces  to  indent  your  code.  Second, 
keep  your  listings  on  the  narrow 
side — less  than  60  columns  if  at  all 
possible.  If  your  comments  go  out  be¬ 
yond  that  point,  I  may  shorten  them 
or  your  article  may  be  pulled  from 
the  issue  because  the  listings  won’t 
fit.  Readability  is  also  important — if 
you  are  writing  in  C  or  Pascal  (or 
some  other  structured  language),  use 
at  least  four  columns  for  your  in¬ 
dents.  If  it  turns  out  that  you  can’t 
keep  your  source  code  under  60  col¬ 
umns  with  four-column  indents, 
then  you  have  too  many  levels  of 
nesting.  The  best  thing  to  do  in  such 
cases  is  to  remove  the  innermost  lev¬ 
els  and  make  them  a  separate  rou¬ 
tine.  Some  of  you  may  be  responding 
with  outrage  at  this.  You’re  the  ones 
who  put  big  banner  headlines  at  the 
tops  of  your  programs.  That’s  fine  for 
fanfold  paper,  but  magazine 
space  is  often  at  a  premium.  So  keep 
it  short  and  sweet,  please. 


Nick  Turner 
editor 


ARCHIVES 


Program  Editors 

"It  seems  to  me  that  a  great  obstacle  to 
better  programming  is  the  lack  of  an  editor 
that  is  as  well  adapted  to  its  purpose  as  Visi- 
Calc  is  adapted  to  calculation  in  rows  and 
columns.” — “Re-Thinking  Program  Edi¬ 
tors,  ~  William  B.  Brogden,  DDJ,  June  1981. 

Magazine  Editors 

"Planning  each  issue  of  DDJ  is  like  play¬ 
ing  a  game  of  editorial  bingo.  We  sit  down 
at  a  long  table  with  our  game  card  and 
pieces  of  corn  to  lay  on  squares.  Across  the 
top  of  the  card  are  the  column  labels  — 
8080,  6502,  1802,  6800,  etc.  The  horizontal 
rows  are  categorized  into  algorithm,  lan¬ 
guages,  hardware,  programming  prob¬ 
lems,  and  more.  Now  and  then  our  pieces 
form  a  straight  line  and  we  all  yell,  Bin- 
go!’  ’’ — editorial,  Marlin  Ouverson,  DDJ,  Au¬ 
gust  1981. 

Ten  Years  Ago  in  DDJ 

"Talking  dirty  in  a  cryptogram  is  super 
except  when  you  and  your  ten  year  old 
daughter  (for  real)  decode  it.  It  lost  a  lot  of 
its  humor  while  I  tried  explaining  it." — 
name  witheld,  letter  to  DDJ,  February  1977. 

"I  wish  to  offer  a  sincere  apology  for  this. 
When  I  scanned  the  article,  it  occurred  to 
me  to  check  the  sample  for  accuracy  (but  I 
didn’t).  However,  it  never  occurred  to  me 
to  check  it  for  immature  vulgarity. . .  .” — 
Jim  Warren,  editorial  response  to  the  above, 
DDJ,  February  1977. 

"For  $599,  Ohio  Scientific  Instruments  is 
selling  a  fully-assembled  diskette  drive  in¬ 
cluding  read/write  electronics,  manuals, 
mating  connectors,  system  interface  board 
(bare)  and  6502/6800  operating  system,  de¬ 
livery  guaranteed  to  be  less  than  120  days. 
User  supplied  parts  are  conservatively  esti¬ 
mated  to  cost  an  additional  $145.  Eight  to 
ten  evenings  of  assembly  time  and  testing 
are  suggested." — DDJ,  February  1977. 

"R  E.  Jef  Raskin's  'Cardboard  Computer 
Company’ — I've  had  the  same  problem 
(money),  so  for  a  couple  of  bucks,  I  got  some 
walnut  grained  contact  paper  which  I  used 
to  cover  cardboard,  plywood,  or  alumi¬ 
num  chassis  enclosures.  If  you  take  a  little 
time  to  smooth  the  wrinkles  and  air  bub¬ 
bles,  you  will  have  the  'Classy  Cardboard 
Computer  Company.'  .  . .  Oh  yea,  I  really 
enjoy  Jef's  articles — wish  he  had  written 
more  about  the  S-100  bus  earlier  before  I 
committed  to  a  system  that  doesn't  use  said 
standard. '  — W.  B.  Goldsmith,  Jr.,  letter  to 
DDJ,  February  1977. 


Dr  DoBB’S  joURNALof 

COMPUTER 

Calisthenics  Orthodontia 

Running  Light  Without  Overbyte 


8 

88 


Dr.  Dobb 's  Journal,  February  1987 


FORUM 


LETTERS 


80386 

Dear  DDJ, 

I  have  finished  reading  "Program¬ 
ming  on  the  80386”  (October  1986), 
and  I  am  both  excited  and  troubled 
by  all  of  the  changes  (enhancements) 
made  to  the  basic  8086. 

It  takes  several  years  for  the  com¬ 
munity  of  software  professionals  to 
master  a  machine — to  make  that 
computer  do  more  than  it  is  general¬ 
ly  believed  it  can  do.  There  is  no  ques¬ 
tion  that  the  new  operating  systems 
and  application  programs  for  the 
80386  are  going  to  be  very  sophisticat¬ 
ed  (or  should  be).  Just  using  the  ma¬ 
chine's  instructions  the  way  they  are 
intended  to  be  used  scares  and  at¬ 
tracts  me  at  the  same  time. 

But  it  takes  a  long  time  to 
master  a  computer  at  the  level 
of  a  good  assembly-language 
programmer.  It  takes  less  time 
for  higher-level  languages, 
but  a  state-of-the-art  applica¬ 
tion  writer  still  needs  to  know 
how  to  take  advantage  of  the 
services  otfered  by  the  oper¬ 
ating  system.  This  knowledge 
is  best  gained  by  the  shared 
experience  of  many  users. 

Only  after  the  experience 
of  the  whole  community  has 
reached  a  certain  level  (which 
can  take  years)  does  it  become 
general  knowledge  what  not 
to  do  on  a  particular  machine. 

This  information  is  so  neces¬ 
sary  and  so  important  that  all 
major  bulletin-board  services 
have  subgroups  for  all  the  ma¬ 
jor  types  of  computers  that 
serve  as  a  forum  for  their  user 
communities. 

There  is  already  a  shortage 


of  software  professionals  who  pro¬ 
gram  on  dinosaur  computers  in  near- 
dead  languages.  And  the  new  lan¬ 
guages  (Ada,  for  example)  will 
require  several  years  of  hands-on  ex¬ 
perience  before  there  are  many  peo¬ 
ple  who  know  how  to  use  them. 

The  learning  curve  for  software 
professionals  lags  behind  the  ad¬ 
vancement  of  technology,  for  both 
hardware  and  software,  by  several 
years.  If  students  use  a  particular  ma¬ 
chine  that  is  state  of  the  art  when 
they  are  sophomores,  by  the  time 
they  are  seniors,  their  knowledge 
will  be  only  somewhat  useful.  If  dy¬ 
namic  programmers  stay  at  one  com¬ 
pany  doing  one  type  of  program¬ 
ming  on  one  type  of  machine  with 
one  type  of  operating  system  for 
more  than  just  a  few  years,  their 
knowledge  will  become  obsolete  un¬ 
less  they  make  a  conscious  effort  to 
be  aware  of  industry  changes. 

This  problem — the  too-rapid  ad¬ 
vancement  of  technology — is  noth¬ 
ing  new.  My  question  to  you  is,  How 
can  the  community  of  software  pro¬ 
fessionals  keep  up  with  all  the  ad¬ 
vances  in  hardware?  It's  difficult 
enough  just  to  read  a  fine  magazine 
such  as  DDJ  every  month.  But  to  have 


true  knowledge  of  how  to  program  a 
computer  well  is  something  that  you 
have  to  learn  by  doing.  We  learn 
from  our  mistakes. 

I  feel  that  the  solution  is  a  series  of 
hands-on  courses  for  professionals 
on  advanced  topics  in  hardware  and 
software.  I  know  computer  vendors 
offer  courses,  but  usually  they  are 
priced  for  the  budgets  of  data  pro¬ 
cessing  departments  instead  of  indi¬ 
viduals  and  they  rarely  explain  tech¬ 
niques  used  by  their  competitors. 

I’m  sure  there’s  a  market  for  this 
kind  of  education.  Just  computer 
consultants  and  students  alone 
would  fill  enough  seats  to  make  it 
worthwhile.  And  with  all  those  hot 
shots  under  one  roof,  the  class  discus¬ 
sions  would  be  interesting  (if  not 
more  informative  than  the  lessons). 
Robert  Rouse 
479  Northlake  Dr.,  #108 
San  Jose,  CA  95117 

Dear  DDJ, 

I  couldn’t  let  Table  1  in  Ross  Nelson's 
article  on  the  80386  in  the  October 
1986  issue  go  by  without  a  response. 
Although  I  am  no  fan  of  the  8088,  he 
has  made  it  look  worse  than  it  really 
is.  No  self-respecting  compiler  would 
generate  the  code  he  listed  in 
Table  1.  I  suggest  my  Table  1, 
page  12,  as  a  replacement  for 
his. 

Tom  Pennello 
Meta  Ware  Inc. 

903  Pacific  St.,  Ste.  201 
Santa  Cruz,  CA  95060 

Drowned  in  C 

Dear  DDJ, 

This  is  in  response  to  letters  in 
October  1986  responding  to 
the  June  1986  Viewpoint, 
"What's  Wrong  with  C.” 
Though  I  found  these  opin¬ 
ions  interesting,  I  think  they 
have  all  missed  what's  really 
wrong  with  C. 

Having  known  Pascal  for 
years,  I  began  learning  C  less 
than  a  year  ago.  In  the  time 
I’ve  known  Pascal,  I've  cursed 
BASIC  (the  favorite  program¬ 
ming  language  of  those  who 
don’t  know  any  other  lan¬ 
guage)  because  it  allows  pro- 


10 


Dr.  Dobb's  Journal,  February  1987 

89 


LETTERS 

(continued  from  page  10) 

grammers  to  make  errors  that  are  im¬ 
possible  in  most  other  languages. 
(When  typing  code  into  interpreted 
BASICS,  you  can  easily  mistype  a  line 
number,  causing  that  line  to  be  some¬ 
where  else  in  the  program  and  possi¬ 
bly  erasing  a  previous  line  with  the 
same  number.)  While  learning  C,  I 
discovered  errors  that  are  unique  to 
it,  also.  Some  of  these  are  related  to 
the  implementation  I've  been  using 
(MS-DOS  Lattice),  but  others  are  a  re¬ 
sult  of  the  C  language  definition. 

Many  of  the  errors  I've  made  relate 
to  the  almost  total  lack  of  type  check¬ 
ing  in  C.  The  following  code  will 
demonstrate: 

int  i,  ar  [10]; 
for  (i  =  0;  i++;i  <  10) 
ar  [i]  =  0; 

If  you  don’t  see  the  problem  with  it 
immediately,  you  might  spend  a  few 
hours  looking  at  other  parts  of  the 
program,  as  I  did.  The  for  statement 
first  sets  i  to  0,  then  executes  the  sec¬ 
ond  statement  (if-  +)  and  exits  from 
the  loop  if  its  result  is  0.  The  state¬ 
ment  i++  returns  a  1,  and  so  the 
loop  continues.  The  body  of  the  for  is 
executed,  setting  ar  II]  to  0.  Then  the 
statement  i  <  10  is  executed  (which  I 
obviously  wanted  to  be  my  terminat¬ 
ing  condition),  and  its  value  is  not 
used.  The  loop  continues  while  i  is 
not  0,  which  on  a  compiler  that  uses 
16-bit  integers  is  65,536  times.  This 
happened  because  I  accidentally 
swapped  parts  of  the  for  statement, 
and  because  Boolean  values  are  just 
integers  in  C,  the  compiler  blindly  ac¬ 
cepted  and  compiled  it.  Another 
problem  is  that  the  integer  array  is 
indexed  with  this  value  and,  because 
of  a  lack  of  array  bounds  checking, 
initializes  131,072  bytes  to  0— much 
more  than  was  desired.  Strangely 
enough,  my  program  did  not  crash, 
but  a  rather  odd  thing  happened — 
the  output  to  the  screen  was  "buff¬ 
ered,”  and  nothing  would  print  until 
the  internal  "buffer”  became  filled 
with  256  characters. 

Another  type  checking  problem 
has  to  do  with  function  parameters. 
The  compiler  I  use  doesn't  check  that 
the  parameter  types  in  a  function 
definition  match  those  in  a  function 
call  or  even  whether  the  number  of 


parameters  is  the  same.  I  was  under 
the  impression  that  this  was  excus¬ 
able  only  in  older  languages,  such  as 
FORTRAN,  and  not  in  more  modern, 
structured  languages.  The  existence 
of  separate  utilities  such  as  lint  to 
check  such  things  tells  me  that  this 
type  of  error  checking  was  simply 
left  out  of  compiler  definitions. 

At  least  some  type  checking  is  done 
in  C,  though,  and  it's  with  the  return 
values  of  functions.  I  was  getting 
warning  messages  with  functions 
that  didn’t  return  any  value.  My  ini¬ 
tial  kludge  fix  was  to  add  the  state¬ 
ment  return  (0);  at  the  end  of  each 
function,  which  while  satisfying  the 
compiler,  I  presume  generated  at 
least  one  extra  instruction  to  set  the 
return  value,  which  wasn't  even 
used  in  the  function  call.  But  this  still 
didn’t  fix  the  warning  message  for 
function  main(  ).  Then  I  discovered 
that  the  void  keyword,  a  new  ANSI 
addition  to  C,  is  used  to  define  a  func¬ 
tion  that  returns  no  value.  But  this 
still  didn't  quite  work  for  me  until  I 
remembered  that  a  function  of  type 
other  than  integer  must  also  be  de¬ 
clared  explicitly,  either  in  the  calling 
function  or  as  a  global.  The  last  time  I 
remember  something  being  default 
unless  declared  otherwise  was  in 
fortran,  in  which  variables  begin¬ 
ning  with  the  letters  I  through  N 
were  default  integer  and  others  were 
default  real.  As  I  became  more  profi¬ 
cient  in  FORTRAN  (shortly  after  learn¬ 
ing  Pascal),  I  started  defining  all  the 
variables  in  my  FORTRAN  programs 
explicitly  and  ignored  the  defaults. 
This  also  improved  the  clarity  of  my 
variable  names.  Both  Pascal  and  FOR¬ 
TRAN  allow  functions  that  don’t  re¬ 
turn  a  value,  with  the  keywords  Pro¬ 
cedure  and  SUBROUTINE  as  part  of  the 
original  language  definitions.  And 
Pascal,  by  not  having  defaults, 


doesn't  make  programmers  learn 
(and  possibly  forget)  extra  rules  and 
their  exceptions. 

One  more  source  of  possible  errors 
in  C  involves  macro  preprocessing, 
specifically  macros  that  look  like 
functions.  This  (and  other  side  ef¬ 
fects — C  seems  to  have  more  possible 
side  effects  than  any  other  language) 
is  actually  documented  in  compiler 
manuals.  A  sample  statement  looks 
like: 

c  =  toupper  (getcharf )); 

in  which  toupper  is  defined  as  a  mac¬ 
ro.  When  expanded,  this  line  actually 
has  more  than  one  call  to  getcharf  ), 
resulting  possibly  in  several  charac¬ 
ters  being  read  when  only  one  was 
wanted.  Another  aspect  of  this  is  that 
toupperf )  could  be  a  macro  or  a  func¬ 
tion  (both  are  included  with  Lattice), 
depending  on  what  header  files  are 
included.  If  it's  a  function,  the  pre¬ 
ceding  code  will  work  just  fine.  It's 
not  hard  to  imagine  someone  porting 
that  line  of  code  and,  on  seeing  it,  rec¬ 
ognizing  toupperf  j  as  a  macro  and  in¬ 
cluding  a  header  that  defines  the 
macro  toupperf  )  and  thus  breaking 
something  that  didn't  need  fixing. 

I  see  the  biggest  problem  with  C  is 
its  growing  popularity  and  thus  the 
growing  availability  of  C  compilers, 
function  libraries,  and  utilities, 
which  take  efforts  away  from  sup¬ 
port  of  other  structured  languages 
such  as  Pascal  or  even  something  bet¬ 
ter — the  development  of  yet  another 
new  language,  one  that  all  of  us  could 
hope  would  not  have  such  problems. 
Ben  Bradley 
Telecorp  Systems  Inc. 

5825-A  Peachtree  Corners  E 
Norcross,  GA  30092 

(continued  on  page  130) 


8086 

286 

386 

MOV  BX,I 

MOV  BX,I 

MOV  EAX.I 

SHLBX.1 

SHLBX.2 

SHLBX.1 

FLD  FOO  + 1 2[BX] 

FLD  FOO+12[BX] 

FLD  FOO  +  12[EAX*4] 

FSQRT 

FSQRT 

FSQRT 

Table  1:  Suggested  replacements  for  Boss  Nelson  's  implementation  of 
SQRT  (FOO[I  +  3j) 


12 

90 


Dr.  Dobb's  Journal,  February  1987 


FORUM 


VIEWPOINT 


What’s  Wrong  with  High- 
Level  Languages? 

The  code  accompanying  Brian  R.  An¬ 
derson's  article  "A  68000  Cross  As¬ 
sembler”  (DDJ,  April  and  May  1986) 
provides  an  opportunity  to  criticize 
higher-order  languages  in  general 
and  Modula-2  in  particular.  It  is  not 
my  intention  to  criticize  Anderson.  I 
enjoyed  his  article,  and  his  Modula-2 
code  constitutes,  at  the  very  least,  an 
impressive  effort.  But  the  code  itself, 
precisely  consistent  with  the  teach¬ 
ings  of  our  leading  authorities,  re¬ 
veals  deficiencies  that  cannot  be  al¬ 
lowed  to  pass. 

The  code  in  Code  Example  1,  right, 
is  a  fragment  from  Anderson's  List¬ 
ing  Sixteen.  I  have  left  out  comments 
and  some  details  not  relevant  to  my 

by  Mike  Suman 


points.  The  last  seven  lines  of  code 
are  followed  by  116  similar  entries, 
ending  with 

INC  (i); 

WITH  Table68k[i]  DO 
Mnemonic  :=  ''UNLK”; 

Op  :=  (14,  11,  10,  9,  6,  4,  3}; 
AddrModeA  :  =  ModeA  (Ry02); 
AddrModeB  :  =  ModeB  { }; 

END; 

There  then  follows  code  to  write  out 
the  array  and  finally  the  line 

Mike  Suman,  332  Sturtevant  Dr.,  Sier¬ 
ra  Madre,  CA  91024.  Mike  started 
working  with  computers  in  1949.  He 
has  been  the  director  of  research  and 
development  at  an  aerospace  firm  and 
is  currently  writing  a  book  about 
computers. 


END  InitOperationCodes. 

First,  let  me  comment  on  a  minor 
matter  of  style  over  which  some  lead¬ 
ing  programmers  have  poured  major 
words.  In  this  program  fragment,  the 
constants  FIRST  and  LAST  have  been 
clearly  set  out  in  the  beginning.  As 
the  matter  is  taught,  this  is  supposed 
to  make  the  program  clearer,  more 
immediately  evident,  and  easier  to 
modify. 

But  the  only  place  in  which  FIRST 
and  LAST  are  used  is  in  the  phrase 
Table68K  [FIRST . .  LAST]  OF  TableRe- 
cord.  This  is  surely  not  more  immedi¬ 
ately  evident  than  saying  Table68K  [  1 
. .  118]  OF  TableRecord.  And  no  one 
can  maintain  that  it  is  really  easier,  or 
safer,  to  change  values  in  an  early 
definition  than  it  is  to  change  them  in 
the  only  place  in  which  they  are  used 
later,  when  it  is  obvious  what  the  ef¬ 
fects  of  the  change  are  going  to  be. 

As  a  general  principle,  common 
sense  and  literary  history  both  argue 


that  it  seldom  simplifies  or  clarifies 
complex  matters  to  introduce  new 
names  for  things  that  are  already 
named,  as  are  numbers,  or  to  sepa¬ 
rate  numerics  from  the  phrases  that 
reference  them.  You  don’t  make  ”2 
plus  2  equals  4”  clearer  by  saying  "x 
is  2  and  y  is  4  and  x  plus  x  equals  y.” 
The  issue  is  small,  but  the  point  is 
large:  We  do  not  tolerate  this  kind  of 
turgid  excess  in  writing;  why  then  do 
we  force  it  in  programming? 

Having  begun  on  a  small  point,  let 
me  move  to  a  larger  one.  Imagine 
that  you  were  handed  a  list  of  grocer¬ 
ies  that  had  been  "carefully  arranged 
for  clarity”: 

Item  potatoes 
Quantity  3.2  lbs 
Unit  cost  $0. 65/lb 
Total  cost  $2.08 
Item  oranges 
Quantity  5  lbs 
Unit  cost  $0. 88/lb 

(continued  on  page  132) 


MODULE InitOperationCodes; 

CONST 
FIRST  =  1  ; 

LAST  =118; 

TYPE 

ModeTypeA  =  (RegMem3 ,  Ry02  ,  Rx9 11,  .  .  .  ,  OpM37  )  ; 

Mode  TypeB  =  ( Bit8 1 1 ,  Size67,  Size6,  EA6 11); 

ModeA  =  SET  OF  ModeTypeA; 

ModeB  =  SET  OF  ModeTypeB; 

TableRecord  =  RECORD 

Mnemonic  :  Token; 

Op  :  BITSET; 

AddrModeA  :  ModeA; 

AddrModeB  :  ModeB; 

END; 

VAR 

Table68K  [FIRST. .LAST]  OF  TableRecord; 
i  :  CARDINAL; 

BEGIN 
i  :  =  1  ; 

WITH  Table68K[  i  ]  DO 
Mnemonic  :=  "ABCD" ; 

Op  s  =  {15,  14 ,8); 

AddrModeA  :  =  ModeA{Rx91 1  ,  RegMem3  ,  Ry02}; 

AddrModeB  :  =  ModeB(  }; 

END; 

INC  (  i  )  ; 

WITH  Table68K[  i  ]  DO 
Mnemonic  :  =  "ADD" ; 

Op  :  =  {15,1«,12}; 

AddrModeA  :  =  ModeA{OpM68D}; 

AddrModeB  :  =  ModeB{OpEA05y}; 

END; 

Code  Example  1:  Fragment  of  code  from  Listing  Sixteen,  DDJ,  May  1986 


14 


Dr.  Dobb's  Journal,  February  1987 

91 


ARTICLES 


Text  Editors: 
In  Matters  of 
Taste  . 


by  Levi  Thomas  and  Nick 


" Editors ?  You  wanna  talk  edi¬ 
tors ?  How's  about  something 
trivial  like  life  after  death,  re¬ 
ligion,  or  politics'’"—  Don 
Watkins,  sysop  on  Compu¬ 
Serve's  IBM  NET. 

History 

“The  first  interactive  editor  I 
ever  used  was  Expensive  Typewriter  on  the  PDP-1  at  MIT 
(yeah,  the  one  we  played  Spacewar  on)." — Dennis  Broth¬ 
ers,  author  of  Mactep  and  MicroPhone 

Back  in  the  good  old  days,  people  seldom  had 
much  choice  when  it  came  to  text  editors.  (How 
many  of  you  remember  TECO  or  Expensive  Type¬ 
writer?  How  about  keypunches  and  Hollerith  cards?) 
There  wasn’t  a  lot  you  could  do  with  a  simple  keyboard 
and  a  hard-copy  printer.  Editing  was  almost  exclusively 
line-oriented,  and  the  only  way  to  get  an  idea  of  what  the 
text  actually  looked  like  was  to  list  a  section  of  the  file 
explicitly,  usually  by  specifying  a  range  of  line  numbers. 
Frequently  the  editing  commands  were  cryptic  and  hard 
to  learn,  such  as  34LSjl0U20D$$  (an  actual  command 
string  from  an  early  line  editor).  Things  have  certainly 
changed  since  those  days.  CRT  screens  and  mice  have 
been  invented,  and  all  sorts  of  user  interface  discoveries 
have  been  made.  You  might  think  that  by  now  someone 
would  have  invented  an  editor  so  nearly  perfect  that  it 
would  set  a  standard  imitated  by  all  the  rest — but  nope. 

Although  the  incomprehensible  command  strings  of 
yesterday  are  gone  from  the  new  editors,  replaced  in 
most  cases  by  sensible  logical  structures,  the  old  editors 
live  on.  People  still  cling  to  WordStar,  with  all  its  hard-to- 


Levi  Thomas  and  Nick  Turner,  501  Galveston  Dr.,  Redwood 
City,  CA  94063.  Levi  and  Nick  are  editors  for  DDJ. 


memorize  control  codes,  de¬ 
spite  the  barrage  of  attacks 
from  other  camps.  In  fact, 
programmers  who  admit  to 
still  using  WordStar  often  get 
the  same  kind  of  reception 
as  programmers  who  an¬ 
nounce  that  they  prefer  BA¬ 
SIC  to  Pascal. 

So,  why  do  so  many  of  the  older,  presumably  outdated, 
editors  still  live?  Well,  the  "if  it's  not  broken,  don't  fix  it” 
philosophy  accounts  for  it  somewhat — some  folks  are 
quite  happy  with  their  “outdated”  editors  and  ignore  the 
"if  it’s  newer  it  must  be  better”  attitude  that  is  so  preva¬ 
lent  in  this  industry.  But  more  often  imprinting  seems  to 
have  a  lot  to  do  with  it. 

Old  Workhorses  and  Baby  Ducks 

"I  still  use  WordStar  in  nondocument  mode  when  editing 
programs.  OK,  so  kill  me." — Ray  Duncan,  DDJ  columnist 

Like  ducklings  that  adopt  the  first  moving  object  they 
see  as  a  mother,  programmers  often  adopt  the  first  editor 
they  learn  as  the  model  of  what  an  editor  is  and  should 
be.  Once  you've  learned  an  editor,  once  it  is  "burned  into 
your  brain,”  it  may  not  be  worth  the  effort  to  learn  anoth¬ 
er  (or,  more  to  the  point,  to  unlearn  the  first  editor)  no 
matter  how  comparatively  easy  it  is  or  how  many  "neat” 
features  it  has.  This  learning  process  is  especially  frustrat¬ 
ing  because  "everything  you  know  is  wrong” — you  are 
accustomed  to  being  in  control  of  the  process,  of  not  hav¬ 
ing  to  think  about  the  steps  that  stand  between  you  and 
what  you  want  done.  When  you  learn  a  different  editor, 
that  transparent  process  becomes  pitifully  opaque.  What 
a  gumption  trap!  You  see  a  similar  occurrence  when  writ¬ 
ers  try  to  learn  to  use  a  word  processor.  Even  though  the 
rewards  are  enormous,  the  first  few  hours  or  days  are 


' Which  is  better?  It's  a 
matter  of  personal 
bias — of  taste. J 


16 

92 


Dr.  Dobb 's  Journal,  February  1987 


hell.  Your  concentration  is  constantly  interrupted  by  the 
mundane  mechanics  of  word  processing,  and  you  feel  as 
though  there  is  suddenly  a  great  barrier  between  you  and 
your  work — a  barrier  you  didn't  notice  when  you  used 
the  typewriter.  So  once  you  become  comfortable  with  a 
word  processor  (or  an  editor),  you  will  be  pretty  reluctant 
to  start  over  again. 

Buddy  Can  You  Paradigm? 

There  seems  to  be  far  more  disagreement  than  agree¬ 
ment  when  it  comes  to  text  editors.  The  dissent  runs  deep, 
and  it  s  not  just  a  matter  of  taste  (although  that  certainly 
enters  into  it  as  well).  Not  only  are  there  different  ideals 
for  different  tasks,  but  also  individual  style  has  a  major 
impact.  Sometimes  it  seems  there  are  as  many  schools  of 
thought  as  there  are  programmers.  And  programmers  do 
tend  to  be  adamant  about  their  likes  and  dislikes  in  this 
matter. 

What  Do  We  Agree  On? 

There  are  some  features  we  all  seem  to  agree  are  vitally 
important  in  a  ''good”  editor.  These  are  the  factors  that 
apply  to  all  editing  situations,  on  all  systems  .  .  .  and  they 
are  few. 

Speed 

"An  editor  must  be  fast.  If  it's  not  blindingly  fast  in  screen 
updates  (or  at  least  as  fast  as  possible),  I  will  probably  put 
my  fist  through  the  CRT  screen  after  the  1, 793th  line  of  code 
at  1  A.M.  in  the  morning." — Darryl  Okahata,  programmer 

No  one  likes  to  wait  for  anything,  especially  when  star¬ 
ing  at  a  computer  screen.  If  an  editor  pauses  for  more 
than  a  fraction  of  a  second  for  any  reason,  programmers 
often  perceive  it  as  a  serious  flaw. 

Today's  editors  use  advanced  techniques  such  as  the 
Boyer-Moore  string  search,  hashing  tables,  and  RAM 


caches  to  speed  up  performance.  More  RAM  has  also 
meant  that  entire  files  can  often  be  held  in  memory, 
whereas  before  they  had  to  be  paged  or  sometimes  could 
not  be  edited  at  all. 

Features 

Many  of  today’s  editors  are  packed  with  sophisticated 
features — some  of  which  are  seldom  really  needed.  Al¬ 
though  it  is  possible  for  an  editor  to  be  so  feature-laden 
that  it  becomes  cumbersome  and  difficult  to  learn,  pro¬ 
grammers  do  agree  that  that  an  editor  must  have  a  basic 
set  of  powerful  functions.  Just  what  that  set  of  functions 
might  include  is  a  source  of  much  debate. 

User  Interface 

"An  editor  is  something  you  really  curl  up  with.  If  two  edi¬ 
tors  have  the  same  features,  you  re  going  to  go  with  the  one 
that  feels'  right.  It's  like  professional  musicians  are  about 
their  instruments  . .  .  some  choose  a  Gibson  Humming¬ 
bird  while  others  may  prefer  a  Martin.  Which  is  better ?  It  s 
a  matter  of personal  bias — of  taste." — Bob  Wallace,  author 
of  PC  Write 

An  important  and  highly  subjective  issue  is  ease  of  use. 
The  best  editor  is  transparent — its  use  becomes  so  natural 
that  you  forget  it’s  there.  Ease  of  use  is  a  difficult  thing  to 
measure,  but  some  traits  are  worth  examining.  For  exam¬ 
ple,  editors  that  are  ‘'modeless” — that  is,  the  program 
rather  than  the  human  using  it  keeps  track  of  the  mode — 
have  generally  become  accepted  as  superior  to  ones  that 
require  humans  to  keep  track.  Of  course,  there  is  still  a  lot 
of  disagreement  as  to  how  this  is  best  implemented. 

But  almost  any  editor  is  easy  to  use  once  you’ve  mas¬ 
tered  its  syntax  and  commands.  Just  what  kind  of  learn¬ 
ing  curve  you  are  willing  to  tackle  is  the  big  question. 
Often  the  sweat  and  frustration  of  learning  an  editor  with 
complicated  arcane  commmands  is  the  price  of  gaining  a 


Dr.  Dobb  s  Journal,  February  1987 


17 

93 


TEXT  EDITORS 
(continued  from  page  17) 

j  when  I  used  it .  .  .  I'd  rather  use  vi  than  goose  quills,  but 
it’s  a  close  race.” 

WYSIWYG  vs.  Straight  ASCII 

Why  would  I  need  WYSIWYG  to  edit  a  program ?  If  it  does 
anything  but  edit  straight  ASCII ,  it's  not  for  program¬ 
mers.  " — Roy  LeBan 

One  of  the  most  recent  developments  in  text  editors  has 
been  the  "what  you  see  is  what  you  get”  philosophy, 
which  maintains  that  what’s  on  the  screen  should  be  as 
close  as  possible  to  the  exact  appearance  of  the  text  when 
it’s  output  onto  paper.  With  newer  high-resolution 
graphics  screens,  this  approach  has  become  increasingly 
feasible.  WYSIWYG  does  have  its  merits,  especially  when 
you’re  aiming  for  high-quality  hard  copy.  There  is,  how¬ 
ever,  a  price  to  pay.  The  processing  power  and  memory 
required  for  WYSIWYG  editing  frequently  result  in  editors 
that  are  too  slow  and  cumbersome  for  a  large  number  of 
users. 

The  WYSIWYG  approach  has  been  married  to  what 
Steve  Jasik,  author  of  MacNosy,  refers  to  as  the  "point  and 
grunt  interface” — that  is,  the  mouse  and  pointer  method. 
This  has  perhaps  been  the  most  revolutionary  change  in 
text  editing  on  personal  computers,  but  it's  also  been  one 
of  the  most  controversial  ones.  Not  only  does  the  heavily 
graphical  WYSIWYG  style  slow  down  the  editor,  but  also 
the  mouse  itself  is  seen  by  some  as  an  unnecessary  en¬ 
cumbrance.  These  people  object  to  the  need  to  move  their 

high  degree  of  control  over  the  editing  process — a  price 
many  programmers  are  willing  to  pay. 

Roy  LeBan,  a  programmer  at  Ann  Arbor  Softworks, 
says:  ‘It  doesn't  matter  how  long  it  takes  to  learn  an  edi¬ 
tor,  as  long  as  it  does  what  I  want  it  to  once  I’ve  learned  it.” 

Dennis  Brothers  says:  “My  all  time  favorite  editor  is 
TECO,  for  emotional  rather  than  rational  reasons.  It’s  an 
absolute  bitch  to  learn  to  use,  but  once  you've  learned  it, 
boy,  can  you  make  it  dance!” 

Dennis  Allison,  cofounder  of  the  People’s  Computer 
Company  (and  DDJ),  says  this  about  his  favorite  editor, 
Emacs:  "It’s  hard  to  remember  that  meta-Control-X  does 
such  and  such,  but  once  you  learn  your  way  around  you 
can  do  all  sorts  of  things.  You  can,  for  instance,  swap 
every  other  word  with  a  single  editing  command.” 

Fortunately,  a  variety  of  editors  available  today  are 
easy  to  use  and  require  a  lot  less  time  and  effort  to  learn 
than  the  earlier  editors.  In  many  cases  there  is  a  trade-off 
in  terms  of  features,  but  often  this  does  not  hinder  the 
editor’s  usefulness  for  most  jobs.  These  editors  are  a  wel¬ 
come  alternative  for  most  programmers,  particularly 
those  who  take  umbrage  about  complicated,  counter-in¬ 
tuitive  command  strings. 

Alex  Pournelle  of  Workman  and  Associates  puts  it  suc¬ 
cinctly:  “I  might  adopt  Unix  if  it  had  an  editor  that  didn’t 
make  me  want  to  throw  the  terminal  through  the  wall 

Wish  List:  The 

We  asked  some  programmers  to  list  the  features  they 
would  like  to  see  in  the  ultimate  editor  and  got  many 
good,  if  not  definitive,  responses.  This  list  came  from 
Chris  Dunford.  It's  important  to  realize  that  these  are  (and 
must  be)  personal  opinions.  Value  judgments  are  all  im¬ 
portant  when  evaluating  editors,  and  your  own  experi¬ 
ence  may  be  vastly  different.  Feel  free  to  make  up  a  list  of 
your  own  and  send  us  a  copy — we'd  be  interested  to  hear 
more  from  our  readers  on  this  topic. 

Must  Haves 

•  Line  orientation. 

•  Reconfigurable  keyboard. 

•  Macros  that  must  be  able  to  make  decisions,  not  just 
remember  keystroke  sequences,  and/or  programming 
language.  Must  be  able  to  assign  macros  to  keys. 

•  Line/block-move/copy/delete  (one-dimensional).  Op¬ 
tional  two-dimensional  blocks — both  character  stream 
and  arbitrary-rectangle  blocks. 

•  Global  search/replace,  case-insensitive  option,  with 
wildcards.  Search/replace  should  be  restrictable  to  speci¬ 
fied  portions  of  the  text,  including  column  orientation 
(replace  foo  with  boo  over  lines  n-m  from  columns  x-y). 

•  Shell  to  operating  system. 

•  No  menus,  or  at  least  the  ability  to  run  in  command 
mode  and  bypass  menus.  Command  mode  should  not  just 
be  obscure  keystrokes. 

•  Multifile  capability  with  interfile  operations  (for  exam- 

Ultimate  Editor 

pie,  move  text  from  one  file  to  another  and  perform  oper¬ 
ation  x  on  all  files)  for  at  least  six  files. 

•  Tab  stops  user-definable,  and  the  option  to  use  tab  com¬ 
pression  when  writing  to  disk. 

•  Optional  auto-indent  (OK  if  done  via  macro). 

•  Internal  operations  must  be  fast. 

•  Must  have  "go  to  line  n”  or  "go  to  current  line  +  n”  and 
be  able  to  mark  locations  in  the  file  for  gotos. 

•  Some  undo  capability — for  example,  "restore  last  n 
lines  that  were  altered.” 

•  Read/ write  blocks  from  disk  (for  example,  insert  file  x 
at  cursor  position;  write  marked  block  to  disk  file  y). 

Not  Necessary  for  Everybody 

•  Multiwindow  operations  with  configurable  window 
sizes  (should  be  capable  of  both  top-to-bottom  and  side-by- 
side  windows). 

•  Block  left/right  shifts. 

•  Text  overlay  operations. 

That’d  Be  Nice,  but  I  Can  Live  Without  It 

•  Edit  files  larger  than  memory. 

•  Some  optional  word  processing  features:  line  center, 
paragraph  reformatting,  case  conversions,  ability  to  tag 
groups  of  lines. 

18 

94 


Dr.  Dobb’s  Journal,  February  1987 


TEXT  EDITORS 

(continued  from  page  18) 

hand  frequently  from  the  mouse  to  the  keyboard  and 
back. 

And  then  there  are  those  who  prefer  to  have  it  both 
ways.  Dennis  Brothers  says:  "  My  workhorse  editor  these 
days  is  the  Macintosh  Programmer’s  Workshop  (MPW) 
shell.  It  functions  as  both  an  editor  and  the  command 
language  interface  for  MPW,  with  all  the  advantages  of 
both  a  Mac-like  point  and  click’  system  and  a  program¬ 
mable,  command-oriented  editor.  You  ought  to  see  me 
fumbling  for  the  mouse  when  I  have  to  edit  something  on 
a  PC!” 

The  straight  (non-WYSIWYG)  text  editors  seem  to  fall  into 
two  categories.  On  one  hand,  you  have  the  WordStar- 
style  editors,  which  use  lots  of  special  control  characters 
to  add  power  and  speed  but  sacrifice  simplicity  (and  usu¬ 
ally  use  "weird”  file  formats  that  can’t  be  read  by  any 
other  editor  without  some  sort  of  conversion).  On  the 
other  hand,  you  have  straight  ASCII  editors  such  as  XY- 
Write,  PC  Write,  and  Apple’s  MDS  editor,  which  always 
keep  the  text  in  its  purest  form  and  generally  display 
every  single  character  of  the  file  on  the  screen,  including 
control  characters.  This  is  very  useful  when  uploading 
files  to  computer  networks  or  writing  something  that  will 
be  sent  to  a  typesetter.  Such  editors  often  have  a  com¬ 
mand  line  that  is  separate  from  the  text  window  or  a 
menu  bar  that  fills  the  same  function. 

Kibitz  Mode 

"The  so-called  syntax  checking  editors  are  good  for  about 
ten  minutes,  then  you  learn  that  only  the  people  that  spec  'ed 
them  actually  code  that  way." — Don  Watkins 

In  the  last  few  years  a  new  breed  of  editor  has  sprung 
up.  These  new  editors  are  supposed  to  ease  the  job  of 


On-Line 

With  the  increase  of  telecomputing  traffic  in  recent 
years,  another  important  editing  issue  has  arisen.  At  first, 
there  was  little  need  for  on-line  text  editing.  Messages 
were  usually  short  and  concise,  mainly  because  on-line 
time  was  expensive.  Now,  though,  electronic  mail  has  be¬ 
come  more  and  more  sophisticated,  and  with  it  has  come 
the  need  for  quality  on-line  editors.  The  problem,  simply 
stated,  is  that  there  is  no  standard  for  any  high-level  on¬ 
line  interface.  The  current  default  for  almost  all  systems 
is  something  called  TTY,  a  relic  from  the  days  when  all 
terminals  were  hard-copy  printers.  Under  the  TTY  inter¬ 
face,  there  is  no  way  to  go  back  to  a  previous  line  and 
change  it.  Once  something  is  printed,  it's  immutable.  So 
any  editors  have  to  be  similar  to  the  oldTECO  style.  Given 
the  TTY  constraints,  an  admirable  amount  of  progress  has 
nonetheless  been  made  in  on-line  editors.  Several  gener¬ 
ally  accepted  standards  have  evolved  for  the  command 
interface,  and  on-line  editors  today  seem  far  more  usable 
than  the  old  TTY  editors. 

The  final  solution  to  the  on-line  problem  really  comes 


software  designers,  chiefly  by  performing  real-time  syn¬ 
tax  checking  of  the  source  file  being  edited.  They  tend  to 
be  very  specialized,  requiring  a  different  version  for  each 
particular  dialect  of  a  language.  These  langage-oriented 
editors  might  be  great  for  relatively  inexperienced  pro¬ 
grammers,  who  can  often  save  a  great  deal  of  debug  time 
by  doing  a  syntax  check  without  even  leaving  the  editor. 
But  in  our  experience,  expert  programmers  tend  to  pre¬ 
fer  the  simplicity  of  a  straight  ASCII  editor. 

Chris  Dunford,  software  author  and  a  sysop  on  IBM  NET, 
says:  "The  current  smart  editors  are  so  restricting  that  it’s 
like  wearing  a  straightjacket.  I  have  a  friend  who  calls 
them  ’Nazi’  editors.  They  tell  you  what  you  can  and  can  t 
do,  and  that’s  the  wrong  attitude.  A  smart  editor  should 
be  one  that  watches  over  my  shoulder  while  I  use  it  free 
form  (that  is,  just  like  I  would  use  a  dumb  editor)  and 
figures  out  what  I  am  doing  and  does  helpful  things — 
[HAL-like  voice:]  Dave,  I  don’t  see  a  declaration  of  that  vari¬ 
able  you  just  used.  Would  you  like  me  to  declare  it  for 
you? — But  if  I  want  to  do  something  it  thinks  might  be 
wrong,  well,  by  God  it’s  gonna  let  me  do  it.  Let  s  face  it. 
With  the  current  crop  of  smart  editors,  you  have  to  tell 
them  exactly  what  you’re  about  to  do.  How  smart  is  that? 

Again,  for  programmers  who  prefer  using  one  editor 
for  all  their  work,  whether  it’s  source  code  or  documen¬ 
tation,  this  kind  of  editor  is  not  a  viable  option.  There 
seems  to  be  another  disadvantage  as  well.  Dennis  Allison 
mentions  a  syntax  editor  called  Program  Synthesizer  that 
was  at  one  time  used  at  Cornell  University:  “[It]  would  not 
let  you  write  a  bad  program.  When  the  students  tried  to 
write  a  program  using  Emacs,  they  couldn’t  write  one 
that  worked.”  Of  course,  Allison  also  notes  that  he  likes  a 
couple  of  the  syntax-driven  editors  ...  if  the  syntax- 
check  mode  can  be  turned  off. 

Do  What  I  Say 

An  issue  that  is  especially  pertinent  to  programmers  is 


Editing 

in  two  forms:  the  first,  and  perhaps  most  obvious,  is  to 
edit  the  information  off-line  and  then  upload  it  to  the  on¬ 
line  system  in  one  pass.  The  other  solution  is  more  techni¬ 
cally  difficult  but  offers  more  power  in  the  long  run.  It  is 
to  install  a  front-end  program  at  the  user  end  that  imple¬ 
ments  a  quality,  full-screen  editing  interface  and  uses  a 
defined  protocol  to  send  data  back  to  the  host.  The  main 
advantage  of  this  over  off-line  editing  is  that  the  host’s  full 
database  can  be  made  available  interactively  during  such 
an  edit. 

Editing  for  Uploading 

Here  the  most  obvious  problem  is  compatibility — because 
you  don’t  know  much  about  the  destination  system,  often 
not  even  what  kind  of  computer  it  will  be,  you  must  usu¬ 
ally  send  a  completely  clean  ASCII  file.  Some  front-end 
packages  for  the  larger  information  services  include  their 
own  off-line  editors;  others  include  a  way  to  read  in  a  text 
file  and  send  it.  In  both  cases  the  text  is  likely  to  be  quite 
clean. 


20 


Dr.  Dobb’s  Journal,  February  1987 

95 


programmability.  Some  editors  allow  you  to  construct 
macros  that  substitute  a  whole  chain  of  commands  for 
one  simple  command  (or  keystroke).  That’s  the  first  step. 
Then  there  are  editors  that  allow  you  to  create  minipro¬ 
grams,  complete  with  looping  control  structures  and 
if  -  ■  ■  then  statements.  Beyond  that  you  have  editors  that 
can  be  interfaced  to  custom-designed  program  modules, 
written  in  C,  assembly  language,  or  some  other  language. 
The  ultimate  in  programmability  is  perhaps  best  ex¬ 
pressed  by  Dennis  Allison  at  last  year’s  Hackers  Confer¬ 
ence:  “Just  give  me  the  source  code.” 

Programmers  tend  to  prefer  programmability  some¬ 
what  more  than  typical  users  do,  but  even  among  pro¬ 
grammers,  there's  a  wide  range  of  preferences  about  this 
feature,  too.  Some  go  so  far  as  to  write  their  own  editors, 
whereas  others  prefer  not  to  be  presented  with  so  many 
choices  in  editor  configuration,  preferring  to  learn  a  giv¬ 
en  set  of  commands  and  to  get  on  with  the  job  at  hand. 

The  Ultimate  Editor? 

Many  programmers,  unsatisfied  with  the  state  of  the  art, 
have  created  their  own  answers.  Jef  Raskin  of  Informa¬ 
tion  Appliance  Inc.  has  invented  a  special  interface  called 
the  SwyftCard  (see  DDJ,  June  1986,  for  a  review). 

Bob  Wallace  has  this  to  say  about  the  ultimate  editor: 
“There  are  some  identifiably  different  approaches  that 
people  have  to  the  world  in  general — some  people  are 
visually  oriented;  others  respond  to  things  more  in  terms 
of  touch,  spatial  relationships,  or  sound.  The  thing  about 
text  editors  and  word  processors  is  that  there  is  room  for 
all  these  user  interfaces,  and  there  is  no  reason  why  one 
editor  can’t  be  accessible  from  more  than  one  approach.” 

Summing  It  Up 

When  asked  to  give  his  views  on  text  editors,  former  DDJ 
columnist  Dave  Cortesi  deftly  sidestepped  specifics  and 
instead  gave  us  the  following  parable. 

"IBM  has  this  big  internal  network,  VNET,  that  every¬ 
body  uses  to  swap  messages  and  files.  And  there  is,  or 
was,  a  newsletter,  VNET  News,  distributed  to  hundreds  of 
users  around  VNET  each  month.  About  this  time  there 
was  a  great  proliferation  of  editor  programs  around  IBM; 
everybody  had  their  pet  editor  and  wouldn’t  look  at  any¬ 
body  else’s.  The  NIH  (not  invented  here)  factor  was  fierce. 

"So  I  was  young  and  stupid,  and  I  wrote  a  letter  to  the 
editor  of  VNET  News,  saying  essentially,  'It's  all  malarky. 
We  don’t  need  any  more  editors,  let’s  quit  experimenting 
with  these  trivial  variations  and  use  (the  editor  I  pre¬ 
ferred  at  the  time).’ 

"Nobody  could  agree  on  the  best  editor,  but  there  was 
one  thing  that  got  universal  agreement:  I  had  the  wrong 
attitude,  and  the  one  important  thing  was  to  keep  experi¬ 
menting.  Boy,  did  I  get  rebutted!  From  which  I  learned, 
never  discuss  religion,  sex,  or  editors  with  anybody — es¬ 
pecially  editors.” 

DDJ 

Vote  for  your  favorite  feature/article. 

Circle  Reader  Service  No.  2. 


Dr.  Dobb  s  Journal,  February  1987 

96 


ARTICLES 


6502  Hacks 


by  Mark  S.  Ackerman 


Computers  and  controllers  us¬ 
ing  the  6502  CPU  often  de¬ 
mand  efficient  use  of  both 
processing  time  and  memory  space. 
With  an  address  space  of  only  64K, 
data  and  code  space  efficiency  can  be 
critical.  Moreover,  though  advance¬ 
ments  in  these  6502  systems  have  al¬ 
lowed  the  use  of  high-level  lan¬ 
guages,  there  will  always  be  a  need 
for  fast  subroutines  and  tight 
programs. 

This  article  is  a  result  of  writing 
code  for  an  Atari  VCS  2600  game  unit 
that  had  only  128  bytes  of  RAM  and  8K 
of  ROM.  Because  of  the  limited  graph¬ 
ics  hardware,  all  processing  was  in 
real  time  with  cycles  counted.  Sever¬ 
al  of  the  hacks  and  tips  presented 
here  are  also  transferable  to  other 
microprocessors.  Many  of  the  tips 
are  appropriate  to  any  true  real-time 
system — that  is,  any  system  where 
real-time  is  measured  in  very  small 
fractions  of  a  second.  My  6502  experi¬ 
ence  has  been  helpful  in  finding 
ways  to  crunch  memory  require¬ 
ments  and  timing  on  the  Intel  8086 
and  80286. 

In  short,  this  article  is  a  collection 
of  my  favorite  hacks  for  the  6502.  A 
fair  amount  of  6502  code  is  included, 
so  the  next  section  gives  a  brief  intro¬ 
duction  to  the  6502.  If  you've  pro¬ 
grammed  the  6502  extensively, 
you’ve  probably  been  forced  to 
memorize  the  instruction  set  in  hex, 
so  perhaps  you  should  skip  ahead. 

The  6502 

The  6502  has  only  five  registers.  The 
instruction  pointer  UP)  points  to  the 


Mark  S.  Ackerman,  24  Chatham  St.,  Cam¬ 
bridge,  MA  02139.  Mark  is  a  computer  re¬ 
searcher  at  the  Massachusetts  Institute  of 
Technology,  specializing  in  graphics 
environments. 


There  will 
always  be  a  need 
for  fast  subroutines 
and  tight  programs. 


next  instruction,  as  on  most  comput¬ 
ers,  and  is  not  directly  setable  by  the 
user.  The  accumulator  (A)  is  the  main 
register,  and  it  is  the  only  register 
with  full  arithmetic-logic  unit  (ALU) 
functionality.  The  two  index  regis¬ 
ters,  X  and  Y,  are  used  for  indexed 
addressing.  The  stack  pointer  (S) 
points  to  the  top  of  the  stack. 

The  6502  also  has  a  nonregular  in¬ 
struction  set.  For  example,  the  S  reg¬ 
ister  can  be  set  only  from  the  X  regis¬ 
ter  using  the  TSX  (transfer  S  register 
to  X  register)  or  TXS  (yep — transfer  X 
to  S ).  Moreover,  if  you  want  to  trans¬ 
fer  a  value  between  the  X  and  Y  regis¬ 
ters,  you  must  transfer  through  the 
accumulator  or  through  memory. 
For  example,  a  typical  sequence 
might  be: 

TXA  ;tfr  X  to  accumulator 

TAY  ;tfr  accum  to  Y 

The  6502  also  has  several  flags.  The 
important  ones  are 

•  the  negative  flag  (or  minus  flag), 
which  is  set  when  loading  or  doing 
ALU  functions 

•  the  zero  flag,  which  is  set  in  similar 
situations 

•  the  carry  flag,  which  is  set  in  arith¬ 
metic  operations 

•  the  overflow  flag,  which  is  also  set 
in  arithmetic  operations 

There  are  also  flags  for  enabling  in¬ 
terrupts  and  for  decimal  mode. 


Basic  Hacks 

One  of  the  simplest  ways  to  save  code 
space  is  at  initialization  time.  The  fol¬ 
lowing  code,  for  example,  might  be 
used  to  initialize  a  few  variables: 

LDA  #0  ;load  accumulator 

with  0 

STA  FIRST  ;store  accum  in  FIRST 
LDA  * 2 
STA  SECOND 
LDA  #\ 

STA  THIRD 
LDA  *0 
STA  FOURTH 
LDA  »5 
STA  FIFTH 

This  requires  20  bytes  (2  per  instruc¬ 
tion)  and  a  minimum  of  25  cycles  (2 
per  load  immediate  and  3  per  store). 
The  following  would  be  cheaper: 

LDX  *0  ;X  register =0 

STX  FIRST  ;store  X  in  FIRST 

STX  FOURTH 

INX  ;inc  X  by  1  (X  now= 1) 

STX  THIRD 

INX  ;X  now  =  2 

STX  SECOND 
LDX  * 5  ;X  now =5 

STX  FIFTH 

This  takes  two  cycles  and  4  bytes  less; 
you  save  1  byte  each  per  INX  instruc¬ 
tion.  An  even  cheaper  result  can  be 
obtained  by: 


LDA 

»5 

accumulator = 5 

STA 

FIFTH 

;store  accum  in  FIFTH 

LSR 

A 

;shift  right  — 

;  (acc  now  =  2) 

STA 

SECOND 

LSR 

A 

;acc  now=l 

STA 

THIRD 

LSR 

A 

;acc  now  =  0 

STA 

FIRST 

STA 

FOURTH 

24 


Dr.  Dobb's  Journal,  February  1987 

97 


This  last  example  uses  the  same  num¬ 
ber  of  cycles,  23,  but  costs  only  15 
bytes — a  reduction  of  25  percent  over 
the  first  example.  You  might  object 
that  this  is  not  "clean  code.”  Without 
adequate  documentation,  it  may  be 
less  than  clear,  but  it  does  save  bytes 
and  cycles. 

This  example  also  demonstrates  a 
reduction  principle:  general  reduc¬ 
tion  algorithms,  such  as  using  the  in¬ 
dex  registers  to  increment  instead  of 
loading  the  accumulator  with  imme¬ 
diate  values,  can  produce  significant 
savings.  The  best  savings,  however, 
require  a  sharp  eye  for  special 
situations. 

Incidentally,  if  you  use  a  loop  dur¬ 
ing  initialization,  remember  that  the 
counter  register  contains  —1  or  0  at 
the  end  of  the  loop.  You  can  use  this 
by-product  for  further  savings: 


LDA 

#$C0 

LDX 

#7 

LPSTA 

TABLE2,X  ;put  acc  at  TABL 
;  plus  offset  in  X 

DEX 

;decrement  X 

BPL 

LP 

;loop  while  X  is 
;  positive  (i.e.,  >  0) 

INX 

;at  end  of  loop, 

;  increment  X 

STX 

ZERO  ;store  X  (  =  0)  in  ZERO 

Zero-Page  Savings 

The  6502's  first  256  bytes  of  memory, 
zero  page,  have  a  unique  property. 
Reads  from  and  writes  to  zero  page, 
including  indexed  I/O  using  only  the 
X  register,  save  a  cycle  and  a  byte. 
Frequently  used  variables,  or  memo¬ 
ry  registers,  should  be  kept  in  this 
portion  of  RAM. 

It  is  critical  when  addressing  this 
memory  to  use  the  X  register.  Using 
the  Y  register  for  indexed  addressing 
such  as  this: 

LDA  ZEROPAGE,Y 

is  actually  an  absolute  addressed  in¬ 
struction — that  is,  the  6502  ignores 
whether  the  ZEROPAGE  location  is  in 
zero  page  or  not.  The  Y-indexed  in¬ 
struction  uses  an  additional  cycle  for 
the  fetch  and  a  byte  for  the  page 
address. 

Using  Left-Over  Registers 

Use  all  the  registers.  The  index  regis¬ 
ters  should  be  used  for  intermediate 
results.  The  following  nonsense  ex¬ 


ample  assumes  that  the  stack  is  at 
$FF: 


LDX 

LOOP-COUNT 

LOOP  TXS  ;store  loop  ct, 

;  freeing  X 

LDA 

WHATEVER 

LSR 

A 

TAX 

;store  acc/2 

LSR 

A 

LSR 

A 

AND 

#07  ;get  low  3  bits 

TAY 

;get  offset  of  TABLE 

TXA 

;  restore  acc 

LDX 

TABLE, Y  ;new  index  for  X 

ORA 

$80  ;OR  $80  to  low  3  bits 

STA 

STORE, X  ;put  at  STORE  plus 
;  offset  from  TABLE + Y 

TSX 

;restore  loop  counter 

DEX 

;dec  loop  counter 

BPL 

LOOP  ;loop  if  counter  >  =  0 

TXS 

;restore  stack  ptr 

Note  that  the  TXS  instruction  can  be 
used  only  if  interrupts  have  been  dis¬ 
abled.  A  temporary  zero-page  vari¬ 
able  could  have  been  used  to  replace 
the  TXS  and  TSX  at  a  cost  of  two  cycles 
per  loop  execution  and  1  byte. 

Stack-Related  Savings 

It  is  often  cheaper  to  place  values 
onto  the  stack  than  to  store  them  to 
temporary  variables.  A  push  (PHA) 
and  pull  ( PLA )  take  seven  cycles  and  2 
bytes.  A  store  to  zero-page  memory 
with  a  following  load  takes  six  cycles 
and  4  bytes;  a  store  to  other  memory 
takes  eight  cycles  and  6  bytes. 

When  doing  a  substantial  amount 
of  I/O  to  temporary  variables,  it  may 
make  sense  to  actually  reposition  the 
stack  pointer.  This  works  only  with 
page  1  variables. 

Flags  and  the  Hit  Instruction 

Careful  use  of  bit  flags  can  also  save 
bytes  and  cycles  substantially.  The 
BIT  instruction  does  a  nondestructive 
test  of  a  byte  in  memory.  Bit  7  (the 
high-order  bit)  of  the  byte  is  placed  in 
the  negative  flag,  and  bit  6  is  placed  in 
the  overflow  flag.  No  registers  are  af¬ 
fected. 

For  this  reason,  if  RAM  is  limited, 
the  two  high-order  bits  of  a  byte  are 
extremely  valuable  for  Booleans.  In 
fact,  bit7  is  valuable  because  it  also 
sets  the  negative  flag  upon  loading: 

LDA  WORD  ;load  acc  with  WORD 

BPL  NOT_SET  ,gO  if  +  (bit7=0) 


IS_SET  AND  #$0F 
TAX 

In  addition,  the  carry  flag,  which  is 
set  or  cleared  in  shift  operations,  can 
be  used  to  store  a  flag  value  tempo¬ 
rarily.  In  this  case,  bits  7  and  0  are  the 
most  valuable. 

If  RAM  is  not  limited,  then  a  single 
Boolean  in  bitO  allows  the  use  of  the 
zero  flag  upon  loading.  The  BIT  in¬ 
struction  still  cannot  be  used  profit¬ 
ably  unless  the  Boolean  is  in  bit  7  or  6, 
however. 

More  Advanced  Hacking 

There  are  two  ways  to  depend  on 
preexisting  conditions.  The  first  is  to 
assume  that  the  carry  bit  is  either  set 
or  not  set  as  needed.  In  the  6502,  the 
carry  bit  signifies  just  that:  a  bit  is  be¬ 
ing  set  to  indicate  the  carry.  To  add 
two  16-bit  numbers,  then: 

CLC 

LDA  FIRST— V  AR _ FIRST_8 

ADC  SECON  D_  V  A  R_ FI  RST_8 
LDA  FIRST— VAR— SECOND—8 
ADC  SECOND— VAR— SECOND— 8 

If  you  know  the  condition  of  the  car¬ 
ry,  such  as  in  the  following  sequence 
of  instructions: 


LDA 

FIRST 

CMP 

#$18 

;comp  acc  to  $18 

BCS 

BRANCH1  ;go  if  acc  >  = 

LDA 

VAR1 

ADC 

VAR2 

then  a  CLC  Can  be  omitted.  Why?  Be¬ 
cause  the  branch  was  on  a  carry  set 
condition,  the  only  way  into  the  addi¬ 
tion  would  be  if  the  carry  were  clear. 
In  a  similar  manner,  you  can  assume 
that  there  is  no  carry  from  a  previous 
addition.  For  example,  if  VAR3  never 
exceeded  16  and  VAR4  never  exceed¬ 
ed  5,  then  the  carry  will  never  set.  So 
the  sequence: 


CLC 

LDA 

VAR3 

ADC 

VAR4 

STA 

TEMPI 

LDA 

NEW1 

ADC 

NEW2 

can  be  used.  The  CLC  for  the  second 
addition  can  be  omitted  because  the 
carry  will  not  be  set.  This,  however, 
reduces  the  robustness  of  the  code. 


Dr.  Dobb's  Journal,  February  1987 

98 


25 


6502  HACKS 

(continued  from  page  25) 

Removing  clear  carries  or  set  car¬ 
ries — used  for  subtractions — can 
save  many  bytes.  You  may  need  to 
use  some  ingenuity.  If  the  carry  is  set, 
for  example: 

LDA  TEMP 

SEC  *$FF  ;subtract  —1 

you  may  want  to  add  1  to  TEMP  by 
subtracting  —  1. 

Cheaper  Branching 

The  second  way  to  use  preexisting 
conditions  is  with  branching.  If  you 
know  that  a  flag  will  be  in  a  certain 
condition,  the  appropriate  branch  in¬ 
struction  can  be  used  for  an  uncondi¬ 
tional  jump.  This  will  save  a  byte  but 
no  cycles  in  the  6502;  the  uncondi¬ 
tional  JMP  instruction  takes  3  bytes 
whereas  a  conditional  branch  takes  2 
bytes.  Both  take  three  cycles  when 
the  branch  is  made.  (A  conditional 
branch  in  which  no  branch  is  execut¬ 
ed  takes  only  two  cycles.)  For  exam¬ 
ple,  if  the  carry  is  set  (perhaps  from  a 
BIT  instruction  as  'below  or  from  a 
subtraction),  then  the  code: 

BCC  JUMP_LOC 
NEXT_LOC 

forces  an  unconditional  branch  for 
the  savings  of  a  byte  over  a  JMP. 

Interestingly,  the  instruction  se¬ 
quence  for  a  Boolean  in  bitO: 

LDA  YOUR_FLAG 

AND  n  ;get  bitO 

BNE  TRUE-SETTING  ;branch  on  1 
FALSE-SETTING 

can  be  replaced  by: 

LDA  YOUR—FLAG 
LSR  A  ;shift  bitO  into  carry 

BCS  TRUE— SETTING  ;gO  if  set 
FALSE— SETTING 

at  a  savings  of  a  byte.  This  is  especial¬ 
ly  useful  for  testing  several  bit  flags  in 
a  single  byte.  It  also  nicely  sets  the 
carry  bit  for  unconditional  branch¬ 
ing  for  both  branches  of  an  if.  . .  else 
structure. 


BCS  TRUEL-SETTING 
FALSE-SETTING  ;carry  is  clear 

some  code 
BCC  END— IF 
TRUE-SETTING 
some  code 
END— IF 


Table-Driven  Code 

You  can  make  very  large  savings  if 
you  can  replace  code  with  preset 
data  tables.  Instead  of  attempting  to 
compute  divide-by-17s  or  sines,  for 
example,  it  may  be  possible  to  have  a 
table  of  the  results  for  all  expected 
values.  For  example,  instead  of  com¬ 
puting  MOD7,  if  the  variable  will  nev¬ 
er  exceed  32,  it  will  be  far,  far  cheap¬ 
er  to  have  a  table: 


MOD7  DS  0,1, 2, 3, 4, 5, 6 

DS0,1,2,3  .  .  .  etc. 

Because  many  of  these  tables  can  be 
compressed  or  merged  with  other  ta¬ 
bles  (as  discussed  later),  the  cost  in 
bytes  is  reasonable.  This  method  is 
certainly  faster. 

In  a  similar  manner,  decision  ta¬ 
bles,  game-play  paths,  or  timing  deci¬ 
sions  can  often  be  decided  prior  to 
compilation  rather  than  during  exe¬ 
cution.  In  games,  it  is  often  best  to 
store  the  delta  xs  and  delta  ys  instead 
of  trying  to  compute  sine  wave  pat¬ 
terns  on  the  fly. 

In  manv  cases,  von  can  use  tables  to 
speed  up  operations  that  are  repeat¬ 
ed  often.  If,  for  example,  it  is  neces¬ 
sary  to  increment  only  the  bottom 
nybble  of  a  word,  a  normal  addition 
cannot  be  used  because  the  carry  will 
ruin  the  top  nybble.  You  could  write: 


LDA  WORD 
AND  *$F0 
STA  TEMP 
LDA  WORD 
AND  *$0F 
CLC 

ADC  n 
AND  *$0F 
ORA  TEMP 
STA  WORD 


;get  high  nybble 
; store  temporarily 

;get  the  low  nybble 
;assume  worst:  clr  carry 
;add  1 

;watch  for  wrap 
;OR  in  high  nybble 
;store  back  out 


This  costs  24  cycles  and  19  bytes.  If 
you  had  a  table: 

NEXTINC  DS  1,1, 1,1, 1,1, 1,1 

DS  1,1, 1,1, 1,1, 1,-15 


LDA  YOUR _ FLAG 

LSR  A 


the  cost  could  be  reduced  to  19  cycles 


and  13  bytes: 


LDA 

WORD 

AND 

*$0F 

;get  current  low  nybble 

TAY 

;index  into  NEXTINC 

LDA 

WORD 

CLC 

;might  not  be  needed 

ADC 

NEXTINC, Y  ;add,  indexed 

;  by  current  value 

STA 

WORD 

If  this  calculation  were  done  in 
many  locations  in  the  program,  the  ta¬ 
ble  would  quickly  become  much 
cheaper  than  the  calculation.  Inciden¬ 
tally,  the  add  instruction  could  be  re¬ 
placed  with  a  subtract  (or  even  a  logi¬ 
cal  OR)  instruction.  Your  choice  of 
which  instruction  to  use  might  de¬ 
pend  on  what  table  you  had  lying 
around! 

Unrolling  Loops 

One  large  trade-off  between  time 
and  space  is  in  unrolling  loops.  The 
loop: 


LDA  *1  ;outside  the  loop 
LDX  #4 
LOOP  STA  LOC,X 
DEX 

BPL  LOOP 

can  be  changed  to: 

LDA  «1 
STA  LOC 
STA  LOC+1 
STA  LOC +2 
STA  LOC +  3 
STA  LOC +4 

This  is  more  costly  in  terms  of  bytes 
(12  bytes  vs.  9  bytes),  but  it  is  far  faster 
(17  cycles  vs.  48  cycles).  (The  loop  over¬ 
head  takes  4»(2+3)  +  l*(2+2).)  It  is  of¬ 
ten  surprising  how  much  time  can  be 
saved  by  unrolling  simple  loops. 

It  is  also  possible  to  combine  loops, 
even  of  different  sizes,  saving  the 
costly  loop  overhead: 

LDA  PICKUP +  7 

;move  PICKUPS  contents  to  TABLE 
STA  TABLE+7 
LDA  PICKUP  +  6 
STA  TABLE+6 
LDX  #5 
LDY  *$80 

LOOP2  LDA  PICKUP, X  ;continue  the 
;  move  with  loop 

STA  TABLE, X 


26 


Dr.  Dobb's  Journal,  February  1987 

99 


6502  HACKS 

(continued  from  page  26) 


STY  TAB2,X  ;set  TAB2  to  $80 

DEX 

BPL  LOOP2 

Chaining  Subroutines 

One  of  the  simplest  ways  to  save 
bytes  is  to  use  subroutines  for  com¬ 
mon  code.  This  requires  the  time  cost 
of  the  JSR  (six  cycles)  and  the  RTS  (six 
cycles),  however. 

One  way  to  save  in  subroutines  is  to 
create  multiple-entry  subroutines. 
That  is,  if  two  subroutines  share  a 
common  ending,  do  not  put  that  com¬ 
mon  ending  in  another  subroutine. 
Instead,  create  a  single  subroutine.  In 
a  "pure”  multiple-entry  subroutine, 
you  fall  through  all  sections  of  code 
until  the  return  statement.  You  can 
also  jump  around  the  noncommon 
code: 

FIRST_ENTBY 

first  section  of  code 
JMP  END_SUB  ;also  try  BCC 

;  or  the  like 
SECOND_ENTRY 

second  section  of  code 
END_SUB 

final  processing 
RTS 

The  calling  routines  can  call  either 
FIRST— ENTRY  or  SECOND-ENTRY  with 
their  different  processing.  Both  sub¬ 
routine  sections  will  exit  from  the 
same  RTS,  however. 

Finding  the  Extra 
Microprocessor  Cycle 

Sometimes,  in  coding  for  real-time 
processing,  you  may  need  to  kill  an 
extra  cycle  or  two.  The  6502  has  a 
two-cycle  NOP  (no  operation)  instruc¬ 
tion.  What  about  a  single  cycle?  The 
6502  has  no  single  cycle  NOP. 

Of  course,  sometimes  a  single  cycle 
isn’t  needed — only  an  odd  number  of 
cycles.  You  can  get  seven  cycles  by  a 
combination  of  push  and  pull  stack 
operations;  five  cycles  can  be  bought 
by  doing  a  NOP  and  a  load  from  zero- 
page  memory. 

Single  cycles  can  be  gained  only 
through  other  operations.  One  such 
operation  is  the  absolute  addressed 
load  using  an  index  register.  Normally 
this  instruction  (LDA  ADDRESS, X  or  LDA 
ADDRESS,  Y)  takes  four  cycles.  If  there 


is  a  page  boundary  crossing  (say  that 
ADDRESS  is  at  C0F0  and  the  X  register 
is  18),  however,  then  the  instruction 
takes  five  cycles.  To  gain  the  extra 
cycle,  the  table  can  be  placed  to  force 
a  page  boundary  crossing. 

Occasionally  you  can  use  hardware 
memory  mapping  to  the  same  effect. 
In  the  Atari  VCS,  for  example,  page  1 
memory  and  page  0  memory  were 
mapped  together.  Therefore,  if  ZER- 
OADD  were  at  0065,  it  could  also  be 
found  at  0165.  A  zero-page  fetch  costs 
three  cycles,  but  the  same  fetch  from 
page  1  memory  costs  four  cycles. 

It  is  also  possible  to  branch  to  the 
next  instruction  depending  on  a  flag. 
This  kills  either  two  cycles  (for  a  non- 
executed  branch)  or  three  cycles  (for 
a  branch  taken).  This  is  occasionally 
useful  for  synchronizing  if. . .  then 
. . .  else  code. 

Savings  by  Stepping  Back 

Perhaps  the  greatest  savings  can  be 
had  by  proper  planning.  Putting  two 
flags  in  bits  7  and  6  or  in  bits  0  and  1 
makes  more  sense  than  putting  them 
in  bits  3  and  5.  Often  it  is  necessary  to 


make  simple  changes  in  the  midst  of 
programming  in  order  to  crunch 
code.  This  type  of  planning  comes 
with  experience. 

Stepping  back  from  the  actual  pro¬ 
gramming  always  helps.  For  exam¬ 
ple,  you  may  have  converted  one 
data  structure  to  another  through 
easily  understandable  transforma¬ 
tions,  perhaps  with  intermediate 
data  structures.  If  you  are  used  to 
high-level  languages  in  which  this  is 
encouraged,  your  assembler  code 
will  reflect  it.  Unfortunately,  this 
type  of  elegance  often  turns  out  to  be 
costly.  The  type  of  elegance  that  will 
benefit  you  in  crunching  will  be  the 
elegance  of  simple  algorithms — al¬ 
most  always  single-pass  algorithms — 
that  do  not  require  special  cases.  Spe¬ 
cial  cases  cost. 

Another  type  of  planning  that  is  of¬ 
ten  helpful  is  determining  when  pro¬ 
gram  actions  will  occur.  It  may  not  be 
necessary  to  have  all  program  func¬ 
tionality  present  at  the  same  time.  In 
a  game,  where  time  is  critical,  the  x,y 
positions,  for  example,  do  not  need  to 
be  updated  every  1/60  second  be- 


Dr.  Dobh 's  Journal,  February  1987 

100 


29 


6502  HACKS 

(continued  from  page  29) 

cause  the  human  eye  does  not  de¬ 
mand  that.  In  business  software, 
where  space  is  more  critical,  you  can 
overlay  code. 

Is  It  Really  Necessary? 

In  addition,  there  is  always  a  time  to 
ask  yourself,  is  that  feature  really 
necessary? 

In  a  time-sharing  scheme  I  was 
working  on,  for  example,  it  was  nec¬ 
essary  to  determine  three-way  time 
sharing.  The  ideal  would  have  been 
to  have  a  scheduling  sequence  such 


as: 

123  123  123  123 

and  so  on,  with  each  of  the  three  ac¬ 
tions  (placing  a  graphics  sprite  on  the 
display)  getting  one-third  of  the  time. 
This  order  was  needed  rarely,  how¬ 
ever,  the  usual  case  being  either  two- 
way  time  sharing  or  no  time  sharing. 
Divides-by-3  are  extremely  expensive 
on  the  6502,  so  I  came  up  with  this 
alternative: 

1231  1232  1233  123x 

where  y  was  a  skipped  slice.  It  turned 


out  that  the  result  was  not  significant¬ 
ly  distinguishable,  which  saved  many 
bytes  and  much  time.  (This  could  be 
implemented  by  AiVDing  a  timer  with 
$07  and  using  a  lookup  from  a  table.) 
This  approach  transformed  a  difficult 
computation  into  a  divide-by-4 
problem. 

All  of  this  is  not  to  say  that  design 
can  be  separated  from  coding.  The  in¬ 
spired  moment  of  coding  often  finds 
the  necessary  time  and  bytes  when  all 
the  planning  weeks  (or  even  months) 
ago  failed.  It  takes  patience  and  skill  to 
notice  that  TABLE2  can  be  created  by 
taking  the  TABLE1  entry,  exclusive  Off¬ 
ing  $7F,  and  adding  5.  It  takes  the  same 
patience  and  skill  to  rip  out  a  section 
of  code  and  rearrange  it  to  get  the 
same  overt  behavior. 

Killer  Hacks 

When  all  else  fails,  there  are  always  a 
few  tricks  left.  The  following  hacks 
are  not  for  novices,  though.  These 
methods  squeeze  the  final  cycles  and 
bytes  from  your  program.  They 
make  debugging  your  code  nearly 
impossible,  and  you  might  as  well 
forget  about  maintaining  the  code 
later. 

Please  don  your  safety  goggles. 

Chaining  Branches 

One  ugly  way  to  reduce  the  number 
of  bytes  of  code  is  to  chain  branches. 
As  mentioned  earlier,  the  6502  uses  2 
bytes  for  a  relative  branch  instruc¬ 
tion  and  3  bytes  for  an  absolute  jump 
instruction.  Unfortunately,  the  rela¬ 
tive  byte  instruction  can  address  a 
space  of  only  127  bytes  forward  or 
backward.  Therefore,  the  uncondi¬ 
tional  JMP  instruction  is  often  used, 
even  for  implementing  if.  .  .  then 
. . .  else  structures.  It  is  possible  to 
convert  these  JMPs  to  conditional 
branches,  saving  bytes. 

If  a  condition  is  known,  such  as  the 
carry  bit  being  set  or  the  overflow 
flag  being  clear,  it  is  possible  to 
branch  to  another  branch.  By  chain¬ 
ing  branches,  it  is  possible  to  have 
conditional  branching  of  more  than 
127  bytes  distance.  This  is  recom¬ 
mended  only  when  it  is  important  to 
conserve  space  at  the  cost  of  execu¬ 
tion  time — two  branches  have  to  be 
executed — and  maintainability. 

Self-Modifying  Code 

Self-modifying  code  can  be  used 


30 


Dr.  Dobb’s  Journal,  February  1987 

101 


profitably  in  critically  real-time  rou¬ 
tines  when  literally  every  cycle 
counts.  Instead  of  loading  a  loop 
counter  or  some  such  from  a  zero- 
page  variable,  you  can  just  change 
the  LDX  immediate  instruction  on  the 
fly.  This  saves  a  cycle. 

In  addition,  instead  of  performing 
a  load  or  store  indirect  indexed, 
which  uses  2  bytes  of  zero-page  RAM: 

LDA  (INDIRECT), Y 

you  can  modify  the  destination  ad¬ 
dress  on  the  fly  and  perform  an  in¬ 
dexed  absolute  load  or  store  to  save  a 
cycle: 

LDA  ADDRESS, Y 

This,  of  course,  will  have  you  barred 
permanently  from  any  MIS  employ¬ 
ment  for  the  rest  of  your  life.  Actual¬ 
ly,  I  could  not  use  this  on  the  VCS  be¬ 
cause  of  the  lack  of  RAM,  so  I  am  not 
familiar  with  its  difficulties.  I  have 
been  told,  however,  that  with  ade¬ 
quate  documentation,  this  sort  of 
treachery  can  be  maintained. 

Use  of  the  NMI  Interrupt 

Another  nasty  trick  to  save  bytes  is  the 
use  of  the  break  (perform  interrupt) 
instruction.  The  JSR  (call  subroutine) 
instruction  requires  3  bytes  per  call, 
whereas  the  BRK  instruction  requires 
only  a  single  byte.  This  method,  how¬ 
ever,  requires  that  you  not  be  expect¬ 
ing  nonmaskable  interrupts. 

To  use  this  method,  you  must  set 
the  NMI  vector  to  the  address  of  the 
most  frequently  used  subroutine.  The 
BRK  instruction  can  then  be  used  to 
call  the  routine.  BRK,  however,  not 
only  places  the  return  address  onto 
the  stack  but  also  pushes  the  flag  byte 
onto  the  stack.  If  you  do  not  return 
information  in  the  flags,  you  can  re¬ 
turn  from  the  interrupt  with  an  RTI.  If 
you  need  the  flags  that  were  set  in  the 
subroutine,  however,  return  with: 

PLA  ;pop  caller's  flags 
RTS  .return  normally 

Unfortunately,  the  BRK  instruction 
takes  seven  cycles  instead  of  the  JSR’ s 
six.  If  a  PLA  is  needed,  that  will  also 
add  four  cycles.  After  you  set  the  NMI 
vector  (at  a  cost  of  2  bytes),  however, 
each  call  will  save  2  bytes. 


Overlapping  Code  and  Data 

You  can  squeeze  data  table  space  in 
two  ways.  The  first  sounds  more  dif¬ 
ficult  to  maintain  but  actually  turns 
out  to  be  easier  in  practice.  This  is  to 
find  the  appropriate  data  table  values 
in  your  code  space.  For  example,  you 
might  need  the  following  flag  table: 

TABLE  $80,00 

If  you  will  be  testing  only  bit7,  howev¬ 
er,  the  following  table  would  also  do: 

TABLE  $A9,00 

This  just  happens  to  be  a  LDA  * 0  (load 


immediate  of  0)  instruction.  Finding 
the  appropriate  code  in  your 
program: 

TABLE  LDA  * 0 

also  eliminates  the  2  bytes  from  your 
data  space.  Although  you  might  want 
to  comment  this  table  extensively  to 
remind  yourself  of  what  you  did,  the 
only  real  problem  you  will  have  is  to 
find  the  table  again.  Note  that  this 
technique  generally  works  only  with 
small  tables. 

The  second  method  is  much  more 
effective  in  finding  bytes.  If  you  have 
four  data  tables: 


Dr.  Dobb  s  Journal,  February  1987 

102 


31 


6502  HACKS 

(continued  from  page  31) 


TABLE1  DS  0,1, 2, 3, 0,0,0 
DS  $80,3,2,1,0 
TABLE2DS  3, 2, 1,0 
TABLE3  DS  0,0,0 
TABLE4DS  1,0,6 

they  can  be  profitably  reorganized 
as: 

TABLE1  DS  0,1, 2, 3 
TABLE3DS  0,0,0 
DS  $80 
TABLE2DS  3,2 
TABLE4DS  1,0,6 


Confused?  TABLE3  and  TABLEZ  are 
now  completely  contained  in  TA- 
BLE1.  TABLE1  also  extends  throughout 
most  (but  not  all)  of  TABLE4.  Note  that 
this  has  saved  9  bytes  from  the  origi¬ 
nal  22  bytes. 

Often  the  program  can  be  altered 
slightly  to  increase  this  type  of  sav¬ 
ings.  Using  the  nybble-increment  ex¬ 
ample  from  the  section  on  table-driv¬ 
en  code,  the  choice  of  instruction 
used  might  also  depend  on  what  ta¬ 
bles  are  available  for  merging.  It  is 
also  possible  to  find  the  table  back¬ 
ward  within  another  table:  your  in¬ 
dexing  must  then  proceed  backward 
also,  decrementing  instead  of 


incrementing. 

Similarly  huge  savings  are  almost 
always  possible.  If  you  have  a  bug 
and  one  of  the  tables  must  be 
changed,  however,  it  will  be  ex¬ 
tremely  difficult  to  separate  the  origi¬ 
nal  data  tables  without  adequate  doc¬ 
umentation.  I  have  always  used  this 
technique  last. 

Code  as  Data 

There  are  particular  instances — es¬ 
pecially  when  there  is  absolutely  no 
room  left — when  code  can  substitute 
completely  for  data  tables.  This  is  es¬ 
pecially  effective  for  simulating  ran¬ 
dom  movements,  such  as  for  a  self¬ 
play  mode  (called  the  attract  mode  in 
games).  Code  for  the  6502  tends  to  be 
somewhat  heavy  toward  having  bit7 
set,  but  otherwise  it  can  create  effec¬ 
tive  random  tables.  It  is  often  neces¬ 
sary  to  try  many  code  sections,  how¬ 
ever,  for  the  desired  effect  in  the 
software  action. 

The  Endless  Trade-Off 

The  crunching  process  revolves 
around  the  standard  trade-off  of  time 
vs.  space.  Even  a  simple  change,  such 
as  removing  redundant  code,  re¬ 
quires  this  trade-off.  Subroutines 
have  extra  overhead  and  slow  pro¬ 
cessing  down.  In-line  code,  as  with 
macros,  can  greatly  speed  up  critical 
processing  but  at  the  cost  of  an  enor¬ 
mous  code  space. 

Many  of  the  hacks  described  here 
require  this  trade-off.  Most  are  tricks 
that  sacrifice  one  for  the  other.  The 
other  hacks  all  have  the  added  cost  of 
reduced  maintainability  or  increased 
programmer  effort. 

When  is  it  worth  using  these 
hacks?  If  you’re  writing  Pascal  code 
on  the  Macintosh  or  C  code  on  the  PC, 
they  cannot  help  much  in  removing 
30K  from  your  200K  program.  But  if 
you're  in  a  situation  in  which  you 
need  a  fast  interrupt  routine,  these 
techniques  can  help  on  any  machine. 
They  can  also  be  useful  if  you  need  to 
reduce  the  code  space  for  a  desk  ac¬ 
cessory  or  a  similar  routine  or  if 
you're  just  the  sort  of  person  who 
gets  excited  by  realizing  that  a  flag 
can  be  reused. 

I’d  like  to  thank  the  folks  at  General 
Computer  for  all  their  help. 

DDJ 

Vote  for  your  favorite  feature/article. 

Circle  Reader  Service  No.  3. 


32 


Dr.  Dobb's  Journal,  February  1987 

103 


ARTICLES 


Hashing  for  High- 
Performance  Searching 


by  Edwin  T.  Floyd 


Programs  that  process  symbol¬ 
ic  information,  such  as  com¬ 
pilers,  interpreters,  assem¬ 
blers,  spelling  checkers,  and  text 
formatters,  maintain  an  internal  list 
of  symbols  or  words — a  symbol  table. 
The  speed  of  the  symbol  table’s 
search  and  update  operations  often 
determines  the  performance  of  these 
programs.  A  hashing  or  scatter  stor¬ 
age  symbol  table  is  easy  to  program 
and  nearly  always  performs  much 
better  than  a  linear  list  or  binary  tree. 
In  this  article,  I'll  describe  a  tech¬ 
nique  called  open  hashing,  discuss 
some  of  its  performance  factors,  and 
then  introduce  a  simple  modification 
that  can  more  than  double  the  speed 
of  the  technique. 

Open  Hashing 

Each  symbol  in  a  symbol  table  is  rep¬ 
resented  by  an  identifier  that  is  usual¬ 
ly  a  string  of  alphanumeric  charac¬ 
ters,  or  a  word.  Each  variable  name 
in  a  Pascal  program  is  an  identifier, 
for  example.  An  identifier  is  often  as¬ 
sociated  with  other  data  of  interest  to 
the  application — for  instance,  an  en¬ 
try  in  the  variable  symbol  table  for  an 
interpreter  may  be  associated  with  a 
value  for  that  variable,  or  an  entry  in 
the  word  table  for  a  text  analysis  pro¬ 
gram  may  be  associated  with  a  refer¬ 
ence  count  for  that  word.  For  now, 
don’t  worry  about  how  to  associate 


Edwin  T.  Floyd,  4210  Pickering  Dr., 
Columbus,  GA  31907.  Edwin  heads  the 
Data  Processing  Department  of  the 
Hughston  Sports  Medicine  Founda¬ 
tion  in  Columbus.  In  addition  to  work¬ 
ing  on  electromyographic  signal  anal¬ 
ysis,  he  maintains  a  medical  database 
with  more  than  130,000  patient  his¬ 
tories  that  goes  back  40  years. 


A  hashing  symbol 
table  nearly  always 
performs  better  than 
a  linear  list 
or  binary  tree. 

data  with  a  symbol  or  even  how  to 
store  strings  of  characters  in  memo¬ 
ry.  Suppose  you  can  store  the  identi¬ 
fier  string  for  a  symbol  somewhere 
in  memory  and  retain  its  location  as 
an  index  value,  address,  or  pointer. 
Also  suppose  you  can  associate  each 
identifier  with  a  pointer  to  another 
identifier;  thus  you  can  form  a  list,  or 
chain,  of  identifiers,  each  pointing  to 
the  next,  until  you  reach  the  end  of 
the  list,  symbolized  by  (nil).  For  in¬ 
stance,  you  might  represent  a  chain 
of  three  identifiers  by: 

wordl— >word2— >word3— Xnil) 

An  open  hash  table  is  a  linear  array 
of  pointers  called  buckets,  each 
pointing  to  the  beginning  of  a  list  of 
symbol  identifiers.  For  example,  you 
might  represent  an  open  hash  table 
with  101  buckets  as  shown  in  Table  1, 
page  35.  Bucket  number  0  points  to 
the  list  of  identifiers  left,  help,  replace ; 
bucket  number  1  is  empty;  number  2 
points  to  the  list  down,  word,  print, 
align;  and  so  on.  The  bucket  pointers 
are  adjacent  to  one  another  in  an  or¬ 
dinary  array,  but  the  identifiers  may 
be  scattered  about  in  dynamic  mem¬ 
ory  or  located  in  a  separate  array  and 
strung  together  by  their  pointers  into 
a  list. 

How  do  you  decide  to  which  buck¬ 
et  a  word  belongs?  You  decide  with  a 
hash  function.  For  example,  you 
might  add  up  the  ASCII  character  val¬ 


ues  of  the  letters  in  the  identifier 
string,  divide  by  the  size  of  the  table, 
and  use  the  remainder  as  the  bucket 
number.  (This  is  actually  a  widely 
used  method  and  is  one  of  the  four  I 
analyze  later.)  The  remainder  after 
integer  division,  or  modulo  (MOD),  is 
always  less  than  the  divisor,  so  if  you 
use  the  table  size  as  the  divisor,  the 
remainder  is  always  a  valid  bucket 
number.  When  you  insert  an  identifi¬ 
er  in  the  table,  you  first  compute  its 
hash  number,  which  is  the  bucket 
number.  If  the  bucket  pointer  al¬ 
ready  points  to  an  identifier,  this  is 
called  a  collision,  and  you  add  the 
identifier  to  the  list  of  symbols  al¬ 
ready  associated  with  that  bucket.  If 
the  bucket  is  empty,  you  simply 
point  it  to  the  new  identifier.  When 
you  wish  to  search  the  symbol  table 
for  an  identifier,  you  again  compute 
its  hash  number,  which  determines 
the  bucket  containing  the  identifier  if 
it  is  in  the  table.  You  then  need  to 
search  only  the  list  of  symbols  associ¬ 
ated  with  that  bucket.  Using  this 
method,  on  average  you  will  search 
about  1  percent  of  the  full  list. 

Pascal  data  structures  for  a  hash 
pointer  table  and  symbols  are  pre¬ 
sented  in  Listing  One,  page  44.  Listing 
Two,  page  44,  provides  routines  to 
initialize,  insert,  and  locate  symbols, 
and  Listing  Three,  page  47,  provides 
four  different  hash  functions. 

Performance 

As  part  of  a  performance  test,  I  ana¬ 
lyzed  a  text  file  consisting  of  about 
1,200  lines  of  Pascal  code.  The  first 
pass  through  the  file  collected  all 
unique  identifiers  in  a  101-bucket 
hash  table  using  the  "sum  of  the  char¬ 
acters”  hash  function  described  earli¬ 
er.  Out  of  3,198  noncomment  words, 


34 

104 


Dr.  Dobb's  Journal,  February  1987 


243  were  unique  identifiers.  The  sec¬ 
ond  pass  searched  the  hash  table 
again  for  each  of  the  3,198  words  and 
simultaneously  counted  "probes,”  or 
comparisons  of  the  search  word  with 
a  symbol  table  identifier.  The  second 
pass  counted  7,216  probes — an  aver¬ 
age  of  2.26  probes  to  find  any  given 
word.  This  result  is  about  what  you 
would  expect  from  243  identifiers 
distributed  to  101  buckets.  In  con¬ 
trast,  with  the  same  data  in  a  bal¬ 
anced  binary  tree  or  a  sorted  list,  you 
would  average  about  seven  or  eight 
probes  to  find  each  word,  and  in  a 
simple  linear  list  you  would  average 
more  than  a  hundred. 

Evidently,  the  number  of  probes 
necessary  to  find  any  given  symbol 
appears  to  be  approximately  the 
number  of  identifiers  stored  in  the  ta¬ 
ble  divided  by  the  number  of  buck¬ 
ets.  (Actually,  for  a  uniform  word  dis¬ 
tribution,  you  would  expect  it  to  be 
about  half  this  number,  but  it’s  not 
because  the  word  distribution  is  not 
uniform,  as  you  shall  see.)  Therefore, 
one  way  to  improve  the  perform¬ 
ance  of  a  hash  table  is  to  increase  the 
number  of  buckets.  With  the  same 
Pascal  source  file  in  a  203-bucket 
hash  table,  for  example,  you  should 
more  often  than  not  find  the  search 
symbol  on  the  first  probe. 

I  also  analyzed  one  quick-to-com- 
pute  hash  function — the  sum  of  the 
first  (times  a  constant)  and  last  char¬ 
acters  plus  the  identifier  length. 

Aho,  Sethi,  and  Ullman1  describe 
and  analyze  a  hash  function  they  call 
HashPJW  (see  Listing  Three).  In  study¬ 
ing  HashPJW,  I  was  struck  by  its  simi¬ 
larity  to  software  routines  for  com¬ 
puting  cyclic  redundancy  check  (CRC) 
codes  used  to  detect  errors  in  disk  stor¬ 
age  and  data  communications.  I  sup¬ 
pose  this  makes  intuitive  sense — CRC 
code  algorithms  are  designed  to  gen¬ 
erate  as  many  unpredictably  differ¬ 
ent  codes  as  possible  for  widely  vary¬ 
ing  input  data,  with  the  hope  that  a 
packet  or  sector  with  an  error  will 
have  a  different  CRC  from  the  original 
and  thus  the  error  will  be  detected.  A 
hashing  algorithm  based  on  CRC 
should  do  quite  well  on  the  average, 
and  fast,  table-driven  routines  exist  in 
the  public  domain.  One  such  routine 
is  also  presented  in  Listing  Three. 

A  poor  hash  function  can  reduce 
performance  by  overusing  some 
buckets  and  underusing  others.  A 


hash  function  may  be  quite  accept¬ 
able  for  one  set  of  symbols  and  horri¬ 
ble  for  another.  Aho,  Sethi,  and  Ull¬ 
man  also  describe  a  statistic  (here 
called  U(h,tl)  that  characterizes  the 
uniformity  of  a  hash  function  (h)  in 
distributing  a  given  set  of  symbols  (t)  to 
hash  buckets.  A  routine  to  compute 
U(h,t)  is  given  in  Listing  Four,  page  48; 
it  computes  the  results  for  each  hash 
function  and  text  file  discussed.  The 
lower  the  U(h,t)  number,  the  more 
uniform  is  the  distribution — a  per¬ 
fectly  random  distribution  would 
have  a  U(h,t)  of  1.000,  and  a  distribu¬ 
tion  more  uniform  than  random 
would  have  a  U(h,t)  less  than  1.000. 

In  summary,  the  four  hash  algo¬ 
rithms  I  analyzed  were: 

•  (sum  of  the  characters  +  length) 
MOD  101 


•  (first  *  256  +  last  +  length)  MOD  101 

•  HashPJW  MOD  101 

•  CRC-16  MOD  101 

I  analyzed  three  text  files: 

•  a  1,200-line,  3,198-word  Pascal  pro¬ 
gram  (comments  and  strings  exclud¬ 
ed)  with  243  unique  identifiers 

•  a  1,300-line,  9,025-word  legal  deposi¬ 
tion  outline  with  1,068  unique  words 

•  a  1,150-line,  5,516-word  business  re¬ 
port  with  1,402  unique  words 

My  results  are  summarized  in  Ta¬ 
bles  2,  3,  and  4,  below.  In  general,  the 
choice  of  hash  function  didn't  appear 
to  make  a  significant  difference  to  the 
performance  of  the  program.  The  dif¬ 
ference  in  number  of  probes  between 
the  best  performing  and  the  worst 
performing  hash  function  was  never 


0.  — >  left— >  help— >  replace— >  (nil) 

1 .  — >(nil) 

2 .  — >  down— >  word— >  print— >  align— >  (nil) 

3.  — >tab — >(nil) 

4.  — >(nil) 

5.  — >  screen— >  insert— >off~>format— >(nil) 

6.  — >find— >save— >(nil) 


100.  ~>quick — >turn- 

->  line— >  (nil) 

Table  1:  Pointer  identifier  list 

Hash  Method 

Sum  of  chars 

F  +  L  +  len 

HashPJW 

CRC 

No.  of  Probes 

7216 

7611 

6410 

6925 

U(h,t) 

1.022 

1.127 

1.045 

0.981 

Probes/Search 

2.26 

2.38 

2.00 

2.16 

Table  2:  Pascal  source  code,  average  ZAO  identifiers  per  bucket 

Hash  Method 

Sum  of  chars 

F  +  L  +  len 

HashPJW 

CRC 

No.  of  Probes 
81544 

84702 

82780 

87147 

U(h,t) 

1.034 

1.178 

0.993 

1.014 

Probes/Search 

9.04 

9.38 

9.17 

9.66 

Table  3:  Legal  deposition  outline,  average  10.57  words  per  bucket 

Hash  Method 

Sum  of  chars 

F  +  L  +  len 

HashPJW 

CRC 

No.  of  Probes 

63036 

72479 

59807 

59608 

U(h,t) 

1.004 

1.217 

1.001 

0.995 

Probes/Search 

11.43 

13.14 

10.84 

10.81 

Table  4:  Business  report,  average  13.88  words  per  bucket 


Dr.  Dobb  s  Journal,  February  1987 


35 

105 


HASHING  FOR  SEARCHING 

(continued  from  page  35) 

more  than  23  percent.  The  second 
hash  function  did  remarkably  well 
considering  the  amount  of  informa¬ 
tion  discarded  by  ignoring  everything 
but  the  first  and  last  characters  and 
the  length.  This  routine  would  be  ex¬ 
pected  to  do  very  poorly  on  identifi¬ 
ers  that  are  all  the  same  length  and 
begin  with  the  same  character,  such 
as  those  generated  by  some  assembler 
macro  processors.  The  CRC  routine 
was  unexpectedly  worst  on  the  depo¬ 
sition  outline  (long,  but  with  not  many 
unique  words)  but  did  very  well  on 
the  others.  The  U(h,t)  function  gener¬ 
ally  did  pretty  well  in  characterizing 
the  search  performance  of  the  hash 
functions.  1  would  be  interested  to 
hear  the  results  of  similar  tests  of 
these  hash  functions  on  other  text 
files  and  of  other  hash  functions. 

A  Self-Organizing  List 

Computer  and  human  language  text, 
particularly  computer  language  text, 
do  not  contain  a  uniform  distribution 
of  words.  If  you  were  to  count  the  oc¬ 
currences  of  each  unique  word  in  a 
large  sample  of  text  and  rank  the 
words  from  most-often-used  to  least- 
often-used,  you  would  find  a  wide  dis¬ 
parity  in  word  counts.  You  would  also 
find  a  pattern:  The  product  of  the  oc¬ 
currence  count  and  the  rank  order 
would  be  approximately  constant.  In 
1949,  George  Zipf2  described  this 
word  distribution  pattern  for  several 
human  languages  and,  oddly,  popula¬ 
tions  of  cities — it's  now  called  Zipf's 
law.  In  addition,  computer  language 
text  tends  to  be  "clustery”— that  is, 
references  to  the  same  identifier  tend 
to  be  clustered  together  rather  than 
uniformly  distributed  throughout  the 
text.  You  can  use  this  information  to 
improve  search  algorithms. 

Notice,  after  computing  the  bucket 
number  with  a  hash  function, 
searching  an  open  hash  table  boils 
down  to  searching  a  linear  list  an¬ 
chored  in  the  chosen  bucket.  If  you 
could  organize  the  list  so  that  the 
more  likely  an  identifier  is  to  be  ref¬ 
erenced  next,  the  closer  it  is  to  the 
front  of  the  list,  the  search  time 
would  be  improved.  If,  every  time 
you  find  a  symbol  during  a  search, 
you  move  it  to  the  front  of  the  list,  the 
list  will  tend  to  organize  itself  in  just 


the  optimum  way  for  the  clustery, 
unevenly  distributed  references  that 
occur  in  human  and  computer  lan¬ 
guage  text. 

The  symbol  search  routine  in  List¬ 
ing  One  requires  seven  additional 
lines  of  code  to  implement  a  self-or¬ 
ganizing  list  with  "move  to  front” 
(MTF).  The  modified  search  routine  is 


Computer  language 
text  tends  to  be 
' clustery / 


presented  in  Listing  Five,  page  50, 
and  the  analyses  of  the  four  hash 
functions  on  the  three  text  files  are 
presented  in  Tables  5, 6,  and  7,  below. 
Clearly,  the  MTF  optimization  has  a 
far  greater  effect  on  the  number  of 
probes  than  on  any  of  the  hash  func¬ 
tion  variants.  On  Pascal  source,  the 
number  of  probes  dropped  at  least  30 
percent.  On  business  report  text,  the 
drop  was  about  55  percent,  and  on 
the  deposition  outline,  the  drop  was 
an  incredible  66  percent  or  more. 

In  general,  the  more  clustery  and 
skewed  the  distribution,  and  the 
more  items  per  hash  bucket,  the 


more  improvement  MTF  offers.  I  ex¬ 
pect  thart  Pascal  source,  arguably  the 
most  skewed  and  clustery  of  all  text, 
would  have  shown  more  dramatic 
improvement  had  the  average  length 
of  the  word  list  at  each  hash  bucket 
been  10  or  15,  as  in  the  other  text  sam¬ 
ples,  rather  than  2  or  3.  In  the  two 
human  language  text  samples,  the 
number  of  probes  per  search 
dropped  to  significantly  less  than 
half  the  average  number  of  words 
per  bucket.  If  words  were  distributed 
randomly  in  the  text  with  uniform 
frequency,  you  would  expect  the 
number  of  probes  per  search  to  be 
about  half  the  average  number  of 
words  per  bucket.  Interestingly,  with 
MTF  the  performance  of  CRC  on  the 
deposition  outline  text  climbs  into 
third  place,  well  ahead  of  first  +  last 
+  length  and  more  in  line  with  ex¬ 
pectations.  The  U(h,t)  function  re¬ 
mains  unchanged  because  it  depends 
only  on  the  hash  function  and  the 
text  under  analysis. 

Conclusions 

Hashing  is  a  simple  and  effective  or¬ 
ganization  scheme  for  symbol  tables 
when  rapid  searching  of  the  table  is 
important  but  ordering  (for  example, 
alphabetically)  is  not.  The  choice  of 
hashing  function  appears,  on  the  ba¬ 
sis  of  the  tests  I  conducted,  to  have 


Hash  Method 

No.  of  Probes 

U(h,t) 

Probes/Search 

Sum  of  chars 

4326 

1.022 

1.35 

F  +  L  +  len 

4552 

1.127 

1.42 

HashPJW 

4414 

1.045 

1.38 

CRC 

4550 

0.981 

1.42 

Table  5:  Pascal  source  code  searched  with  MTF 


Hash  Method 

No.  of  Probes 

U(h,t) 

Probes/Search 

Sum  of  chars 

25659 

1.034 

2.84 

F  +  L  +  len 

28043 

1.178 

3.11 

HashPJW 

25001 

0.993 

2.77 

CRC 

26079 

1.014 

2.89 

Table  6:  Legal  deposition  outline  searched  with  MTF 


Hash  Method 

No.  of  Probes 

U(h,t) 

Probes/Search 

Sum  of  chars 

27882 

1.004 

5.05 

F  +  L  +  len 

33314 

1.217 

6.04 

HashPJW 

27806 

1.001 

5.04 

CRC 

27802 

0.995 

5.04 

Table  7:  Business  report  searched  with  MTF 


Dr.  Dobb 's  Journal,  February  1987 

106 


37 


HASHING  FOR  SEARCHING 

(continued  from  page  37) 

little  effect  on  the  overall  perform¬ 
ance  of  the  search,  with  the  caveat 
that  it  is  always  possible  to  deliberate¬ 
ly  or  accidentally  contrive  data  that 
will  defeat  any  hashing  function.  1 
would  tend  to  favor  HashPJW  or  CRC, 
which  seem  to  be  more  difficult  to 
fool.  The  incorporation  of  MTF  self- 
organizing  lists  greatly  improves  the 
performance  of  the  linear  search 
phase  at  minimal  coding  cost. 

Some  Topics  for  Further 
Investigation 

1.  It's  difficult  to  believe  that  the 
choice  of  hashing  function  has  as  lit¬ 
tle  effect  on  performance  as  I  ob¬ 
served.  I  would  like  to  see  the  test  re¬ 
sults  of  the  four  hashing  algorithms 
on  a  wider  variety  of  textual  material 
in  human  and  computer  languages.  I 
can  be  reached  at  CompuServ  ID 
[76067,747]  or  by  U.S.  Snail.  Do  you 
know  a  hashing  function  that  consis¬ 
tently  does  better  on  a  wide  variety 
of  text  materials  than  the  four  pre¬ 
sented  here? 

2.  Hester  and  Hirschberg3  point  out 
that  MTF  is  not  the  only  way  to  imple¬ 
ment  a  self-organizing  list.  Some  re¬ 
search  (mostly  assuming  uniform  dis¬ 
tributions)  indicates  that  exchanging 
the  found  item  with  the  one  preced¬ 
ing  it  on  the  list  works  better  than 
MTF  as  the  number  of  searches  be¬ 
comes  arbitrarily  large.  How  does 
this  method  perform  compared  to 
MTF  with  practical  text  files? 

MTF  and  Exchange  are  opposite 
ends  of  the  general  "move  ahead  n” 
method  (Exchange  is  move  ahead  1, 
MTF  is  move  ahead  all  the  way).  Is 
there  some  n,  perhaps  a  function  of 
the  number  of  symbols  on  the  list  or 
the  nearness  of  the  found  symbol  to 
the  front  of  the  list,  that  produces  bet¬ 
ter  results  than  MTF? 

Wild  tangent:  Huffman  data  com¬ 
pression  depends  on  a  distributed  or¬ 
dering  of  words  or  characters.  Can  a 
self-organizing  list  be  used  in  a  Huff¬ 
man-like  data  compression  routine 
that  makes  only  one  pass  on  the  file? 

3.  If  the  identifier  or  search  key  is 
very  long  and  the  number  of  colli¬ 
sions  is  high,  the  time  required  to 
compare  keys  may  become  signifi¬ 
cant  or  even  dominant.  Morris4  sug¬ 
gests  computing  a  hash  code  at  least 


three  times  longer  than  needed,  us¬ 
ing  part  of  the  code  to  determine  the 
bucket  number,  and  storing  the  re¬ 
mainder  in  the  table  instead  of  the 
identifier.  The  search  routine  then 
compares  hash  codes  only,  not  the 
full  identifier.  This  admits  the  possi¬ 
bility  of  a  false  match,  but  Morris  be¬ 
lieves  the  probability  is  acceptably 
low  (which  seems  to  me  to  be  a  bla¬ 
tant  challenge  to  Murphy!).  This  has 
the  attraction  of  eliminating  the  need 
to  store  a  variable  length  character 
string.  Do  you  have  any  experience 
with  such  a  scheme,  and  would  you 
be  willing  to  share  it? 

4.  What  if  you  had  a  small  list  of  iden¬ 
tifiers  that  you  needed  to  detect  very 
rapidly  in  a  text  file — for  example, 
the  list  of  reserved  words  in  Pascal?  Is 
it  possible  to  create  a  "perfect”  hash 
function  in  which  each  identifier 
would  hash  to  a  unique  bucket  with 
no  collisions  and  the  number  of  buck¬ 
ets  would  equal  the  number  of  iden¬ 
tifiers — or  at  least  be  a  minimum?  Ac¬ 
cording  to  Sager5  it  is,  and  the 
creation  of  such  a  function  can  even 
be  partially  automated.  I  would  like 
to  hear  about  experiences  with 
Sager's  algorithms  implemented  and 
used  on  a  microcomputer.  He  reports 
creating  minimal  perfect  hash  func¬ 
tions  for  lists  of  up  to  512  identifiers 
using  an  IBM  4341;  what  is  the  practi¬ 
cal  maximum  for  an  IBM  PC? 

5.  I've  discussed  hashing  as  a  RAM-res¬ 
ident  symbol  table  implementation 
technique.  It  has  also  been  used  as  a 
disk  database  indexing  technique. 
How  is  hashing  on  disk  different 
from  hashing  a  RAM  symbol  table?  Is 
MTF  optimization  or  any  other  kind 
of  self-organizing  list  appropriate  for 
disk  hashing? 

Listing  Notes 

All  listings  are  in  Turbo  Pascal  (Ver¬ 
sion  3,  any  operating  system)  for  ac¬ 
cessibility.  The  algorithms  are  almost 
trivial  and  should  be  readily  convert¬ 
ible  to  assembly  or  any  higher-level 
language.  These  routines  use  the  fol¬ 
lowing  nonstandard  Turbo  Pascal 
features,  which  I  feel  serve  to  en¬ 
hance  the  clarity  of  the  code  with 
minimal  impact  on  its  portability: 

Declaration:  STRING  [n]— variable 
length  string  variables,  equivalent  to 
ARRAY  [0..n]  of  CHAR  where  element  0 
of  the  array  contains  the  length  of  the 


string.  Assignment,  concatenation, 
and  comparisons  are  permitted  on 
string  variables. 

LENGTH  (y) — returns  the  length 
(0..255)  of  the  string  parameter;  equiv¬ 
alent  to  ORD  (x  [0]). 

COPY  (y,  i,  j) — returns  substring  of 
string  x,  beginning  with  character  i 
and  continuing  for  j  characters  or  un¬ 
til  the  end  of  string  y. 

SIZEOF  (x) — pseudofunction  return¬ 
ing  the  size  in  bytes  of  a  variable  or 
declaration.  A  constant  value  may  be 
substituted. 

GETMEM  (ptrvar,  len) — akin  to  NEW, 
returns  a  pointer  to  a  chunk  of  dy¬ 
namic  memory  of  the  specified 
length. 

Infix  Boolean  bitwise  operators:  AND, 
OR,  XOR — recognized  as  bitwise  oper¬ 
ators  when  used  with  integer  oper¬ 
ands  and  bit  shift  operators  (SHL,  shift 
left;  and  SHR,  shift  right). 

Listing  Two 

The  procedure  Symbol— Put  in  List¬ 
ing  Two,  which  inserts  the  symbol 
name  and  data  in  the  table,  does  not 
check  for  duplicate  symbols.  If  you 
want  to  disallow  duplicate  symbols, 
use  something  such  as: 

IF  NOT  symboL_get  (. . .)  THEN 
symboLput  (. . .); 

On  exit,  this—symbol  points  to  the 
heap  copy  of  the  symbol  just  insert¬ 
ed,  and  sudata  has  been  copied  into 
this— symbol  I  .sym—data. 

Symbol— Put  attempts  to  conserve 
heap  space  by  obtaining  only  enough 
memory  to  hold  the  identifier  and  its 
associated  data,  not  the  unused  por¬ 
tion  of  the  sym-name  string.  The  tech¬ 
nique  used  here  depends  on  nonstan¬ 
dard  features  and  characteristics  of 
Turbo  Pascal,  Version  3,  and  may  not 
be  directly  portable  to  other  lang¬ 
uages  or  future  versions  of  Turbo  Pas¬ 
cal.  Nevertheless,  I  believe  it  to  be  the 
clearest  exposition  of  the  algorithm 
and  defend  it  on  those  grounds. 

Notes 

1.  A.  V.  Aho,  R.  Sethi,  and  J.  D.  Ullman, 
Compilers  (Reading,  Mass.:  Addison- 
Wesley,  1986),  434  —  436. 

2.  G.  K.  Zipf,  Human  Behaviour  and  the 
Principle  of  Least  Effort  (Reading, 
Mass.:  Addison-Wesley,  1949). 

3.  J.  H.  Hester  and  D.  S.  Hirschberg, 
“Self-Organizing  Linear  Search, '  ’  ACM 


40 


Dr.  Dobb  s  Journal,  February  1987 

107 


Computing  Surveys,  vol.  17  no.  3 
(1985):  295—311.  A  survey  of  the  cur¬ 
rent  state  of  knowledge  on  self-orga¬ 
nizing  searches,  particularly  with 
Zipf  distributions. 

4.  R.  Morris,  “Scatter  Storage  Tech¬ 
niques,”  Communications  of  the  ACM, 
vol.  11  no.  1  (1968):  38-44  or  vol.  26 
no.  1  (1983):  39-42.  The  earliest  de¬ 
scription  I  found  of  open  hashing, 
here  called  “scatter  index  tables.”  It 
also  describes  “virtual  scatter  tables.” 

5.  T.  J.  Sager,  “A  Polynomial  Time 
Generator  for  Minimal  Perfect  Hash 
Functions,”  Communications  of  the 
ACM,  vol.  28  no.  5  (1985):  523-532. 
Graph/network-theory  related  algo¬ 
rithms  for  finding  perfect  hash 
functions. 

Bibliography 

Bentley  and  McGeoch.  “Amortized 
Analysis  of  Self-Organizing  Sequen¬ 
tial  Search  Heuristics.”  Communica¬ 
tions  of  the  ACM,  vol.  28  no.  4  (1985): 
404—411.  This  paper  indicates  a  pref¬ 
erence  for  MTF  for  linked  list 
implementations. 

Knuth,  D.  E.  The  Art  of  Computer  Pro¬ 
gramming,  Volume  3:  Sorting  and 
Searching.  Reading,  Mass.:  Addison- 
Wesley,  1983.  On  page  514  Knuth  de¬ 
scribes  open  hashing  as  “collision  res¬ 
olution  by  chaining”  and  suggests 
using  a  self-organizing  list  but  offers 
no  implementation.  Page  397  dis¬ 
cusses  Zipf's  law  and  other  probabili¬ 
ty  distributions;  pages  398-401  de¬ 
scribe  self-organizing  files. 

Peterson,  W.  W.  “Addressing  for  ran¬ 
dom  access  storage."  IBM  J.  Research 
and  Development,  vol.  1  no.  2  (1957): 
130 — 146.  This  is  generally  credited  as 
the  first  scientific  paper  on  hashing, 
though  Knuth  reports  finding  refer¬ 
ences  as  early  as  1953. 

Availability 

All  the  source  code  for  articles  in  this 
issue  is  available  on  a  single  disk.  To 
order,  send  $14.95  to  Dr.  Dobb  s  Jour¬ 
nal,  501  Galveston  Dr.,  Redwood  City, 
CA  94063  or  call  (415)  366-3600  ext. 
216.  Please  specify  the  issue  number 
and  disk  format  (MS-DOS,  Macintosh, 
Kaypro). 

ODJ 

(Listings  begin  on  page  44.) 

Vote  for  your  favorite  feature/article. 

Circle  Reader  Service  No.  4. 


HASHING  FOR  SEARCHING 


Listing  One  (Teyt  begins  on  page  34.) 

(Listing  One:  Declarations  of  data  structures 
used  by  hashing  and  symbol  table  routines.} 

CONST 

symbol_hash_size  -  100; 

{Number  of  buckets  -  1  in  the  hash  table. 

I  believe  it  should  be  a  prime  -  1.} 

TYPE 

str255  =  String  [255];  {General  large  str) 
symbol_data  *  RECORD 

{Data  to  be  associated  with  identifier} 
usecount:  INTEGER; 

END; 

symbol_name  -  String  [255]; 

{Symbol  identifier  is  a  string} 
symbol_ptr  =  Asymbol_Type; 
symbol__range  “  0 . . symbol_hash_size; 
symbolJType  =  RECORD 

{identifier  and  its  data} 
sym_chain:  symbol_ptr; 

{Ptr  to  next  symbol  in  list} 
sym_data:  symbol_data; 

{Type  declared  in  the  main  program} 
sym_name:  symbol_name; 

{Symbol  name  or  identifier} 

END; 

symbol_control  =  RECORD 

{Declare  one  of  these  in  main  program  for 
each  symbol  table  to  be  used} 
symbols,  searches,  notfound:  INTEGER; 
probes :  REAL; 

{Real  because  some  counts  exceed  32767} 
this_bucket :  symbol_range; 

{Bucket  #  of  last  referenced  symbol} 
this_symbol:  symbol_ptr; 

{Pointer  to  last  referenced  symbol} 
sym_ptr:  ARRAY  [symbol_range)  OF  symbol  ptr; 

{ Buckets} 

END'*  End  Listing  One 


Listing  Two 


{Listing  2:  Routines  to  initialize  the  symbol 
table,  insert  a  symbol,  and  locate  a  symbol, 
without  MTF. } 

FUNCTION  symbol_size 

(VAR  s_name:  symbol_name) :  INTEGER; 

{Return  the  size  of  memory  required  to  contain 
a  symbol  named  in  s_name.} 

BEGIN 

symbol_size  :=  SI2E0F  (symbol_ptr) 

+  SIZEOF  (symbol_data) 

END.  +  SUCC  (LENGTH  (s_name) ) ; 

PROCEDURE  symbol_init  (VAR  sym:  symbol_cont rol ) ; 

{Initialize  symbol  control  pointers.  Call  this 
before  the  first  use  of  a  Symbol_Control  area.} 

VAR 

i:  symbol_range; 

BEGIN 

WITH  sym  DO  BEGIN 

FOR  i  ;*»  0  TO  symbol_hash_si  ze 
DO  sym_ptr  [i]  :=  NIL; 
this_bucket  :=  0; 
this_symbol  :=  NIL; 
symbols  :=  0; 
searches  :=  0; 
probes  :*=  0.0; 
notfound  :  =  0; 

END; 

END; 

PROCEDURE  symbol_put  (VAR  sym:  symbol_control ; 
s_name :  symbol_name;  VAR  s_data:  symbol  data); 

{Insert  symbol  name  and  data  in  table.  This 

routine  does  not  check  for  duplicate  symbol.} 


BEGIN 

WITH  sym  DO  BEGIN 


(continued  on  page  47) 


Dr.  Dobb's  Journal,  February  1987 

108 


HASHING  FOR  SEARCHING 


Listing  Twof Listing  continued,  text  begins  on  page  34.) 

this_bucket  symbol_hash  (s_name) ; 

GETMEM  (this_symbol,  symbol_size  (s_name) ) ; 

WITH  this_symbolA  DO  BEGIN 

sym_chain  :»  sym__ptr  [this_bucket ] ; 
sym_data  :*  s_data; 
sym_name  : -  s_name ; 

syra_ptr  [this_bucket]  this_symbol; 

END; 

symbols  SUCC  (symbols); 

END; 

END; 

FUNCTION  symbol_get 

(VAR  sym:  symbol_control;  s_name:  symbol_name; 

VAR  s_data:  symbol_data) :  BOOLEAN; 

(Retrieve  a  symbol.  If  the  symbol  is  found, 
set  s_data  to  the  data  stored  by  the  last  call 
to  symbol_put  specifying  the  same  symbol  name, 
point  this_symbol  to  the  symbol  table  entry,  and 
return  TRUE.  If  the  symbol  is  not  found  leave 
s_data  unchanged,  leave  this_symbol  undefined, 
and  return  FALSE.  This  version  does  NOT 
implement  the  MTF  algorithm.} 

VAR 

p:  symbol_ptr;  (work  pointer} 

BEGIN 

WITH  sym  DO  BEGIN 

this_bucket  symbol_hash  (s_name); 
p  :**  sym_ptr  [this_bucket  ] ; 
symbol_get  FALSE; 
searches  SUCC  (searches); 

IF  p  -  NIL  THEN 

not  found  SUCC  (not found); 

WHILE  p  <>  NIL  DO  WITH  pA  DO  BEGIN 
probes  probes  +  1.0; 

IF  s_name  -  sym_name  THEN  BEGIN 
( found  it ! } 
s_data  sym_data; 
this_symbol  :»  p; 
p  :«  NIL; 

symbol_get  :»  TRUE; 

END  ELSE  BEGIN 

(not  this  one,  chain  to  the  next} 
p  sym_chain; 
if  p  -  NIL  THEN 

notfound  SUCC  (notfound); 

END; 

END; 

END;  .  .  . 

END.  End  Listing  Two 

Listing  Three 

{Listing  3:  Hash  functions,  presented  as  a 
single  Pascal  function  with  case  statement 
controlled  by  a  global  variable  ‘hashtype11 
to  select  one  of  the  four  routines.) 

(First  the  table  used  by  the  CRC-16  routine, 
this  from  a  public  domain  file  uncompression 
program:  DeArc,  by  Bela  Lubkin.) 

const  crctab  :  array  [0. .255]  of  integer  « 

($0000,  $C0C1,  $C181,  $0140,  $C301,  $03CO,  $0280, 

SC241,  $C601,  S06C0,  $0780,  $C741,  $0500,  SC5C1, 

$C481,  $0440,  SCC01,  $0CC0,  $0D80,  $CD41,  $0F00, 

$CFC1,  $CE81,  $0E40,  $0A00,  $CAC1,  $CB81,  $0B40, 

$C901,  S09C0,  $0880,  $C841,  $D801,  $18C0,  $1980, 

$D941 ,  $1B00,  SDBCl,  SDA81,  $1A40,  S1E00,  SDEC1, 

$DF81,  $1F40,  $DD01,  S1DC0,  $1C80,  $DC41,  $1400, 

$D4C1,  $D581,  $1540,  $D701,  $17C0,  $1680,  $D641, 

$D201,  $12C0,  $1380,  $D341,  $1100,  $D1C1,  $D081, 

$1040,  $F001,  $30C0,  $3180,  SF141,  $3300,  $F3C1, 

$F281,  $3240,  $3600,  $F6C1,  $F781,  $3740,  SF501, 

$35C0,  $3480,  $F441,  $3COO,  $FCC1,  $FD81,  $3D40, 

$FF01,  $3FC0,  $3E80,  $FE41,  $FA01,  $3AC0,  $3B80, 

$FB41,  $3900,  $F9C1,  $F881,  $3840,  $2800,  $E8C1, 

$E981,  $2940,  $EB01,  $2BC0,  $2A80,  $EA41,  SEE01, 

$2EC0,  $2F80,  $EF41,  S2D00,  $EDC1,  $EC81,  S2C40, 

$E401,  S24C0,  $2580,  $E541,  $2700,  $E7C1,  $E681, 

$2640,  $2200,  $E2C1,  $E381,  $2340,  $E101,  $21C0, 

$2080,  $E041,  $A001,  $60C0,  $6180,  $A141,  $6300, 

$A3C1 ,  $A281 ,  $6240,  $6600,  $A6C1,  $A781,  $6740, 

(continued  on  next  page) 


Dr.  Dobb's  Journal,  February  1987 


47 

109 


HASHING  FOR  SEARCHING 


Listing  Three  (Listing  continued,  text  begins  on  page  34.) 


$A501 , 

$65C0, 

$6480, 

$A441, 

$6C00, 

$ACC1, 

$AD81( 

$6D40, 

$AF01, 

$6FC0, 

$6E80, 

$AE41, 

$AA01, 

$6AC0J 

$6B80, 

$AB41 , 

$6900, 

$A9C1, 

$A881, 

$6840, 

$7800, 

$B8C1, 

$B981, 

$7940, 

$BB01, 

$7BC0, 

$7A80, 

$BA41 , 

$BE01 , 

$7EC0, 

$7F80, 

$BF41, 

$7D00, 

$BDC1, 

$BC81, 

$7C40, 

$B401, 

$74C0, 

$7580, 

$B541, 

$7700, 

$B7C1 , 

$B681, 

$7640, 

$7200, 

$B2C1, 

$B381 , 

$7340, 

$B101 , 

$71C0, 

$7080, 

$B041, 

$5000, 

$90C1 , 

$9181, 

$5140, 

$9301, 

$53C0, 

$5280, 

$9241, 

$9601, 

$56C0, 

$5780, 

$9741, 

$5500, 

$95C1, 

$9481, 

$5440, 

$9C01, 

$5CC0, 

$5D80, 

$9D41 , 

$5F00, 

$9FC1, 

$9E81 , 

$5E40, 

$5A00, 

$9AC1 , 

$9B81 , 

$5B40, 

$9901, 

$59C0, 

$5880, 

$9841, 

$8801, 

$48C0, 

$4980, 

$8941, 

$4B00, 

$8BC1, 

$8A81 

$4A40, 

$4E00, 

$8EC1, 

$8F81, 

$4F40, 

$8D01, 

$4DC0 

$4C80, 

$8C41, 

$4400, 

$84C1, 

$8581, 

$4540, 

$8701 

$47C0, 

$4680, 

$8641, 

$8201, 

$42C0, 

$4380, 

$8341 

$4100, 

$81C1 , 

$8081, 

$4040 

); 

FUNCTION  symbol_hash 

(VAR  s_name:  symbol_name) :  symbol_range; 


(Hash  the  symbol  name  to  a  number  between 
0  and  the  hash  table  size.} 

VAR 

i,  j:  INTEGER; 

BEGIN 

CASE  hashtype  OF 

1:  BEGIN  (Sum  of  the  characters  +  length} 
j  0; 

FOR  i  :=  0  to  LENGTH  (s_name)  DO 
j  : “  j  +  ORD  (s_name  [ i } ) ; 
symbol_hash 

j  MOD  SUCC  (symbol_hash_size) ; 

END; 

2:  BEGIN  {First  +  Last  +  Length} 
symbol_hash 

((ORD  (s_name  [1})  SHL  8) 

+  ORD  (s_name  [Length  (s_name) } ) 

+  Length  (s_name) ) 

MOD  SUCC  (symbol_hash_size)  ; 


END; 

3:  BEGIN  {HashPJW} 

j  0; 

FOR  i  ;»  1  TO  LENGTH  (s_name)  DO  BEGIN 
j  (j  SHL  4)  +  ORD  (s_name  [i ] ) ; 

IF  (j  AND  $F000)  <>  0  THEN 

j  :=  j  XOR  (j  SHR  12)  AND  $0FFF; 

END; 

symbol_hash  :»  (j  AND  $7FFF) 

MOD  SUCC  (symbol_hash_size)  ; 

END; 

4:  BEGIN  {CRC-16} 

j  0; 

FOR  i  :=  1  TO  LENGTH  (s_name)  DO 
j  (j  SHR  8)  XOR 
crctab  [(j  XOR  ORD 

(s_name  [1 ) ) )  AND  $00FF}; 
symbol_hash  ;=  (j  AND  $7FFF) 

MOD  SUCC  (symbol_hash_size)  ; 

END; 

else  symbol_hash  ;=  0; 

(Not  specified,  punish  the  user} 

END; 

END;  End  Listing  Three 

Listing  Four 

{Listing  4:  Symbol  distribution  function  U(h,t)} 

FUNCTION  symbol_distribution 

(VAR  sym:  symbol_control)  :  REAL; 

{Compute  the  distribution  test  as  outlined  in 
Aho,  et  al.  This  function  approaches  1.0  as 
the  ''randomness''  of  the  hashing  improves.} 

VAR 

p:  symbol_ptr; 
b,  n,  m,  r:  REAL; 
i :  symbol_ra  nge ; 
j:  INTEGER; 

BEGIN 


(Listing  continued ,  tejct  begins  on  page  34.) 


r  :=  0.0; 

WITH  sym  DO  BEGIN 

FOR  i  :=  0  to  symbol_hash_size  DO  BEGIN 
p  : “  sym_ptr  [ij; 
j  :=  0; 

WHILE  p  <>  NIL  DO 

WITH  pA  DO  BEGIN  {count  the  list} 
p  :=  sym_chain; 
j  :«  SUCC  (j); 

END; 
b  :=  j; 

r  :=  r  +  (b  *  (b  +  1.0))  /  2.0; 

END; 

m  SUCC  (symbol_hash_size) ; 
n  :»  symbols; 

symbol_distribution  r  / 

( (n  /  (2.0  *  m) )  *  (n  +  2.0  *  m  -  1.0)); 

END; 

END; 


Listing  Five 

{Listing  5:  Symbol  search  with  MTF) 


End  Listing  Four 


{Note:  for  the  purposes  of  the  test  program,  the 
application  of  MTF  is  controlled  by  a  global 
boolean  variable  *‘mtf''  set  by  the  main  program. 
The  test  for  this  boolean  should  be  removed  in  a 
production  version  of  the  routine.  MTF  with  all 
its  performance  advantages  is  accomplished  with 
the  addition  of  seven  lines  of  code! } 

FUNCTION  symbol_get 

(VAR  sym:  symbol_control;  s_name:  symbol_name; 
VAR  s_data:  symbol_data) :  BOOLEAN; 

{Retrieve  a  symbol.  If  the  found,  s_data  is  set 
to  the  data  stored  by  the  last  call  to  symbol__put 
specifying  the  same  symbol  name,  this_symbol 
points  to  the  symbol  table  found,  the  symbol  is 
moved  to  the  front  of  the  chain,  and  the  function 


returns  TRUE.  If  the  symbol  is  not  found  s_data 
is  unchanged,  this_symbol  is  undefined,  and  the 
function  returns  FALSE.} 

VAR 

p:  symbol_ptr;  {work  pointer} 

BEGIN 

WITH  sym  DO  BEGIN 

this_bucket  :=  symbol_hash  (s_name) ; 
p  :=  sym_ptr  [this_bucket ] ; 
symbol_get  :*  FALSE; 
this_symbol  :  =  NIL; 
searches  :=*  SUCC  (searches); 

IF  p  =  NIL  THEN  notfound  :=  SUCC  (notfound) ; 
WHILE  p  <>  NIL  DO  WITH  pA  DO  BEGIN 
probes  :»  probes  +  1.0; 

IF  s_name  -  sym_name  THEN  BEGIN 
{ found  it ! } 

IF  this_symbol  <>  NIL  THEN  IF  mtf  THEN 
BEGIN  {Move  it  to  the  front} 

this_symbolA.sym_chain  :-  sym_chain; 
sym_chain  :=  sym_ptr  [this_bucket } ; 
sym_ptr  [this_bucket }  :«  p; 

END; 

s_data  :«  sym_data; 
this_symbol  :*=  p; 
p  :»  NIL; 

symbol_get  :*  TRUE; 

END  ELSE  BEGIN 

{not  this  one,  chain  to  the  next} 
this_symbol  :=*  p; 
p  : -  sym_chain; 
if  p  -  NIL  THEN 

notfound  :=  SUCC  (notfound); 

END; 

END; 

END; 

END; 


End  Listings 


50 

no 


Dr.  Dobb's  Journal,  February,  1987 


C  CHEST 


Listing  One  (Text  begins  on  page  90. ) 

1  char  *cpy(  dest,  src  ) 

2  register  char  *dest,  *src; 

3  { 

4  /*  Works  like  strcpy  but  returns  a  pointer 

5  *  to  the  new  end  of  string  (ie.  to  the  null) . 

6  */ 

7 

8  while (  *src  ) 

9  *dest++  »  *src++  ; 


*dest  “0; 
return  dest; 


Eod  Listing  One 


Listing  Two 


union  REGS 


1  /*  DATE.C  Get  the  date  from  dos 

2  */ 

3 

4  # include  <dos.h> 

5 

6  extern  int  intdos(  union  REGS* ,  union  REGS*  ); 

7 

8  /* - */ 

9 

10  date(  month,  day,  year,  day_of_the_week  ) 

11  int  *month,  *day,  *year,  *day_of_the_week  ; 

12  1 

13  /*  Return  the  month,  day,  year,  and  day  of  the 

14  *  week  (0  -  Sunday,  6  =  Saturday) . 

15  */ 

16 

17  union  REGS  regs; 

18 

19  regs. h. ah  »  0x2a  ; 

20 

21  intdos(  &regs,  &regs  ); 

22 

23  *month  -  regs.h.dh  ; 

24  *day  *  regs.h.dl  ; 

25  *year  -  regs.x.cx  ; 

26  *day_of_the_week  =  regs.h.al  ; 

27  ) 

28 

29  /* - *, 

30 

31  #ifdef  DEBUG 

32 

33  main() 

34  { 

35  int  month,  day,  year,  day_of_the_week; 

36 

37  date  (  &month,  &day,  &year,  &day_of_the_week  ); 

38 

39  printf ("date  is  %d/%d/%d,  day  of  the  wee 

40  month,  day,  year,  day_of_ 

41  } 

42 

43  lendif 


regs. h. ah  -  0x2a  ; 

intdos(  &regs,  &regs  ); 

*month  -  regs.h.dh  ; 

*day  *  regs.h.dl  ; 

*year  =  regs.x.cx  ; 

*day  of  the_week  -  regs.h.al  ; 


int  month,  day,  year,  day_of_the_week; 

date (  &month,  &day,  &year,  &day_of_the_week  ) ; 

printf ("date  is  %d/%d/%d,  day  of  the  week  -  %d\n", 

month,  day,  year,  day_of_the_week  ); 


End  Listing  Two 


Listing  Three 


/*  HASH. H 


Header  required  by  the  hash  functions  in  hash.c  */ 


♦define  MAXNAME  32 

typedef  struct  element_ 

{ 

struct  element_  *next; 
struct  element_  **prev; 
char  sname [  MAXNAME  +  1) 

} 

BUCKET; 


typedef  struct  hash_tab 
{ 

BUCKET  **table  , 
int  size 
int  numsyms  , 

) 

HASH  TAB; 


/*  Pointer  to  hash  table  */ 
/*  Max  number  of  elements  in  table  */ 
/*  number  of  elements  currently  in  table  */ 


52 


Dr.  Dobb's  Journal,  February  1987 

111 


22  /*  symname()  extracts  the  name  field  from  a  BUCKET: 

23  p  is  a  pointer  returned  from  findsym,  evaluates  to  the 

24  *  contents  of  the  sname  field. 

25  */ 

26 


27  #define  symname  (p)  (  ( (BUCKET*) (p)  -  l)->sname  ) 

28 

29  extern  char  *addsym  (HASH_TAB  *,  char  *,  int 

30  extern  void  delsym  (HASH_TAB  *,  BUCKET  *, 

31  extern  char  *findsym  (HASH_TAB  *,  char  * 

32  extern  HASH_TAB  ‘maketab  {unsigned  int 

33  extern  void  pstats  (HASHJTAB  * 

34  extern  int  ptab  (HASHJTAB  *,  void  (*)  () 

Listing  Four 


)  ; 

)  ; 

)  ; 

); 

)  ; 

>  •  End  Listing  Three 


1  finclude  <stdio.h> 

2  # include  <ctype.h> 

3  finclude  <hash.h> 

4 

5  /*  HASHTAB.C  General-purpose  hash  table  functions. 

6  * 

7  *  (C)  1986,  Allen  I.  Holub.  All  rights  reserved. 

8  * 

9  *  The  hash  table  structures  are  defined  in  /include/hash. h.  A  HASHJTAB 

10  *  is  a  structure  that  contains  the  table  size,  the  number  of  elements  in 

11  *  the  table  and  a  pointer  to  the  table  itself,  this  last  an  array  of 

12  *  BUCKET  pointers.  Colisions  are  resolved  by  putting  the  BUCKETS  into 

13  *  a  doublely  linked  list: 

14  * 

15  *  + - +  + - + 

16  *  V  I  V  | 

17  *  + - +  + - +-+-+ - +  + - +-+-+ - + 

18  *  |  *-+ - >|  |  *  |  *-+ - >|  |  *  |  0  | 

19  *  + - +  + - + - + - +  + - + - + - + 

20  *  + - +  name  prev  next  name  prev  next 

21  *  + - + 

22  * 

23  *  The  leftmost  box  is  the  array  pointed  at  by  the  HASHJTAB  structure. 

24  *  It's  an  array  of  pointers  to  BUCKETS.  The  other  boxes  are  BUCKETS.  The 

25  *  "next"  field  points  at  the  next  bucket  in  the  chain  or  is  NULL  if  there 

26  *  isn't  another  bucket.  The  "prev"  field  points  at  the  "next"  field  of  the 

27  *  previous  bucket.  In  the  case  of  the  leftmost  bucket,  it  will  point  at 

28  *  the  actual  hash-table  element. 

29  * 

30  *  The  BUCKET  itself  is  actually  a  header,  similar  to  the  one  used  by 


31  *  malloc  ()  : 

32  * 

33  *  + - + 

34  *  |  BUCKET  | 

35  *  + - + 

36  *  I  user  |  < pointer  returned  from  addsym()  and  findsymO 

37  *  «  memory  = 

38  *  |  | 

39  *  + - + 

40  * 


41  *  The  pointer  returned  by  addsymO  of  findsymO  can  be  used  as  a 

42  *  structure  pointer  by  the  applications  program. 

43  */ 

44 

45  fdefine  max(a,b)  ((a)  >  (b)  ?  (a)  :  (b) ) 

46  fdefine  min(a,b)  ((a)  <  (b)  ?  (a)  :  (b) ) 

47 

48  fdefine  WLEN  (  sizeof (unsigned) * 8  ) 

49  fdefine  MAXINT  (( (unsigned) ~0)  »  1) 

50  fdefine  MAXLEN  128 

51 

52 

53  /* - 

54  *  Prototypes  for  static  functions  (the  global  function  prototypes  are 

5'5  *  in  hashtab.h): 

56  */ 

57 

58  static  unsigned  hash  (  char*,  HASHJTAB*) ; 


59  static  int  symcmp  (  BUCKET**,  BUCKET**  ); 

60 

61  /* - */ 

62 

63  static  unsigned  hash  (  name,  tabp  ) 

64  register  char  *name; 

65  HASHJTAB  *tabp; 

66  { 

67  /*  Compute  hash  value.  Note  that  the  MOD  table-length  is 

68  *  done  my  the  calling  routine. 

69  */  (continued  on  nejct  page) 


/*  f  of  bits  in  an  unsigned  int  */ 
/*  Largest  signed  integer  */ 
/*  Used  by  pstat(),  max  number  */ 
/*  of  expected  chain  lengths.  */ 


Dr.  Dobb 's  Journal,  February  1987 

112 


53 


C  CHEST 


Listing  Four  ( Listing  continued,  text  begins  on  page  90. ) 


70 

71 

72 

73 

74 

75 

76 

77 

78 

} 

register  unsigned  h,  g; 

fort  h  -  0;  *name  ;  h  +=  *name++  ) 

return!  h  %  tabp->size  ); 

79 

/* - 

80 

81 

82 

HASH 

TAB 

*maketab(  maxsyms  > 

83 

unsigned 

maxsyms; 

84 

i 

85 

/* 

Make  a  hash  table  of  the  indicated  : 

size.  It's  a  good 

86 

* 

idea  to  make  maxsyms  a  prime  number 

(though  that's  not 

87 

* 

required) .  Some  useful  primes  are: 

88 

* 

47  61  89  113  127  157  193  211  257  293  359  401 

89 

*/ 

90 

91 

extern 

HASH  TAB  * calloc () ; 

92 

HASH  TAB  *p; 

93 

94 

if  <  p  - 

calloc (1,  (maxsyms  *  sizeof  (BUCKET  * 

))  +  sizeof (HASH_TAB) 

95 

t 

96 

p->table  =  (BUCKET  **) (  p  +  1  ); 

97 

p->size  =  maxsyms  ; 

98 

p->numsyms  =0  ; 

99 

i 

100 

else 

101 

t 

102 

err ("Insufficient  memory  for  symbol 

table\n") ; 

103 

exit  (  1  ) ; 

104 

i 

105 

106 

return 

p; 

107 

i 

108 

110 

111 

char 

*addsym(  tabp,  name,  size  ) 

112 

HASH 

TAB 

*tabp; 

113 

char 

*name; 

114 

i 

115 

/*  Add  a  symbol  to  the  hash  table. 

11C 

*/ 

117 

118 

BUCKET 

**p,  *tmp  ; 

119 

BUCKET 

*sym; 

120 

121 

if(  ! {sym  «  (BUCKET  *)  calloc (  size  +  sizeof (BUCKET) ,  1))  ) 

122 

ferr ("Can't  get  memory  for  symbol\r 

i"); 

123 

124 

strnepy  (  sym- >s name,  name,  MAXNAME  ) ; 

125 

126 

p  -  & 

(tabp->table)  [  hash (name,  tabp)  J; 

127 

128 

tmp 

-  *p  ; 

129 

*P 

-  sym  ; 

130 

sym->prev  =  p  ; 

131 

sym->next  -  tmp  ; 

132 

133 

i f  (  tmp  ) 

134 

tmp->prev  -  &sym->next  ; 

135 

136 

tabp->numsyms++; 

137 

return 

(char  *)  (sym  +  1) ; 

138 

i 

139 

141 

142 

char 

*findsym(  tabp,  name  ) 

143 

HASH 

TAB 

*tabp; 

144 

char 

*  name ; 

145 

t 

146 

/* 

Return  a  pointer  to  the  hash  table 

element  having  the 

147 

* 

indicated  name  or  NULL  if  the  name 

isn't  in  the  table. 

148 

* 

If  more  than  one  such  entry  is  the 

the  table,  the  most- 

149 

* 

recently  added  one  is  found. 

150 

*/ 

151 

152 

BUCKET 

*P  ; 

54 


Dr.  Dobb's  Journal,  February  1987 

113 


p  -  (tabp->table) [  hash (name, tabp)  ); 

while (  p  a  strncmp(  name,  p->sname,  MAXNAME  )  ) 

p  =  p->next; 

return  (char  *) (  p  ?  p  +  1  :  NULL  ); 


void  delsym(  tabp,  p  ) 

HASHJTAB  *tabp; 

BUCKET  *p; 

( 

/*  Remove  a  symbol  from  the  hash  table  and  free  the  memory. 

“p"  is  a  pointer  returned  from  a  previous  flndsymO  call 
(it  will  actually  be  pointing  just  below  the  BUCKET  header) 
*  and  “tabp"  is  a  pointer  to  the  HASHJTAB  structure. 


if  (  !p  ) 

ferr(  "Internal  error:  bad  pointer  to  delsym()“  ); 

— tabp->numsyms; 

— p; 

if (  * (p->prev)  =  p->next  ) 

p->next->prev  *  p->prev  ; 

free (p) ; 


static  symcmpf  si,  s2  ) 

BUCKET  **sl,  **s2  ; 

( 

return (  strcmpf  (*sl) ->sname,  (*s2)->sname  )  ); 


ptab (  tabp,  print  ) 

HASHJTAB  *tabp; 

void  (*  print)  ()  ; 

1 


/*  Print  the  hash  table,  sorted  by  key.  */ 
/*  pointer  to  the  table  */ 
/*  print  routine  that  is  passed  the  */ 
/*  name  and  a  pointer  to  the  */ 
/*  applications  of  an  BUCKET.  */ 


BUCKET  **outtab,  **outp,  *sym,  **symtab  ; 

lnt  i; 

/*  Allocate  memory  for  the  outtab,  an  array  of  pointers  to 

BUCKET,  and  initialize  it.  The  outtab  is  different  from 
the  actual  hash  table  in  that  every  outtab  element  points 
to  a  single  BUCKET  structure,  rather  than  to  a  linked  list 
*  of  them. 


if(  ! (outtab  =  (BUCKET  **)  malloc (tabp->numsyms  *  sizeof (BUCKET*) )) ) 

err ("Insufficient  memory  to  print  symbols"); 
return; 

) 

outp  -  outtab; 

for (  symtab  -  tabp->table,  1  -  tabp->size  ;  — 1  >-  o  ;  symtab++  ) 

for(  sym  =  *symtab  ;  sym  ;  sym  -  sym->next  ) 

*outp++  -  sym; 


Sort  the  outtab  and  then  print  it.  The  (*outp)+l  in  the 
print  call  increments  the  pointer  past  the  header  part 
of  the  BUCKET  structure. 


qsort (  outtab,  tabp->numsyms,  sizeof (  BUCKET*  ) ,  symcmp  ) ; 

for (  outp  -  outtab,  i  -  tabp->numsyms;  — i  >=  0  ;  outp++  ) 
("print) (  (*outp) — >sname,  (*outp)+l  ); 


(continued  on  next  page ) 


Dr.  Dobb's  Journal,  February  1987 

114 


55 


C  CHEST 


Listing  Four  (Listing  continued ,  tejct  begins  on  page  90.) 

free (  outtab  ) ; 


238 

239  } 

240 

241  /* - 

242 

243  void 

244  HASH_ 

245  { 

246 

247 

248 

249 

250 

251 

252 

253 

254 

255 

256 

257 

258 

259 

260 
261 
262 

263 

264 

265 

266 

267 

268 

269 

270 

271 

272 

273 

274 

275 

276 

277 

278 

279 

280 
281 
282 

283 

284 

285 

286 

287 

288 

289 

290 

291 

292  } 

293 


pstats(  tabp  ) 

*tabp; 

Print  out  various  statistics  showing  the  lenghts  of  the 
chains  (number  of  colisions)  along  with  the  mean  depth 
of  non-empty  chains,  standard  deviation,  etc. 


*/ 


BUCKET  *p; 
int  i; 
int 
int 
int 
int 


chain_len; 
maxlen  *  0; 
minlen  =  MAXINT; 
lengths [  MAXLEN  ]; 


/*  Pointer  to  current  hash  element  */ 
/*  counter  */ 
/*  length  of  current  colision  chain  V 


int 


longer  -  0  ; 
memset (  lengths,  0,  sizeof (lengths)  ); 
0;  i  <  tabp->size  ;  i++  ) 


/*  maximum  chain  length 
/*  mimimum  chain  length 
/*  indexed  by  chain  length,  holds 
/*  the  #  of  chains  of  that  length. 
/*  #  of  chains  longer  than  MAXLEN 


*/ 

*/ 

*/ 

*/ 

V 


for  (  i 

( 


chain_len  -  0; 

for(  p  -  tabp->table[i]  ;  p  ;  p  -  p->next  ) 
chain_len++; 

if (  chain_len  >-  MAXLEN  ) 

++longer; 

else 

++lengths [chain_lenl ; 

minlen  “  min(  minlen,  chain_len  ); 
maxlen  -  max(  maxlen,  chain_len  ); 

newsample(  chain_len  ); 


} 


printf("%d  entries  in  %d  element  hash  table,  ", 

tabp->numsyms,  tabp->size  ); 

printf ("%d  (%1.0f%%)  empty. \n", 

lengths[0],  ( (double) lengths [0 J /tabp->size)  *  100.00); 

printf ("Mean  chain  length:  %d,  max-%d,  min“%d,  deviation-%d\n", 
running_mean () ,  maxlen,  minlen,  deviation ()  ); 

for (  i  -  0;  i  <  MAXLEN;  i++  ) 
if  (  lengths [i ]  ) 

printf ("%3d  chains  of  length  %d\n",  lengths [i],  i  ); 

if(  longer  ) 

printf ("%3d  chains  of  length  %d  or  longer\n",  longer,  MAXLEN); 


dptab(  addr  ) 
HASH_TAB 
( 


294  /* - 

295  #ifdef  DEBUG 

296 

297 

298 

299 

300 

301 

302 

303 

304 

305 

306 

307 

308 

309 

310 

311 

312 

313 

314 

315 

316 

317 

318 

319 

320 


*addr; 


BUCKET  **p,  *bukp  ; 
int  i; 

printf ("HASH_TAB  at  0x%04x  (%d  element  table,  %d  symbols) \n", 
addr,  addr->size,  addr->numsyms  ) ; 


for  (  p 
( 


»  addr->table,  i 
if(  !*P  ) 


i  <  addr->size  ;  ++p,  ++i  ) 


continue; 

printf (“Htab[%3d]  0x%04x:",  i,  p); 

for(  bukp  *  *p;  bukp  ;  bukp=bukp->next  ) 

{ 

printf ("*  0x%x  (%s)  p-0x%x,  n=0x%x,  user=0x%x\n", 

bukp,  bukp->sname,  bukp->prev,  bukp->next,  bukp+1); 


printf  (" 


") ; 


(continued  on  page  58) 


56 


Dr.  Dobb's  Journal,  February  1987 

115 


C  CHEST 


Listing  Four  (Listing  continued,  text  begins  on  page  90.) 


1 


321 

322 

323  ) 

324 

325  lendif 

326  /* - 

327 

328  #ifdef  MAIN 

329 

330  typedef  struct 

331  { 

332  char 

333  lnt 

334  } 

335  STAB; 

336 

337  /* - 


putchar ('\r') ; 


-*/ 


str [16] ; 
count ; 


338 

339  getword(  buf  ) 

340  char  *buf; 

341  { 

342 

343  lifdef  RANDOM 

344 


Generate  500  random  words  - */ 


345 

346 

347 

348 

349 

350 

351 

352 

353 

354 

355 

356 

357 

358 

359 

360  #else 

361 

362 

363 

364 

365 

366 

367 

368 

369 

370 

371 

372 

373 

374 

375  lendif 

376 

377 

378 

379  } 

380 

381  /* - 


static  int 
int 


wordnum  -  500; 
num_letters,  let; 


if< 

while (  (num  letters 


-wordnum  <  0  ) 
return  0; 


rand()  %  16)  <3  ) 

0  ) 


while (  — num_letters  >- 

{ 

let  =  (rand()  %  26)  +  'a*  ;  /*  26  letters  in  english  */ 
*buf++  -  (rand()  %  10)  ?  let  :  toupper(let)  ; 


} 

int  c; 
while (  (c  = 


Get  words  from  standard  input  - */ 


getcharO)  !-  EOF  &&  !  (isalnum(c)  ||  c= 


')  ) 


if(  c 

else 

{ 


-  EOF  ) 
return  0; 


*buf++  -  c; 

while  (  (c  -  getcharO) 
*buf++  -  c; 


EOF  &&  (isalnum(c)  ||  c= 


•)  > 


*buf  -  '\0' 
return  1; 


382 

383  main( 

384  { 

385 

386 

387 

388 

389 

390 

391 

392 

393 

394 

395 

396 

397 

398 

399 

400 

401 

402 

403 


argc,  argv  ) 

char 

STAB 

HASH  TAB 


word [80] ; 

*sp; 

*tabp; 


tabp  -  maketab(  127  ); 
while (  getword(  word  )  ) 


{ 


if  (  sp  «  (STAB  *)  findsym(  tabp,  word)  ) 

{ 

if(  strcmp(sp->str, M123456789abcdefM)  !-  0  ) 


{ 


print f (“NODE  HAS  BEEN  ADULTERATED\nM) ; 
exit (  1  ) ; 


sp->count++; 


else 


(continued  on  page  60) 


58 

116 


Dr.  Dobb’s  Journal,  February  1987 


C  CHEST 

Listing  Four  (Listing  continued,  teyt  begins  on  page  90.) 

404 

( 

405 

sp  *  (STAB  *)  addsym(  tabp,  word,  sizeof(STAB)  ); 

406 

strcpy(  sp->str,  M123456789abcdef "  ); 

407 

sp->count  *  1; 

408 

i 

409 

i 

410 

411 

pstats i 

tabp  ) ;  /*  Print  statistics  */ 

412 

413 

lifdef 

DEBUG 

414 

dptab( 

tabp  )  ; 

415 

#endif 

416 

417 

418 

#endif 

End  Listing  Four 

Listing  Five 

1 

/* 

ITOASCII.C 

3 

Routines  to  convert  integers  to  ascii  strings. 

4 

* 

itoroman  converts  to  a  roman  numeral. 

5 

* 

itoalpha  converts  to  a  string  of  letters  such  as  are  used 

6 

* 

in  an  outline. 

7 

* 

itoascii  does  all  of  the  above  +  arabic  format. 

9 

* 

None  of  these  routines  check  for  string  overflow.  The 

10 

★ 

largest  number  of  characters  genereated  by  the  routines 

11 

* 

are: 

12 

* 

format  routine  longest  #  #digits 

13 

* 

14 

* 

alpha  itoalpha  AVLG  4 

15 

* 

roman  itoroman  MMMMDCCCLXXXVIII  16 

16 

* 

arabic  -  32767  5 

17 

* 

total:  itoascii  16 

18 

* 

19 

* 

The  exception  to  the  above  is  "english"  format  where  the 

20 

* 

longest  string  is: 

21 

* 

"minus  twenty  seven  thousand  seven  hundred  seventy-seven" 

22 

* 

56  characters  includeing  the  terminating  null. 

23 

* 

Allowing  an  extra  space  for  a  potential  leading  -  sign 

24 

* 

and  another  for  a  terminating  blank  we  get  a  worst  case 

25 

* 

of  18  characters. 

26 

*/ 

27 

28 

#def ine 

SSIZE 

16  /*  Maximum  size  of  expanded  number  */ 

29 

30 

extern 

char 

*cpy  (char*,  char*  ); 

31 

extern 

void 

strcpy  (char*,  char*  ); 

32 

33 

/* — 

34 

35 

itoeng 

dest,  uppercase,  n  ) 

36 

int 

n,  uppercase; 

37 

char 

*dest; 

38 

t 

39 

/* 

Convert  number  to  english  string  "one",  "two",  etc.  If 

40 

* 

uppercase  then  the  first  letter  is  capatalized.  The 

41 

* 

number  of  characters  in  the  string  is  returned. 

42 

* 

Longest  possible  string  is: 

43 

* 

"minus  twenty  seven  thousand  seven  hundred  seventy  seven" 

44 

* 

(56  characters  including  the  null) . 

45 

*/ 

46 

47 

static 

char  *onestab[]  *=* 

48 

{ 

49 

"zero",  "one",  "two",  "three",  "four". 

50 

"five",  "six",  "seven",  "eight",  "nine". 

51 

“ten",  “eleven",  "twelve",  "thirteen",  "fourteen". 

52 

"fifteen",  "sixteen",  "seventeen",  "eighteen",  "nineteen" 

53 

); 

54 

55 

static 

char  *tenstab[]  - 

56 

57 

"",  "ten",  "twenty",  "thirty",  "forty". 

58 

"fifty",  "sixty",  "seventy",  "eighty",  "ninety" 

59 

); 

60 

61 

char 

*f irst; 

62 

63 

first  -  dest; 

64 

65 

iff  n  "  0  ) 

66 

l 

60 


Dr.  Dobb's  Journal,  February  1987 

117 


lf(  uppercase  ) 

cpy(  dest,  uppercase  ?  “Zero" 


return (  5  ) ; 


if  (  n  <  0  ) 


dest  «  cpy(  dest,  “minus  *  ); 
n  -  -n; 


if(  20000  >  n  >-  10000  ) 

{ 

dest  -  cpy(  dest,  onestabfn  /  1000  ]  ); 

dest  »  cpy {  dest,  (n  %=  1000)  ?  "  thousand,  "  :  “  thousand"); 

) 

else  if  (  n  >**  1000  ) 

{ 

if  (  n  >=  20000  ) 

( 

dest  -  cpy (  dest,  (n  >=  30000)  ?  “thirty"  :  “twenty"); 
if (  n  %=  10000  ) 

dest  =  cpy(  dest,  (n  >=  1000)  ?  “-“  :  “  "  ) ; 

) 

if (  n  >=  1000  ) 

( 

dest  -  cpy(  dest,  onestab[n  /  1000)  ); 
dest  -  cpy(  dest,  “  '■  ) ; 

) 

dest  -  cpy (  dest,  (n  %-  1000)  2  "thousand,  “  :  “thousand"); 


if(  n  >=  100  ) 


dest  =  cpy(  dest,  onestab[  n/100  ]  ); 

dest  -  cpy (  dest,  (n  %-  100)  ?  “  hundred 


“  hundred"); 


if (  n  >-  20  ) 

( 

dest  -  cpy(  dest,  tenstab[  n  /  10  ]  ); 

i f (  n  %-  10  ) 

dest  -  cpy(dest,  "-“) ; 


if<  n  ) 


dest  -  cpy(  dest,  onestabfn)  ); 


if(  uppercase  ) 

•first  -  toupper(  ‘first  ); 

return!  (dest  -  first)  +1  ); 


itoroman(  dest,  uppercase,  n  ) 
int  n,  uppercase; 

char  ‘dest; 


Convert  integer  to  a  Roman  numeral. 

Return  number  of  characters  put  into  dest . 

Bugs:  1)  Numbers  larger  than  4999  are  not  printed.  This 

is  because  characters  are  required  which  don't 
exist  in  the  ASCII  character  set  (ie.  letters 
with  lines  over  them. 

2)  The  number  0  is  represented  as  a  ‘O'  which  didn't 
exist  in  Roman  numerals.  Similarly,  negative 
numbers  have  a  preceeding  sign. 


register  char  *cp,  **rp,  ‘start; 

register  int  i  ; 


static  char 


(continued  on  next  page ) 


Dr.  Dobb  s  Journal,  February  1987 

118 


61 


C  CHEST 

Listing  Five  (Listing  continued,  te?ct  begins  on  page  90.) 

151 

MM,  "C",  "CC",  " CCCH ,  "CD", 

152 

"D",  "DC",  "DCC",  "DCCC",  "CM", 

153 

154 

"",  "X",  "XX",  "XXX",  "XL", 

155 

“L”,  "LX",  "LXX",  “LXXX",  "XC", 

156 

157 

"",  "I",  "II",  "III",  "IV", 

158 

"V",  "VI",  "VII",  "VIII",  "IX" 

159 

); 

160 

161 

start 

-  dest; 

162 

163 

if(  n 

<=  -5000  | |  n  >=  5000  )  /*  Number  can't  be  represented  */ 

164 

i 

165 

strcpy(  dest,  "********"  ); 

166 

return  0; 

167 

i 

168 

169 

if(  n 

*=“  0  )  /*  Deal  with  a  zero  */ 

170 

*dest++  =  'O'; 

171 

172 

else  1 f (  n  <  0  )  /*  Print  preceeding  -  if  necessary  */ 

173 

174 

*dest++  = 

175 

n  =  -n; 

176 

i 

177 

178 

while  < 

n  >=  1000  )  /*  Take  care  of  leading  thousands  */ 

179 

f 

180 

*dest++  -  (uppercase)  ?  'M'  :  'm'  ; 

181 

n  —  1000; 

182 

i 

183 

184 

rp  *  rnums; 

185 

for  (  i 

-  10*10  ;  n>0  &&  i>=l  ;  i/=10  ) 

186 

f 

187 

cp  =  * (rp  +  (n/i));  /*  Find  the  appropriate  str.*/ 

188 

n  %=  i; 

189 

rp  +-  10; 

190 

191 

for(;  *cp  ;  cp++) 

192 

*dest++  =  (uppercase)  3  *cp  :  *cp  +  ('a'-'A')  ; 

193 

i 

194 

195 

*dest 

-  ’\0‘  ; 

196 

return (dest  -  start); 

197  } 

198 

199  /* - 

- ./ 

200 

201 

202  itoalpha  (dest. 

uppercase,  n  ) 

203  int 

n,  uppercase  ; 

204  char 

*dest; 

205  { 

206 

/* 

Convert  integer  to  an  ASCII  string  as  follows: 

207 

* 

208 

* 

0,  a,  b,  c,  ...  y,  z,  aa,  ab,  ac,  ad  ... 

209 

* 

210 

* 

Return  number  of  characters  in  expanded  string. 

211 

* 

212 

* 

This  routine  is  very  similar  to  itoa().  The  problem  here 

213 

* 

is  that  the  above  sequence  is  not  a  base  26  number. 

214 

* 

That  is,  there  is  no  equivalent  to  a  zero. 

215 

* 

(2  +  1  -  aa  (if  a=0,  b«l  _  then  z  +  1  would  -  ba  ) 

216 

* 

or  to  look  at  it  another  way,  if  a=0,  b«l,  etc.  then 

217 

* 

aa  should  equal  0.  It  doesn't. 

218 

*/ 

219 

220 

char 

scratch[SSIZE+l ] ,  *p,  *start  ; 

221 

222 

start 

-  dest; 

223 

224 

if  (  n 

<  0  ) 

225 

226 

*dest++  =  '-'  ; 

227 

n  =  -n; 

228 

) 

229 

230 

if  (  n 

-=  0  ) 

231 

*dest++  *=  'O'; 

232 

else 

233 

f 

62 


Dr.  Dobb  s  Journal,  February  1987 

119 


'  '  '  *  ^  '"'V  V'fU*  V 

234 

235 

236 

237 

238 

239 

240 

241 

242 

243 

244 

245 

246 

247 

248  } 

249 

250  /* - 

— n  ; 

p  =  scratch; 

do{ 

*p++  “  (n  %  26)  +  (uppercase  ?  'A*  ;  'a')  ; 

}  while  (  (n  -  (n/26) -1)  >=  0  ); 

while (  — p  >«  scratch  )  /*  Copy  scratch  space  to  */ 

*dest++  *  *p  ;  /*  destination  string.  */ 

*dest  «  *\0'  ; 
return (dest  -  start); 

251 

252  itoascii (  str,  fmt,  n 

253  char 

*str; 

254  { 

255 

/*  convert  interger  "n"  to  an  ascii  string  accoding  to 

256 

*  M  fmt " 

and  put  it  in  "str"; 

257 

* 

fmt  =“ 

i'  lower  case  roman  numerals 

258 

fmt  == 

I*  upper  case  roman  numerals 

259 

* 

fmt  — 

a'  lower  case  alphabetic 

260 

fmt  — 

A'  upper  case  alphabetic 

261 

fmt  == 

e'  spelled  out  in  lower  case. 

262 

fmt  == 

E'  spelled  out  w/  1st  char  capatalized. 

263 

fmt  =* 

1‘  arable,  field  zero  padded  to  1  char 

264 

fmt  *==■ 

2‘  arable,  field  zero  padded  to  2  char 

265 

* 

etc. 

266 

* 

267 

*  Return 

width  of 

string  in  str. 

268 

*/ 

269 

270 

register  lnt 

rval 

=  0  ; 

271 

272 

register  char 

♦format 

=  "%01d"  ; 

273 

register  char 

*mfmt 

“  "-%01d"  ; 

274 

275 

if(  'O'  <**  fmt 

ss  fmt 

<-  ‘9*  ) 

276 

i 

277 

lf(  n 

<  0  ) 

278 

1 

279 

/* 

This  kludge  makes  up  for  a  bug  in  sprintf. 

280 

* 

sprintf (  str,  "%04d“,  -2  )  loads  str  with: 

281 

* 

"000-2" 

282 

*/ 

283 

284 

mfmt (3) 

-  fmt; 

285 

n  =*  -n; 

286 

rval  =■  sprintf(  str.  mfmt.  n  ); 

287 

) 

288 

else 

289 

i 

290 

format [2]  -  fmt; 

291 

rval  -  sprintf(  str.  format,  n  ); 

292 

t 

293 

i 

294 

else  if  (  fmt  —  1 i ■  II 

fmt  -=  ) 

295 

1 

296 

rval  * 

itoroman 

str,  fmt  ==  'I',  n  ); 

297 

i 

298 

else  if  (  fmt  =-  ‘a  •  II 

fmt  ==  ‘A1  ) 

299 

( 

300 

rval  = 

itoalpha 

str,  fmt  »=*  'A',  n  ); 

301 

302 

else  if  (  fmt  = 

-  'e1  || 

fmt  «  'E'  ) 

303 

i 

304 

rval  «*= 

itoeng(  str,  fmt  'E*.  n  ); 

305 

i 

306 

307 

return)  rval  ); 

308  J 

309 

310  fifdef 

DEBUG 

311 

312  main() 

313  { 

314 

int  n; 

315 

char  str[80] 

; 

316 

317 

while (  1  ) 

318 

i 

(continued  on  next  page) 

Dr.  Dobb's  Journal,  February  1987 

120 


63 


C  CHEST 


Listing  Five  ( Listing  continued,  tejct  begins  on  page  90. ) 

319  printf ("Enter  a  decimal  number:  ") ; 

320  scanf("%d",  &n  ); 


321 

itoascii (  str. 

1  9 ' , 

n  )  ; 

printf  ("%s\n". 

str 

322 

itoascii (  str. 

'O'. 

n  ) ; 

printf ("%s\n". 

str 

323 

itoascii {  str. 

1  a ' 

n  ) ; 

printf ("%s\n", 

str 

324 

itoascii (  str. 

•A' 

n  )  ; 

printf  ("%s\n", 

str 

325 

itoascii (  str. 

•  i ' 

n  ) ; 

printf  ("%s\n", 

str 

326 

itoascii (  str. 

•I' 

n  ) ; 

printf ("%s\n", 

str 

327 

itoascii (  str. 

'  e ' 

n  )  ; 

printf ("%s\n", 

str 

328 

itoascii (  str. 

•E' 

n  ) ; 

printf ("%s\n", 

str 

329  } 

330  } 

331 

332  #endif 

Listing  Six 

linclude  <stdio.h> 

# include  <ctype.h> 
linclude  <stdarg.h> 


End  Listing  Five 


1 

2 

3 

4 

5 

6 

7 

8 
9 

10 

11 

12 

13 

14 

15 

16 

17 

18 

19 

20 
21 
22 

23 

24 

25 

26 


/* 


59 

60 
61 
62 

63 

64 

65 

66 
67 


PARSE. C  Expression  parser  for  infix  desk  calculator 

Copyright  (c)  1986,  Allen  I.  Holub.  All  rights  reserved. 

General  purpose  expression  analyzer.  Can  evaluate  any  expression 
consisting  of  number  and  the  following  operators  (listed  according 
to  precedance  level): 

()  -  !  'str'str* 

*  /  % 


< 

&& 


In  a  string  (ie,  in  strcmpO)  *  and  ?  work  as  in  DOS. 

All  operators  associate  left  to  right  unless  ()  are  present. 

The  top  -  is  a  unary  minus,  exists ()  evaluates  to  true  if  the 
filename  exists.  strcmpO  evaluates  to  true  if  the  strings  match. 
isfile()  evaluates  to  true  if  the  file  exists  and  is  a  file  (as 
compared  to  a  directory) .  All  whitespace  is  ignored  and  "  counts 
as  whitespace. 


27 

* 

<expr>  : 

<term> 

<exprl> 

28 

* 

<exprl>  : 

=» 

&&  <term> 

<exprl> 

29 

* 

= 

| |  <term> 

<exprl> 

30 

* 

- 

epsilon 

31 

* 

32 

* 

<term>  : 

:  = 

<fact> 

<terml> 

33 

* 

<terml>  : 

:  = 

<  <fact> 

<terml> 

34 

* 

:  = 

<*=  <fact> 

<terml> 

35 

* 

:  = 

>  <fact> 

<terml> 

36 

* 

:- 

>“  <fact> 

<terml> 

37 

* 

:- 

<fact> 

<terml> 

38 

* 

:  - 

! =  <fact> 

<terml> 

39 

★ 

epsilon 

40 

* 

41 

* 

<fact>  : 

<part> 

<factl> 

42 

* 

<factl>  : 

:  “ 

+  <part> 

<factl> 

43 

* 

:  - 

-  <part> 

<factl> 

44 

* 

:-= 

epsilon 

45 

* 

46 

* 

<part>  : 

:  = 

<const> 

<partl> 

47 

* 

<partl>  : 

: » 

*  <const> 

<partl> 

48 

* 

:  = 

/  <const> 

<partl> 

49 

* 

epsilon 

50 

* 

51 

* 

<const> 

:■= 

(  <expr>  ) 

52 

* 

:  ■= 

-  (  <expr>  ) 

53 

* 

:  = 

-  <const> 

54 

* 

:  - 

!  <const> 

55 

* 

:» 

•si 's2' 

56 

* 

:«= 

NUMBER 

57 

* 

58 

* 

Note  that 

several  productions  a 

(*  like  strcmp  (si, s2)  *) 


subroutines.  For  example  <expr>  and  <exprl>  are  combined  into 
a  single  subroutine.  Similarly,  all  right  recursion  has  been 
replaced  with  "for"  loops.  External  subroutines  are: 

VTYPE  parse (  expr_p  )  char  **expr_p  ; 

which  parses  the  expression  in  the  string  pointed  to  by  *expr_p 
and  returns  the  parsed  value.  VTYPE  (defined  below)  is  currently 
a  double  but  can  be  changed  by  modifying  the  typedef  and _ 


64 


Dr.  Dobb’s  Journal,  February  1987 

121 


68 

69 

70 

71 

72 

73 

74 

75 

76 


recompiling.  *expr _p  is  advanced  past  the  expression.  Parsing  will 
stop  when  the  first  character  that  can't  be  part  of  an  expression 
is  encountered.  This  includes  numbers  as  in:  "3+5  6" 

where  parse  will  return  8  and  *expr_p  will  be  pointing  at  the  *6'. 

int  startexpr(  str  )  char  *str; 

returns  true  if  if  the  first  token  in  str  is  in  FIRST (expr). 


77 

* 

78 

*/ 

79 

80 

extern 

double 

atof(  char  ' 

81 

extern 

long 

atol  (  char  i 

82 

83 

typedef 

double 

VTYPE; 

84 

85 

static 

char 

*Str; 

86 

static 

char 

*Start  str; 

87 

88 

int 

find 

(Int  ); 

89 

int 

match 

(char  *  )  ; 

90 

void 

error 

(char  *,  _ 

91 

VTYPE 

expr 

0 ; 

92 

VTYPE 

term 

(); 

93 

VTYPE 

fact 

0; 

94 

VTYPE 

part 

0; 

95 

VTYPE 

const 

0; 

96 

VTYPE 

constant () ; 

97 

98 

fdefine 

advance (amt) 

99 

>; 


/*  Static  global  varialbes 


/*  local  static  subroutines 


100  - - 

101 

102  int  startexpr(  str  ) 

103  char  *str; 

104  { 

/*  Returns  true  if  the  first  token  in  str  is  in  FIRST (expr). 


register  int  c; 
c  -  *str; 

return  (  isdigit  (c)  ||  c«*  ('  ||  c=='-'  ||  c==* .  '  ||  c=='!'  ||  c=='\"  ); 


static  int  f ind (  c  ) 

( 

while{  *Str  &&  *Str  !»  c  ) 
Str++; 


} 


if(  ! *Str  ) 

( 

err°r  ("missing  %c  in  expression",  c  ); 
return  0; 

) 

return  1; 


105 

106 

107 

108 

109 

110 
111 
112 

113  ) 

114 

115  /* - 

116 

117 

118 

119 

120 
121 
122 

123 

124 

125 

126 

127 

128 

129 

130 

131 

132 

133 

134 

135 

136 

137 

138 

139 

140 

141 

142 

143 

144 

145 

146  - - 

147 

148  static  void  error  (  fmt  ) 

149  char  *fmt; 

150  { 

151  register  char  *p; 

^52  va_list  args; 


/*  Advance  Str  to  c  or  to  EOS  */ 


/* - 


static  int  match  (  token  ) 

char  *token; 


( 


) 


register  char  *p; 

while(  isspace (*Str)  ||  *str  *=  ) 

St r++; 

for (p  =  Str;  *token  &&  *token  ==  *p  ;  p++,  token++  ) 
return  (  *token  «  '\0'  ); 


/*  has  a  variable  number  of  args  */ 

(continued  on  ne^ct  page ) 


Dr.  Dobb's  Journal,  February  1987 

122 


C  CHEST 

Listing  Six  (Listing  continued,  text  begins  on  page  90.) 

153 

154 

va  start  (  args,  fmt  ) ; 

155 

156 

printf (M%s\nM,  Start_str  ); 

157 

158 

f or (  p  =  Start  str;  p  <  Str; 

P++  ) 

159 

printf ; 

160 

161 

printf (  “  ); 

162 

vprintf(  fmt,  args  ); 

163 

printf  (  "\n"  ); 

164  } 

165 

166 

168 

169  static 

VTYPE  expr  () 

170  { 

171 

VTYPE  left; 

172 

173 

left  -  term(); 

174 

175 

for  (;;) 

176 

f 

177 

if  (  match ("&&")  ) 

advance  (2);  left  «  term()  &&  left;  } 

178 

else  if (  match ("II")  ) 

advance  (2);  left  =  term  ()  I  I  left;  ) 

179 

else  break; 

180 

} 

181 

182 

return  left; 

183  } 

184 

185  /* - 

186 

187  static 

VTYPE  term (  ) 

188  { 

189 

VTYPE  left; 

190 

191 

left  -  fact  ()  ; 

192 

193 

for  (;;) 

194 

< 

195 

if  (  match ("<=“)  ) 

(  advance (2);  left  -  left  o  fact () ;  ) 

196 

else  if(  match ("<"  )  ) 

(  advance  (1);  left  =  left  <  fact();  ) 

197 

else  if(  match  (">=*")  ) 

1  advance  (2);  left  -  left  >=  fact();  i 

198 

else  if(  match  (">"  )  ) 

{  advance  (1);  left  =  left  >  fact();  } 

199 

else  if(  match ("=*«")  ) 

{  advance  (2);  left  **  left  =*  fact();  ) 

200 

else  if(  match("!=")  ) 

t  advance  (2);  left  ■>  left  !-  fact();  1 

201 

else  break; 

202 

i 

203 

204 

return  left; 

205  } 

206 

207  /* - 

208 

209  static 

VTYPE  fact  () 

210  { 

211 

VTYPE  left; 

212 

213 

left  -  part  () ; 

214 

215 

for  (;;) 

216 

( 

217 

if  (  match ("+") 

)  (  advance  (1);  left  +=  part();  1 

218 

else  if(  match 

)  {  advance  {1) ;  left  -=  part();  ] 

219 

else  break; 

220 

i 

221 

222 

return  left; 

223  } 

224 

225 

226  /* - 

227 

228  static  VTYPE  part () 

229  { 

230 

VTYPE  left; 

231 

static  tmp; 

232 

233 

left  -  const  (); 

234 

235 

f  or  (; ; ) 

Dr.  Dobb's  Journal,  February  1987 


236 

237 

238 

239 

240 

241 

242 

243 

244 

245 

246 

247 

248 

249 

250 

251 

252 

253 

254 

255 

256 

257 

258 

259  } 

260 

261  /* - 

262 

263  static 

264  { 

265 

266 

267 

268 

269 

270 

271 

272 

273 

274 

275 

276 

277 

278 

279 

280 
281 
282 

283 

284 

285 

286 

287 

288 

289 

290 

291 

292 

293 

294 

295 

296 

297 

298 

299 

300 

301 

302 

303 

304 

305 

306 

307 

308 

309  abort: 

310 

311  } 

312 

313  /* - 

314 

315  VTYPE 

316  char 

317  { 

318 

319 


if (  match  ("*")  ) 

{ 

advance  (1) ; 
left  *»  part  {)  ; 

) 

else  if (  match ("%")  ) 

[ 

advance (1) ; 

left  -  (long)  left  %  (long)  part  ()  ; 

) 

else  if  (  match  (V)  ) 

( 

advance (1)  ; 

if  (  tmp  “  part  ()  ) 

left  /«*  tmp; 

else 

error (  "Divide  by  0\nM  ); 

} 

else  break; 


return  left; 


VTYPE  const () 
register  VTYPE  rval  =»  0; 

int  sign  -  1,  logical_not  -  0; 

static  char  *sl,  *s2; 

if(  match  ("-")  )  {  advanced);  sign  -  -1; 

if(  match  ("!")  )  {  advance  (1);  logical_not  -  1; 

if(  match  ("(")  ) 

{ 

advance (1) ; 
rval  =  expr  ()  ; 

if (  match  (") ")  ) 

advance (1) ; 

else 

error ("Mis-matched  parenthesis\n") ; 

) 

else  if  (  match  ("V  ")  ) 

( 

si  =  ++str; 

if(  !find(  'V*  )  )  goto  abort; 

*Str++  =  0; 
s2  -  Str; 

if(  !find(  ' \ * •  )  )  goto  abort; 

*Str++  =  0; 


rval  =  strcmp(  si,  s2  ); 

s2  [-1]  -  ' \ ; 

Str(-l)  -  ' \ '  1 ; 


rval  -  (  sizeof  (VTYPE)  «  si zeof (double)  ) 

?  atof (  Str  ) 

:  (VTYPE)  atol  (  Str  ) 


while (  isdigit  (*Str)  ||  *Str  • .  •  ) 
Str++; 


return  (  logical_not  ?  !rval  :  rval  *  sign  ); 


parse  (  expr_p  ) 

*  *expr_p; 

f *  Return  the  value  of  "expression"  or  0  if  any  errors  were 
*  found  in  the  string.  "*Err"  is  set  to  the  number  of  errors. 


(continued  on  next  page) 


Dr.  Dobb's  Journal,  February  1987 

124 


C  CHEST 

Listing  Six  (Listing  continued,  text  begins  on  page  90.) 

320 

*  "Parse"  is  the  "access  routine"  for  expr().  By  using  it  you 

321 

*  need 

not  know  about  any  of  the  global  variables  used  by  expr(). 

322 

*/ 

323 

324 

VTYPE 

rval; 

325 

326 

Start  str  *  Str  -  *expr  p  ; 

327 

328 

if (  !Str  | |  ! *Str  ) 

329 

return  0; 

330 

331 

rval  - 

expr!)  ; 

332 

333 

*expr  p 

=  Str; 

334 

return 

rval; 

335  | 

336 

338  fifdef 

339 

MAIN 

340  main(  argc,  argv  ) 

341  char 

342  { 

*  *argv; 

343 

/* 

Desk  calculator  program  to  test  parse ().  If  no  cmd  line 

344 

* 

arguments  are  present,  works  in  interactive  mode  and 

345 

* 

exits  with  a  0  status.  If  command  line  arguments  are 

346 

* 

present,  the  expressions  (one  per  argv  entry)  are 

347 

* 

evaluated  and  the  result  printed.  The  exit  status  is 

348 

* 

the  result  of  the  last  expression  evaluated,  truncated 

349 

* 

to  unsigned  char  (0-255) .  If  -s  is  specified,  nothing 

350 

* 

is  printed  but  the  same  exit  status  is  returned,  you 

351 

* 

can  then  test  $status  in  a  shell  script  to  get  the 

352 

* 

status.  For  example,  the  following  script  prints  "50": 

353 

* 

354 

* 

expr  -s  "10  *  5" 

355 

* 

echo  $status 

356 

*/ 

357 

358 

char 

buf (128] ,  *p  ; 

359 

int 

silent  -  0; 

360 

VTYPE 

rval; 

361 

362 

if(  argc  --  1  ) 

363 

i 

364 

printf ("Enter  expression  or  blank  line  to  exit\n>"); 

365 

366 

for(  ;  gets  (buf);  print f(">")  ) 

367 

i 

368 

if (  !*buf  ) 

369 

break; 

370 

371 

If (  ! startexpr (buf )  ) 

372 

printf  (  "%s  not  an  expression^",  buf  ); 

373 

374 

p  »  buf; 

375 

376  fifdef 

DEBUG 

printf (  “iq\n",  parse!  sp  )  ); 

377 

378  fendif 

379 

380 

381 

382 

i 

else 

< 

printf  ("tail  -  <%s>\n",  p  ); 

) 

383 

if (  argv[l] [0]  —  ss  argv[l] [1]  —  's'  ) 

384 

( 

385 

silent  *  1; 

386 

argv++; 

387 

argv — ; 

388 

i 

389 

390 

for(  p  =  *++argv;  — argc  >  0  ;  p  =  *++argv  ) 

391 

( 

392 

rval  -  parse (  sp  > ; 

393 

1 f (  Isilent  ) 

394 

printf!  " »s  -  %g\n“,  *argv,  rval  ); 

395 

1 

396 

} 

397 

398 

exit  { 

(unsigned  char)  rval  ) ; 

399  } 

400 

401  fendif 

End  Listing  Six 

68 


Dr.  Dobb's  Journal,  February  1987 

125 


' 

Listing  Seven 

1  »include  <ctype.h> 

2 

3  char  *sklpspace(  p,  esc  ) 

4  register  char  *p  ; 

5  register  lnt  esc  ; 

6  { 

7  /* 

*  Skip  all  •  •  characters.  \<space>,  where  \  is  the 

*  "esc"  character,  doesn't  count  as  a  space. 

10  */ 

11 

12  while (  *p  —  ■  ■  ) 

13  { 

34  if  (  *p  !-  esc  ) 

15  p++  ; 

16 

else  if  (  *++p  )  /*  skip  escaped  characters  »/ 

1°  p++; 


return  (p)  ; 


End  Listing  Seven 


Listing  Eight 


1  char  *skipto(  c,  p,  esc  ) 

2  register  char  *p  ; 

3  register  int  c,  esc  ; 

4  ( 

5  /*  Skip  to  c  or  to  end  of  string.  If  c  is  preceeded  by 

6  *  the  esc  character  it  is  skipped.  Return  a  pointer 

7  *  to  c. 


while(  *p  ss  *p  !-  c  ) 

( 

if  (  *p  !-  esc  ) 
P++  ; 

else  if  (  *++p  ) 
P++; 


/*  skip  escaped  characters  */ 


return  (p)  ; 


Listing  Nine 


End  Listing  Eight 


1  uatoi (s) 

2  char  * 

3  { 


Like  atoi  but  updates  s  to  point  past  the  number. 


register  char  *str; 

register  int  num  =  0  ; 

for(  str  “  *s  ;  'O'  <-  *str  &&  *str  <-  '9';  str++  ) 
num  =  (num  *  10)  +  (*str  -  '0')  ; 

*s  -  str; 
return  num; 


fifdef  DEBUG 
iinclude  <stdio.h> 


char  buf [80] ,  *bp; 

int  i  ; 


while (  1  ) 
1 


printf ("enter  string:  "); 
gets (buf) ; 
bp  -  buf; 
i  -  uatoi (  &bp  ) ; 

printf ("num  *  %d,  bp  -  <%s>\n",  i,  bp  ) ; 


End  Listings 


Dr.  Ddbb  's  Journal,  February  1987 

6 


STRUCTURED  PROGRAMMING 


Listing  One  (Text  begins  on  page  124.) 

Listing  1.  Turbo  Pascal  listing  of  a  four-function  calculator  program. 
PROGRAM  Calculate; 

(*  Four  function  calculator  example  *) 

VAR  OpError  :  BOOLEAN; 

Operation,  OK  :  CHAR; 

X,  Y,  Result  :  REAL; 

BEGIN 

REPEAT 

ClrScr; 

WRITE ('Enter  first  number  ');  READLN (X) ;  WRITELN; 

WRITECEnter  operation  ');  READLN  (Operation) ;  WRITELN; 

WRITE ('Enter  second  number  ');  READLN (Y)  ;  WRITELN; 

OpError  FALSE; 

CASE  Operation  OF 

'+'  :  Result  X  +  Y; 

;  Result  X  -  Y; 

'*'  ;  Result  X  *  Y; 

'/'  :  IF  Y  <>  0. 

THEN 

Result  X  /  Y 
ELSE  BEGIN 

WRITELN ('Cannot  divide  by  zero!'); 

WRITELN; 

OpError  TRUE 

END 

ELSE  OpError  TRUE; 

END;  (*  CASE  *) 

IF  NOT  OpError  THEN  BEGIN 

WRITELN  (X,  '  \  Operation,  '  ',Y,  '  «=  ' ,  Result)  ; 

WRITELN; 

END; 

WRITE ('Perform  another  operation?  (Y/N)  '); 

READLN (OK) ;  WRITELN; 

UNTIL  (OK  <>  'Y')  AND  (OK  <>  *y'); 

END. 


End  Listing  One 


Listing  Two 

Listing  2.  Modula-2  listing  of  a  four-function  calculator  program. 


MODULE  Calculate; 

FROM  ScreenHandler 

IMPORT  ClrEol,  ClrScr,  DelLine,  InsLine,  GotoXY,  WhereX,  WhereY, 
Crtlnit,  CrtExit,  LowVideo,  NormVideo,  HighVideo,  SetAttribute, 
GetAttribute,  normalAtt,  boldAtt,  reverseAtt,  underlineAtt, 
blinkAtt,  boldUnderl i neAtt ,  blinkUnderlineAtt ,  boldBlinkAtt , 
reverseBlinkAtt ,  boldUnderl ineBlinkAtt ; 

FROM  TReallO  IMPORT  ReadReal,  WriteReal; 

FROM  TTextIO 

IMPORT  Readlnt,  ReadCard,  ReadChar,  Readstring,  ReadLn,  ReadBuffer, 
Writelnt,  WriteCard,  WriteChar,  WriteString,  WriteBool,  WriteLn, 
Eoln,  SeekEof,  SeekEoln; 

FROM  TKernellO 

IMPORT  File,  FileType,  OptionMode, 

StatusProc,  ReadProc,  WriteProc,  ErrorProc, 
stdinout,  input,  output,  con,  trm,  kbd,  1st,  aux,  usr, 
conStPtr,  conlnPtr,  auxInPtr,  usrlnPtr,  conOutPtr,  IstOutPtr, 
auxOutPtr,  usrOutPtr,  errorPtr,  IOresult,  KeyPressed,  IOBuffer, 
IOCheck,  DeviceCheck,  CtrlC,  InputFileBuf fer,  OutputFileBuf fer; 

(*  Four  function  calculator  example  *) 


VAR 

OpError:  BOOLEAN; 
Operation,  OK:  CHAR; 
X,  Y,  Result:  REAL; 


BEGIN 

REPEAT 

ClrScr; 

WriteString (stdinout,  'Enter  first  number  ',  0) ; 
ReadBuffer (on) ; 

ReadReal (stdinout,  X) ; 

ReadLn  (stdinout) ; 

ReadBuffer (off) ; 


(continued  on  page  72) 


70 


Dr.  Dobb's  Journal,  February  1987 

127 


STRUCTURED  PROGRAMMING 

Listing  Two  (Listing  continued,  text  begins  on  page  124.) 

WriteLn (stdinout)  ; 

WriteString (stdinout,  'Enter  operation  *,  0); 

ReadBuffer (on) ; 

ReadChar (stdinout.  Operation); 

ReadLn (stdinout) ; 

ReadBuffer  (off)  ; 

WriteLn (stdinout) ; 

WriteString (stdinout,  'Enter  second  number  ',  0); 

ReadBuffer (on) ; 

ReadReal (stdinout,  Y) ; 

ReadLn (stdinout) ; 

ReadBuffer (off) ; 

WriteLn (stdinout) ; 

OpError  :«  FALSE; 

CASE  Operation  OF 
'  +  ' : 

Result  :«  x+Y 
I 

Result  x-Y 
I 

Result  x*Y 
I  '/': 

IF 

Y  O  0.  THEN 

Result  x/Y 
ELSE 

WriteString (stdinout,  'Cannot  divide  by  zero!',  0); 

WriteLn (stdinout) ; 

WriteLn (stdinout) ; 

OpError  TRUE 
END 
ELSE 

OpError  :«  TRUE; 

END;  («  CASE  *) 

IF  NOT  OpError  THEN 

WriteReal (stdinout,  X,  18,  -10); 

WriteString (stdinout,  '  ',  0); 

WriteChar (stdinout.  Operation,  0); 

WriteString (stdinout,  '  ',  0); 

WriteReal (stdinout,  Y,  18,  -10); 

WriteString (stdinout,  •  -  ',  0); 

WriteReal (stdinout,  Result,  18,  -10); 

WriteLn (stdinout ); 

WriteLn (stdinout) ; 

END; 

WriteString (stdinout,  'Perform  another  operation?  (Y/N)  '  0)* 

ReadBuffer (on) ;  '  ' 

ReadChar (stdinout,  OK) ; 

ReadLn  (stdinout); 

ReadBuffer (off) ; 

WriteLn (stdinout) ; 

UNTIL  (OK  <>  • Y' )  AND  (OK  <>  • y • )  ; 

END  Calculate. 

Listing  Three 

Listing  3.  Turbo  Pascal  program  for  text  pattern  matching. 


PROGRAM  Pattern_Search_Test; 

(*$V-*) 

CONST  MAX  -  100; 

DEFAULT_LINE  *=  'Namir  Clement  Shammas'; 

TYPE  STRINGS 0  -  STRING [40]; 

STRING80  -  STRING [80 ] ; 

STRING255  -  STRING [255] ; 

VAR  Line  :  STRING255; 

Pattern  :  STRING40; 

FUNCTION  Pattern_Search (Text_Line  :  STRING255;  Pattern  :  STRING40)  :  INTEGER; 
(*  Scan  Text_Line  with  Pattern  string,  containing  possible  *) 

(*  combination  of  wildcards.  *) 

VAR  NumJTokens  :  INTEGER; 

Token  :  ARRAY  [1..MAX]  OF  STRING40; 


73 

128 


Dr.  Dobb 's  Journal,  February  1987 


STRUCTURED  PROGRAMMING 


Listing  Three  (Listing  continued,  te?ct  begins  on  page  124.) 


PROCEDURE  INC (VAR  A  :  INTEGER) ; 
BEGIN 

A  s-  A  +  1; 

END; 


FUNCTION  Offset_Pos (Strl,  Str2  :  STRING80;  Ptr  :  INTEGER)  :  INTEGER; 
VAR  Ptr 2  :  INTEGER; 

BEGIN 

Delete (Strl, 1, Ptr-1) ; 

Ptr2  Pos (Str2,  Strl); 

IF  Ptr2  >  0  THEN  Ptr2  Ptr2  +  Ptr  -  1; 

Offset_Pos  :«  Ptr2; 

END;  (*  Offset_Pos  *) 


PROCEDURE  Scan_Pattern; 

VAR  Char_Pos,  Pattern_Length,  Ptr,  Ptr2  ;  INTEGER; 

BEGIN 

Char_Pos  1;  Num_Tokens  0;  Ptr  1; 

Pattern_Length  Length (Pattern) ; 

WHILE  Char_Pos  o  Pattern_Length  DO  BEGIN 
CASE  Pattern [Char_Pos]  OF 
•?•  :  BEGIN 

IF  Char_Pos  >  Ptr 
THEN  BEGIN 

INC (Num_Tokens)  ; 

Token [Num_Tokens] 

Copy (Pattern,  Ptr,  (Char_Pos  -  Ptr)); 

END;  (*  IF  *) 

Ptr2  Char_Pos; 

WHILE  Pattern [Ptr2]  -  '?'  DO  INC(Ptr2); 

INC (Num_Tokens) ; 

Token  [NumJTokens]  ;** 

Copy (Pattern, Char_Pos, (Ptr2  -  Char_Pos) ) ; 

Ptr  :«  Ptr2;  Char_Pos  Ptr2 
END; 

•*'  :  BEGIN 

(*  Resolve  any  pending  strings  *) 

IF  Char_Pos  >  Ptr 
THEN  BEGIN 

INC  (NumJTokens)  ; 

Token [NumJTokens] 

Copy (Pattern, Ptr, (Char_Pos  -  Ptr)); 

END; 

INC (NumJTokens) ; 

Token [Num_Tokens]  Pattern [Char_Pos] ; 

Ptr  :**  Char_Pos  +  1; 

END; 

END;  (*  CASE  *) 

INC (Char_Pos) 

END;  (*  WHILE  *) 

(*  Store  any  trailing  characters  *) 

IF  Char_Pos  >  Ptr 
THEN  BEGIN 

INC (NumJTokens) ; 

Token [NumJTokens]  Copy (Pattern, Ptr, (PatternJLength  -  Ptr  +  1) ) ; 

END; 

END;  (*  Scan  Pattern  *) 


FUNCTION  Locate  Pattern  :  INTEGER; 


VAR  I,  First_Char,  Ptr,  Ptr2  :  INTEGER; 


BEGIN 

Fir st  Char  0;  Ptr  1;  Ptr2  1;  I  1? 

WHILE  I  <-  NumJTokens  DO  BEGIN 
IF  Pos ( ' ? ' , Token [I] )  >  0 
THEN  BEGIN 

(*  Sub-pattern  has  one  or  more  '?'  *) 

Ptr  :«  Ptr  +  Length (Token[I] ) ; 

(*  does  the  text  following  the  match  ?  *) 

Ptr2  Offset_Pos (Line,  Token [1+1],  Ptr); 

IF  Ptr  <>  Ptr2  THEN  Ptr2  0; 

INC  (I) ; 


(continued  on  page  78) 


Dr.  Dobb's  Journal,  February  1987 


STRUCTURED  PROGRAMMING 


Listing  Three  (Listing  continued,  text  begins  on  page  124.) 

IF  (Pos  (‘?‘,Token[I])  >  0)  OR  (Token(I)  <>  •*•) 

THEN  Ptr2  :*  Of fset_Pos (Line,  Token[I],  ptr); 

IF  (Token [IJ  <>  1 »■) 

THEN  BEGIN 

IF  (Ptr2  -  0)  AND  (Flrst_Char  >  0) 

THEN  BEGIN 

Flrst_char  0; 

I  I  -  1 

END 

ELSE 

IF  (Ptr2  -  0)  AND  (First_Char  -  0) 

THEN 

I  : -  Num_Tokens 
ELSE  BEGIN 

IF  (First_Char  -  0)  THEN  Flrst_Char  Ptr2; 
Ptr  Ptr2  +  Length (Token(I) ) ; 

END; 

END; 

INC  (I)  ; 

END;  (*  WHILE  I  *) 

Locate_Pattern  :=  First_Char; 

END;  (*  Locate_Pattern  *) 

BEGIN  (*  Pattern_Search  *) 

Scan_Pattern; 

Pattern_Search  :■  Locate  Pattern 
END;  (*  Pattern__Search  *) 


BEGIN  (* - MAIN - *) 

ClrScr; 

WRITELN( ‘Default  string  Is  :  ' , DEFAULT_LINE) ;  WRITELN; 

WRITEf'Enter  string  ');  READLN (Line) ;  WRITELN; 

IF  Line  ~  “  THEN  Line  DEFAULT_LINE; 

WRITE ('Enter  search  pattern  string  ');  READLN (Pattern) ;  WRITELN; 

WRITELN ( ‘Matches  at  position  Pattern  Search (Line, Pattern) ) ; 

WRITELN; 

END'  End  Listing  Three 

Listing  Four 

Listing  4.  Modula-2  program  for  text  pattern  matching. 


MODULE  PatternSearchTest; 

FROM  Strings 

IMPORT  Assign,  Insert,  Delete,  Pos,  Copy,  Concat,  Length,  CompareStr; 
FROM  ScreenHandler 

IMPORT  ClrEol,  ClrScr,  DelLine,  InsLine,  GotoXY,  WhereX,  WhereY, 
Crtlnit,  CrtExit,.  LowVideo,  NomVideo,  HighVideo,  SetAttribute, 
GstAttribute,  normalAtt,  boldAtt,  reverseAtt,  underlineAtt, 
blinkAtt,  boldUnderllneAtt ,  blinkUnderlineAtt,  boldBlinkAtt, 
reverseBlinkAtt,  boldUnderlineBlinkAtt; 

FROM  TTextIO 

IMPORT  Readlnt,  ReadCard,  ReadChar,  Readstring,  ReadLn,  ReadBuffer, 
Writelnt,  WrlteCard,  WriteChar,  Writestring,  WriteBool,  WrlteLn, 

Eoln,  SeekEof,  SeekEoln; 

FROM  TKernellO 

IMPORT  File,  FileType,  OptionMode, 

StatusProc,  ReadProc,  WriteProc,  ErrorProc, 
stdinout,  input,  output,  con,  trm,  kbd,  1st,  aux,  usr, 
conStPtr,  conlnPtr,  auxInPtr,  usrlnPtr,  conOutPtr,  IstOutPtr, 
auxOutPtr,  usrOutPtr,  errorPtr,  IOresult,  KeyPressed,  IOBuffer, 
IOCheck,  DeviceCheck,  CtrlC,  InputFileBuf fer,  OutputFileBuffer; 


(*  Copyright  (c)  1986,  Namir  Clement  Shammas  *) 


(‘  Compiler  Directive  not  supported  (*$V-*)  *) 

(*  See  the  MODULA-2  feature  'Open  Array  Parameters'  *) 


CONST 

MAX  -  100; 

HI  -  32000;  (* — >  line  added  *) 

TYPE 

string40  -  array  [o.  .40-1]  of  char;  ( continued  on  page  80 ) 


78 

130 


Dr.  Dobb 's  Journal,  February  1987 


STRUCTURED  PROGRAMMING 

Listing  Four  (Listing  continued,  te^ct  begins  on  page  34.) 

STRING80  -  ARRAY  [0. . 80-1]  OF  CHAR; 

STRING255  -  ARRAY  [0.. 255-1]  OF  CHAR; 


VAR 

Line,  DEFAULTLINE:  STRING255; 
Pattern:  STRING40; 


(*  Function  to  scan  Text_Line  with  Pattern  string,  containing  possible  *) 
(*  combination  of  wildcards.  *) 


PROCEDURE  PatternSearch (TextLine:  (*— >  STRING255  *)  ARRAY  OF  CHAR; 

Pattern:  (*  —  >  STRING40  *)  ARRAY  OF  CHAR)  :  INTEGER; 


VAR 

NumTokens:  INTEGER; 

Token:  ARRAY  [1..MAX]  OF  STRING40; 

(*-->  PROCEDURE  INC  was  removed  *) 

PROCEDURE  Of fsetPos  (Strl,  Str2:  (*  STRING80  *)  ARRAY  OF  CHAR; 
Ptr:  INTEGER) :  INTEGER; 


VAR 

Ptr2:  INTEGER; 

VAR  Of fsetPosResult :  INTEGER; 

BEGIN 

(* — >  Delete (Strl,  1,  Ptr-1);  *) 

IF  Ptr  >  0  THEN  Delete (Strl,  0,  Ptr-1)  END; 
Ptr2  :-  Pos (Str2,  Strl); 

(* — >  IF  Ptr2  >  0  THEN  *) 

IF  CARDINAL (Ptr2)  <-  HIGH  (Strl)  THEN 
Ptr2  Ptr2  +  Ptr  -  1; 

ELSE  (* — >  ELSE  clause  added  *) 

Ptr2  HI; 

END; 

OffsetPosResult  :»  Ptr2; 

RETURN  OffsetPosResult 
END  Of fsetPos;  (*  Offset_Pos  *) 


PROCEDURE  ScanPattern; 

VAR 

CharPos,  PatternLength,  Ptr,  Ptr2:  INTEGER; 
BEGIN 

(* — >  CharPos  :-  1;  *)  CharPos  :«  0; 
NumTokens  :=  0; 

(* — >  Ptr  :-  1;  *)  Ptr  :«  0; 

PatternLength  Length (Pattern) ; 

(* — >  WHILE  CharPos  <-  PatternLength  DO  *) 
WHILE  CharPos  <  PatternLength  DO 
CASE  Pattern [CharPos]  OF 


IF 

CharPos  >  Ptr  THEN 
INC (NumTokens) ; 

Copy (Pattern,  Ptr,  (CharPos-Ptr) ,  Token [NumTokens] ) ; 

END;  (*— >  IF  *) 

Ptr2  CharPos; 

WHILE  Pattern [Ptr2]  -  '?'  DO 
INC (Ptr2) 

END; 

INC (NumTokens) ; 

Copy (Pattern,  CharPos,  (Ptr2-CharPos) ,  Token [NumTokens] ) ; 
Ptr  Ptr2; 

CharPos  Ptr2 

I 


{ *  Resolve  any  pending  strings  *) 
IF 

CharPos  >  Ptr  THEN 
INC (NumTokens) ; 


(continued  on  page  82) 


80 


Dr.  Dobb's  Journal,  February  1987 

131 


STRUCTURED  PROGRAMMING 


Listing  Four  (Listing  continued,  text  begins  on  page  124.) 

Copy (Pattern,  Ptr,  (CharPos-Ptr) ,  Token [NumTokens) ) ; 
END; 

INC (NumTokens) ; 

Token [NumTokens, 0)  :*  Pattern(CharPos) ; 

Token [NumTokens, 1]  0C; 

Ptr  :«  CharPos+1; 

ELSE 

END;  (*  CASE  *) 

INC (CharPos) 

END;  (*  WHILE  *) 

(*  Store  any  trailing  characters  *) 

IF 

CharPos  >  Ptr  THEN 
INC (NumTokens) ; 

Copy  (Pattern,  Ptr,  (PatternLength-Ptr+1) ,  Token (NumTokens) ) ; 
END; 

END  ScanPattern;  (*  Scan_Pattern  *) 


PROCEDURE  LocatePattern () :  INTEGER; 

VAR 

I,  FirstChar,  Ptr,  Ptr2:  INTEGER; 

VAR  LocatePatternResult:  INTEGER; 

BEGIN 

FirstChar  0; 

(*— >  Ptr  I;  *)  Ptr  0; 

(*— >  Ptr2  1;  *)  Ptr2  0; 

I  1; 

WHILE  I  <-  NumTokens  DO 
IF 

(*— >  INTEGER  (PosC?1,  Token  [I] ))  >  0  THEN  *) 

Pos ( 1 ? ' ,  Token [I))  <-  HIGH (Token[I) )  THEN 

(*  Sub-pattern  has  one  or  more  '?'  *) 

INC (Ptr,  . INTEGER (Length (Token [I] ))); 

(*  does  the  text  following  the  match  ?  *) 

Ptr2  Off setPos (Line,  Token[I+l),  Ptr); 

IF  Ptr  <>  Ptr2  THEN 

(*-->  Ptr2  o  *)  Ptr2  HI 
END; 

INC  (I); 

END; 

(*-->  IF  (INTEGER (Pos (•?*,  Token(I)))  >  0)  OR  (Token[I)  <>  •*•)  THEN  *) 

IF  (PosC?1,  Token(I))  <-  HIGH  (Token  [I] ) )  OR 
(Token [1,0]  <>  •*•)  THEN 
Ptr2  :«  OffsetPos (Line,  Token[I],  Ptr) 

END; 

(*-->  IF  (Token [I]  <>  •*•)  THEN  *) 

IF  Token[I,0J  <>  THEN 

IF  (*— >  (Ptr2  -  0)  AND  (FirstChar  >  0)  THEN  *) 

(Ptr2  -  HI)  AND  (FirstChar  <  HI  )  THEN 
(*-->  FirstChar  0;  *)  FirstChar  :=■  HI; 

DEC (I,  1) 

ELSIF  (*— >  (Ptr2  -  0)  AND  (FirstChar  -  0)  THEN  *) 

(Ptr2  -  HI)  AND  (FirstChar  -  HI)  THEN 
I  NumTokens 
ELSE 

IF  (*— >  (FirstChar  -  0)  *) 

(FirstChar  -  HI)  THEN 
FirstChar  Ptr2 
END; 

Ptr  Ptr2+INTEGER (Length (Tokenfl ] ) ) ; 

END; 

END; 

INC(I); 

END;  (*  WHILE  I  *) 

LocatePatternResult  FirstChar; 

RETURN  LocatePatternResult 
END  LocatePattern;  (*  Locate_Pattern  *) 

VAR  PatternSearchResult:  INTEGER; 

BEGIN  (*  Pattern_Search  *) 

ScanPattern; 

(continued  on  page  84) 


82 

132 


Dr.  Dobb's  Journal,  February  1987 


STRUCTURED  PROGRAMMING 


Listing  Four  (Listing  continued,  text  begins  on  page  34.) 

PatternSearchResult  LocatePattern () ; 

RETURN  PatternSearchResult 
END  PatternSearch;  (*  Pattern_Search  *) 

BEGIN  (* - MAIN - *) 

ClrScr; 

DEFAULTLINE  :»  'Namir  Clement  Shammas'; 

WriteString (stdinout,  'Default  string  is  :  ' ,  0); 

WriteString (stdinout ,  DEFAULTLINE,  0); 

WriteLn (stdinout) ; 

WriteLn (stdinout) ; 

WriteString (stdinout,  'Enter  string  ',  0); 

ReadBuffer (on) ; 

Readstring (stdinout.  Line) ; 

ReadLn (stdinout)  ; 

ReadBuffer (off) ; 

WriteLn (stdinout) ; 

IF  (*— >  Line  -  '  •  THEN  *) 

Line [0]  -  0C  THEN 
(* — >  Line  :«=  DEFAULTLINE  *) 

Assign  (DEFAULTLINE, Line) 

END; 

WriteString (stdinout,  'Enter  search  pattern  string  ',  0)  ; 

ReadBuffer (on) ; 

Readstring (stdinout,  Pattern) ; 

ReadLn (stdinout) ; 

ReadBuffer (off) ; 

WriteLn (stdinout) ; 

WriteString (stdinout,  'Matches  at  position  ',  0); 

Writelnt (stdinout,  (PatternSearch (Line,  Pattern)  +  1),  0); 

WriteLn (stdinout) ; 

WriteLn (stdinout) ; 

END  PatternSearchTest .  End  Listing  Four 


Listing  Five 

Listing  5.  Turbo  Pascal  program  that  uses  sets  for  character-counting 
histograms. 

PROGRAM  Sets; 

(*  Program  for  the  demonstration  of  translating  sets  *) 

(*  by  Namir  C.  Shammas  *) 

TYPE  CharSet  =  SET  OF  CHAR; 

STRING30  -  STRING [30]; 

LSTRING  -  STRING [ 255); 

VAR  Digit Set,  UpperCaseSet,  LowerCaseSet  :  CharSet; 

OK,  C  :  CHAR; 

I,  J,  Count_Digits,  Count_Upper, 

Count_Lower,  Count_Others  ;  INTEGER; 

Filename  :  STRING30; 

Line  :  LSTRING; 

InFile  :  TEXT; 

PROCEDURE  Display_Histogram(Row,  Count  :  INTEGER); 

BEGIN 

GOTOXY ( (11  +  Count  div  100),Row); 

WRITE (•*') ; 

END; 

BEGIN 

REPEAT 

DigitSet  ['O'..'9'l; 

UpperCaseSet  :■  ['A'..'Z']; 

LowerCaseSet  :**=  ('a'..'z']; 

WRITE ('Enter  filename  '); 

READLN  (Filename) ;  WRITELN; 

ClrScr; 

WRITELN ('Digits  '); 

WRITELN ( ' Uppercase  ' ) ; 

WRITELN ( ' Lowercase  ' )  ; 

WRITELN ('Others  '); 

Count_Digits  : n  0; 

Count_Upper  0; 

Count _Lower  0; 

Count _Ot hers  :»  0; 

Assign  (InFile,  Filename); 


84 


Dr.  Dobb's  Journal,  February  1987 

133 


Reset  (InFile) ; 

WHILE  NOT  EOF (InFile)  DO  BEGIN 
READLN (InFile, Line) ; 

FOR  I  1  TO  Length (Line)  DO  BEGIN 
C  :«  Line[I]; 

IF  C  IN  DigitSet  THEN 

Count_Digits  Count_Digits  +  1 
ELSE  IF  C  IN  UpperCaseSet  THEN 

Count_Upper  Count_Upper  +  1 
ELSE  IF  C  IN  LowerCaseSet  THEN 

Count_Lower  Count_Lower  +  1 

ELSE 

Count_Others  Count_Others  +  1; 

END; 

Display_Histogram  (1,  Count_Digits) ; 

Display_Histogram (2, Count_Upper) ; 

Display_Histogram (3, Count_Lower) ; 

Display_Histogram (4, Count_Others) ; 

END; 

Close  (InFile) ; 

GOTOXY  (1, 20) ;  WRITE ('Want  to  scan  another  file?  (Y/N)  '); 

READLN  (OK) ; 

UNTIL  NOT  (OK  IN  ['Y'^'y')); 

GOTOXY (1,20) ;  ClrEol;  WRITELN ( ' End  of  program'); 

END-  End  Listing  Five 

Listing  Six 

Listing  6.  Modula-2  program  that  uses  sets  for  character-counting 
histograms. 


MODULE  Sets; 

FROM  Strings 

IMPORT  Assign,  Insert,  Delete,  Pos,  Copy,  Concat,  Length,  CompareStr; 

FROM  ScreenHandler 

IMPORT  ClrEol,  ClrScr,  DelLine,  InsLine,  GotoXY,  WhereX,  WhereY, 

Crtlnit,  CrtExit,  LowVideo,  NormVideo,  HighVideo,  SetAttribute, 
GetAttribute,  normalAtt,  boldAtt,  reverseAtt,  underlineAtt, 
blinkAtt,  boldUnderlineAtt ,  blinkUnderlineAtt,  boldBlinkAtt, 
reverseBl inkAtt ,  boldUnder 1 ineBl inkAtt  ; 

FROM  TFilelO 

IMPORT  Append,  AssignFile,  Close,  Erase,  Flush,  Rename, 

Reset,  Rewrite,  Truncate,  Eof; 

FROM  TTextIO 

IMPORT  Readlnt,  ReadCard,  ReadChar,  Readstring,  ReadLn,  ReadBuffer, 
Writelnt,  WriteCard,  WriteChar,  WriteString,  WriteBool,  WriteLn, 

Eoln,  SeekEof,  SeekEoln; 

FROM  TKernellO 

IMPORT  File,  FileType,  OptionMode, 

StatusProc,  ReadProc,  WriteProc,  ErrorProc, 

stdinout,  input,  output,  con,  trm,  kbd,  1st,  aux,  usr, 

const Ptr,  conlnPtr,  auxInPtr,  usrlnPtr,  conOutPtr,  IstOutPtr, 

auxOutPtr,  usrOutPtr,  errorPtr,  IOresult,  KeyPressed,  IOBuffer, 

IOCheck,  DeviceCheck,  CtrlC,  InputFileBuf fer,  OutputFileBuf fer; 

(*  The  following  two  IMPORT  statements  are  manually  added  *) 

FROM  LongSet 

IMPORT  BuildSet,  InSet,  SetOfChar,  MakeEmptySet,  Include; 

FROM  SYSTEM  IMPORT  WORD; 

(*  Program  for  the  demonstration  of  translating  sets  *) 

(*  by  Namir  C.  Shammas  *) 

TYPE 

STRING30  =  ARRAY  [0..30-1]  OF  CHAR; 

LSTRING  -  ARRAY  [0.. 2 55-1]  OF  CHAR; 

VAR 

DigitSet,  UpperCaseSet,  LowerCaseSet, 

YesNo  (* — >  this  one  is  added  *)  ;  SetOfChar; 

OK,  C;  CHAR; 

I,  J,  CountDigits,  CountUpper,  CountLower,  CountOthers:  INTEGER; 

Filename:  STRING30; 

Line:  LSTRING; 

InFile:  File; 

PROCEDURE  DisplayHistogram (Row,  Count:  INTEGER); 

begin  ( continued  on  ne^ct  page ) 


Dr.  Dobb's  Journal,  February  1987 

134 


85 


STRUCTURED  PROGRAMMING 


Listing  Six  (Listing  continued,  te^ct  begins  on  page  124.) 

GotoXY ( (11+ (Count  DIV  100) ) ,  Row) ; 

WriteChar (stdlnout,  '*',  0); 

END  DisplayHistogram; 

BEGIN 

REPEAT 

(*  The  following  three  statements  were  edited  from  the  original  lines  *) 

Bui IdSet (Digit Set , ORD ( • 0 1 ) ,  ORD ( • 9 ' ) ) ; 

BuildSet (UpperCaseSet , ORD ( 1 A' ) , ORD ( ' Z ' ) ) ; 

Bui IdSet (LowerCaseSet , ORD ( ' a ' ) ,ORD(' z‘ ) ) ; 

(*  The  next  three  lines  are  inserted  to  support  sets  for  Yes/No  answers  *) 
MakeEmptySet (YesNo) ; 

Include (YesNo,  WORD (ORD ( 'y' ))) ; 

Include (YesNo,  WORD (ORD ( 'Y' ))) ; 

WriteStrlng (stdlnout,  'Enter  filename  ',  0); 

ReadBuffer(on) ; 

Readstring (stdlnout.  Filename); 

ReadLn (stdlnout) ; 

ReadBuf fer (off) ; 

WriteLn (stdlnout) ; 

ClrScr; 

WriteStrlng (stdlnout,  'Digits  ',  0) ; 

WriteLn (stdlnout) ; 

WriteStrlng (stdlnout,  'Uppercase  ',  0) ; 

WriteLn (stdlnout) ; 

WriteStrlng (stdlnout,  'Lowercase  ',  0); 

WriteLn (stdlnout) ; 

WriteStrlng (stdlnout,  ‘Others  ',  0); 

WriteLn (stdlnout) ; 

CountDigits  :=•  0; 

Count Upper  0; 

Count Lower  0; 

CountOthers  :=  0; 

AssignFile (InFile,  Filename,  text); 

Reset (InFile,  0) ; 

WHILE  NOT  Eof (InFile)  DO 

Readstring (InFile,  Line); 

ReadLn (InFile) ; 

(*  Loop  limits  were  shifted  by  1  from  Pascal  version  *) 

FOR  I  0  TO  Length (Line) -1  DO 

C  Line[I); 

(*  InSet ()  is  used  to  test  char  C  instead  *) 

(*  of  BITSETO  generated  by  the  Translator  *) 

IF  InSet (DigitSet, ORD (C) )  THEN 

INC (CountDigits,  1) 

ELSIF  InSet (UpperCaseSet, ORD (C) )  THEN 

INC (Count Upper,  1) 

ELSIF  InSet (LowerCaseSet, ORD (C) )  THEN 

INC  (CountLower,  1) 

ELSE 

INC (CountOthers,  1) 

END; 

END; 

DisplayHistogram  (1,  CountDigits) ; 

DisplayHistogram (2,  CountUpper) ; 

DisplayHistogram (3,  CountLower); 

DisplayHistogram  (4,  CountOthers) ; 

END; 

Close (InFile) ; 

GotoXY (1,  20); 

WriteStrlng (stdlnout,  'Want  to  scan  another  file?  (Y/N)  ',  0); 

ReadBuffer  (on)  ; 

ReadChar (stdlnout,  OK)  ; 

ReadLn  (stdlnout) ; 

ReadBuffer  (off)  ; 

UNTIL  NOT  (InSet (YesNo,  ORD (OK) ) ) ;  (*  Boolean  expression  was  edited  *) 

GotoXY  (1,  20); 

ClrEol; 

WriteStrlng (stdlnout,  ‘End  of  program',  0) ; 

WriteLn (stdlnout) ; 

END  Sets.  End  Listing  Six 

Listing  Seven 

Listing  7.  Turbo  Pascal  program  that  performs  direct  screen  memory  access 
by  using  simple  absolute  variables. 


86 


Dr.  Dobb's  Journal,  February  1987 

135 


Program  Screen; 

(*  Program  to  demonstrate  direct  memory  access  in  Turbo  Pascal  *) 

TYPE  STRING80  «  STRING [80]; 

VAR  Message  :  STRING80; 

Row,  Col  :  INTEGER; 


PROCEDURE  DISP_STR(S  :  STRING80;  Row,  Col  ;  INTEGER); 

(*  Procedure  to  write  a  string  to  the  screen  memory  *) 

TYPE  SCREEN80  -  ARRAY  [1 . . 25, 1 . . 80, 1 . . 2 ]  OF  CHAR; 

VAR  DISP  :  SCREEN80  Absolute  $B000:0000; 

(*  For  color  display  use  *) 

(*  DISP  :  SCREEN80  Absolute  $B800:0000;  *) 

I,  J  :  INTEGER; 

BEGIN 

J  :=■  Length  (S); 

FOR  I  :«  1  TO  J  DO 

DISP [Row, Col  +  I  -  1,1]  :=  S [I] ; 

END; 

BEGIN 

ClrScr; 

WRITELN( ' Enter  message  '); 

READLN (Message) ; 

ClrScr; 

Col  :*  1; 

FOR  Row  :=«  22  DOWNTO  1  DO 

DISP_STR (Message,  Row,  Col+Row) ; 

WHILE  NOT  Keypressed  DO; 

ClrScr; 

END*  End  Listing  Seven 


Listing  Eight 

Listing  8.  Modula-2  program  that  perform^  direct  screen  memory  access 
by  using  simple  absolute  variables. 


MODULE  Screen; 

FROM  Strings 

IMPORT  Assign,  Insert,  Delete,  Pos,  Copy,  Concat,  Length,  CompareStr; 
FROM  ScreenHandler 

IMPORT  ClrEol,  ClrScr,  DelLine,  InsLine,  GotoXY,  WhereX,  WhereY, 
Crtlnit,  CrtExit,  LowVideo,  NormVideo,  HighVideo,  SetAttribute, 

Get Attribute,  normalAtt,  boldAtt,  reverseAtt,  underlineAtt, 
blinkAtt,  boldUnderlineAtt,  blinkUnderlineAtt,  boldBlinkAtt , 
reverseBl inkAt t ,  boldUnderl ineBl inkAtt ; 

FROM  TTextIO 

IMPORT  Readlnt,  ReadCard,  ReadChar,  Readstring,  ReadLn,  ReadBuffer, 
Writelnt,  WriteCard,  WriteChar,  WriteString,  WriteBool,  WriteLn, 

Eoln,  SeekEof,  SeekEoln; 

FROM  TKernellO 

IMPORT  File,  FileType,  OptionMode, 

StatusProc,  ReadProc,  WriteProc,  ErrorProc, 
stdinout,  input,  output,  con,  trm,  kbd,  1st,  aux,  usr, 
conStPtr,  conlnPtr,  auxInPtr,  usrlnPtr,  conOutPtr,  IstOutPtr, 
auxOutPtr,  usrOutPtr,  errorPtr,  IOresult,  KeyPressed,  IOBuffer, 
IOCheck,  DeviceCheck,  CtrlC,  InputFileBuf fer,  OutputFileBuffer; 


(*  Program  to  demonstrate  direct  memory  access  in  Turbo  Pascal  *) 
TYPE 

STRING80  -  ARRAY  [0..80-1]  OF  CHAR; 


(continued  on  ne?ct  page) 


Dr.  Dobb's  Journal,  February  1987 

136 


87 


STRUCTURED  PROGRAMMING 


Listing  Eight  (Listing  continued,  te*t  begins  on  page  124.) 


VAR 

Message:  STRING80; 

Row,  Col:  INTEGER; 

(*  Procedure  to  write  a  string  to  the  screen  memory  *) 
PROCEDURE  DISPSTR (S:  ARRAY  OF  CHAR;  (*  Edited  *) 

Row,  Col:  INTEGER); 

TYPE 

SCREEN80  -  ARRAY  [1..25),  [1..80J,  [1..2]  OF  CHAR; 
VAR 

DISP  [0B000H:00H]  :  SCREEN80; 

{*  For  color. display  use  *) 

(*  DISP [0B800H:00H1  :  SCREENN80  *) 

I,  J:  INTEGER; 

BEGIN 

J  :=  Length  (S) ; 

(* — >  FOR  I  1  TO  J  DO  *) 

FOR  I  0  TO  J-l  DO 

DISP  [Row,  Col+I,  1]  Sdl  (*  Edited  *) 

END; 

END  DISPSTR; 

BEGIN 

ClrScr; 

WriteString (stdinout,  ‘Enter  message  ‘,  0) ; 

WriteLn (stdinout) ; 

ReadBuf fer (on) ; 

Readstring  (stdinout.  Message); 

ReadLn (stdinout) ; 

ReadBuf fer  (off) ; 

ClrScr; 

Col  :=  1; 

FOR  Row  :=  22  TO  1  BY  -1  DO 
DISPSTR (Message,  Row,  Col+Row) 

END; 

WHILE  NOT  KeyPressedO  DO 
END; 

ClrScr; 

END  Screen. 


Listing  IVine 

Listing  9.  Modula-2  program  that  displays  and  alters  the  states  of  the 
[Caps  Lock]  and  [Num  Lock]  keys  on  an  IBM  PC  or  compatible. 


88 


Dr.  Dobb's  Journal,  February  1987 

137 


MODULE  Keylock; 

(* 

*  This  program  interactively  displays  and  changes  the  states 

*  of  the  Caps  Lock  and  Num  Lock  keys  on  an  IBM  PC  or  compatible. 

*  It  is  particularly  useful  with  keyboards  that  have  LEDS  on  these 

*  keys;  these  sometimes  get  out  of  sync. 


*  This  program  is  adapted  from  FL.COM  by  Patrick  Swayne,  from 

*  the  June  1986  issue  of  REMark  magazine. 

* 

*  The  declaration  of  keyflag  is  a  feature  of  the  Logitech  compiler. 


*  James  Janney,  June  1986 
*) 


FROM  Terminal  INPORT  Readstring,  WriteString,  WriteLn; 


CONST 

Num Lock  -  5; 
CapsLock  =  6; 


(*  Num  Lock  bit  *) 
(*  Caps  Lock  bit  *) 


VAR 

keyflag  [40H:17H]  :  BITSET;  (*  ROM-BIOS  keyboard  status  word  *) 
cmd  :  ARRAY  [1..80]  OF  CHAR; 


BEGIN 

LOOP 

WriteString  ("1.  Caps  Lock  is  ")  ; 
IF  CapsLock  in  keyflag  THEN 
WriteString ("on") 

ELSE 

WriteString (MoffM) 

END; 

WriteLn; 


WriteString  ("1.  Num  Lock  is  "); 
IF  NumLock  in  keyflag  THEN 
WriteString (“on") 

ELSE 

WriteString ("off") 

END; 

WriteLn; 


WriteString ("Enter  option  to  change:  ") 

Readstring (cmd) ; 

CASE  cmd [1 ]  OF 

*1'  :  keyflag  :=  keyflag  /  (CapsLock)  (*  toggle  Caps  Lock  *) 
I  '2*  :  keyflag  :=  keyflag  /  (NumLock)  (*  toggle  Num  Lock  *) 

ELSE 

EXIT 

END; 


WriteLn;  WriteLn; 

END; 

END  Key  Lock.  End  Ustings 


Dr.  Dobb  s  Journal,  February  1987 

138 


89 


COLUMNS 


C  CHEST 


Nroff:  Hashing;  Expressions;  and  Roman  Numerals 


After  the  shell  was  published  in 
this  column,  I  vowed  that  I’d 
stay  away  from  long  programs.  Pub¬ 
lishing  them  was  just  too  much  work. 
I’m  going  to  break  my  own  rule  for 
the  next  few  months,  however,  by 
presenting  nr,  my  version  of  the  Unix 
text  formatter,  nroff.  There  are  two 
reasons  for  this.  First,  nr  includes  a 
bunch  of  subroutines  that  are  useful 
even  if  you're  never  going  to  use  it  as 
a  word  processor.  Good  examples  are 
the  hash  table  management  func¬ 
tions  and  the  general-purpose  ex¬ 
pression  analyzer,  which  I’ll  discuss 
this  month.  Second,  the  program  is 
an  almost  complete  implementation 
of  nroff  and  it  includes  several  of 
troff’s  features  as  well.1  It  does  hy¬ 
phenation  and  proportional  spacing, 
it  can  format  equations  and  matrices, 
and  it's  easily  configurable  to  most 
printers.  I  have  it  driving  a  Brother 
HR-15  (a  Diablo-compatible  daisy- 
wheel  printer),  a  Thinkjet,  and  an  HP 
LaserjetT  laser  printer.  Nr  can  do  as 
much  as  many  word  processors  that 
cost  several  hundred  dollars.  It  is  also 
considerably  more  powerful  than 
any  of  the  various  roff  spin-offs  avail¬ 
able  from  various  users’  groups  and 
bulletin  boards.  Unlike  most  of  these, 
it  doesn’t  derive  from  the  word  pro¬ 
cessor  in  Software  Tools  in  Pascal.2 

1  hadn’t  published  the  program  be¬ 
fore  now  because  I'd  just  about  con- 


by  Allen  Holub 

vinced  myself  that  it  was  obsolete. 
My  research  into  the  newer 
wizzywig  editors  has  convinced  me 
otherwise,  however. 

Text  formatters,  like  many  classes 
of  computer  tools,  can  be  divided  into 
two  categories.  The  interpretive  for¬ 
matters  (such  as  WordStar  or  Micro¬ 
soft  Word)  combine  text  entry  or  edit¬ 


ing  together  with  text  formatting: 
adjusting  margins  and  line  length, 
hyphenating,  and  so  on.  The  com¬ 
piled  word  processors,  such  as  nroff 
and  TeX  work  like  compiled  lan¬ 
guages  do.  You  create  an  ASCII  input 
file  composed  of  mixed  text  and  for¬ 
matting  commands  and  then  submit 
this  file  to  a  second  program  that  ac¬ 
tually  does  the  formatting. 

Both  these  techniques  have  their 
advantages  and  disadvantages — 
something  that  was  made  painfully 
clear  to  me  a  few  months  ago  when  I 
tried  to  typeset  a  book  with  Version 
3.0  of  Microsoft  Word.  I  really  like  the 
idea  of  being  able  to  see  what  the  doc¬ 
ument  will  look  like  as  I’m  typing  it. 
Word,  for  example,  actually  displays 
italicized  text  in  italics  on  the  screen. 
Using  Word  turned  out  to  be  a  serious 
mistake,  however.  It  may  do  a  nice 
job  on  a  newsletter  or  a  medium- 
length  paper,  but  when  it  comes  to 
something  real,  such  as  a  book,  it  just 
can’t  do  the  job.  If  you  can’t  type,  if 
you’re  afraid  of  computers,  and  if 
you’re  never  going  to  try  to  create  a 
document  longer  than  a  newsletter, 
Word  is  the  program  for  you.  If  any 
of  the  above  don’t  apply  to  you,  stay 
away  from  it. 

First  of  all,  it  takes  two  to  three 
times  longer  for  me  to  enter  text  us¬ 
ing  Word  than  it  does  with  my  nor¬ 
mal  editor.  I’m  a  touch-typist — if  I 
don’t  have  to  lift  my  hands  from  the 
home  row  of  the  keyboard,  I  can  type 
about  85  words  a  minute.  Microsoft 
Word  doesn't  permit  this,  however. 


Many  things  have  to  be  done  with  the 
function  keys.  Then  there’s  that  idiot 
mouse.  The  nonmouse  commands 
are  so  involved  that  they're  almost 
useless.  Many  of  them  use  ten  or 
more  keystrokes  to  do  a  simple  action 
such  as  deleting  a  block  of  text.  The 
mouse  is  really  an  integral  part  of  the 
program.  So,  every  time  you  need  to 
delete  some  text,  change  a  font,  or 
whatever,  you  have  to  let  go  of  the 
keyboard,  find  the  idiot  mouse, 
which  has  buried  itself  under  piles  of 
paper,  clear  off  an  area  of  desk  big 
enough  for  the  thing  to  move,  and 
then  execute  the  command.  If  you 
can't  type,  this  might  not  be  a  prob¬ 
lem,  but  I’d  think  that  the  mouse-re¬ 
lated  problems  alone  would  severely 
limit  the  utility  of  Word  in  a  normal 
office  environment.  All  this  is  com¬ 
pounded  by  Word's  refusal  to  imple¬ 
ment  even  the  simplest  editing 
control  codes.  For  example,  Word 
does  not  recognize  Ctrl-H  as  a  back¬ 
space — you  have  to  find  the  back¬ 
space  key,  which  is  way  up  in  the 
boonies  on  the  IBM  keyboard.  I  found 
I  couldn't  type  anything  without  Pro- 
Key  installed  under  Word,  and  even 
then  text  entry  was  difficult. 

The  next  problem  is  that  Word 
doesn't  really  show  you  what  you're 
getting.  It  doesn't  display  text  in  the 
correct  point  sizes.  It  lets  you  see  how 
many  words  will  be  on  the  output 
line,  but  when  you  do  this,  the  tab 
stops  don’t  display  correctly  and  the 
right  margin  isn’t  adjusted  anymore. 
Word  doesn’t  really  format  the  text 
as  you  type  either — at  least,  it  refor¬ 
mats  the  text  as  part  of  the  printing 
process.  The  "compute  page  break" 
command  takes  as  long  to  execute  as 
my  compiled  word  processor  takes  to 
format  the  entire  document. 

The  final  problem,  and  the  one 
that  made  the  program  utterly 


90 


Dr.  Dobb's  Journal,  February  1987 

139 


worthless  to  me,  is  that  Word  evi¬ 
dently  keeps  the  entire  document  in 
memory  at  once.  Though  I  have  a 
640K  system,  this  isn’t  enough  to 
typeset  a  book.  The  program  just 
can’t  handle  that  much  text. 
Word  has  a  text-merge  feature,  but 
evidently  it  can't  be  used  if  you're  as¬ 
sembling  an  index  or  table  of 
contents. 

So  I  filed  Word  in  the  circular  file 
and  went  back  to  my  own  text  for¬ 
matter,  nr.  It  supports  all  the  things 
you  need  to  do  most  word  processing 
applications — multiple  fonts  and 
point  sizes,  multicolumn  text,  hy¬ 
phenation,  proportional  spacing, 
both  footnotes  and  endnotes,  auto¬ 
matic  index  and  table  of  contents 
generation,  and  so  forth.  I  can  use  my 
normal  editor  to  enter  text.3  More¬ 
over,  nr  supports  an  extensive  macro 
language  that  effectively  lets  you 
change  the  way  the  program  works 
to  suit  your  needs  of  the  moment. 
Most  nr  (and  nroff)  input  documents 
don’t  actually  use  the  primitive  com¬ 
mands  supported  by  the  word  pro¬ 
cessor  itself;  rather,  they  use  various 
macro  packages  that  use  these  primi¬ 
tives.  The  macros  can  be  used  in  the 
same  way  as  you  use  subroutines; 
you  can  pass  them  arguments,  define 
and  modify  global  variables,  and  so 
forth.  Though  WYSIWYG  isn't  avail¬ 
able,  nr  supports  an  adequate  screen- 
preview  mode.  I've  configured  it  so 
that,  when  ANSI.SYS  is  installed,  un¬ 
derlined  text  is  displayed  on  the 
screen  underlined,  boldface  is  shown 
in  high-intensity  video,  and  over¬ 
struck  text  blinks. 

It's  going  to  take  several  months  to 
get  the  whole  program  printed  in  this 
column,  so  you  may  find  include  file 
or  subroutine  references  that  aren’t 
part  of  the  current  month’s  listings. 
I’ll  try  to  minimize  this,  but  a  little  for¬ 
ward  referencing  is  inevitable.  The 
complete  program  will  be  available 
on  a  disk  from  DDJ  in  March.  The  re¬ 
mainder  of  this  month's  column  is  a 
discussion  of  several  nr  support  rou¬ 
tines.  In  the  next  column  I’ll  give  a  de¬ 
tailed  users'  manual  along  with 
more  code.  I’ll  finish  the  code  in  a 
third  column  and  also  present  a  de¬ 
tailed  program  description  at  that 
time  (once  all  the  code  is  available). 

Much  of  the  support  code  for  nr 
has  already  been  published  in  previ¬ 
ous  columns.  This  puts  me  in  some¬ 


thing  of  a  quandary  because  most  of 
this  code  has  undergone  minor  modi¬ 
fications  since  its  original  publica¬ 
tion.  After  a  great  deal  of  soul-search¬ 
ing,  I've  decided  not  to  reprint  these 
routines,  even  though  there  are  mi¬ 
nor  changes.  The  algorithms  are  basi¬ 
cally  the  same,  so  the  earlier  articles 
will  serve  as  an  adequate  reference. 
I’m  also  assuming  that  no  one  in  their 
right  mind  would  actually  type  the 


Included  in  nr 
are  subroutines 
that  are  useful 
even  if  you 
never  use  it 
as  a  word  processor. 


whole  thing  in  by  hand  when  the  en¬ 
tire  program  will  be  made  available 
on  a  disk.  Nr  uses  the  following,  pre¬ 
viously  published,  support  routines: 

DDJ,  May  1985 — stoi(  ),  getargsi  ) 

DDJ,  June  1985,  queues — make- 
queuei  ),  show_rw}d(  ),  enqueue(  ), 
dequeuei  ),  spoused/  ) 

DDJ,  June  1985,  bit  maps — setbit(  ), 
testbiti  ),  makebitmap(  ) 

DDJ,  October  1985,  hyphenation — 
hypheni  ) 

Most  of  the  support  routines  print¬ 
ed  this  month  are  adequately  com¬ 
mented  and  can  stand  alone.  There 
are  three  exceptions:  A  set  of  general- 
purpose,  symbol  table  maintenance 
functions;  a  binary-to-ASCll  conver¬ 
sion  function  that  can  print  in  Roman 
numerals  and  English  words  as  well 
as  the  usual  digits; and  a  powerful  ex¬ 
pression  analyzer. 

Symbol  Table  Maintenance 
with  Hashing 

Listing  Three,  page  52,  presents  a  set 
of  symbol  table  maintenance  func¬ 
tions  that  use  a  hashing  strategy  for 
table  maintenance.4  They  were  origi¬ 
nally  written  for  a  compiler  project 


but  are  useful  anywhere  you  need  a 
list  of  objects  ordered  by  name.  I've 
made  the  routines  as  general  purpose 
as  possible.  As  was  the  case  in  the  AVL 
tree  routines  printed  in  the  August 
1986  issue  of  DDJ,  I've  separated  the 
actual  maintenance  functions  from 
the  part  of  the  program  that’s  going 
to  use  the  table.  That  is,  all  knowl¬ 
edge  of  how  the  table  is  organized  is 
hidden  within  the  maintenance 
functions.  The  application  program 
doesn’t  know  (or  care  about)  the  actu¬ 
al  details.  By  the  same  token,  the 
maintenance  functions  don't  know 
about  anything  the  application  is  do¬ 
ing  with  the  table.  All  they  know 
about  is  how  to  put  things  in  tables 
and  take  them  out  again. 

All  the  routines  use  the  hash.h  file 
given  in  Listing  Four,  page  58.  For 
now  let’s  avoid  the  details  and  consid¬ 
er  the  one  thing  that’s  needed  from 
this  file  to  use  the  maintenance  func¬ 
tions — the  HASH— TAB  structure.  A 
HASH— TAB  is  used  in  the  same  way  as 
the  FILE  structure  used  by  the  buff¬ 
ered  I/O  routines  is  used.  The 
subroutine: 

HASH_TAB  *maketab(  size  ) 
unsigned  size; 

is  analogous  to  fopeni  );  it  creates  a 
table  having  the  indicated  size  and 
returns  a  pointer  to  it.  It  turns  out 
that  hash  functions  work  best  if  size 
is  a  prime  number.  Some  likely  can¬ 
didates  are  47,  61,  89,  113,  127,  157, 
193,  211,  257,  293,  359,  and  401.  The 
larger  the  table,  the  quicker  the  ac¬ 
cess  time.  On  the  other  hand,  the 
larger  the  table,  the  larger  the  table. 
I’ve  found  127  and  257  to  be  reason¬ 
able  numbers.  Maketabi  J  won't  re¬ 
turn  if  it  can’t  make  the  table  (it  exits 
with  a  status  of  1  in  this  case). 

Once  a  table  has  been  created,  ob¬ 
jects  can  be  inserted  with  a  call  to: 

char  *addsym(  tabp,  name,  size  ) 

HASH_TAB  *tabp; 
char  ’name; 

where  tabp  is  a  pointer  returned 
from  a  previous  maketabi  )  call,  name 
is  a  symbol  name,  and  size  is  the  size 
of  a  block  of  memory  that  the  appli¬ 
cation  will  use.  Only  the  first  32  char¬ 
acters  in  name  are  significant;  addi¬ 
tional  characters  are  ignored. 
Addsymi  )  won’t  return  if  it  can’t  get 


Dr.  Dobb's  Journal,  February  1987 

140 


91 


C  CHEST 

(continued  from  page  91) 


the  memory.  It  prints  an  error  mes¬ 
sage  and  exits  from  the  program 
with  a  status  of  1  in  this  case.  On  suc¬ 
cess,  addsymf  )  returns  a  pointer  to  a 
block  of  memory  that  can  be  used  by 
the  application  just  as  if  it  had  been 
returned  from  mallocl  ).  The  symbol 
name  is  stored  automatically  by  add- 
sym(  ),  so  it  need  not  be  stored  again 
in  the  application  area.  Note  that  the 
memory  is  already  attached  to  the  ta¬ 
ble  at  this  point.  That  is,  the  allocate 
and  add  functions  have  been  merged 
into  a  single  subroutine. 

The  application  can  look  for  a  spe¬ 
cific  symbol  with  a  call  to: 

char  *findsym(  tabp,  name  ) 

HASH— TAB  *tabp; 
char  'name; 

where  tabp  is  a  pointer  returned  from 
maketabf  )  and  name  is  a  symbol 
name.  If  the  named  symbol  is  in  the 
table,  the  same  pointer  that  was  re¬ 
turned  from  the  original  addsymf  ) 
call  is  returned;  otherwise,  NULL  is  re¬ 
turned.  The  application  can  use  this 
pointer  as  it  sees  fit.  Note  that  find- 
sym(  )  does  not  remove  the  node 
from  the  table — it  just  returns  a  point¬ 
er  to  the  node’s  application  area. 

A  symbol  is  deleted  by  calling: 

delsyml  tabp,  symp  ) 

HASH_TAB  "tabp; 
char  *symp; 


where  tabp  is,  again,  a  pointer  re¬ 
turned  from  maketabf )  and  symp  is  a 
pointer  returned  from  fmdsymf  ). 

Delsymf  )  both  removes  the  symbol 
from  the  table  and  deletes  all  memo 
ry  associated  with  that  symbol,  in¬ 
cluding  the  application 
area. 

The  basis  of  all  hash  hash_tab: 

strategies  is  the  conver-  + - + 

sion  of  a  string  into  a  table  |  *-+ 

random  number.  This 

number  is  then  used  as  size 

an  index  into  a  large  ar-  _ i  , 

°  numsyms  2. 

ray — the  hash  table  it-  + _ + 

self.  In  the  best  hash 
functions,  there  is  virtu¬ 
ally  no  relation  between 
the  original  string  and 
the  hashed  number. 

The  purpose  of  the  ran¬ 


domization  is  to  scatter  the  names  as 
evenly  as  possible  throughout  the  en¬ 
tire  table.  In  practice,  there  are  al¬ 
ways  a  few  strings  that  will  hash  to 
the  same  numeric  value.  This  condi¬ 
tion,  called  a  collision, can  be  resolved 
in  several  ways.  The  easiest  is  to 
make  the  hash  table  itself  an  array  of 
pointers  to  structures.  Each  of  these 
points  to  the  head  of  a  linked  list. 
When  a  collision  occurs,  the  new 
node  is  just  linked  into  the  head  of 
the  list. 

This  strategy  is  particularly 
convenient  in  a  compiler's  symbol  ta¬ 
ble,  where  two  symbols  might  have 
the  same  name  but  different  scope.  If 
colliding  nodes  are  linked  to  the  head 
of  the  list,  they  will  shadow  previous¬ 
ly  declared  nodes  associated  with  the 
same  name. 

The  HASH-TAB  and  BUCKET  struc¬ 
tures  defined  in  hash.h  are  used  to 
define  the  table.  A  HASH-TAB  is  the 
actual  table.  It  looks  like: 

typedef  struct  HASH — TAB_ 

( 

BUCKET  "table  ; 

int  size  ; 

int  numsyms  ; 

} 

HASH_TAB; 

The  table  field  points  at  the  actual  ar¬ 
ray.  This  array  is  created  with  a  mal- 
locf  )  call  when  maketabf  )  is  called. 
Size  holds  the  number  of  elements  in 
the  table,  as  was  passed  to  maketabf  ). 
Numsyms  is  a  count  of  the  number  of 
symbols  currently  in  the  table.  It’s 
useful  only  for  statistical  purposes — 
the  hash  functions  themselves  don’t 
use  it. 

Each  element  of  the  hash  table  ar 
ray  points  at  a  BUCKET  structure,  de¬ 
fined  in  hash.h  as: 


typedef  struct  element— 

{ 

struct  element—  “next; 

struct  element—  "prev; 

char  snamej  MAXNAME  +  1]; 

} 

BUCKET; 

A  BUCKET  is  actually  a  header,  pre¬ 
fixed  onto  the  top  of  a  block  of  memo¬ 
ry  used  by  the  application  program. 
This  technique  is  used  by  mallocf  ), 
and  I  used  the  same  technique  in  the 
AVL  tree  routines  a  few  months  ago. 
Addsymf  )  returns  a  pointer  to  the 
application  area,  just  below  the  BUCK¬ 
ET.  The  BUCKETS  themselves  hold  the 
symbol  name  and  two  pointers  used 
to  form  a  doubly  linked  list. 

The  whole  system  of  structures  is 
shown  in  Code  Example  1,  below. 
The  example  shows  a  length  5  table 
with  two  symbols  inserted  into  it. 
The  symbols  both  hash  to  the  same 
number,  so  a  collision  condition  is 
shown.  The  hash  table  itself  is  an  un¬ 
named  array  pointed  to  by  the  table 
field  of  the  HASH-TAB  structure.  The 
array  entry  forms  the  head  of  a 
linked  list,  pointed  to  by  the  neyf 
fields  in  the  various  BUCKETS.  The  list 
is  double-linked,  which  means  that 
not  only  is  there  a  pointer  to  the  next 
node  in  the  chain  but  also  a  pointer  to 
the  previous  node  is  kept.  This  lets 
you  remove  a  symbol  from  the  list 
without  having  to  chase  down  the 
entire  list.  All  you  need  is  a  pointer  to 
the  actual  node  you  wish  to  delete. 
Note  that  the  prev  pointer  points  not 
at  an  entire  BUCKET  structure  but 
rather  at  the  neyf  field  of  the  previous 
structure.  This  way,  the  leftmost 
node  in  the  chain  is  not  a  special  case. 
All  backward  pointing  pointers  point 
at  objects  of  the  same  type — pointers 
to  BUCKETS.  Assuming  that  p  is  a 


+- 

->! 

+  - 


BUCKET: 

+  + - + 

+-— >1  I 

+ 


*  I  * 

| name | prev | next | 

+ - + - + - + 

I  application 
area 


+  +— 

+  —  >1 


+ - 

1  0 


Code  Example  1:  A  complete  hash  table 


94 


Dr.  Dobb’s  Journal,  February  1987 

141 


C  CHEST 

(continued  from  page  94) 


pointer  to  a  BUCKET  that  you  want  to 
delete,  you  can  remove  the  BUCKET 
with: 

if(  *(p->prev)  =  p->next ) 
p->next->prev  =  p->prev 

The  assignment  in  the  if  statement 
makes  the  negt  pointer  of  the  previ¬ 
ous  node  point  around  the  current 
node.  The  next  line  is  executed  only 
if  there’s  more  than  one  node  in  the 
chain.  In  this  case  the  prev  field  of  the 
node  that  follows  the  one  you  want 
to  delete  is  made  to  point  at  the  next 
field  of  the  node  to  the  left  of  the  one 
you  want  to  delete.  If  the  deleted 
node  is  at  the  head  of  the  chain,  the 
hash  table  array  itself  is  modified. 

One  final  point  worth  mentioning 
is  the  hash  algorithm  itself.  More  pa¬ 
per  than  I  care  to  think  about  has 
been  wasted  talking  about  "efficient” 
hash  algorithms.  I’ve  always  been 
suspicious  of  these  fancy  functions, 
especially  the  ones  that  use  several 
multiplies  and  divides.  It  doesn't 
seem  that  the  inefficiencies  inherent 
in  the  more  complex  algorithms  can 
be  countered  by  any  better  perform¬ 
ance  in  the  collision  resolution  de¬ 
partment.  How  long  can  it  take  to 
chase  down  a  linked  list  anyway?  So 
being  my  own  untrusting  self,  I  tried 
out  about  20  hash  functions  empiri¬ 
cally.  I  was  not  surprised  to  find  that 
the  fanciest  functions  were  so  slow  as 
to  be  virtually  useless.  The  three  best 
algorithms  I  tried  are  shown  in  Table 
1,  right.  The  first  two  are  derived 
from  the  HashPJW  function  de¬ 
scribed  in  the  "dragon”  book.5  To  my 
surprise,  however,  adding  together 
the  characters  in  the  name  is  just 
about  as  good  as  HashPJW,  and  addi¬ 
tion  is  a  lot  faster  than  all  that  shift¬ 
ing,  type  conversion,  and  XORing.  I 
used  addition  in  my  own  hash  func¬ 
tion.  The  keywords  used  in  the  test 
are  names  I  extracted  from  various  C 
programs  in  my  own  library. 

Binary  to  Roman  Numerals 

Listing  Five,  page  GO,  shows  itoasciif  ), 
a  fancy  binary-to-integer  conversion 
routine.  Unlike  itos(  ),  ecvt(  ),  and  the 
like,  itoasciif  )  can  convert  to  several 
formats,  depending  on  the  value  of 
its  second  argument.  The  various  for- 


96 

142 


mats  are  listed  in  Table  2,  page  97. 
The  alphabetic  output  goes  like: 

a  b  c  . . .  y  z  aa  ab  ac  . . .  az  ba  bb 

be  .  . .  bz  ca  . . . 

It's  useful  if  you're  making  out¬ 
lines.  The  spelled  out  formats  print: 

one 

two 

three 


thirteen 

fourteen 

fifteen 

twenty-one 

twenty-two 

two  thousand,  one  hundred  thirty- 
one 

two  thousand,  one  hundred  thirty- 
two 


HASHPJW 

for(  h  =  0;  ’name  ;  h 

=  (h  <<  4)  +  *name++  ) 

iff  g  =  h  &  ( (funsignedXO)  >  >  4 )) ) 

h  =  (h 

'  (g  >>  WLEN— 8))  “  g; 

1110  entries  in  1 27  element  hash  table,  0  (0%)  empty. 

Mean  chain  length:  8,  max= 

=20,  min=3,  deviation=2 

3  chains  of  length  3 

5  chains  of  length  4 

7  chains  of  length  5 

1 1  chains  of  length  6 

14  chains  of  length  7 

21  chains  of  length  8 

15  chains  of  length  9 

22  chains  of  length  10 

12  chains  of  length  11 

8  chains  of  length  1 2 

5  chains  of  length  1 3 

1  chains  of  length  14 

1  chains  of  length  15 

1  chains  of  length  17 

1  chains  of  length  20 

SIMPLIFIED  HASHPJW 

forf  h  =  0;  ’name ;  h 

=  (h  <<  4)  +  *name+  +  ) 

1110  entries  in  1 27  element  hash  table,  0  (0%)  empty. 

Mean  chain  length:  9,  max= 

25,  min=1,  deviation =3 

2  chains  of  length  1 

4  chains  of  length  2 

4  chains  of  length  3 

8  chains  of  length  4 

13  chains  of  length  5 

5  chains  of  length  6 

1 2  chains  of  length  7 

17  chains  of  length  8 

13  chains  of  length  9 

16  chains  of  length  10 

9  chains  of  length  1 1 

7  chains  of  length  12 

2  chains  of  length  13 

4  chains  of  length  1 4 

1  chains  of  length  15 

3  chains  of  length  1 6 

3  chains  of  length  17 

1  chains  of  length  19 

1  chains  of  length  20 

1  chains  of  length  21 

1  chains  of  length  25 

ADDITION: 

forf h  =  0;  ’name ;  h  +=  *name+  +  ) 

1 1 10  entries  in  127  element  hash  table,  0  (0%)  empty. 

Mean  chain  length:  7,  max= 

16,  min =2,  deviation =2 

1  chains  of  length  2 

4  chains  of  length  3 

1  chains  of  length  4 

9  chains  of  length  5 

14  chains  of  length  6 

1 5  chains  of  length  7 

16  chains  of  length  8 

21  chains  of  length  9 

10  chains  of  length  10 

16  chains  of  length  11 

7  chains  of  length  12 

4  chains  of  length  13 

- 

7  chains  of  length  14 

1  chains  of  length  1 5 

1  chains  of  length  16 

Table  1:  Performance  of  various  hash  functions 


Dr.  Dobb's  Journal,  February  1987 


Roman  numerals  (i,  ii,  iii,  iv,  v,  vi, 
vii,  viii,  ix, . . . )  are  somewhat  limited. 
In  particular,  numbers  larger  than 
MMMMM  can’t  be  printed  because  an 
overscore  would  be  required  and 
most  printers  don’t  support  overline, 
only  underline.  Also,  because  there’s 
no  zero  in  the  Roman  counting  sys¬ 
tem,  an  Arabic  0  is  used  if  n  is  zero. 

The  Arabic  conversions  are  all  done 
using  sprintff  ),  spelled  formats  are 
done  by  itoengf  )  (line  35),  itoromanf  ) 
(line  128),  and  itoalpha(  )  (line  202).  Of 
these,  itoengf  )  is  the  most  straightfor¬ 
ward.  The  main  problems  have  to  do 
with  getting  the  commas  and  hy¬ 
phens  in  the  right  place.  The  other 
two  conversions  are  a  little  harder  be¬ 
cause  neither  number  system  has  a 
zero  in  it.  Consequently,  you  can’t  just 
modify  a  basic  itoa(  )  routine.  Itoro¬ 
manf  )  uses  a  lookup  table  to  assemble 
the  numerals.  (See  Code  Example  2, 
page  98.)  The  Ms  used  for  thousands 
are  printed  with  a  simple  loop  on 
lines  178  to  181  of  Listing  Five.  The  rest 
of  the  digits  are  printed  by  the  for 
loop,  also  in  Code  Example  2. 

The  outer  loop  executes  at  most 
three  times.  Rp  starts  out  pointing  to 


C  CHEST 


table.  The  number  is  printed  by  the 
inner  for  loop. 

Itoalphaf  )  is  very  atoif  J-like.  The 
lack  of  a  zero  complicates  things 
here,  too.  In  particular,  the  alphabet¬ 
ic  numbers  aren't  a  simple  base-26 
counting  system.  That  is,  if  you  map  a 
to  0,  b  to  1,  c  to  2,  and  so  forth,  the 
output  series  would  look  like: 

a  b  c  d  .  . .  ba  bb  be  . .  .  ca  cb  cc  . . . 

instead  of: 

a  b  c  d  .  . .  aa  ab  ac  . . .  ba  bb  be  .  . . 

Changing  the  mapping  by  leaving 
out  the  0  and  mapping  a  to  1,  b  to  2, 
and  so  forth  gives  you: 

abed...  a_  aa  ab  ac  .  .  .  b_  ba  bb  be 

which  is  closer  to  the  desired  series, 
but  now  the  zero  causes  problems, 
represented  by  the  underscore  in  the 
strings  (placed  where  the  zero  would 
go  if  you  had  one).  Itoalphaf  )  actually 
does  the  conversion  with: 

do{ 


the  hundredths  part  of  the  table  (the 
first  ten  entries).  The  line: 

cp  =  *(rp  +  (n/i)); 

computes  a  pointer  to  the  correct 
string,  where  n  is  the  current  value  of 
the  number  and  z  is  the  current  mul¬ 
tiplier.  It  is  100  the  first  time  through 


the  loop,  10  the  second  time,  and  1 
the  last  time.  The  expression  could 
also  be  written: 

cp  =  rp[  n/i  ]; 

N  and  i  are  then  adjusted  for  the  next 
iteration,  and  10  is  added  to  rp  to 
make  it  point  at  the  next  part  of  the 


int 

itoascii(  str,  fmt,  n ) 

char 

*str; 

/*  Output  buffer  */ 

int 

n; 

/*  Integer  to  convert  */ 

int 

fmt; 

/*  One  of  the  following: 

fmt:  Output  format: 

T  Lowercase  Roman  numerals 

T  Uppercase  Roman  numerals 

’a’  Lowercase  alphabetic 

’A’  Uppercase  alphabetic 

’e’  Spelled  out  in  lowercase 

’E’  Spelled  out,  first  letter  capitalized 

’  1  ’  Arabic  numerals  in  a  1  -character  wide 

field 

’2’  Arabic  numerals  in  a  2-character  wide 

field,  zero  padded 
etc. 


Table  2:  Itoasciij )  calling  syntax 


*p+  +  =  (n  %  26)  + 

(uppercase  ?  ’A’ :  ’a’); 


is  done  in  the  while  part  of  the  do 
.  .  .  while  statement. 


}  while!  (n  =  (n/26)— 1)  >=  0  ); 

Here  p  is  a  pointer  to  the  output  buff¬ 
er.  The  n  %  26  selects  the  current  digit 
in  the  expected  way.  You  can’t  just 
divide  by  26  to  get  the  next  digit  be¬ 
cause  of  that  0,  so  you  compensate  for 
the  0  by  dividing  and  then  subtract¬ 
ing  1  from  the  result.  This  subtraction 


Expression  Analysis 

An  arithmetic  expression  analyzer, 
along  with  a  discussion  of  the  under¬ 
lying  theory,  was  published  in  the 
September  1985  C  Chest.  That  analyz¬ 
er  had  several  problems  that  I  knew 
about  at  the  time  but  ignored  in  order 
to  make  the  grammar  more  intuitive¬ 
ly  obvious.  A  real  application,  how- 


static  char  ♦rnuras  [  ]  = 

t 

»t  ii 

"C, 

"CC"  , 

"CCC" , 

"CD", 

"D"  , 

"  DC  "  , 

"DCC" 

,  "DCCC " , 

"CM"  , 

ii  ii 

"X"  , 

"XX"  , 

"XXX" , 

"XL"  , 

"L"  , 

"LX", 

"LXX" 

,  "LXXX" , 

"XC"  , 

••  it 

"I", 

"II", 

"III", 

"IV"  , 

"V"  , 

}; 

"VI"  , 

"Vii" 

,  "VIII", 

"IX" 

rp  =  rnums  ; 

for  (  i  =  10*10 

n}0  6S  i) 

=  1  ;  i 

/=  10  ) 

cp  =  *(rp  +  (n/i)  )  ; 

n  %=  i ; 

rp  +=  10; 

f  or  (  ;  *cp 

;  cp H  h  ) 

*dest+  +  =  (uppercase) 

} 

?  *cp  :  *cp  + 

( 'a'  -  'A'  )  ; 

Code  Example  2:  Lookup  table  used  by  itoromanf  J 


Dr.  Dobb's  Journal,  February  1987 


97 

143 


C  CHEST 

(continued  from  page  98) 


ever,  needs  a  more  realistic  expres¬ 
sion  analyzer.  One  is  presented  here 
in  Listing  Six,  page  64.  If  you  don’t 
know  what  a  formal  grammar  is  or 
how  a  recursive  descent  parser 
works,  you  should  either  go  back  and 
read  the  previous  C  Chest  or  look  at 
my  book  The  C  Companion.6 

The  analyzer  is  called  with: 

VTYPE  parse!  expr_p  ) 
char  “expr_p; 

VTYPE  is  currently  defined  to  double, 
but  you  can  change  it  to  any  type 
that's  convenient  and  recompile  if 
you  wish.  Changing  to  an  integral 
type  will  make  the  program  smaller. 
Eypr_p  is  a  pointer  to  a  string  point¬ 
er.  The  string  itself  holds  the  expres¬ 
sion  to  analyze.  The  expression  is 
parsed,  *expr_p  is  adjusted  to  point 
past  the  parsed  string,  and  the  result 
of  the  expression  evaluation  is  re¬ 
turned.  Evaluation  terminates  when 
the  first  character  that  can  t  be  part 
of  an  expression  is  encountered. 

Expressions  are  composed  of: 

spaces — All  white  space  is  ignored. 

numbers — If  VTYPE  is  a  floating-point 
type  ( float  or  double ),  then  numbers 
with  decimal  points  are  permitted; 
otherwise,  a  period  is  an  illegal  char¬ 
acter  and  should  be  shunned. 

operators — Several  operators  are 
supported,  as  shown  in  Table  3,  right. 
Operators  on  higher  lines  have  high¬ 
er  precedence.  They  all  associate  left 
to  right  and  evaluate  as  in  C.  Unlike  C, 
evaluation  of  &&  and  II  does  not  ter¬ 
minate  when  the  truth  or  falsity  of 
the  expression  can  be  determined. 
The  minus  sign  on  the  top  line  is  una¬ 
ry  minus;  the  one  on  line  3  is  binary 
minus.  The  '.strl  'str2  '  operator  works 
just  like  strcmp(strl,str2 )  does.  Paren¬ 
theses  are  for  grouping  rather  than 
for  subroutine  calls. 

The  grammar  used  is  shown  on 
lines  27  to  56  of  Listing  Six.  It  is  a  clas¬ 
sic  LL(1)  expression  grammar.  For  the 
most  part,  the  subroutines  follow  the 
productions  quite  closely.  Note  that 
I've  merged  several  productions  into 
a  single  subroutine  when  possible. 


An  example  of  this  merging  is  shown 
in  Code  Example  3,  below.  A  descrip¬ 
tion  of  the  merging  process  is  in  The 
C  Companion,  cited  earlier. 

Notes 

1.  A  note  on  nomenclature.  The  two 
Unix  text  formatters,  nroff  and  troff, 
are  almost  identical.  The  main  differ¬ 
ences  have  to  do  with  commands 
that  are  unique  in  typesetting  (point 
changes  and  so  on)  as  compared  to 
simple  text  formatting.  Nr  imple¬ 
ments  all  of  nroff,  but  the  implemen¬ 
tation  of  the  troff  features  is  spotty. 
For  simplicity's  sake,  I’ll  say  nroff 
throughout,  even  if  I'm  talking  about 
what  is  really  a  feature  of  troff. 

2.  Brian  Kernighan  and  P.  J.  Plauger, 
Software  Tools  in  Pascal  (Reading, 
Mass.:  Addison-Wesley,  1981). 

3.  I’m  using  a  version  of  the  Unix  vi 
editor  called  PC/VI.  It's  $149  from  Cus¬ 
tom  Software  Systems,  P.O.  Box  678, 
Natick,  MA  01760,  and  I  recommend 
it  highly  if  you're  a  vi  addict  like  me. 
PC/VI  is  a  solid  and  very  complete  im¬ 
plementation  of  vi  for  MS-DOS. 


4.  Hash  algorithms  are  discussed  in 
greater  depth  by  Robert  Kruse,  Data 
Structures  and  Program  Design  (En¬ 
glewood  Cliffs,  N.J.:  Prentice-Hall, 
1984),  pages  112—128,  and  also  by 
Aaron  Tenenbaum  and  Moshe  Au- 
genstein,  Data  Structures  Using  Pas¬ 
cal,  2d  ed.  (Englewood  Cliffs,  N.J.: 
Prentice-Hall,  1986),  pages  521—573. 
Both  these  books  are  very  good.  The 
examples,  all  in  nicely  done  Pascal, 
are  informative  and  useful. 

5.  Aho,  Sethi,  and  Ullman,  Compilers: 
Principles,  Techniques,  and  Tools 
(Reading,  Mass.:  Addison-Wesley, 

1986) ,  436. 

6.  Allen  I.  Holub,  The  C  Companion 
(Englewood  Cliffs,  N.J.:  Prentice-Hall, 

1987) ,  189—211.  The  descriptions  in 
the  Companion  are  a  bit  better  than 
those  in  the  original  C  Chest  articles. 

DDJ 

(Listings  begin  on  page  52.) 

Vote  for  your  favorite  feature/article. 

Circle  Reader  Service  No.  5. 


The  productions: 

(fact  )  :  :  =  (part)  (factl) 

(fact  1 )  :  :  =  +  (part)  (fact  1 ) 

:  :=  -  (part)  (factl) 

:  :  =  epsilon 

can  be  implemented  as: 

static  VTYPE  f  act(  ) 

/ 

\ 

VTYPE  left; 

left  =  part  (  )  ; 

for ( ; ; ) 

/ 

\ 

if  (  match (  "  +  ")  ) 

j  advance!  1  ]  ;  left  +  =part();  } 

else  if  (  match(  "  —  "  )  ) 

{ advance (1);  left  —  —  part (  )  ;  } 

else  break; 

/ 

return  left ; 

} 

Code  Example  3:  Merging  two  productions  into  a  single  subroutine 


100 

144 


Dr.  Dobb's  Journal,  February  1987 


COLUMNS 


16-BIT  SOFTWARE  TOOLBOOX 


Resources  for  MS-DOS 
Programmers 

ourdain,  Robert.  Programmer's 
Problem  Solver  for  the  IBM  PC,  XT, 
and  AT.  New  York:  Brady  Communi¬ 
cations,  1986.  473  pages  with  index. 
$22.95.  ISBN  0-89303-787-7 

Topics  covered  include  determin¬ 
ing  the  system  configuration;  manag¬ 
ing  interrupts;  memory  allocation; 
reading  and  setting  timers;  control¬ 
ling  video  adapters;  creating  tones; 
controlling  the  keyboard  interface, 
printer,  and  serial  port;  reading  and 
writing  disks;  and  device  drivers.  For 
the  most  part,  each  topic  is  accompa¬ 
nied  by  three  example  program  list¬ 
ings:  a  high-level  routine  in  BASIC,  an 
intermediate-level  routine  in  assem¬ 
bly  language  that  calls  the  operating 
system  or  ROM  BIOS,  and  a  low-level 
routine  in  assembly  language  that  ac¬ 
cesses  the  hardware  directly.  Almost 
anything  you  can  imagine  wanting  to 
do  to,  with,  or  on  an  IBM  PC  can  be 
found  in  this  book,  including  reading 
and  writing  files  to  a  cassette  record¬ 
er!  But  the  book  is  not  a  tutorial  and 
will  be  most  useful  in  combination 
with  a  book  such  as  The  Peter  Norton 
Programmer's  Guide  to  the  IBM  PC, 
Angermeyer  and  Jaeger's  MS-DOS  De¬ 
veloper  s  Guide,  or  my  own  Ad¬ 
vanced  MS-DOS  (yes,  that  was  a  plug). 

Rollins,  Dan.  IBM-PC  8088  Macro  As- 


by  Ray  Duncan 

sembler  Programming.  New  York: 
Macmillan,  1985.  435  pages  with  in¬ 
dex.  $25.50.  ISBN  0-02-403210-7 

A  very  nice  primer  on  8086/88  as¬ 
sembly  language,  starting  at  the  most 
elementary  level  but  progressing  to 
advanced  topics  such  as  structures, 
macros,  and  conditional  assembly.  It 
has  a  brief  section  on  file  and  record 


operations  under  MS-DOS,  but  this 
only  covers  the  now-obsolete  file 
control  block  functions.  The  book 
ends  with  a  detailed  discussion  of 
programming  for  the  IBM  PC  video  in¬ 
terface,  including  graphics  modes, 
that  is  helpful  and  practical. 

Sargent,  Murray,  III,  and  Shoemaker, 
Richard  L.  The  IBM  PC  from  the  Inside 
Out.  rev.  ed.  Reading,  Mass.:  Addison- 
Wesley,  1986.  483  pages  including  in¬ 
dex.  $19.95.  ISBN  0-201-06918-0 

This  book  covers  a  wide  range  of 
topics  and  is  the  only  MS-DOS  book  I 
know  of  that  is  directed  at  the  hard¬ 
ware  hacker.  The  PC  system  bus,  pe¬ 
ripheral  chips,  and  controllers  are 
discussed  in  great  detail  with  many 
programming  examples,  and  the 
book  winds  up  with  a  chapter  on 
how  to  breadboard  your  own  inter¬ 
faces.  It’s  a  real  pity  that  Addison- 
Wesley  didn't  see  fit  to  invest  in  de¬ 
cent  production  for  this  book, 
especially  as  it  was  popular  enough 
to  warrant  a  second  edition;  instead, 
the  book  was  offset  from  camera- 
ready  copy  prepared  on  a  daisy- 
wheel  printer  with  a  particularly 
crowded  and  tiring  sans  serif  font. 

King,  Richard  Allen.  The  MS-DOS 
Handbook.  2d  ed.  Berkeley,  Calif.:  Sy- 
bex  Inc.,  1986.  338  pages  including  in¬ 
dex.  $19.95.  ISBN  0-89588-352-X 

I  discussed  the  first  edition  of  this 
book  in  my  May  1986  column.  The 
second  edition  covers  essentially  the 
same  ground  but  has  some  additional 


material  on  networking  and  MS-DOS, 
Version  3.  Incidentally,  previously  I 
commented  that  this  book  "jumbled 
together"  material  about  MS-DOS  and 
PC-DOS  "with  very  little  distinction.” 
In  a  letter  that  accompanied  the  sec¬ 
ond  edition,  Mr.  King  said,  "You  are 
right, .  . .  [but]  my  claim  is  that  it  does 
not  matter,  and  few  people  make  the 
distinction  anyway."  The  year  that 
has  passed  since  I  originally  wrote 
those  carping  words  has  vindicated 
Mr.  King.  The  incredible  domination 
of  the  marketplace  by  the  IBM  PC  ar¬ 
chitecture  has  made  any  distinction 
between  generic  MS-DOS  and  IBM  ver¬ 
sions  of  MS-DOS  a  moot  point. 

DeVoney,  Chris.  Using  PC-DOS.  India¬ 
napolis,  Ind.:  Que  Corp.,  1986.  519 
pages  including  index.  $21.95.  ISBN  0- 
88022-170-4 

This  is  not  a  programming  book 
but  is  a  superuser  manual  to  PC-DOS 
that  eclipses  all  other  such  books,  in¬ 
cluding  all  the  Microsoft  and  IBM 
manuals.  The  last  section  of  the  book 
is  a  thorough,  alphabetical  reference 
to  PC-DOS  commands,  including  ver¬ 
sion  dependency  information  and  a 
detailed  list  of  error  messages. 

The  MS-DOS  STACK 
Command 

People  who  have  switched  from  ear¬ 
lier  versions  of  MS-DOS  to  Version  3.2 
have  sometimes  been  surprised  to 
find  that  previously  healthy  soft¬ 
ware  caused  the  system  to  halt  with 
the  mysterious  message  "Internal 
Stack  Error.” 

It  turns  out  that  the  boys  from  Boca 
got  concerned  because  when  net¬ 
work  cards  were  active  a  great  many 
interrupts  could  occur  in  rapid  suc¬ 
cession,  causing  a  program's  stack  to 
overflow  with  consequent  unpre¬ 
dictable  damage  to  the  system.  They 


102 


Dr.  Dobb's  Journal,  February  1987 

145 


therefore  instigated  a  new  scheme  in 
MS-DOS  3.2  such  that  when  an  inter¬ 
rupt  occurs,  the  system  automatical¬ 
ly  switches  to  a  stack  allocated  from 
an  internal  pool  before  passing  con¬ 
trol  to  the  handler.  When  the  inter¬ 
rupt  service  is  complete,  the  stack  is 
released  back  to  the  pool.  When  suffi¬ 
cient  interrupts  occur  in  a  brief  peri¬ 
od  and  are  all  simultaneously  in  vari¬ 
ous  stages  of  processing,  the  stack 
pool  can  be  exhausted  and  the  sys¬ 
tem  halts  with  the  previously  men¬ 
tioned  error  message. 

The  number  and  size  of  stacks  in 
the  internal  pool  can  be  controlled  at 
system  initialization  time  by  adding 
the  new  MS-DOS  3.2  STACKS=n,s  com¬ 
mand  to  the  CONFIG.SYS  file,  where  n 
is  the  number  of  stack  frames  (8—64, 
with  a  default  of  9)  and  s  is  the  size  (in 
bytes)  of  each  stack  frame  (32—512, 
with  a  default  of  128). 

Remember  that  this  “feature”  was 
specifically  added  to  IBM’s  version  of 
MS-DOS  and  may  or  not  be  present  (or 
documented)  in  other  OEM  versions 
of  MS-DOS. 

Boyer-Moore  Algorithm 

Heartfelt  thanks  to  all  the  readers 
who  wrote  in  with  comments,  expla¬ 
nations,  and  improvements  on  the 
Boyer-Moore  routine  published  in 
the  October  1986  16-Bit  Software 
Toolbox  column.  By  an  interesting 
coincidence,  the  November  1986  is¬ 
sue  of  Computer  Language  carried  a 
lengthy  article  on  the  same  subject. 
Frankly,  I  am  still  digesting  that  opus 
and  all  the  feedback  I  got  on  the  sub¬ 
ject  from  DDJ  readers,  so  I  will  defer  a 
resumption  of  this  discussion  until 
the  future. 

TSRs  and  File  I/O 

Terry  Flanagan,  of  GSD  Development 
Corp.,  Chicago,  Illinois,  responded 
with  the  following  letter  to  Gary 
Cramblitt's  appeal,  printed  in  the 
September  1986  16-Bit  Toolbox  col¬ 
umn,  for  help  with  Terminate  and 
Stay  Resident  utilities:  "The  problem 
is,  as  Gary  mentioned,  that  MS-DOS  is 
not  reentrant.  It  is  a  single-user  oper¬ 
ating  system,  and  as  such,  reentrancy 
is  not  a  requirement.  This  situation 
severely  limits  the  processing  that 
can  be  performed  by  a  memory-resi¬ 
dent  program,  however. 

"The  problem  has  been  resolved  to 
some  extent  by  various  software  com¬ 


panies  that  have  developed  memory- 
resident  utilities.  There  are  still  con¬ 
flicts  between  these  various  utilities, 
and  there  is  still  no  officially  sanc¬ 
tioned  method  for  providing  compati¬ 
bility  between  memory-resident  pro¬ 
grams.  A  multitasking  version  of  MS- 
DOS  will  probably  appear  before  a  TSR 
standard  is  adopted.  During  the  inter¬ 
im,  developers  of  memory-resident 
programs  are  on  their  own. 

"There  are  several  methods  I’ve 
read  about  and/or  experimented 
with  to  determine  when  MS-DOS  is 
‘safe.’  One  method,  which  Gary  men¬ 
tioned  in  his  letter,  is  the  DOS  Critical 
Flag.  This  is  a  byte  within  the  DOS  code 
segment  that  is  incremented  upon  en¬ 
try  to  the  DOS  function  dispatcher  ( int 
21h)  for  any  potentially  'unsafe’  func¬ 
tion  calls  and  decremented  upon  exit. 
The  function  that  returns  the  address 
of  that  flag  is  int  21h,  function  34h.  The 
DOS  Critical  Flag,  therefore,  is  a  count¬ 
er  that  indicates  the  number  of  DOS 
calls  in  progress.  Normally,  this  count¬ 
er  never  exceeds  1.  So  it  is  theoretical¬ 
ly  safe  to  call  DOS  whenever  this 
counter  is  0.  Unfortunately  this  is  not 
usually  the  case. 

"Certain  DOS  calls  are  relatively 
safe  to  interrupt.  These  are  the  char¬ 
acter  I/O  calls  (int  21h,  functions  I 
through  Och).  DOS  normally  resets  its 
own  internal  stack  for  each  function 
call.  If  a  character  I/O  call  has  been 
interrupted,  however,  as  indicated 
by  an  internal  flag,  then  DOS  will  ad¬ 
just  the  stack  pointer  upward  so  as 
not  to  collide  with  the  stack  of  the 
character  I/O  call  in  progress.  The 
problem  is  how  to  determine  when 
character  I/O  is  in  progress. 

"Borland’s  SideKick  is  a  pioneer  in 
this  area.  Borland  decided  to  play  it 
safe  by  letting  SideKick  run  off  with 
as  many  interrupt  vectors  as  it  could 
carry.  This  is  typical  of  many  memo¬ 
ry-resident  utilities  and  results  in  the 
utilities  stealing  vectors  from  one  an¬ 
other  and  causing  further  confusion. 
SideKick  even  uses  the  timer  tick  to 
steal  back  the  interrupt  vectors. 
(Sooner  or  later  someone  will  just  re¬ 
program  the  8259  to  hide  the  real  in¬ 
terrupt  vectors  from  everyone  else 
and  really  mess  things  up.) 

"Anyway,  one  of  the  vectors  that 
SideKick  'borrows'  is  the  DOS  func¬ 
tion  dispatcher  (inf  21h).  By  intercept¬ 
ing  the  DOS  calls  the  program  can  de¬ 
termine  when  it  is  safe  to  call  DOS 


from  the  memory-resident  code.  The 
program  can  also  block  access  to  DOS 
or  reschedule  DOS  calls  if  desired. 

"Another  method  for  determining 
when  DOS  calls  are  safe  involves  int 
28h.  I’ve  read  that  this  interrupt  oc¬ 
curs  while  COMMAND.COM  is  waiting 
for  keyboard  input  and  DOS  is  basical¬ 
ly  in  an  idle  state.  The  DOS  back¬ 
ground  print  processor  supposedly 
uses  this  interrupt  to  perform  print 
operations. 

"From  what  I’ve  seen,  this  interrupt 
is  called  from  DOS  prior  to  nondestruc¬ 
tive  reads  to  the  console  to  check  for 
things  such  as  Control-C.  This  would 
normally  occur  during  character  I/O 
calls.  Prior  to  calling  int  28h,  DOS 
pushes  a  byte  flag  on  the  stack  (really 
a  word,  but  only  the  low-order  byte 
matters)  that  indicates  whether  a 
'safe'  character  I/O  call  is  in  progress 
or  not.  This  byte  contains  a  1  if  the  call 
is  safe;  otherwise  it  is  0.  So  it  would 
seem  that  if  the  DOS  Critical  Flag  is  0  or 
the  Critical  Flag  is  1  and  the  byte 
pushed  before  the  int  28h  call  is  1, 
then  it  would  be  safe  to  call  DOS.  I’m 
not  sure  I  would  even  want  to  think 
about  what  happens  when  a  'critical 
error’  occurs  on  a  DOS  call  that  inter¬ 
rupts  a  character  I/O  call,  though. 

"In  addition  to  determining  wheth¬ 
er  DOS  is  safe  or  not,  a  memory-resi¬ 
dent  program  must  be  reactivated 
somehow.  Most  of  the  currently 
available  utilities  use  the  concept  of  a 
'hot  key’  to  trigger  program  process¬ 
ing.  The  hot  key  can  be  detected  by 
monitoring  keyboard  input  through 
the  hardware  interrupt  (int  9)  and/or 
through  the  keyboard  BIOS  driver  in¬ 
terrupt  vector  (int  16h ).  Some  pro¬ 
grams  use  a  combination  of  these 
methods.  Some  utilities  also  use  the 
timer  interrupt  (int  8)  to  monitor  the 
state  of  the  keyboard  shift  flags  if  the 
hot  key  involves  a  shift-key  combina¬ 
tion. 

"Once  the  hot  key  is  detected,  the 
program  can  wait  for  DOS  to  become 
safe.  This  usually  involves  setting 
flags  and  then  using  the  timer  inter¬ 
rupt  to  trigger  hot-key  processing,  al¬ 
though  there  are  numerous  possibili¬ 
ties  here.  For  Gary's  screen  dump  it 
would  probably  be  advisable  to  save 
the  screen  in  memory  when  the  hot 
key  is  detected,  then  dump  it  as  soon 
as  DOS  is  safe. 

"One  other  point  that  may  be 
worth  mentioning  deals  with  the  PSP 


Dr.  Dobb's  Journal,  February  1987 

146 


103 


16-BIT 

(continued  from  page  103) 


(program  segment  prefix).  The  PSP  is 
used  to  store  things  such  as  the  envi¬ 
ronment  pointer,  command  tail,  file 
handles,  caller’s  stack  pointer  during 
DOS  calls,  and  so  on.  It  is  also  the  DOS 
equivalent  of  a  process  ID  (PID).  When 
a  memory-resident  program  is  reacti¬ 
vated,  DOS  is  still  assuming  the  PSP  of 
the  currently  executing  program. 
Therefore  it  is  advisable  to  switch  to 
the  PSP  of  the  memory-resident  pro¬ 
gram  [to  ensure  that  MS-DOS  is  look¬ 
ing  at  the  correct  table  of  file  handles  if 
the  memory-resident  program  needs 
to  perform  file  I/O — Hay].  At  installa¬ 
tion  time,  the  TSR  should  use  the  DOS 
function  call  62h  (MS-DOS  3.0  and  later) 
to  obtain  its  PSP  address  and  store  it  for 
future  reference  (versions  of  DOS  pri¬ 
or  to  3.0  should  use  the  undocument¬ 
ed  function  51h  to  do  the  same).  When 
the  memory-resident  program  is  re¬ 
activated,  it  should  use  another  un¬ 
documented  function  (50fi)  to  change 
the  PSP  and  restore  it  before  exiting. 
These  are  safe,  if  undocumented,  DOS 
calls. 


"There  are  no  guarantees  for  any 
of  these  methods,  especially  those  in¬ 
volving  undocumented  DOS  calls 
(why  does  Microsoft  keep  these  a  se¬ 
cret  anyhow?).  I  hope  that  some  sort 
of  standard  for  TSR  programs  will  be 
developed  and  that  these  mysteries 
will  be  solved  in  the  near  future.” 

It  is  true,  as  Terry  notes,  that  the 
MS-DOS  calls  he  mentions  are  undocu¬ 
mented  and  therefore  not  officially 
supported.  It  is  my  feeling,  however, 
that  it  is  quite  safe  to  use  these  calls  in 


MS-DOS,  Versions  2  and  3,  because 
these  versions  of  the  operating  sys¬ 
tem  seem  to  be  quite  stable  and  it 
does  not  appear  that  DOS  4  will  ever 
be  released  for  8086/88  machines  in 
the  United  States.  The  calling  se¬ 
quences  for  these  MS-DOS  functions 
are  in  Table  1,  below. 

On  the  subject  of  conflicting  Termi¬ 
nate  and  Stay  Resident  utilities,  there 
was  a  great  flurry  of  press  releases 
about  a  year  ago  to  the  effect  that  Mi¬ 
crosoft,  Borland,  and  other  worthies 


MS-DOS  Cali 

Call  With 

Returns 

int  21  h,  function  34h  * 

Get  DOS  Critical  Flag 
address 

ah  =  34h 

es:bx  = 

address  of  DOS  critical  flag 

int  21  h,  function  50h 

ah  =  50h 

nothing 

Set  current  PSP 

bx  =  PSP  (segment)  address 

int  21  h,  function  51  h  ** 
Get  current  PSP 

ah  =  51h 

bx  =  PSP  (segment)  address 

•  Interestingly,  function  34h  is  documented  in  the  Heath/Zenith  Version  2  Programmer's  Utility  Pack  manual  but  is  absent 
from  all  other  OEM  editions  of  the  MS-DOS  Programmer's  Reference. 

"Apparently  identical  to  int  21  h,  function  62h  in  MS-DOS  3.0  and  later  (so  why  give  it  another  function  number,  rather 
than  just  document  function  51ht  Another  MS-DOS  mystery  . . . .) 


Table  1:  Undocumented  MS-DOS  calls  34h,  50h,  and  51h 


were  going  to  work  out  and  cospon¬ 
sor  a  TSR  specification  that  would 
solve  these  problems  forever.  No 
word  of  progress  on  this  document 
has  been  forthcoming  for  the  last  six 
months,  however,  and  it  appears  to 
have  died  the  lingering  and  painful 
death  of  neglect.  The  torch  has  been 
taken  up  by  a  small  committee  that 
includes  Chip  Rabinowitz,  Chris  Dun- 
ford,  Jim  Kyle,  Neil  Rubenking,  and 
Lane  Ferris,  who  have  been  develop¬ 


ing  a  public-domain  protocol  for  TSRs 
called  Ringmaster.  The  Ringmaster 
effort  has  been  quietly  gathering  sup¬ 
port  and  seems  likely  to  have  become 
the  de  facto  standard  by  the  time  you 
read  this. 

Removing  Extra  EOF 
Characters 

Robert  Seabock,  of  Tamal,  California, 
wrote:  "Concerning  your  report  [Sep¬ 
tember  1986]  of  that  bizarre  bug  in 


Version  4  of  the  Microsoft  Macro  As¬ 
sembler  [where  extra  EOF  characters 
cause  problems  in  include  files]  and 
Microsoft’s  suggested  'fix’:  For  those 
people  (myself  included)  who  would 
rather  pick  up  a  skin  rash  than  a  copy 
of  EDLIN,  the  easiest  way  to  remove 
EOF  characters  from  the  end  of  a  text 
file  is  to  use  the  MS-DOS  TYPE  com¬ 
mand,  redirecting  output  to  a  file.  For 
example: 

TYPE  myfile  >  temp 

DEL  myfile 

REN  temp  myfile 

would  do  the  job  quite  nicely.” 

Thanks  Robert,  you  are  right.  But 
to  my  chagrin,  Robert  Thrun  of  Adel- 
phi,  Maryland,  pointed  out  that  there 
is  an  even  easier  way  to  strip  off 
those  nasty  extra  EOF  characters,  to 
wit: 

COPY  myfile  /A  newfile 

The  /A  (for  ASCII)  switch  is  present  in 
the  COPY  command  just  for  this  type 
of  problem!  It  copies  a  file  only  up 
until  the  first  EOF  mark,  then  stops. 


0000 

cseg  segment 

para  public  'CODE' 

assume 

cs:cseg,ds:cseg,ss:cseg 

0100 

org  1  OOh 

0100 

test  proc  near 

= 

test_equ  equ 

of  f  set  (  $  )  -  off  set  test 

=  0000 

test_equals  = 

off  set  ($)  -  off  set  test 

0100  0000 

dw 

test_equ 

0102  0002 

dw 

test_equ 

010«  0000 

dw 

test_equals 

0106  0000 

dw 

test-^equals 

test  endp 

0108 

cseg  ends 

end 

test 

Code  Example  1:  The  difference  between  MASM's  equ  and  —  operators 


104 


Dr.  Dobb's  Journal,  February  1987 

147 


16-BIT 

(continued  from  page  105) 

Macro  Assembler  Equates 

Microsoft  MASM's  equ  and  =  opera¬ 
tors  are  used  to  associate  a  value  or 
address  with  a  name  and  facilitate 
the  creation  of  readable,  well-docu¬ 
mented,  maintainable  assembler 
code.  There  is  a  subtle  distinction  be¬ 
tween  these  two  operators  that  is  not 
obvious  from  the  manual  but  that  can 
nail  you  at  unexpected  times!  The  dif¬ 
ference  is  that  equ  basically  creates  a 
text  macro,  whereas  the  =  operator 
is  evaluated  when  it  is  encountered 
and  assigns  an  absolute  value  to  a 
symbol.  This  distinction  is  perhaps 
best  illustrated  by  a  simple  example. 
(See  Code  Example  1,  page  105.) 

In  this  example,  I've  assigned  the 
expression  offset  ($)  —  offset  test  to 
two  different  labels  created  with  equ 
and  =.  As  you  can  see,  the  symbol 
test— equ  created  with  equ  is  evaluat¬ 
ed  when  it  is  invoked,  in  this  case  re¬ 
sulting  in  a  different  value  each  time 
it  is  used.  The  symbol  test-jequals  cre¬ 
ated  with  =  is  evidently  evaluated  at 
the  time  it  is  declared,  so  it  has  a  con¬ 
stant  value. 

Lay  Down  Your  Pencils 

Hans  Pufal's  pop  quiz  in  the  Septem¬ 
ber  1986  16-Bit  Software  Toolbox  un¬ 
fortunately  went  somewhat  awry 
because  of  my  own  bad  eyes  and  er¬ 
rant  typing.  Hans’  original  code  was 
correct,  but  a  typo  in  the  printed  list¬ 
ing  led  a  number  of  readers  to  send 
comments  along  the  lines  of  ‘‘the 
code  doesn't  do  what  its  author 
thinks  it  does.” 

The  real  answer  is  that  the  little 
routine  converts  a  hexadecimal  nyb¬ 
ble  in  register  al  to  its  ASCII  character 
equivalent,  leaving  the  result  in  al. 
The  corrected  code  appears  in  Code 
Example  2,  below. 

DDJ 

Vote  for  your  favorite  feature/article. 

Circle  Reader  Service  No.  6. 


Code  Example  2:  Corrected  code 
for  Hans  Pufal’s  pop  quiz 


106 

148 


Dr.  Dobb's  Journal,  February  1987 


COLUMNS 


ARTIFICIAL  INTELLIGENCE 


What  Progress  Is  Being  Made  in  AI? 


As  this  is  the  first  of  the  monthly 
columns  I  will  be  writing  on  ar¬ 
tificial  intelligence,  it  seemed  that  a 
good  way  to  begin  would  be  to  give  a 
progress  report  on  the  state  of  the  art 
in  AI.  Much  of  what  I  will  report  of  a 
factual  nature  is  based  on  a  close 
monitoring  of  the  newest  develop¬ 
ments  in  the  field,  especially  as  dem¬ 
onstrated  at  the  main  AI  conferences 
of  the  last  few  years — the  Interna¬ 
tional  Joint  Conference  on  Artificial 
Intelligence  (IJCAI)  in  Los  Angeles  in 
1985  and  the  AAAI  conference  in  Phil¬ 
adelphia  in  August  1986.  Unless  oth¬ 
erwise  stated,  of  course,  the  opinions 
and  extrapolations  are  solely  my 
own. 

It  may  well  be  true  that  AI  still  has  a 
long  way  to  go  to  overtake  all  the 
hype,  but  if  you  have  been  watching 
closely  over  the  last  two  years  or  so, 
you  may  have  noticed  that  the  field 
has  been  making  some  interesting 
progress  each  year.  At  the  1986  AAAI 
conference,  for  example,  there  was  a 
noticeable  presence  of  some  serious 
AI  applications,  not  just  more  and 
better  tools.  The  main  mood  of  the 
field  at  this  time  seems  to  be  that  of 
proving  its  usefulness  in  as  many  le¬ 
gitimate  areas  as  it  can.  Some  power¬ 
ful  development  tools  have  been 
available  commercially  for  a  few 
years,  and  an  impressive  and  diverse 
array  of  applications  has  begun  to 
appear. 

Many  programmers  recognize  in¬ 
stinctively  that  they  can  learn  some¬ 
thing  from  AI  that  will  make  them 


by  Ernest  R.  Tello 


better  programmers,  even  if  they  are 
not  planning  to  do  work  specifically 
in  the  AI  field.  One  common  miscon¬ 
ception,  though,  is  that  the  most  they 
can  learn  is  some  new  techniques  to 
add  to  their  arsenal  of  programming 
devices.  AI  has  much  more  to  offer 
than  this.  You  will  be  hearing  a  lot 


about  programming  paradigms  both 
in  my  monthly  column  here  and 
elsewhere,  so  this  is  a  good  time  to 
explain  the  difference  between  pro¬ 
gramming  techniques  and  program¬ 
ming  paradigms. 

One  thing  all  people  in  AI  agree 
about,  regardless  of  how  vehemently 
they  may  disagree  on  other  matters, 
is  that  we  still  do  not  know  how  to 
program  computers  very  well — 
there  are  many  things  still  to  be 
learned  that  can  greatly  improve  our 
programming.  Another  thing  that  is 
generally  agreed  upon  among  peo¬ 
ple  in  the  AI  field  is  that  nearly  all 
conventional  programming  comes 
under  one  major  paradigm,  namely 
that  of  procedural  programming. 
Once  this  is  pointed  out,  it  is  almost 
unnecessary  to  dwell  on  the  fact  that 
for  most  programmers  designing  a 
program  simply  involves  planning 
procedures  for  the  computer  to  exe¬ 
cute.  As  things  stand  now,  though, 
procedural  programming  is  just  one 
of  many  paradigms  at  the  disposal  of 
AI  programmers. 

The  object-oriented  paradigm  rep¬ 
resents  another  approach  entirely. 
Arising  originally  through  AI  re¬ 
search,  and  currently  one  of  the  most 
popular  of  all  the  paradigms  in  AI,  it 
is  just  now  penetrating  the  commer¬ 
cial  programming  market.  The  rule- 
based  programming  paradigm  has 
also  entered  the  commercial  main¬ 
stream,  mainly  through  its  successful 
use  in  the  hot  field  of  expert  system 
technology.  The  logic  programming 
paradigm  as  well  as  the  declarative 
and  rule-based  programming  para¬ 
digms  were  all  combined  to  create 


the  powerful  new  programming  lan¬ 
guage  PROLOG,  another  recent  new¬ 
comer.  Other  less-known  program¬ 
ming  paradigms  that  have  also 
originated  in  AI  research  include 
constraint  propagation,  access-ori¬ 
ented  programming,  the  actor  para¬ 
digm,  the  neural  network  paradigm, 
and  various  parallel  and  distributed 
programming  paradigms  such  as  the 
connectionist  and  data-flow  ap¬ 
proaches.  As  you  can  see,  the  field  of 
AI  today  is  large  and  complex,  but 
some  clear  trends  show  a  definite 
unity  amid  the  diversity. 

Another  good  thing  for  program¬ 
mers  to  know  is  that  there  are  several 
clear  indications  that,  for  better  or 
worse,  the  goals  of  AI  are  on  the 
verge  of  undergoing  a  paradigm 
shift.  According  to  Mark  Stefik  at  Xe¬ 
rox  PARC,  AI  is  now  in  the  process  of 
becoming  the  latest  vehicle  for  the 
development  and  proliferation  of  a 
new  form  of  "knowledge  media.” 
Knowledge  media,  according  to  Ste¬ 
fik,  embrace  not  only  the  representa¬ 
tion  and  storage  of  knowledge  but 
also  its  transmission  throughout  soci¬ 
ety.  At  this  point,  at  least,  AI  systems 
are  not  a  major  medium  for  the  trans¬ 
mission  of  knowledge.  As  AI  technol¬ 
ogy  matures,  though,  Stefik  argues 
that  it  will  increasingly  take  on  this 
role  as  a  carrier  and  transmitter. 

It  is  obvious  that  AI  systems  already 
differ  from  most  other  knowledge 
media  such  as  books,  diagrams,  and 
films  in  that  they  are  not  entirely  fro¬ 
zen  and  inert.  The  knowledge  in  an 
AI  system  is  dynamic  and  can  be  exe¬ 
cuted  for  solving  problems  in  a  more 
flexible  way  than  is  possible  with 
any  other  knowledge  medium.  In 
this  respect,  it  already  resembles  hu¬ 
mans.  But,  so  far,  the  ways  in  which 
AI  systems  can  use  knowledge  to 
solve  problems  is  still  extremely  lim¬ 
ited  as  compared  with  humans. 

Has  AI  given  up  on  the  quest  to  cre¬ 
ate  machine  intelligence  as  an  arti- 


108 


Dr.  Dobb's  Journal,  February  1987 

149 


ARTIFICIAL  INTELLIGENCE 

(continued  from  page  108) 


fact,  then,  and  settled  upon  the  far 
easier  goal  of  machine  knowledge? 
This  is  both  a  crucial  and  yet  ex¬ 
tremely  difficult  question.  Part  of  the 
problem  is  that,  for  the  short  term  at 
least,  knowledge  appears  to  be  far 
more  useful  than  is  intelligence.  This 
may  sound  surprising,  but  it  is  impor¬ 
tant  to  remember  that  human  intelli¬ 
gence  must  undergo  a  long  training 
period  before  it  can  be  trusted  with 
any  serious  responsibilities.  Just 
imagine  buying  a  program  that  you 
had  to  teach  for  16  years  before  it 
could  do  anything  useful!  But  a 
knowledge-based  system,  even 
though  it  does  not  have  much  true 
intelligence,  can  do  some  useful 
things,  usually  without  any  training 
at  all.  All  the  same,  I  think  that,  in 
planning  long-range  projects  and 
even  shorter-range  ones,  to  lean  on 
knowledge  too  heavily  and  ignore 
the  issue  of  how  to  make  the  process¬ 
ing  of  knowledge  more  intelligent  is 
tantamount  to  sidestepping  one  of 
the  most  challenging  issues  AI  faces. 

The  field  of  AI  today  is  like  a  cell  in 
which  mitosis  has  occurred  so  that 
there  are  now  really  two  indepen¬ 
dent  bodies,  both  of  whom  seem  to  be 
thriving.  I  am  referring  to  the  two  re¬ 
lated  fields  of  commercial  applica¬ 
tions  of  AI  and  basic  AI  research.  I 
intend  to  give  both  some  representa¬ 
tion  here  because  I  think  each  of 
them  are  important  and  interesting. 

Although  some  noteworthy  pro¬ 
gress  is  being  made  in  AI  today,  I  think 
it  is  fair  to  say  that  the  field  suffers 
from  extreme  fragmentation.  A  few 
major  figures  see  the  "big  picture” 
and  have  a  vision  of  sorts,  but  there 
still  seem  to  be  far  too  many  research¬ 
ers  who  see  only  the  trees  (sometimes 
just  the  leaves  on  the  trees)  and  never 
the  forests.  So  far,  it  has  been  primari¬ 
ly  the  Defense  Advanced  Research 
Projects  Agency  (DARPA)  that  has  tried 
to  impose  some  type  of  an  overall 
agenda  on  AI  research,  but  for  several 
reasons,  this  has  not  and  cannot  suc¬ 
ceed.  The  changes  have  to  come  from 
within  AI  itself.  A  new  vision  based  on 
realistic  long-range  objectives  and  a 
scenario  for  its  realization  will  have  to 
emerge. 

In  spite  of  the  lip  service  that  is  fre¬ 
quently  paid  to  emulating  the  way 


the  human  brain  works,  many  AI  re¬ 
searchers,  even  those  who  express 
great  sympathy  for  this  approach,  do 
not,  in  my  opinion,  make  a  serious  at¬ 
tempt  to  design  systems  that  show  an 
appreciation  of  the  high-level  organi¬ 
zation  of  the  brains  of  even  lower  ver¬ 
tebrates.  For  example,  nothing  has 
been  built  as  yet  that  even  remotely 
approaches  the  functioning  of  the 
cerebellum  of  even  the  lower  am¬ 
phibians  and  reptiles.  AI  is  still  a  com¬ 
puter-oriented  discipline  at  this  point, 
but  in  the  area  of  computer  science 


All  people  in  AI 
agree  that 
we  do  not  know 
how  to  program 
very  well. 


some  substantial  headway  is  being 
made. 

Real-World  AI 

Many  of  the  large  AI  tool  vendors 
have  announced  commercial  appli¬ 
cations  that  had  been  developed  with 
their  products  this  year.  Companies 
such  as  Corning  Glass,  Campbell 
Soup,  and  American  Express  all  have 
interesting  expert  system  applica¬ 
tions  that  are  either  completed  or  in 
stages  nearing  completion.  American 
Express  has  developed  The  Authoriz- 
er's  Assistant,  a  system  that  provides 
an  automated  authorization  service 
for  merchants  who  accept  American 
Express  charge  cards.  Another  inter¬ 
esting  real-word  application  of  the 
technology  is  the  Expert  Publishing 
System  developed  by  Crossfield  CSI. 
This  firm  provides  a  system  that 
helps  newspaper  staffs  control,  inte¬ 
grate,  and  coordinate  the  activities  of 
various  departments  that  have  to  co¬ 
operate  to  bring  out  the  daily  paper. 
The  system  is  intended  to  be  capable 
of  fully  integrating  the  layout  of 
newspaper  pages  and  tying  together 
the  functions  of  the  editorial,  adver¬ 
tising,  and  production  departments. 

Another  interesting  commercial  ex¬ 


pert  system  is  the  SEATS  system  devel¬ 
oped  by  Sperry  for  Northwest  Air¬ 
lines.  SEATS  addresses  the  problem  of 
handling  discount  prices  in  such  a 
way  as  to  sell  as  many  seats  on  sched¬ 
uled  flights  as  possible.  It  acts  as  an 
intermediate  operator  that  interacts 
with  two  other  software  systems — 
the  Airline  Reservation  System  and 
the  Airline  Revenue  Enhancement 
System. 

Several  firms  specialize  in  selling 
finished  expert  system  applications 
for  the  large  business  market,  too. 
For  example,  Applied  Expert  Sys¬ 
tems  is  currently  selling  a  generic 
business  expert  system,  called  Plan- 
Power,  that  provides  assistance  in  fi¬ 
nancial  planning.  The  system  is  capa¬ 
ble  of  analyzing  more  than  125 
different  types  of  financial  asset, 
such  as  securities,  insurance,  real  es¬ 
tate,  and  various  types  of  investments 
as  well  as  providing  a  comprehen¬ 
sive  plan  for  a  five-year  period  that 
integrates  various  recommendations 
for  a  specific  client  or  user. 

PlanPower  is  sold  as  a  turnkey  sys¬ 
tem,  packaged  with  a  Xerox  1186  AI 
workstation  and  an  HP  Laserjet  print¬ 
er.  One  important  advantage  of  the 
Xerox  workstation  is  that  it  includes 
an  interface  that  allows  it  to  run  soft¬ 
ware  written  for  IBM  PC  series  com¬ 
puters.  The  price  for  all  this  is  about 
$50,000.  PlanPower  is  supposed  to  be 
able  to  recommend  various  financial 
strategies  based  upon  people's  atti¬ 
tudes  and  objectives  as  well  as  their 
financial  circumstances.  Another 
company  selling  generic  business  ex¬ 
pert  systems  is  Palladian,  which  of¬ 
fers  the  Capital  Investment  Expert 
System,  and  the  Manufacturing  and 
Logistics  Expert  System. 

It  is  not  true  by  any  means  that  all 
serious  expert  system  development  is 
being  done  on  expensive  LISP  ma¬ 
chines  with  $50,000  development 
tools.  A  surprising  amount  of  sub¬ 
stantial  AI  has  been  done  already  on 
IBM  PCs,  PC/ ATs,  and  Apple  Macin¬ 
toshes.  ARCO,  for  example,  has  devel¬ 
oped  the  Cementing  Expert  System 
with  the  M.l  tool  from  Teknowledge 
running  on  an  IBM  PC/AT.  Another  oil 
company  that  has  developed  an  ex¬ 
pert  system  using  M.l  is  Phillips  Pe¬ 
troleum.  Other  firms  that  have  devel¬ 
oped  expert  systems  using  the  M.l 
tool  include  Gould  Electronics  and 
the  First  National  Bank  of  Chicago. 


110 

150 


Dr.  Dobb's  Journal,  February  1987 


The  expert  system  tool  that  captured 
the  knowledge  of  retiring  Aldo  Ci- 
mino  at  Campbell  Soup  was  the  Per¬ 
sonal  Consultant  tool  developed  by 
Texas  Instruments  for  PCs  and 
compatibles. 

Although  there  was  no  particular 
presence  of  it  at  either  the  1985  IJCAI 
or  the  AAAI  conference  in  1986,  it 
would  be  impossible  to  discuss  the 
state  of  the  art  in  AI  without  some 
mention  of  the  CVC  project  that  is  be¬ 
ing  undertaken  by  Doug  Lenat  at  MCC 
in  Austin,  Texas.  Dr.  Lenat  is  best 
known  for  his  work  on  the  Eurisko 
Discovery  System  program  devel¬ 
oped  while  he  was  at  Stanford  and 
Xerox  PARC. 

The  CYC  system  is  one  of  the  most 
ambitious,  long-range  AI  projects 
ever  attempted.  Over  the  next  ten 
years,  it  aspires  to  develop  a  knowl¬ 
edge  system  of  truly  encyclopedic 
size  that  has  a  knowledge  base  of 
common  sense  knowledge  as  well. 
Although  its  breadth  and  depth  are 
admittedly  of  epic  proportions,  in 
other  respects  the  project  is  not  as 
revolutionary  as  it  might  seem  be¬ 
cause  it  is  based  largely  on  today's  AI 
technology  rather  than  the  AI  tech¬ 
nology  of  ten  years  from  now. 

What  Lenat  proposes  to  do  is  liter¬ 
ally  to  use  encyclopedias  as  a  knowl¬ 
edge  source  and  to  build  a  deep 
frame-based  system  that  not  only  en¬ 
codes  the  knowledge  presented  in 
encyclopedia  articles  but  also  the 
common  sense  implied  in  them.  The 
latter,  in  particular,  is  the  most  ambi¬ 
tious  aspect  of  the  project.  Common 
sense,  oddly  enough,  is  the  area  that 
researchers  still  have  tremendous 
difficulty  in  making  consciously  ex¬ 
plicit.  It  is,  for  example,  not  at  all 
clear  that  a  frame-based  conceptual 
hierarchy  will  be  a  powerful  enough 
tool  to  model  common  sense.  Done 
properly,  a  large  conceptual  hierar¬ 
chy  is  certainly  a  powerful  tool.  But  it 
is  powerful  for  providing  the  struc¬ 
ture  of  knowledge  rather  than  the  ac¬ 
tive  processes  that  are  at  work  when 
we  use  know-how  to  solve  problems. 

The  CYC  system  is  representative  of 
the  same  trend  you  see  at  work  in 
Stefik's  approach  at  the  Xerox  PARC 
Intelligent  Systems  Group:  the  refo¬ 
cusing  of  AI  goals  from  intelligence  as 
an  artifact  to  that  of  dynamic  ma¬ 
chine  knowledge  systems.  And  it  is 
also  represented  in  the  "knowledge 


is  power”  slogan  made  popular  by 
Edward  Feigenbaum.  It  is  now  no 
longer  intelligence  but  executable 
knowledge  that  is  power,  it  seems. 
Nevertheless,  in  the  often  mundane 
world  of  knowledge  engineering, 
some  intelligent  new  ways  of  han¬ 
dling  knowledge  are  emerging. 

Starplan  II 

I  want  to  devote  some  space  here  to 
discussing  the  new  Starplan  II  archi¬ 
tecture,  an  expert  system  for  satellite 
diagnosis  and  repair  under  develop¬ 
ment  at  Ford  Aerospace — how  it  dif¬ 
fers  from  the  Starplan  I  configuration 
and  why  such  drastic  changes  had  to 
be  made  in  its  earlier  design.  Starplan 
I  was  already  an  interesting  expert 
system  design  because  of  its  use  of  a 
construct  the  developers,  Ron  Sie¬ 
mens  and  Jay  Ferguson,  called  guard¬ 
ians.  This  was  a  conscious  attempt  by 
them  to  apply  Marvin  Minsky’s  "soci¬ 
ety  of  experts”  idea.  For  example,  the 
Starplan  I  system  had  monitor  and 
metamonitor  experts,  among  others, 
that  each  operated  as  independent 
knowledge  sources  in  the  system. 
Another  similar  construct  was  the 
alarm  demons,  sleeping  processes 
that  awoke  as  each  guardian  became 
initialized  and  attached  themselves 
to  appropriate  values  in  a  telemetry 
database. 

In  order  to  perform  its  task  proper¬ 
ly,  an  expert  system  such  as  Starplan 
has  to  carry  out  several  interrelated 
functions,  including  monitoring,  situ¬ 
ation  assessment,  diagnosis,  goal  de¬ 
termination,  and  real-time  planning. 
The  Starplan  I  system  consisted  of  five 
main  components:  guardians,  moni¬ 
tors,  metamonitors,  a  simulator,  and  a 
relational  database. 

In  Starplan  II  this  architecture  un¬ 
derwent  a  drastic  revision.  The  de¬ 
velopers  decided  that  each  of  the 
main  functions  enumerated  above 
would  have  to  be  implemented  as  en¬ 
tirely  separate  functions.  In  Starplan 
I,  they  were  all  incorporated  to  some 
degree  in  the  role  of  the  monitors 
and  metamonitors,  which  led  to  con¬ 
siderable  overlap  and  redundancy. 
The  five  components  of  the  new  sys¬ 
tem  were  organized  entirely  accord¬ 
ing  to  function.  They  consisted  of  an 
active  database,  situation  assessment, 
causal  diagnosis,  goal  determination, 
and  planning  and  command.  The 
knowledge  base  became  completely 


unified  with  each  of  the  five  modules 
operating  on  it  in  the  same  shared 
memory.  The  knowledge  represen¬ 
tation  was  done  with  the  utmost 
completeness.  Every  object  that 
could  be  reasoned  about  was  repre¬ 
sented.  And  each  of  the  objects  repre¬ 
sented  was  defined  with  three  com¬ 
ponents:  the  object's  own  attributes, 
its  relation  to  the  other  objects  in  the 
satellite,  and  a  "behavioral”  descrip¬ 
tion  of  the  object. 

Starplan  II's  developers  made  ex¬ 
traordinary  claims  for  the  resulting 
system.  They  developed  a  hybrid 
knowledge  representation  system  in 
which  they  set  out  to  incorporate  the 
strong  points  of  each  of  the  major 
knowledge  representation  para¬ 
digms  and  to  eliminate  their  weak¬ 
nesses  by  overriding  them.  In  their 
presentation  at  the  AAAI  conference 
last  summer,  they  claimed  to  have 
succeeded  in  this  ambitious  objective. 

ATRAIVS 

At  the  Socie'te'  Generale  Bank  in  Brus¬ 
sels,  an  important  AI  application 
called  the  Automatic  Funds  Transfer 
Telex  Reader  (ATRANS)  is  currently 
undergoing  its  final  testing  phase.  As 
the  name  suggests,  it  is  a  program  for 
reading  telex  messages  in  natural  lan¬ 
guage  and  automatically  translating 
them  into  the  machine-readable  for¬ 
mat  of  the  bank’s  automatic  payment 
system.  Developed  by  Cognitive  Sys¬ 
tems  of  New  Haven,  Connecticut, 
ATRANS  uses  a  method  of  knowledge- 
based  parsing  and  text  analysis.  The 
main  difficulty  associated  with  this 
application  is  the  variety  of  expres¬ 
sions  that  are  used  during  actual  rou¬ 
tine  telex  transactions.  Because  the  ba¬ 
sic  content  of  most  messages  is 
predictable,  however,  knowledge  of 
the  nature  of  the  types  of  transactions 
can  be  used  as  a  basis  for  routines  that 
can  handle  various  expressions  that 
might  otherwise  confuse  a  computer. 

The  average  monetary  transfer 
message  might  contain  a  mixture  of 
irrelevant  text,  misspellings,  non¬ 
standard  names  for  various  banks, 
and  various  unusual  phrases  and  am¬ 
biguous  references.  ATRANS  uses  a 
large  complex  script  about  how  mon¬ 
ey  transfers  are  made  between  inter¬ 
national  banks  to  guide  its  analysis  of 
the  text  of  incoming  messages.  What 
is  unique  about  ATRANS  is  the  minute 
detail  with  which  it  analyses  these 


Dr.  Dobb's  Journal,  February  1987 


111 

151 


ARTIFICIAL  INTELLIGENCE 

(continued  from  page  111) 


telex  messages.  Unlike  most  other 
programs  of  its  type,  it  carefully  ana¬ 
lyzes  each  and  every  word  and  pro¬ 
duces  a  highly  complex,  detailed  en¬ 
coding  of  its  content.  And  it  does  this 
surprisingly  fast  for  such  a  large  LISP 
application.  Running  on  a  VAX  11/785 
under  the  VMS  operating  system, 
ATRANS  processes  an  average  funds 
transfer  telex  in  less  than  20  seconds. 

PRIDE 

1  also  find  it  encouraging  that  several 
expert  systems  are  now  being  devel¬ 
oped  in  the  area  of  mechanical  engi¬ 
neering  design.  In  the  past,  nearly  all 
the  design  systems  attempted  were 
intended  for  VLSI  and  other  types  of 
circuit  design.  One  of  the  most  inter¬ 
esting  mechanical  design  programs  is 
the  PRIDE  system,  developed  at  Xerox 
PARC.  The  system  is  intended  for  the 
design  of  paper  transport  devices  in¬ 
side  copy  machines.  Although  this 
may  not  sound  all  that  earthshaking, 
it  is  a  difficult  achievement  to  develop 


software  that  can  design  anything 
properly.  It  is  just  these  tedious  and 
routine  sorts  of  devices  that  you  want 
machines  to  design,  though,  to  leave 
human  designers  more  time  to  work 
on  creative  and  challenging 
problems. 

PRIDE  is  interesting  too  because  of 
the  general  framework  and  basic  ap¬ 
proach  it  uses  to  accomplish  its  goal 
successfully.  This  framework  is  delib¬ 
erately  configured  to  handle  a  whole 
class  of  design  tasks — namely,  those 
that  can  be  characterized  as  having  a 
well-defined  search  space.  Given  this 
condition,  the  assumption  is  that  the 
design  process  can  be  reduced  to  the 
task  of  searching  the  space  of  possible 
designs.  The  search  technique  used, 
of  course,  is  a  special  one  uniquely 
suited  to  this  type  of  problem.  De¬ 
tailed  knowledge  is  used  to  configure 
the  search  space  by  the  creation  of 
partial  designs  according  to  various 
design  constraints,  and  further 
knowledge  is  employed  to  reconfig¬ 
ure  the  design  when  it  turns  out  that 
constraints  have  been  violated. 

Each  of  the  different  types  of 


knowledge  used  in  PRIDE  are  orga¬ 
nized  into  design  plans.  The  main 
problem  solver  that  executes  these  de¬ 
sign  plans  has  the  ability  to  search  the 
entire  design  space  if  necessary.  The 
basic  AI  technique  used  to  implement 
the  problem  solver  is  a  dependency- 
directed  backtracking  mechanism 
equipped  with  an  advice  mechanism 
that  allows  information  about  why  a 
design  failed  to  be  used  to  select  a  like¬ 
ly  direction  in  which  to  backtrack. 

So  in  all,  the  framework  utilized  to 
build  the  PRIDE  system  exploits  four 
main  types  of  knowledge:  ordering 
knowledge  that  defines  the  dimen¬ 
sions  of  the  design  space,  knowledge 
that  guides  the  choices  along  each  di¬ 
mension,  constraints  on  design  pa¬ 
rameters,  and  modification  advice 
for  aiding  redesign.  These  types  of 
knowledge  have  been  skillfully  inte¬ 
grated  into  knowledge  structures 
that  operate  as  usable  plans.  To  en¬ 
able  this  to  work,  plans  are  organized 
in  terms  of  goals  for  making  specific 
decisions  about  design  parameters. 
In  the  PRIDE  paper  handler  design 
system,  these  goals  include  things 
such  as  the  design  of  paper  path, 
driver  roll,  and  driver  width.  As  you 
might  expect,  each  of  these  design 
goals  is  responsible  for  a  certain  set  of 
design  variables  that  correspond  to 
its  task. 

Each  of  the  design  goals  in  the  PRIDE 
system  also  has  various  design  meth¬ 
ods  assigned  to  it  that  define  alterna¬ 
tive  ways  in  which  decisions  about 
design  parameters  can  be  made.  Some 
examples  of  design  methods  are  gen¬ 
erators,  which  actually  specify  sets  or 
ranges  of  design  parameters;  calcula¬ 
tions,  which  apply  math  operators  to 
previously  determined  design  vari¬ 
ables;  and  subplans,  which  specify 
subgoals  that  are  needed  to  satisfy 
higher  level  goals. 

The  PRIDE  system  was  developed  as 
a  collaborative  effort  between  Xerox 
PARC  and  the  Xerox  Reprographics 
Business  Group.  A  prototype  version 
was  field-tested  for  more  than  a  year, 
during  which  time  it  was  tested  on 
actual  design  problems  in  various  on¬ 
going  copier  projects  at  Xerox.  Ac¬ 
cording  to  the  evaluations  made  of 
the  prototype  version,  it  was  able 
both  to  develop  new  designs  success¬ 
fully  and  evaluate  the  shortcomings 
of  designs  produced  by  engineers. 
Research  is  now  continuing  in  order 


112 

152 


Dr.  Dobb's  Journal,  February  1987 


to  improve  the  advice  mechanism  to 
handle  those  difficult  situations  in 
which  many  constraints  fail  simulta¬ 
neously  and  in  which  conflicts  be¬ 
tween  different  advice  options  have 
to  be  resolved. 

jVeir  Developments  in  AI 

Research 

Agora 

One  of  the  more  ambitious  AI  re¬ 
search  projects  underway  is  the 
work  being  done  on  the  Agora  sys¬ 
tem  at  Carnegie-Mellon  University. 
Agora  is  really  a  general-purpose  AI 
development  environment  that  has 
been  designed  specifically  to  support 
applications  written  in  multiple  lan¬ 
guages  and  those  that  support  highly 
parallel  problem-solving  approach¬ 
es.  The  way  Agora  goes  about  doing 
this  is  by  providing  a  virtual  machine 
that  is  independent  of  any  particular 
programming  model  or  language 
and  that  can  be  mapped  onto  a  vari¬ 
ety  of  different  computer  architec¬ 
tures.  The  current  system  is  actually 
the  result  of  two  different  designs 
and  implementations  that  were 
made  in  1985.  One  is  being  used  for  a 
prototype  speech-recognition  system 
running  on  a  network  of  MicroVAXs 
and  Perqs  and  will  be  extended  this 
year  to  support  a  shared-memory 
parallel  computer  as  well  as  addition¬ 
al  Sun  and  IBM  RT/PC  workstations. 

Agora  was  used  to  develop  the  AN¬ 
GEL  speech-recognition  system, 
which  currently  consists  of  more 
than  100,000  lines  of  C  and  Common 
LISP  code  running  at  a  speed  of  about 
1,000  MIPS.  ANGEL  has  been  developed 
by  a  team  of  more  than  15  program¬ 
mers,  and  because  of  this  large  num¬ 
ber  of  programmers,  several  different 
computation  styles  have  been  used  to 
implement  the  complete  system. 

Agora  has  a  multilayered  architec¬ 
ture  that  makes  particular  use  of  the 
blackboard  approach  to  knowledge- 
based  processing.  On  the  first  layer  is 
the  heterogeneous  mix  of  distributed 
hardware  mentioned  earlier.  On  the 
next  layer  is  the  MACH  operating  sys¬ 
tem,  a  Unix-compatible  multiproces¬ 
sor  OS  that  uses  the  techniques  of 
message  passing,  shared  memory, 
and  threads  to  provide  the  basic  soft¬ 
ware  that  addresses  the  different 
types  of  hardware.  The  next  layer 
above  MACH  is  the  parallel  virtual 
machine  layer,  which  forms  the  as¬ 


sembly-language  level  of  the  Agora 
machine.  At  this  level  computations 
are  programmed  in  C  and  Common 
LISP  that  use  basic  Agora  primitives  to 
compile  higher-level  routines  and  as¬ 
sign  them  to  the  various  processors. 
Above  this  is  the  framework  layer, 
which  is  the  level  at  which  applica¬ 
tion  programmers  work.  Finally, 
there  is  the  layer  of  the  actual  knowl¬ 
edge  source  clusters. 

Detailed  simulations  of  the  Agora 
virtual  machine  have  been  conduct¬ 
ed  and  the  speedup  factors  deter¬ 
mined  for  its  use  on  a  variety  of  dif¬ 
ferent  hardware  types,  including 
custom  VLSI  multiprocessor  chips. 
Initial  reports  indicate  favorable  per¬ 
formance  on  a  broad  mix  of  hard¬ 
ware.  The  Agora  architecture  is  an 
important  step  forward  in  designing 
hardware-independent  AI  systems 
for  distributed  and  parallel  process¬ 
ing  and  is  bound  to  be  the  coming 
trend  in  large  systems,  such  as  those 
used  aboard  the  NASA  space  station. 

BACAS 

Researchers  at  the  University  of  Es¬ 


sex  in  the  U.K.  are  developing  a  new 
parallel,  content-addressable  memo¬ 
ry  computer  architecture  called  the 
Binary  and  Continuous  Activation 
System  (BACAS).  This  system  is  intend¬ 
ed  specifically  for  storage  and  re¬ 
trieval  of  knowledge  structures  that 
turn  out  to  be  particularly  useful  for 
natural  language  understanding. 
Currently,  BACAS  is  a  two-layered 
system  with  a  total  of  10  different 
types  of  main  knowledge  structures, 
or  K-structures,  and  46  types  of  small¬ 
er  knowledge  structures,  or  TKUs. 

In  some  respects  this  system  is  rath¬ 
er  simple  compared  with  many 
schemes  for  natural  language  pro¬ 
cessing  and  content-addressable 
memory,  but  in  others  it  is  rather 
subtle.  It  has  only  two  "levels.”  On 
one  level  are  "micro-units,”  the  ele¬ 
mentary  actions  or  indivisible  items 
of  knowledge  out  of  which  larger 
knowledge  structures  are  built.  On 
the  upper  level  are  the  threshold 
knowledge  units  (TKUs),  which  are 
built  from  the  micro-units.  The  TKUs 
are  like  individual  scenes  in  a  movie 
that  can  be  constructed  into  larger 


Dr.  Dobb's  Journal,  February  1987 


113 

153 


ARTIFICIAL  INTELLIGENCE 

(continued  from  page  113) 


scripts.  It  is  these  larger  structures 
built  from  TKUs  that  form  the  ten 
types  of  K-structure. 

The  TKUs  are  a  kind  of  hybrid  be¬ 
tween  the  older  logic  threshold  unit 
idea  of  Minsky  and  Papert  in  their 
work  on  perceptrons  and  the  newer 
neural  net  pattern  completion  ap¬ 
proach  developed  by  John  J.  Hop- 
field,  which  is  currently  so  fashion¬ 
able.  The  main  problem  that  BACAS  is 
designed  to  solve  is  to  build  a  system 
that  can  make  easy  and  rapid  transi- 


114 

154 


tions  between  successive  knowledge 
structures  in  the  course  of  single  lan¬ 
guage  interpretation  tasks.  Earlier, 
the  researchers  had  attempted  to  use 
the  Boltzmann  machine  approach 
but  found  it  unworkable  for  enabling 
such  transitions  easily. 

At  the  time  the  BACAS  system  was 
reported,  it  still  lacked  many  of  the 
facilities  that  it  will  ultimately  need 
to  function  effectively,  such  as  repre¬ 
sentation  of  causality  and  mecha¬ 
nisms  for  temporally  ordering  units 
and  role  binding.  As  work  continues 
on  this  system,  it  will  be  interesting  to 
see  how  the  complete  system  takes 


shape.  Continuing  research  is  also 
oriented  toward  developing  a  learn¬ 
ing  mechanism  for  BACAS,  which 
makes  this  a  promising  project  to 
track.  It  is  another  example  of  a  mar¬ 
riage  between  state-of-the-art  soft¬ 
ware  and  hardware  concepts  that 
has  considerable  potential. 

JUDGE 

JUDGE  is  an  another  interesting  pro¬ 
gram,  mainly  from  a  computer  sci¬ 
ence  vantage  point  because  a  com¬ 
mercial  version  would  hardly  be 
likely  to  become  a  best-seller.  It  is  a 
case-based  reasoning  program  that 
assumes  the  role  of  criminal  court 
judge.  It  was  written  at  Yale  Universi¬ 
ty  by  William  A.  Bain  and  is  a  sample 
of  the  latest  in  scripts  from  the  Schan- 
kian  school. 

A  case-based  program  is  one  that 
uses  previous  examples  of  a  particu¬ 
lar  activity  to  produce  new  perfor¬ 
mances  of  the  activity  to  meet  new 
circumstances.  Fifty-five  real  cases  of 
manslaughter  and  assault  were  read 
into  the  system,  selected  for  their  var¬ 
ied  characteristics  from  the  view¬ 
point  of  the  program’s  parameters. 
To  provide  the  model  for  the  sentenc¬ 
ing  judge,  various  presiding  judges  in 
the  Superior  Court  of  Connecticut 
were  interviewed.  The  resulting  sys¬ 
tem  has  five  stages  of  operation  that 
are  used  to  formulate  a  criminal's 
sentence:  interpretation  of  the  case; 
retrieval  of  similar  cases;  difference 
analysis  of  the  current  case  and  those 
retrieved;  strategy  application  and 
modification,  which  adjust  past  sen¬ 
tences  when  necessary;  and  the  gen¬ 
eralization  facility,  which  allows 
JUDGE  to  derive  rules  based  on  its 
findings  about  similar  cases. 

The  purpose  of  a  research  system 
such  as  JUDGE  is  not  to  provide  a  proto¬ 
type  for  an  actual  automated  criminal 
court  judge.  Rather,  it  is  to  explore  a 
type  of  reasoning  that  differs  signifi¬ 
cantly  from  most  of  the  types  of  rea¬ 
soning  that  AI  systems  have  so  far  ex¬ 
plored.  The  emphasis  is  really  on 
case-based  reasoning  and  learning 
problems  as  they  apply  in  legal  situa¬ 
tions  generally,  rather  than  the  prob¬ 
lem  of  criminal  sentencing  per  se.  The 
program,  therefore,  attempts  to  break 
new  ground  in  the  modeling  of  sub¬ 
jective  human  reasoning  that  is  of  a 
very  different  kind  from  the  diagnos¬ 
tic  reasoning  used  by  doctors  and 


Dr.  Dobb's  Journal,  February  1987 


ARTIFICIAL  INTELLIGENCE 

(continued  from  page  114 ) 


equipment  troubleshooters. 

Butterfly  LISP 

Currently,  several  projects  are  afoot 
to  develop  new  parallel  hardware 
specifically  for  the  rigors  of  AI  appli¬ 
cations.  In  most  cases,  an  extended  di¬ 
alect  of  LISP  is  created  specifically  to 
make  use  of  these  machines.  One  of 
the  most  interesting  of  the  new  crop 
of  AI  supercomputers  is  the  Butterfly 
Machine  that  has  been  built  at  Bolt, 
Beranek,  and  Neumann.  Basically, 
the  Butterfly  Machine  is  an  example 
of  a  coarse-grained  parallel  architec¬ 
ture  as  opposed  to  the  massive  paral¬ 
lelism  of  the  Connection  Machine. 
The  Butterfly  Machine  uses  numer¬ 
ous  processing  nodes  (up  to  256)  each 
of  which  consists  of  a  68000,  between 
1  to  4  megabytes  of  RAM,  and  a  cus¬ 
tom  processor  node  controller  (PNC). 
It  is  from  the  PNC  units  that  the  But¬ 
terfly  Machine  gets  its  name.  The 
PNCs  are  programmed  in  microcode 
to  enable  inward  and  outward  But¬ 
terfly  switch  transactions  and  to  ex¬ 
tend  the  instruction  set  of  the  Motor¬ 
ola  68000  for  multiprocessing. 

Originally,  all  the  programming  on 
the  Butterfly  Machine  was  done  in  C. 
Based,  however,  on  work  done  by 
Robert  Halstead  and  the  Multilisp 
group  at  MIT,  an  extended  version  of 
Common  LISP  that  was  also  based  on 
some  features  of  the  Scheme  dialect 
has  been  implemented  for  the  But¬ 
terfly  Machine.  One  of  the  main  fea¬ 
tures  of  Butterfly  LISP  is  the  future 
construct.  Its  form  is  simple.  The  syn¬ 
tax  used  is: 

(future  <  expression  >) 

where  the  expression  can  be  any  LISP 
expression  whatsoever.  The  future 
construct  is  used  as  the  basic  task-cre¬ 
ating  mechanism  in  Butterfly  LISP. 
When  the  user  makes  a  call  to  the  sys¬ 
tem  using  the  future  construct,  if  re¬ 
sources  are  available,  the  computa¬ 
tion  is  begun  and  control  returns 
immediately  to  the  function  that 
made  the  call  to  future,  returning  a 
novel  LISP  object  called  an  "undeter¬ 
mined  future.”  This  future  object 
then  acts  as  a  temporary  placeholder 
for  the  ultimate  value  of  the  expres¬ 
sion  and  as  such  can  be  stored  or  ma¬ 


nipulated  in  any  fashion  just  as  the  fi¬ 
nal  value  would  be.  This,  of  course,  is 
extremely  significant  because  it 
means  that  the  various  computations 
often  do  not  have  to  wait  for  the  value 
of  the  needed  expression  but  can  con¬ 
tinue  on  with  their  own  operations  as 
if  the  needed  value  were  already 
available.  Naturally,  however,  any 
operation  that  includes  a  conditional 
that  depends  on  the  value  of  the  fu¬ 
ture  expression  will  have  to  be  sus¬ 
pended  until  that  value  becomes 
available.  The  implications  of  this  are 
far  reaching.  It  means  that  the  results 
of  parallel  processing  can  be  manipu¬ 
lated  without  explicit  synchroniza¬ 
tion  and  that  the  form  of  parallel  LISP 
programs  can  be  essentially  similar  to 
the  same  programs  written  for  se¬ 
quential  machines. 

New  AI  Products 

In  later  columns  I’ll  be  giving  con¬ 
crete  examples  of  what  some  of  the 
newest  and  most  powerful  AI  tools 
can  do.  Here  is  a  rather  extensive  pre¬ 
view  of  some  of  the  products  that 
you'll  be  reading  about  in  far  more 
detail  in  the  months  ahead. 

ACORN 

First,  then,  is  ACORN  from  Gold  Hill 
Computers.  ACORN  is  a  LISP-based  ex¬ 
pert  system  development  tool  that 
seems  to  be  one  of  the  largest  and 
most  sophisticated  that  has  appeared 
so  far  for  microcomputers.  It  runs  on 
the  large-model  286  Developer  ver¬ 
sion  of  Golden  Common  LISP  for  IBM 
PC/ ATs  and  compatibles.  One  of  the 
reasons  why  I  am  excited  about 
ACORN  is  the  definite  influence  I  see 
of  the  ideas  of  MIT’s  Dr.  Carl  Hewitt, 
the  primary  advocate  of  the  actor 
model  of  programming.  Gerry  Bar¬ 
ber,  the  chief  scientist  at  Gold  Hill 
was,  of  course,  a  student  of  Hewitt's 
at  MIT.  ACORN  is  not  a  true  actor  sys¬ 
tem,  but  it  would  probably  be  correct 
to  say  that  it  is  the  expert  system  tool 
that  shows  the  most  influence  of  the 
actor  programming  model. 

ART  3.0 

It  would  be  wrong  not  to  mention  the 
new  version  of  the  ART  tool,  Release 
3.0,  even  though  its  cost  still  keeps  it 
well  out  of  most  people's  reach.  It  is  a 
LISP-based  expert  system  tool  from 
Inference  Corp.  that  was  originally 
sold  exclusively  for  LISP  machines 


but  now  will  be  available  in  a  C  ver¬ 
sion  for  Sun  workstations  and  the  IBM 
RT/PC.  One  of  the  real  delights  of  ART 
is  the  Artist  system,  an  impressive 
graphics  and  animation  package. 
ART  was  the  first  commercial  expert 
system  tool  to  offer  the  capability  of 
multiple  hypotheses  reasoning  using 
the  viewpoints  construct. 

ExperOPS5-Plus  and 
ExperProlog  II 

Expertelligence  has  made  two  im¬ 
portant  additions  to  its  line  of  prod¬ 
ucts  for  the  Macintosh.  ExperOPS5- 
Plus  is  a  more  powerful  version  of 
the  ExperOPS5  product  and  includes 
a  toolkit  for  adding  dialog  boxes  to 
provide  a  much-needed  user  inter¬ 
face  to  systems  built  with  OPS5.  Dialog 
boxes  can  be  used  to  display  bit¬ 
mapped  images  that  have  been  de¬ 
veloped  with  MacPaint  or  MacDraw. 
Another  addition  in  ExperOPS5-Plus 
is  access  to  the  real-time  speech  syn¬ 
thesis  functions  provided  with  Ex- 
perLisp  Plus,  available  separately. 
ExperProlog  II  is  an  implementation 
of  PROLOG  II,  the  new  standard  for 
the  language  proposed  by  Alain  Col- 
merauer,  the  inventor  of  PROLOG.  Ex¬ 
perProlog  II  takes  full  advantage  of 
the  Mac  desktop  to  provide  a  conve¬ 
nient  interactive  development  envi¬ 
ronment,  as  well  as  a  facility  for  par¬ 
titioning  knowledge  bases  into 
separate  contexts  or  worlds.  Exper¬ 
Prolog  II  can  be  configured  to  use  up 
to  4  megabytes  of  memory. 

Expert  Development  Package 

Arity  Corp.  is  now  selling  an  expert 
system  shell,  called  the  Expert  Devel¬ 
opment  Package,  that  runs  in  Arity’s 
powerful  implementation  of  PRO¬ 
LOG.  The  package  comprises  two  re¬ 
lated  languages — the  Taxonomy  lan¬ 
guage,  which  offers  a  frame-based 
object  hierarchy  with  built-in  inheri¬ 
tance,  and  a  rule-based  language  that 
supports  an  English-like  syntax  for 
rules.  The  system  is  designed  for  PCs 
and  PC  compatibles  with  640K  and 
support  for  the  Above  Board  expand¬ 
ed  memory  standard.  Initial  testing 
of  this  package  indicates  that  the  ex¬ 
panded  memory  option  is  necessary 
for  any  serious  application  work. 
The  advantages  of  this  system  are  the 
open  architecture,  which  allows  a 
two-way  interface  between  it  and 
PROLOG,  and  the  fact  that  Arity’s  PRO- 


116 


Dr.  Dobb  s  Journal,  February  1987 

155 


ARTIFICIAL  INTELLIGENCE 

(continued  from  page  116) 


LOG  compiler  can  be  used  to  develop 
stand-alone  expert  systems  in  stand¬ 
ard,  .EXE  machine-code  format. 

The  Intelligence/Compiler 

The  Intelligence/Compiler  from  In- 
telligenceWare  is  another  exciting  ex¬ 
pert  system  shell  that  is  helping  to 
make  all  those  people  who  said  that 
professional  AI  systems  cannot  be 
supported  on  microcomputers  think 
again.  This  system  supports  four  dif¬ 
ferent  programming  paradigms:  rule- 
based  programming,  frame-based 
representation,  logic-based  program¬ 
ming,  and  procedural  programming. 
Other  pluses  with  The  Intelligence/ 
Compiler  include  a  smart  editor  and  a 
built-in  B-tree  database  manager.  The 
editor  includes  its  own  built-in  parser 
that  can  detect  syntax  errors,  and 
word  completion  and  spelling  check¬ 
ing  are  also  planned. 

KEE  3.0 

Intellicorp  has  taken  the  first  step  to¬ 
ward  updating  its  KEE  system  to  keep 
pace  with  the  competition.  In  Release 
3.0  it  has  introduced  the  KEEworlds  fa¬ 
cility,  which  is  intended  to  provide 
KEE  with  the  capability  of  hypotheti¬ 
cal  reasoning  with  multiple  hypo¬ 
thetical  continuations.  The  KEEworlds 
facility  is  based  on  the  ATMS  (assump¬ 
tion-based  truth  maintenance)  ap¬ 
proach  to  truth  maintenance  devel¬ 
oped  by  Johann  deKleer  at  Xerox 
PARC.  Full  integration  with  all  other 
facilities  of  KEE  is  provided  with  this 
new  capability,  and  an  interactive 
user  interface  featuring  a  World- 
Browser  has  also  been  provided. 

Nexpert  Object 

The  only  way  to  classify  Neuron  Da¬ 
ta's  new  Nexpert  Object  system  for 
IBM  PC/ATs  and  compatibles — a  sig¬ 
nificantly  more  powerful  version  of 
the  system  available  for  the  Mac — is 
as  an  extremely  imaginative  product 
that  is  very  much  in  a  category  all  its 
own.  This  is  easily  the  tool  that  comes 
the  closest  of  any  so  far  to  creating  an 
AI  workstation  environment  on  a  mi¬ 
cro.  In  addition  to  the  lush  user  inter¬ 
face  running  under  Microsoft  Win¬ 
dows,  Nexpert  Object  has  several 
powerful  and  unique  features.  One 
of  my  favorites  is  the  common  rule 


118 

156 


format  for  both  forward  and  back¬ 
ward  chaining.  As  far  as  I  know,  this 
is  the  only  commercial  expert  system 
shell  that  offers  this  feature. 

Most  expert  system  shells  that  offer 
fully  specifiable  forward  and  back¬ 
ward  chaining  use  a  different  format 
for  forward  and  backward  chaining 
rules,  which  means  that  the  same 
knowledge  cannot  be  used  in  both 
reasoning  modes.  With  a  common 
format  for  both  types,  the  opportuni¬ 
ty  exists  for  writing  "pure  knowl¬ 
edge”  rules  that  are  usable  for  a  vari¬ 
ety  of  purposes  and  can  be  accessed 
in  either  type  of  inference. 

The  main  new  addition  to  the  sys¬ 
tem  is  a  major  one.  Nexpert  Object 
has  a  true  object-oriented  class  hier¬ 
archy  system  whereas  Nexpert  for 
the  Mac  does  not.  An  object  editor  is 
provided  to  make  this  facility  accessi¬ 
ble  to  users.  Each  object  has  proper¬ 
ties  (or  slots)  as  well  as  subobjects.  A 
subobject  is  a  part  of  the  object  that 
constitutes  an  object  in  its  own  right, 
such  as  an  organ  that  is  part  of  a  per¬ 
son's  body.  An  object  also  has  metas¬ 
lots,  which  provide  various  ways  for 
the  object  to  utilize  various  custom 
features  regarding  such  things  as  its 
salience,  inheritance  relations,  or 
sources  of  information. 

Office  Automation  Toolkit 

One  interesting  tool  available  for  lisp 
programmers  is  the  Office  Automa¬ 
tion  Toolklit  from  Grandmaster.  This 
is  a  high-level  library  of  program¬ 
ming  tools  that  facilitate  the  develop¬ 
ment  of  business-oriented  applica¬ 
tions  in  LISP.  This  package  is  built  for 
the  muLISP-86  dialect  available  from 
Soft  Warehouse  and  Microsoft.  Func¬ 
tions  provided  in  this  toolkit  allow 
programmers  to  develop  applica¬ 
tions  that  use  multiple  buffers,  multi¬ 
ple  windows,  menus,  forms,  and  nat¬ 
ural  language  commands.  A  new 
addition  to  the  toolkit  also  provides 
functions  for  programming  applica¬ 
tions  that  utilize  the  B-F-tree  data¬ 
base  structure.  One  of  the  attractions 
of  this  toolkit  is  that,  like  libraries 
written  for  compiled  languages,  no 
royalties  have  to  be  paid  on  stand¬ 
alone  applications  that  use  it. 

PC  Scheme  2.0 

Difficult  as  it  may  be  to  choose,  if 
pressed,  I  would  have  to  say  that  the 
most  remarkable  implementation  of 


LISP,  as  well  as  my  favorite  object-ori¬ 
ented  programming  environment 
for  PCs  so  far,  is  PC  Scheme  from  Tex¬ 
as  Instruments.  PC  Scheme  is  the 
most  modern  and  streamlined  of  the 
LISP  dialects  and  is  the  one  many 
think  should  have  been  the  basis  for 
Common  LISP.  PC  Scheme  provides 
an  object-oriented  extension  called 
SCOOPS  that  features  the  Mixin  multi¬ 
ple-inheritance  capability  and  active 
values,  formerly  found  mainly  on  ex¬ 
pensive  LISP  machines.  PC  Scheme  in¬ 
cludes  a  compiler  and  a  smart, 
Emacs-style,  full-screen  editor.  At  less 
than  $100,  this  is  easily  the  best  buy  in 
LISP  systems  for  PCs.  If  it  were  not  for 
TI's  extreme  low  profile  in  market¬ 
ing  this  product,  it  probably  would 
have  already  became  the  LISP  coun¬ 
terpart  to  Turbo  Pascal.  It  undoubted¬ 
ly  will  in  any  case. 

Personal  Consultant  Plus 

TI  has  also  rewritten  and  significant¬ 
ly  updated  its  expert  system  shell 
product,  which  now  runs  in  the  PC 
Scheme  environment.  Personal  Con¬ 
sultant  Plus  offers  four  main  features 
that  are  of  particular  interest:  frame- 
based  representation,  metarules,  ac¬ 
tive  value  access  methods,  and  an 
open  architecture.  The  open  archi¬ 
tecture  in  this  case  is  particularly  in¬ 
teresting  because  it  means  that  it  is 
possible  to  construct  some  advanced 
"deep”  systems  with  Personal  Con¬ 
sultant  Plus  by  exploiting  SCOOPS,  the 
object-oriented  extension  to  PC 
Scheme.  Other  useful  features  are 
the  ability  to  display  graphic  images 
and  a  Snapshot  utility  that  can  import 
graphics  developed  using  any  other 
software  package. 

VP-Expert 

Paperback  Software's  VP-Expert  pro¬ 
gram  is  likely  to  become  the  same 
kind  of  spoiler  in  the  commercial  AI 
software  category  that  Turbo  Pascal 
was  in  the  structured  programming 
segment  of  the  market.  It  has  just  too 
many  features  to  be  sold  for  less  than 
$100 — or  so  all  its  competitors  will 
undoubtedly  feel.  The  irrepressible 
Adam  Osborne  obviously  has  differ¬ 
ent  thoughts  on  the  subject.  First  im¬ 
pressions  of  it  are  that  it  is  something 
like  a  poor  man’s  Guru  or  a  poor 
man’s  M.l.  But  this  is  not  quite  right 
either  because  VP-Expert  has  other 
features  that  make  it  comparable  in 


Dr.  Dobb  s  Journal,  February  1987 


ARTIFICIAL  INTELLIGENCE 

(continued  from  page  118) 


some  ways  to  programs  such  as  Ex¬ 
pert  Ease.  Suffice  it  to  say  that  it  is 
now  easily  the  runaway  winner  for 
the  "best  buy”  award  in  PC  expert 
system  software. 

iVc w  AI  Books 

Right  now,  with  the  enormous 
changes  in  AI — both  those  already 
occurring  and  the  ones  still  ahead — 
books  are  playing  as  much  or  more  of 


a  role  in  leading  the  way  as  are  im¬ 
portant  AI  programs  and  applica¬ 
tions.  Like  it  or  not,  books  reach  a  lot 
more  people  than  great  programs  do. 
Here  are  some  new  ones  that  look  es¬ 
pecially  interesting: 

Agha,  Gul.  ACTORS.  Cambridge, 
Mass.:  MIT  Press,  1986. 

Galambos,  James  A.,  et  al.  Knowledge 
Structures.  Hillsdale,  N.J.:  Lawrence 
Erlbaum  Assoc.,  1986. 

Schank,  Roger.  Explanation  Patterns. 
Hillsdale,  N.J.:  Lawrence  Erlbaum  As¬ 
soc.,  1986. 


Sterling,  Leon,  and  Shapiro,  Ehud. 
The  Art  of  Prolog.  Cambridge,  Mass.: 
MIT  Press,  1986. 

Wilensky,  Robert.  Common  LlSPcraft. 
New  York:  Norton,  1986. 

Winograd,  T.,  and  Flores,  F.  Under¬ 
standing  Computers  and  Cognition. 
Norwood,  N.J.:  Ablex  Publishing 
Corp.,  1986. 

Bibliography 

Bain,  W.  M.  "A  Case-Based  Reasoning 
System  for  Subjective  Assessment.” 
AAAI-86. 

Bisiani,  R.  "A  Software  and  Hardware 
Environment  for  Developing  AI  Ap¬ 
plications  on  Parallel  Processors.” 
AAAI-86. 

Blelloch,  G.E.  "CIS:  A  Massively  Con¬ 
current  Rule-Based  System.”  AAAI-86. 
Brooks,  R.,  et  al.  "A  Mobile  Robot 
with  Onboard  Parallel  Processor  and 
Large  Workspace  Arm.”  AAAI-86. 
Edelson,  T.  "Can  a  System  Be  Intelli¬ 
gent  if  It  Never  Gives  a  Damn?"  AAAI- 
86. 

Hammond,  K.  "CHEF:  A  Model  of 
Case-Based  Planning.”  AAAI-86. 

Hon  Wai  Chun.  "A  Representation 
for  Temporal  Sequence  and  Duration 
in  Massively  Parallel  Networks." 
AAAI-86. 

Lenat,  D.,  et  al.  “CYC:  Using  Common 
Sense  Knowledge  to  Overcome  Brit¬ 
tleness  and  Knowledge  Acquisition 
Bottlenecks.”  Al  Magazine  (Winter 
1986). 

Lozano-Perez,  T.  "A  Simple  Motion 
Planning  Algorithm  for  General  Ro¬ 
bot  Manipulators.”  AAAI-86. 

Lytinen,  S.,  and  Gershman,  A. 
"ATRANS:  Automatic  Processing  of 
Money  Transfer  Messages.”  AAAI-86. 
Sharkey,  N.,  et  al.  "Mixing  Binary  and 
Continuous  Connection  Schemes  for 
Knowledge  Access.”  AAAI-86. 
Siemens,  R.,  et  al.  "Starplan  II:  Evolu¬ 
tion  of  an  Expert  System.”  AAAI-86. 
Stefik,  M.  "The  Next  Knowledge  Me¬ 
dium.”  AI  Magazine  (Spring  1986). 
Steinberg,  S.,  et  al.  "The  Butterfly  LISP 
System."  AAAI-86. 

DDJ 


Vote  for  your  favorite  feature/article. 
Circle  Reader  Service  No.  7. 


120 


Dr.  Dobb's  Journal,  February  1987 

157 


COLUMNS 


STRUCTURED  PROGRAMMING 


Language  Translations 


This  column  begins  a  series  of  dis¬ 
cussions  regarding  the  transla¬ 
tion  of  programs  between  different 
languages  or  implementations.  I’ll 
start  by  looking  at  selected  cases  of 
translating  Turbo  Pascal  programs 
into  Logitech  Modula-2. 

Logitech  has  recently  released  the 
Translator,  which  allows  the  migra¬ 
tion  of  memory-hungry  Turbo  Pascal 
programs  to  Modula-2/86.  Discussing 
the  Logitech  Translator  at  length 
would  fill  a  book;  instead,  I'll  present 
four  sample  Turbo  Pascal  programs 
and  discuss  their  translations. 

The  first  question  you  might  ask 
about  the  Translator  is,  "Does  it  trans¬ 
late  100  percent?”  For  most  applica¬ 
tions,  the  oversimplified  answer  is  no. 
The  Translator  may  include  several 
warning  messages  in  a  generated 
Modula-2  listing,  but  the  absence  of 
such  messages  does  not  necessarily 
mean  that  the  program  compiles  cor¬ 
rectly.  As  a  matter  of  fact,  the  Transla¬ 
tor  deliberately  leaves  some  types  of 
errors  (those  due  to  incomplete  trans¬ 
lation)  for  the  compiler  to  detect. 

Before  you  translate  a  program, 
you  should  first  identify  its  compo¬ 
nents:  expressions,  loops,  decision¬ 
making  constructs,  procedures,  func¬ 
tions,  and  so  on.  The  Translator  can 
translate  similar  language  aspects, 
such  as  loops,  decision-making  con¬ 
structs,  and  routines,  from  Pascal  to 
Modula-2  without  the  need  for  "man¬ 
ual”  editing.  Any  difficulties  in  trans- 

by  Namir  Clement 
Shammas 

lation  are  partially  because  of  differ¬ 
ences  in  data  types  and  their 
manipulation. 

Numeric  Manipulation 

Listing  One,  page  70,  shows  a  Turbo 
Pascal  program  that  implements  a 
simple  four-function  calculator.  The 
program  deals  with  REAL  numeric 


manipulations.  Listing  Two,  page  70, 
contains  the  translated  Modula-2 
code.  No  hand-coded  patches  were 
needed  to  make  the  Modula-2  version 
run  correctly.  Note  that  the  RE¬ 
PEAT  .  .  .  UNTIL  loop  and  the  CASE 
and  IF  ..  .  THEN  constructs  have  been 
translated  correctly.  The  Modula-2 
version  contains  more  code,  especial¬ 
ly  for  I/O  operations — for  example,  a 
single  Pascal  WRITELN  statement  has 
been  replaced  by  several  Modula-2 
procedures.  Notice  the  use  of  stdinout 
as  the  standard  I/O  device  in  the 
Modula-2  translation.  In  addition,  the 
output  procedures  in  Modula-2  con¬ 
tain  arguments  for  formatting  the 
displayed  numbers. 

A  note  on  translating  a  Turbo  Pas¬ 
cal  CASE  statement  without  the  ELSE 
clause:  in  Turbo  Pascal,  if  the  value  of 
the  CASE  variable  does  not  match  any 
option,  program  flow  simply  re¬ 
sumes  outside  the  CASE  construct.  A 
similar  situation  in  Modula-2  yields  a 
run-time  error.  Thus,  the  Translator 
inserts  an  ELSE  clause  in  the  CASE 
construct  if  one  is  not  present  in  the 
Pascal  program. 

The  first  example  translates  with¬ 
out  any  snags.  Programs  that  tackle 
sorting  and  searching  of  INTEGERS 
and  REALS  fall  into  the  same  category. 
However,  operations  such  as  bit  ma¬ 
nipulation  using  INTEGERS  and  im¬ 
plicit  numeric  type  conversion  need 
additional  editing.  The  core  Modula-2 
language  does  not  support  bit  ma¬ 
nipulation  using  INTEGERS  or  CARDI¬ 
NALS — after  all,  that’s  what  the  BIT- 
SET  type  is  for.  This  can  be  frustrating 
for  Pascal  programmers  who  take  in¬ 


teger-bit  manipulation  for  granted. 
The  Translator  library  does,  howev¬ 
er,  contain  procedures  that  support 
manual  editing  for  integer-bit  ma¬ 
nipulation.  For  example,  you  can  re¬ 
code  the  following  Pascal  expression: 

(Base_Address  AND  Offset)  OR 

(Mem_Loc  Shi  3) 

by  hand  into  the  following  RPN-like 
expression  in  Modula-2: 

OR(AND(BaseAddress,  Offset), 

Shl(MemLoc,3)) 

String  Manipulation 

The  next  example  deals  with  string 
manipulations.  Listings  Three,  page 
72,  and  Four,  page  78,  contain  pro¬ 
grams  that  perform  text  pattern 
searches.  The  T'and  wildcards  are 
supported  and  follow  the  convention 
familiar  to  CP/M  and  MS-DOS  users. 
String  manipulation  is  the  most  com¬ 
mon  area  in  which  a  good  deal  of  ad¬ 
ditional  editing  is  required.  First,  in 
Turbo  Pascal  the  STRING  type  is  pre¬ 
defined  with  a  lower  index  of  1  and  a 
maximum  lengih  of  255.  (The  zero  in¬ 
dex  of  a  Turbo  Pascal  STRING  stores  its 
maximum  length.)  In  Modula-2, 
strings  are  handled  quite  differently. 
Basically,  they  are  treated  as  arrays 
of  characters.  The  current  Modula- 
2/86  compiler  requires  that  string 
types  have  a  lower  array  index  of  0. 
The  reason  for  this  seems  to  be  a  fea¬ 
ture  that  enables  the  compiler  to 
automatically  insert  an  end-of-string 
delimiter  (as  in  the  C  language)  when 
assigning  a  string  constant  to  a  string 
variable,  as  in: 

Name  :  ARRAY  [0  .  .  79]  OF  CHAR; 

(*  more  definitions  and  statements  *) 
Name  :=  Don  Johnson'; 

Otherwise,  the  programmer  is  re¬ 
sponsible  for  inserting  the  control  0 
delimiter,  as  in: 


124 

158 


Dr.  Dobb's  Journal,  February  1987 


Name  :=  ARRAY  [1  .  .  80]  OF  CHAR; 

(*  more  definitions  and  statements  *) 
Name:=  'Don  Johnson'; 

Name[12]  :=  0C; 

Strings  in  Modula-2/86  can  be  up  to 
65,535  characters  long,  and  future 
implementations  may  permit  this 
upper  limit  to  be  around  two  billion 
(the  upper  range  of  the  long  integer 
type  LONGINT).  Moreover,  the  sup¬ 
port  of  open  arrays  in  Modula-2  does 
away  with  the  Turbo  Pascal  directive 
'{$V-}'  to  relax  the  strict  string  type 
checking  with  routine  parameters. 

Listing  Four  contains  comments 
beginning  with  -->  strings.  These 
point  out  the  lines  emitted  by  the 
Translator  that  needed  further  edit¬ 
ing.  A  few  lines  were  inserted,  and 
the  INC  procedure  was  removed  be¬ 
cause  Modula-2  already  has  it  as  a 
predefined  routine  that  performs  the 
same  task.  The  constant  declaration 
section  has  a  new  identifier  HI, 
which  is  used  in  conjunction  with 
the  substring  scanning  function 
Pos(  ).  In  Turbo  Pascal,  a  value  of  0  is 
returned  if  no  match  is  found.  In 
Modula-2,  the  index  value  of  0  points 

to  the  first  character.  The  implemen¬ 
tors  of  the  Modula-2  version  Post  ) 
have  elected  to  return  a  value  greater 
than  the  upper  string  dimension  lim¬ 
it.  Thus,  in  Turbo  Pascal  you  test  for 
substring  match  using: 

Actor  :  =  'Don  Johnson'; 

(*  Index  is  an  INTEGER  *) 

Index  :  =  PosCSellek’, Actor); 

IF  Index  >  0  THEN  (*  statements  *) 

whereas  in  Modula-2  the  same  test  is 
written,  using  the  HIGHt )  function, 
as: 

Actor  :=  'Don  Johnson'; 

(*  Index  is  a  CARDINAL  *) 

Index  :=  PosCSellek’, Actor); 

IF  Index  <  =  HIGH(Actor)  THEN 

(*  statements  *) 

The  constant  HI  is  assigned  as  a  '  con¬ 
sistent’ '  large  integer  value  that  sig¬ 
nals  a  substring  mismatch  status.  It  is 
assigned  to  variables  that  are  used  to 
monitor  the  first  character  position 
at  which  a  match  occurs. 

You  may  have  noticed  I  have  used 
open  arrays  in  the  PatternSearch 

function  instead  of  the  STRING255 
and  STRING 40  types  for  declaring  the 
two  string  parameters.  The  declara¬ 
tion  can  be  rewritten  as: 

PROCEDURE  PatterSearchlTextLine, 

Pattern  :  ARRAY  OF  CHAR) : 

(*  function  returns  an  *)  INTEGER; 

The  body  of  the  function  has  been 
edited  to  use  constant  HI  and  to  ac¬ 
count  for  the  difference  between  the 
indexing  schemes  that  Turbo  Pascal 
and  Modula-2  use. 

The  Modula-2  version  of  Pattern- 
Search  has  a  new  identifer,  Pattern- 
SearchResult,  to  return  the  function 
result.  The  Translator  inserts  the 
new  identifier  because  in  Modula-2 
functions  return  their  results  via  the 
RETURN  <identifier>  syntax  and  do 
not  use  the  function  name  in  an  as¬ 
signment  statement. 

In  the  procedure  ScanPattern  and 
the  function  LocatePattern,  the  vari¬ 
ables  tracking  character  positions 
have  all  been  shifted  by  1  and  as¬ 
signed  an  initial  value  of  0.  Using  the 
string-copy  function  Copyt )  seems  to 
require  no  alteration.  In  the  function 

STRUCTURED  PROGRAMMING 

shows  a  simple  Pascal  program  that 
performs  character  counting.  The 
program  prompts  you  for  a  text  file; 
reads  it;  and  classifies  each  character 
as  either  a  digit,  uppercase,  lower¬ 
case  or  "others."  It  then  draws  a  sim¬ 
ple  histogram  (each  represents  a 

hundred  count)  for  each  character 
category.  The  program  is  used  to 
demonstrate  handling  small  and 
large  (more  than  16  elements)  sets. 

The  Translator  produces  a  Modula- 
2  program  (Listing  Six,  page  85)  that 
requires  a  fair  amount  of  editing. 
First,  I’ll  discuss  the  small  set.  In  the 
Turbo  Pascal  listing,  a  two-member 
set  ['Y',  'y ']  is  used  to  examine  the  us¬ 
er’s  response  in  the  UNTIL  clause.  In 
the  Modula-2  version,  I  import  Make- 
Emptyt  ),  Includet  J,  and  Insett  )  and 
the  SetOfChar  type  from  module 
LongSet.  In  the  Modula-2  program,  I 
declare  YesNo  as  a  variable  of  type 
SetOjChar  and  use  the  MakeEmpty 
procedure  to  create  an  empty  set  of 
YesNo.  The  Y  and  y  set  members  are 
inserted  using  the  Include  procedure. 
Notice  the  nested  use  of  WORD ( )  and 
ORDt )  to  transform  the  character's  AS¬ 
CII  code  into  a  WORD.  The  above  steps 
prepare  the  YesNo  set  for  testing  of 

membership.  The  Boolean  function 
InSett  )  is  used  with  the  ordinal  value 
OK,  the  character-typed  response. 

Handling  long  sets  is  different.  The 
BuildSett  )  procedure  takes  three  ar¬ 
guments:  the  set-typed  variable  and 
the  ordinal  values  of  the  first  and  last 
set  members.  Three  calls  are  made  to 
BuildSett  )  to  create  the  three  sets  in 
question  (I  am  treating  DigitSet  as  a 
long  set).  The  Insett )  function  is  used 
to  determine  the  set  membership  of 
each  file  character  read. 

Absolute  Variables 

The  last  example  deals  with  simple 
absolute  variables.  Listings  Seven  and 
Eight,  pages  86  and  87,  show  Pascal 
and  Modula-2  programs  that  write 
strings  directly  to  the  screen  memory 
of  an  IBM  PC  with  a  monochrome 
monitor.  The  Pascal  listing  contains 
an  identifier  DISP  defined  as: 

DISP  :  SCREEN80  absolute  $B0()0:0000; 

which  is  edited  in  Modula-2  to 
become: 

DISP  [B000H:00H]  :STRING80; 

LocatePattern,  most  of  the  tests  in  the 
IF  statements  needed  editing.  The 
original  statements  are  still  included, 
enclosed  in  comments.  Also  notice 
that  the  result  of  the  string  length 
function,  which  returns  a  CARDINAL, 
has  been  converted  into  an  INTEGER 
to  match  the  assigned  variable  type. 

In  the  main  program  segment,  the 
IF  Line  =  ' '  THEN  Pascal  code  has 
been  rewritten  as  IF  Line[0]  =  0C 
THEN  for  Modula-2.  In  addition,  string 
assignment  between  two  variables 
needs  the  Assign (  )  procedure  instead 
of  Pascal’s  ;=  assignment  operator. 

Sets 

The  third  example  deals  with  sets. 
Logitech  Modula-2  specifies  that  sets 
can  have  16  members  at  most — not 
enough  to  handle  the  character  sets 
that  Pascal  supports.  The  Translator 
emits  BITSET{ }  when  it  encounters 
Pascal  sets  such  as  ['Y’/yl  or 
['A' . .  'Z’l  The  translation  needs  man¬ 
ual  editing,  using  a  much  needed 
LongSet  module  supplied  with  the 
Translator.  Listing  Five,  page  84, 

Dr.  Dobb  s  Journal,  February  1987 


125 

159 


STRUCTURED  PROGRAMMING 

(continued  from  page  126) 


to  conform  with  the  standard  Modu¬ 
la-2  syntax  for  absolute  variables. 
More  complicated  types  of  absolute 
variables  in  Turbo  Pascal  may  re¬ 
quire  more  hacking.  The  translated 
program  has  its  FOR  I  ;=  1  TO  J  DO 
loop  edited  into  FOR  I  :=  0  TO  J— IDO 
to  reflect  the  different  string  index¬ 
ing  schemes. 

Creating  Library  Modules 

This  discussion  of  translating  Turbo 
Pascal  programs  into  Modula-2  is  not 
complete  without  considering  the 
strong  modular  nature  of  Modula-2. 
Turbo  Pascal  programs  use  included 
files,  as  well  as  chained  code  seg¬ 
ments  and  overlays,  to  tackle  large 
programs.  Putting  many  of  these  pos¬ 
sibly  reusable  routines  in  library 
modules  is  indeed  a  sound  decision. 

You  can  use  the  Translator  to  cre¬ 
ate  library  modules  by  following 
these  steps: 

1.  Create  a  single  complete  file  from 
all  the  related  included  files.  By  com¬ 
plete,  I  mean  that  it  does  not  rely  on 
any  external  or  further  declarations 
or  code. 

2.  Enclose  this  code  in  an  empty  pro¬ 


gram.  Add  the  program  heading  and 
an  empty  main  body  section. 

3.  Compile  the  Turbo  Pascal  code 
with  the  compiler  to  make  sure  it  is 
assembled  correctly. 

4.  Process  the  correct  Pascal  code 
through  the  Translator.  This  gives 
you  a  Modula-2  version  of  your  "do- 
nothing”  program. 

5.  Apply  any  hand-coded  editing  re¬ 
quired  on  the  Modula  code. 

6.  Decide  what  to  export  from  the 
module,  and  create  the  DEFINITION 
and  IMPLEMENTATION  modules  ac¬ 
cordingly. 

The  next  time  you  ask  a  Turbo  Pascal 
programmer,  “Parle z  vous  Modula- 
2?"  expect  the  answer  to  be  "TRUE.” 

Open  Loops  in  Modula-2 

You  can  use  open  loops  in  Modula-2 
to  prevent  "telescoped”  error  han¬ 
dling  code.  Table  1,  below,  shows  a 
skeleton  Modula-2  program  that  tests 
three  error  conditions.  Notice  how 
telescoped  the  code  is.  Table  2,  below, 
shows  the  same  program,  but  this 
time  an  open  loop  is  used  to  enclose 
the  main  body.  Each  tested  error  con¬ 
dition  uses  an  EXIT  statement,  which 
allows  the  rest  of  the  main  bodv  to  be 
on  the  same  level  regardless  of  the 
number  of  error  conditions  tested. 


The  open  loop  ends  with  an  IF  state¬ 
ment  that  is  guaranteed  to  exit.  Nest¬ 
ed  open  loops  can  be  used  in  a  similar 
way  to  control  multilevel  errors. 

From  Readers 

I  received  a  short  utility  program  for 
the  IBM  PC  written  in  Modula-2  from 
James  Janney,  of  Salt  Lake  City,  Utah. 
The  program,  shown  in  Listing  Nine, 
page  88,  displays  the  status  of  the 
Caps-Lock  and  Num-Lock  keys  and 
allows  you  to  change  them.  This  is 
useful  with  keyboards  such  as  the 
KeyTronic,  whose  LEDs  get  out  of 
sync  when  you  run  programs  such  as 
CrossTalk.  Janney's  utility  rectifies 
this  situation;  the  only  other  remedy 
is  to  reboot. 

Availability 

All  the  source  code  for  articles  in  this 
issue  is  available  on  a  single  disk.  To 
order,  send  $14.95  to  Dr.  Dobb 's  Jour¬ 
nal,  501  Galveston  Dr.,  Redwood  City, 
CA  94063  or  call  (415)  366-3600  ext. 
216.  Please  specify  the  issue  number 
and  disk  format  (MS-DOS,  Macintosh, 
Kaypro). 

DDJ 

(Listings  begin  on  page  70.) 

Vote  for  your  favorite  feature/article. 

Circle  Reader  Service  No.  8. 


MODULE  Test; 

MODULE  Test; 

(*  declarations*) 

(*  declarations  *) 

BEGIN 

BEGIN 

(*  Reset  error  flags  *) 

(*  Reset  error  flags  ’) 

ErrorConditionl  :  =  FALSE; 

ErrorConditionl  :=  FALSE; 

ErrorCondition2  :  «=  FALSE; 

ErrorCondition2 :  =  FALSE; 

ErrorCondition3  :  =  FALSE; 

ErrorCondition3  :  =  FALSE; 

(*  Statements  *) 

LOOP 

IF  NOT  ErrorConditionl  THEN 

(*  Statements  *) 

(*  Statements  *) 

IF  ErrorConditionl  THEN  EXIT  END; 

IF  NOT  ErrorCondition2  THEN 

(*  Statements  *) 

(*  Statements  *) 

IF  ErrorCondition2  THEN  EXIT  END: 

IF  NOT  ErrorCondition3  THEN 

(*  Statements  *) 

(*  Statements  *) 

IF  ErrorCondition3  THEN  EXIT  END; 

END;  (*  IF  NOT  ErrorCondition3  *) 

END;  (*  IF  NOT  ErrorCondition2  *) 

END;  (*  IF  NOT  ErrorConditionl  *) 

(*  Statements  *) 

IF  1  >  0THEN  EXIT  END; 

END;  (*  LOOP  *) 

(*  Error  handling  section  *) 

IF  ErrorConditionl  THEN  (*  Statements  *) 

ELSIF  ErrorCondition2  THEN  (*  Statements  *) 

ELSIF  ErrorCondition3THEN  (*  Statements  *)  END; 

(*  Error  handling  section  *) 

IF  ErrorConditionl  THEN  (*  Statements  *) 

ELSIF  ErrorCondition2  THEN  (*  Statements  *) 

ELSIF  ErrorCondition3  THEN  (*  Statements  *)  END; 

END  Test. 

END  Test. 

Table  1:  Telescoped  code  that  performs  error  handling  Table  2:  Using  an  open  loop  for  cleaner  code 


128 

160 


Dr.  Dobb’s  Journal,  February  1987 


FORUM 


LETTERS 

( continued  from  page  12) 


Eratosthenes  Sieve 

all,  it's  filtering  out  more  nonprimes). 

Dear  DDJ, 

I've  also  included  a  loop  at  the  end  of 

Regarding  the  Eratosthenes  Sieve 

my  program  that  prints  out  the 

that  accompanied  the  article  "High- 

primes,  just  to  prove  that  the  pro- 

Speed  Thrills”  in  the  September  1986 

gram  does  find  primes.  This  should 

issue,  the  program  reports  1,899 

be  removed  if  the  program  is  to  be 

primes  in  the  range  2  to  8,191.  This  is 

used  as  a  benchmark. 

incorrect.  Actually,  there  are  only 

Robert  V.  Duncan 

1,028  primes  in  that  range. 

P.O.  Box  215 

I  realize  that  the  Sieve  is  used  as  a 

Pottersville,  NY  12860 

benchmark  to  test  a  system's  ability 
to  handle  arrays  and  loops  and  that 
the  actual  number  returned  is  imma- 

DDJ 

terial.  Still,  it  bothers  me  that  people 
may  be  going  around  thinking  there 
are  1,899  primes  less  than  8,192.  Code 
Example  1,  below,  returns  the  cor¬ 
rect  result.  It  runs  just  a  little  slower 
than  the  listing  you  published  (after 

#def ine TRUE  1 
#def  ine  FALSE  0 
#def ine  SIZE  8  191 
char  flags[SIZE+1  ]  • 

nainf ) 

{ 

int  i,  j,  count,  loops; 

printf ( "Running  .  .  .  <TC)n'); 

for  (  loops  —  0 ;  loops  ( 

1  0  ;  loops  +  +  ) 

A 

count  =  0  ; 

for  (  i  =  2  ;  i  <  =  SIZE  ;  i  +  +) 

flags [ i ]  =  TRUE; 

for  (  i  =  2  ;  i  (  =  SIZE  ;i  +  +  ) 

{ 

if(flags[i])  /*  found  a  prime,  */ 

l 

for  (j  =  2*i  ;  j  (  =  SIZE 

;  j  =  i  ) 

flags  [  j  ]  =  FALSE ;  /♦"filter*  out  its  multiples 

*/ 

count+  +  ; 

/ *  increment  prime 

counter  */ 

{  /*  endof  f  or  j  =  .  .  . 

*/ 

}  /*  endof  for  i  =  2  .  .  . 

.  */ 

}  /*  endof  for  loops  =  = 

=  0  .  .  .  */ 

pr intf  (  "  %d  primes ,  %d  loops  <TC)n '  ,  count , loops ) ; 

/ *  let '  s  take  a  look  at  our  primes 

zero  and  one  aren  '  t  prime 

/* 

,  start  at  2 

for  (i  =  2  ;  i  <=  SIZE  ;  i  +  +  ) 

if  ( flags [ i ] ) 

printf { ”%6d ' , i ) ; 

print ( "(TC)n' ) ; 

} 

Code  Example  1:  Eratosthenes  Sieve  prime  number  program  in  C 


130 


Dr.  Dobb's  Journal,  February  1987 

161 


FORUM 


VIEWPOINT 

(continued  from  page  14) 


Total  cost  $4.40 
Item  onions 
Quantity  2.5  lbs 
Unit  cost  $0. 72/lb 
Total  cost  $1.80 

Would  you  not  be  somewhat  puzzled 
about  the  rationale  for  that  arrange¬ 
ment?  I  imagine  that  you  would  in¬ 
stinctively  attempt  to  rearrange  the 
list  as  shown  in  Table  1,  right.  Surely 
this  is  clearer,  more  obvious,  easier  to 
read,  and  less  likely  to  hide  an  inad¬ 
vertent  error.  In  fact,  isn't  the  ar¬ 
rangement  in  Table  2,  right,  even 
better? 

Assuming  you  agree,  why  then  in 
Anderson's  Modula-2  program  are 
there  118  repetitions  in  the  following 
form? 

WITH  Table68K  [i]  DO 

Mnemonic  :=  "ABCD"; 

Op  :=  {15, 14,  8}; 

AddrModeA  :=  ModeA  {Rx911, 
RegMem3,  Ry02}; 

AddrModeB  :  =  ModeB  { }; 

END 

At  the  very  least  we  should  expect 
that  118  repetitions  of  the  same  struc¬ 
ture  could  be  reduced  to  something 
such  as: 

Table68K[i]  :=  "ABCD",  (15,  14,  8}, 

{Rx911,  RegMem3,  Ry02},  { }; 

Surely  our  "higher  order"  lan¬ 
guage  should  allow  us  the  benefits  of 
parallel  construction  and  avoid  the 
need  for  excessive  repetition  of  un¬ 
necessary  words.  With  only  the 
smallest  dash  of  literary  license,  we 
really  ought  to  expect  something 
such  as  the  format  shown  in  Code  Ex¬ 
ample  2,  right.  To  me  this  tabular 
form  seems  cleaner,  much  easier  to 
comprehend,  and  much  more  likely 
to  show  up  problems.  The  parallel 
construction  encourages  comparison 
and  highlights  errors.  And  even 
without  these  advantages,  the  result 
is  at  least  7,000  characters  shorter  and 
uses  about  590  fewer  lines,  which,  if 
nothing  else,  saves  ten  pages  of 
printout. 

The  point  is  this:  Although  Modula- 
2  has  been  designed  to  "force”  on 
programmers  what  is  considered 
"best”  form,  it  does  not  allow  them  to 


be  as  clear,  clean,  succinct,  and  pre¬ 
cise  as  they  would  almost  instinctive¬ 
ly  be  without  it.  As  a  proof  without 
comment,  Code  Example  3,  below, 
offers  an  example  of  what  a  pro¬ 
grammer  would  have  done  had  he  or 
she  been  forced  to  write  the  preced¬ 
ing  code  in  assembly  language. 

Blunted  tools  produce  crude  carv¬ 
ings,  and  furry  language  fosters 
fuzzy  thinking.  So  if  my  comments 


about  style  and  form  are  cogent,  we 
ought  to  be  able  to  find  some  defi¬ 
ciencies  in  Anderson’s  code. 

There  are  a  couple  of  rather  odd 
things  about  this  code.  As  I  under¬ 
stand  it,  it  is  used  only  to  create  a  ta¬ 
ble  of  data  on  disk.  It  does  this  by 
building  up  the  entire  table  in  memo¬ 
ry,  one  record  at  a  time,  and  then 
writing  the  entire  table  out  to  disk, 
one  record  at  a  time.  I  argue  that  it 


Item 

Quantity 

Unit  Cost 

Total 

potatoes 

3.2  lbs 

$0.65/lb 

$2.08 

oranges 

5.0  lbs 

$0.88/lb 

$4.40 

onions 

2.5  lbs 

$0.72/lb 

$1.80 

Table  1: 

A  possible  rearrangement  of  a  grocery'  list 

Quantity 

Unit  Cost 

Total 

Item 

(lbs) 

(S/lb) 

($) 

potatoes 

3.2 

0.65 

2.08 

oranges 

5.0 

0.88 

4.40 

onions 

2.5 

0.72 

1.80 

Table  2:  An  improved  rearrangement  of  a  grocery  list 


Table68K  :  = 

Name 

Qp 

AddrModeA 

AddrModeB 

"ABCD ” 

{15,14,8} 

{Rx9  1 1  ,  RegMem3  ,  Ry02} 

{  } 

"ADD" 

{15,14,12} 

{OpM68D} 

{OpEA05y} 

"UNLK" 

{14,11,  10,9,6,4,3} 

(Ry02} 

{  }' 

Code  Example  2:  A  tabular  format  of  Anderson  s  code 


Table68K : 
db  'ABCD  ' 

dw  1  1 0000010000  0000b, 0000000000000 1  1  1b, 0000000000000000b 
db  'ADD  ' 

dw  1  10  100000000000 0b, 0000001 00000000 0b, 00 1000000000000 0b 

db  ' UNLK  ' 

dw  010011100101100 0b, 000000000000001 0b, 0000000000000000b 


Code  Example  3:  An  assembly-language  version  of  Anderson’s  code 


132 

162 


Dr.  Dobb’s  Journal,  February  1987 


would  be  better  to  write  out  each  re¬ 
cord  as  it  is  created,  thereby  requir¬ 
ing  storage  for  only  one  record  rath¬ 
er  than  for  118  of  them.  This  would 
give  a  shorter  program,  use  less 
memory,  run  much  faster  (because 
no  array  indexing  would  be  re¬ 
quired),  and  take  no  longer  to 
compile. 

Even  odder,  Anderson's  current 
code  builds  a  data  structure  at  execu¬ 
tion  time  in  a  situation  where  all  the 
data  that  goes  into  the  structure  is 
contained  in  the  code  at  compile 
time.  This  is  a  little  oblique.  It  would 
be  more  reasonable  to  declare  the 
structure  as  an  initialized  constant. 
This  would  give  an  even  smaller  pro¬ 
gram  that  would  run  still  faster  and 
take  even  less  time  to  compile. 

I  do  not  call  these  peculiarities  er¬ 
rors,  although  for  programmers  they 
are  further  from  good  programming 
than  we  would  allow  writers  to  stray 
in  their  English.  My  thesis  is  that  they 
are  more  the  fault  of  the  language 
than  of  the  programmer.  Indeed,  the 
worst  is  actually  forced:  in  Modula-2 
there  is  no  way  to  initialize  a  constant 
array. 

I  believe  that,  in  our  attempts  to  de¬ 
sign  programming  languages  that  so 
constrain  programmers  that  they  are 
"unable  to  write  bad  code,”  we  are 
removing  them  so  far  from  what  is 
actually  going  on  that  they  are  often 
not  even  conscious  of  the  absurdities 
they  are  creating. 

It  is  odd,  and  we  ought  to  reflect  on 
it,  that  in  the  case  of  Modula-2,  which 
has  been  most  heralded  as  a  step  to¬ 
ward  "improving”  our  program¬ 
ming,  we  regularly  find  the  most  bla¬ 
tant  variances  between  what  is 
considered  "approved  practice”  and 
the  intuitive  dictates  of  our  linguistic 
common  sense. 

Is  there  not  something  wrong  with 
the  direction  in  which  we  are  being 
led? 


DDJ 

Vote  for  your  favorite  feature/article. 
Circle  Reader  Service  No.  1 . 


Dr.  Dobb's  Journal,  February  1987 


FORUM 


DDJ  ON  LINE 


PROLOG  and  the 
Future  of  AI 

The  following  is  a  continuation  of  the 
excerpt  of  a  real-time  conference  that 
we  published  in  January.  The  confer¬ 
ence  was  held  by  Borland  Internation¬ 
al  on  CompuServe  on  July  26,  1986.  A 
complete  transcript  of  this  three-hour 
on-line  conference  on  AI  and  PROLOG 
can  be  found  in  DL6  of  the  Borland  SIG 
on  CompuServe  ( <GO  BOR100>KEY- 
WORDS:CONFERENCE). 

Larry  Kraft,  sysop  of  Borland  SIG: 
Our  panel  of  featured  "speakers”  to¬ 
day  includes  Borland's  president,  Phi¬ 
lippe  Kahn;  assistant  professor  Mark 
Chignell  of  USC;  and  Mike  Swaine,  edi¬ 
tor-in-chief  of  Doctor  Dobb's  Journal 
of  Software  Tools. . . .  Here’s  a  ques¬ 
tion  that  sounds  straightforward: 
What  are  theorem-proving 
algorithms? 

Philippe:  Nothing,  until  now!  The 
idea  is  that  you  could  set  up  a  system 
that  would  be  a  sort  of  auditing  trail 
through  algorithms — in  the  same 
way  as  people  have  tried  to  do  auto¬ 
matic  proofing  of  mathematical  theo¬ 
rems.  It’s  great  in  the  ivory  tower  but 
useless  in  proving  or  finding  new  the¬ 
orems.  The  four-color  problem  was 
not  solved  that  way  but  rather 
through  brute  force.  Some  people 
have  even  questioned  the  validity  of 
the  proof  because  the  algorithms  used 
could  not  be  proved  thoroughly. 

Mark:  We  should  make  it  clear  that 
we  are  not  saying  it’s  not  useful  to 
prove  theorems.  Logic  programming 
requires  theorem  proving,  of  a  sort, 
every  time  you  make  an  inference. 
It’s  just  the  attachment  of  the  auto¬ 


matic  verification  that  is  a  problem. 

Philippe:  That’s  right. 

Larry:  One  more  question  before  we 
proceed  with  questions  from  the 
floor:  How  is  CD  ROM  technology  go¬ 
ing  to  impact  AI?  And,  what’s  the  fu¬ 
ture  of  the  PC  and  how  will  that  im¬ 
pact  AI? 

Philippe:  Well,  the  biggest  limitation 
of  PCs  now  is  the  data  storage  size. 
The  bulk  of  a  knowledge  base  can  be 
read-only  because  the  machine 
seems  now  to  have  enough  read  and 
write  storage  capacity.  Distributing 
large  knowledge  bases  on  CD  ROMs  in 
my  opinion  will  greatly  enhance  the 
use  of  expert  systems  on  PCs.  Access¬ 
ing  the  information  is  another  unre¬ 
lated  question.  The  drivers  have  to  be 
optimized  to  the  type  of  data  struc¬ 
ture  used  when  mastering  the  CD 
ROM  because  keyword  searches  are 
inefficient  for  most  AI  applications. 
CD  ROMs  are  going  to  add  a  lot  to  the 
power  of  the  PC. 

Larry:  Any  comments  from  Mike  or 
Mark  before  we  conclude  the  panel 
phase  of  this  conference  and  proceed 
with  questions  from  the  floor?  OK,  I’ll 
take  questions  from  the  floor  now. 
Go  ahead  Dan. 

Dan:  Philippe,  why  would  anyone 
designing  an  expert  system  want  to 
use  PROLOG  rather  than  one  of  the  ex¬ 
isting  tools  such  as  1st  Class,  In- 
sight2  + ,  or  some  of  the  more  sophis¬ 
ticated  KEE/M.l  type  of  tools? 

Philippe:  It's  the  idea  of  control.  Bor¬ 
land  will  release — it's  no  secret — 
some  prebuilt  shells  that  work  on  top 
of  our  implementation  of  PROLOG. 
But  what  really  made  the  success  of 
products  such  as  dBASE  from  Ashton¬ 
Tate  has  been  their  programmabili¬ 
ty.  And  that  is  what  you  get  when 
you  start  with  the  underlying  imple¬ 
mentation  language.  You  have  the 
control  and  the  full  source! 

Dan:  I  understand  and  agree  on  the 
control  issue,  but  it  seems  to  me  that 
many  people  who  will  want  to  de¬ 
sign  expert  systems  are  experts 


themselves,  rather  than  designers. 
I'm  using  PROLOG  for  my  own  work, 
but  I  know  a  lot  of  people  who  aren't 
and  I  frequently  have  the  argument, 
which  is  why  I  asked  the  question. 

Philippe:  Yes,  that's  true.  That's  why 
we  are  working  on  additional  tools. 
Take  the  example  of  dBASE.  For  an  ac¬ 
counting  system,  for  example,  you 
need  both  the  expert — the  CPA — and 
the  dBASE  programmer  if  you  really 
want  to  build  something  useful.  As  a 
matter  of  fact,  it  is  the  same  thing  with 
most  spreadsheets  that  involve  pro¬ 
gramming  in  some  sense — if  you  do 
not  wish  to  end  up  with  a  monster! 

Mark:  I'm  all  for  experts  designing 
and  implementing  their  own  sys¬ 
tems.  It  may  not  be  as  tough  as  people 
think.  I  happen  to  think  that  high- 
school  algebra  is  more  difficult  to 
learn  than  the  basic  technology  of  ex¬ 
pert  systems.  I  ran  a  course  this  sum¬ 
mer  at  USC,  and  we  had  computer 
neophytes  building  rudimentary  ex¬ 
pert  systems  in  PROLOG.  A  continuum 
of  utilities  can  be  added  on  top  of  PRO¬ 
LOG  to  support  expert  system  devel¬ 
opment.  Why  limit  yourself  to  the  id¬ 
iosyncrasies  of  a  single  shell? 

Larry:  Chris  is  next.  Go  ahead,  please 
Chris. 

Chris:  With  reference  to  the  Fifth 
Generation  Computer  Project,  Terry 
Winograd  and  Fernando  Flores,  in 
their  book  Understanding  Computers 
and  Cognition  (Ablex,  1986),  said:  "The 
grandiose  goals,  then,  will  not  be 
met,  but  there  will  be  useful  spinoffs. 
In  the  long  run,  the  ambitions  for  tru¬ 
ly  intelligent  computer  systems,  as 
reflected  in  this  project  and  others 
like  it  around  the  world,  will  not  be  a 
major  factor  in  technological  devel¬ 
opment.  They  are  too  rooted  in  the 
rationalistic  tradition  and  too  depen¬ 
dent  on  its  assumptions  about  intelli¬ 
gence,  language,  and  formalization.” 
Will  the  panel  please  comment? 

Philippe:  Do  they  mean  that  the  ma¬ 
chine  is  always  going  to  remain 
dumb  but  there  will  be  some  real  use¬ 
ful  technology  that  will  come  out  of 
it?  Just  like  the  space  shuttle? 


134 

164 


Dr.  Dobb's  Journal,  February  1987 


DDJ  ON  LINE 

(continued  from  page  134) 


Mark:  It’s  getting  harder  to  separate 
our  technology  from  our  intellectual 
theories.  I  happen  to  think  that  the  ra¬ 
tionalist  approach  may  have  a  chance 
in  the  future.  If  you  want  to  see  an 
earlier  view,  which  I  think  is  rele¬ 
vant,  see  the  philosopher  Leibniz. 

Chris:  Well,  Winograd  and  Flores  say: 
"Gadamer,  Heidegger,  Habermas,  and 
others  argue  that  the  goal  of  reducing 
even  literal'  meanings  to  truth  condi¬ 
tions  is  ultimately  impossible  and  in¬ 
evitably  misleading.”  And  on  page  132 
they  add:  "First  there  is  a  danger  in¬ 
herent  in  the  label  expert  system.’ 
When  we  talk  of  a  human  expert  we 
connote  someone  whose  depth  of  un¬ 
derstanding  serves  not  only  to  solve 
specific  well-formulated  problems, 
but  also  to  put  them  into  a  larger  con¬ 
text.  We  distinguish  between  experts 
and  ‘idiot  savants.’  Calling  a  program 
an  expert  is  misleading  in  exactly  the 
same  way  as  calling  it  ‘intelligent’  or 
saying  it  ‘understands.’  ” 

Philippe:  OK,  let's  stop  calling  these 
programs  "expert  systems”  and  call 
them  “advisory  systems.” 

JRCooper:  I’d  like  to  direct  a  ques¬ 
tion  to  Philippe.  It  would  appear  that 
LISP  is  being  considered  the  language 
of  the  past  and  that  PROLOG  is  the  lan¬ 
guage  of  choice  for  the  future.  Now, 
hybrid  languages  aside,  there  must 
be  a  pretty  good  reason  why  you  de¬ 
cided  to  choose  PROLOG  over  other  AI 
languages.  I  recall  an  interview  you 
did  for  Computer  Language  magazine 
last  year  in  which  your  "vocal  dis¬ 
dain  for  C”  was  mentioned  on  the 
same  page  as  it  was  said  you  had  no 
plans  for  an  AI  language.  The  real 
question  here  is  why  PROLOG  and  not 
LISP  or  Smalltalk? 

Philippe:  Well,  we've  already  said 
use  the  right  tool  for  the  right  pur¬ 
pose.  To  answer  the  rest  of  your  ques¬ 
tion,  I  usually  contradict  myself  a  lot. 
My  wife  hates  it!  Also,  I  don’t  disdain 
C,  but  I  don’t  think  it  is  a  good  hobbyist 
language.  I  believe  that  Pascal  and 
Modula-2  are  much  better  and  more 
readable.  In  order  to  program  well  in 
C,  you  really  need  to  know  what  you 
are  doing.  Pascal  is  more  typed  and 
structured  for  that  purpose,  and  its 


nested  structures,  in  my  opinion,  of¬ 
fer  more  readability  than  C’s  flat  pro¬ 
cedure  and  function  structure. 

JRCooper:  OK,  but  was  there  a  con¬ 
scious  decision  not  to  introduce  a  Bor¬ 
land  LISP  or  Smalltalk? 

Philippe:  Smalltalk  is  a  truly  great 
language,  and  its  only  drawback  is 
that  it’s  big  on  a  PC.  No,  there  is  anoth¬ 
er  factor  here,  which  is  competitive¬ 
ness  with  other  language  houses.  We 
felt  that  PROLOG  would  open  the 
"brave  new  world  of  AI”  to  our  users 
and  ourselves.  That  doesn't  mean 
that  we  are  not  working  on  other  lan¬ 
guages.  Furthermore,  we  wanted  to 
add  a  new  flavor  to  a  market  that  was 
becoming  boring.  I  think  that,  had 
we  not  released  Turbo  Prolog,  we 
would  not  have  all  this  stir  about  AI 
right  now! 

MFE:  In  your  experience,  is  it  diffi¬ 
cult  for  veterans  of  the  old  high-level 
languages  to  pick  up  PROLOG?  I  seem 
to  be  having  problems.  And,  what's 
the  best  way  to  learn  PROLOG? 

Mark:  My  students  who  are  familiar 
with  procedural  languages  have 
trouble  "thinking  backward”  in  PRO¬ 
LOG,  as  they  put  it.  On  the  other  hand, 
the  transfer  between  LISP  and  PRO¬ 
LOG  appears  to  be  good  because  of 
common  data  structures  and  the  use 
of  recursion.  I  would  suggest  the  fol¬ 
lowing  readings.  First,  A  PROLOG 
Primer  by  Jean  Rogers.  After  that, 
look  at  what  has  become  the  classic — 
Programming  in  PROLOG  by  Clocksin 
and  Mellish.  I  would  also  recom¬ 
mend  two  articles  in  Communica¬ 
tions  of  the  ACM,  December  1985,  and 
the  August  1985  issue  of  Byte. 

Mike:  I’d  also  like  to  modestly  recom¬ 
mend  a  DDJ,  March  1985,  article  by 
David  Cortesi  for  PROLOG  beginners. 
And  Clocksin  and  Mellish,  of  course.  I 
conducted  a  little  experiment  in 
which  I  tried  teaching  PROLOG  to  a 
programmer  and  to  a  nonprogram¬ 
mer.  The  nonprogrammer  had  much 
less  trouble  catching  on. 

MFE:  P.S. — I’m  just  glad  I  can  be 
floundering  around  with  PROLOG 
without  having  to  take  out  a  second 


mortgage!  Keep  up  the  good  work 
Borland. 

Larry:  Go  ahead  Hannu. 

Hannu:  I  think  we  need  more  hard¬ 
ware  advancements  before  we  can 
have  true  AI.  How  about  it  panel?  We 
still  only  have  Is  and  Os? 

Philippe:  You  will  probably  still 
have  Is  and  Os  a  few  years  from  now. 
But  I  always  say  software  is  ten  years 
behind  hardware.  I  really  think  that 
it  is  now  a  software  technology  prob¬ 
lem.  A  lot  of  exciting  things  can  be 
done  without  much  new  hardware. 
It  is  a  weakness  of  programmers  in 
many  cases  to  think  that  they  are  lim¬ 
ited  by  the  hardware.  In  France  we 
say  "a  bad  worker  seems  always  to 
have  the  wrong  tools!” 

Mark:  Did  you  see  the  TI  satellite  sym¬ 
posium  recently?  I  take  great  objec¬ 
tion  to  the  idea  that  you  have  to  get 
the  best  of  everything  to  be  able  to  do 
serious  AI.  Have  you  seen  the  segment 
on  "Saturday  Night  Live”  on  the  limits 
of  the  imagination?  My  feeling  is  that 
the  main  problem  is  the  logic  and  rea¬ 
soning  itself  rather  than  the  grunt  of 
the  machine.  So  I  guess  I’m  backing  up 
Philippe’s  comment.  I  think  that  good 
AI  applications  can  be  built  on  the  AT, 
but  maybe  you  have  to  think  a  bit 
more  than  letting  a  LISP  machine  set 
all  the  defaults  for  you. 

Hannu:  I  agree  that  the  hardware  is 
way  ahead,  but  we  do  need  the  hard¬ 
ware  to  keep  up  with  the  goals  we 
are  trying  to  reach!  And  these  are  tru¬ 
ly  self-pacing — learning  and  modify¬ 
ing  codes  and  so  on. 

Mike:  New  hardware  is  always  fine, 
but  it’s  just  hardware.  A  great  chef 
can  always  use  a  new  pot,  though. 

Philippe:  Yes,  but  the  oldest  pots 
tend  to  give  the  best  flavor.  Have  you 
tried  Teflon  lately? 

Larry:  I’m  going  to  let  MikeM  in  here. 

MikeM:  Thanks.  I  was  wondering 
what  all  you  users  of  PROLOG  have 
been  developing  using  the  new  pack¬ 
age? 


136 


Dr.  Dobb's  Journal,  February  1987 

165 


DDJ  ON  LINE 

(continued  from  page  136) 

Mark:  So  far  I've  been  using  it  in  an 
educational  environment.  Most  of 
my  students  work  in  the  aerospace 
industry  in  Southern  California. 
They  are  building  a  variety  of  small 
expert  systems  in  such  areas  as  moni¬ 
toring  data  from  a  satellite,  develop¬ 
ing  systems  to  aid  instrument  land¬ 
ings  in  aircraft,  and  advising  safety 
engineers  on  how  to  collect  data.  We 
are  trying  to  build  a  frame  represen¬ 
tation  language,  but  it  has  proven 
tough  with  the  typing  that  Turbo 
Prolog  uses. 

Larry:  We  have  a  comment  from  Dan 
Kernan,  one  of  our  technical  support 
people  for  Turbo  Prolog. 

Dan  K.:  Strong  typing  should  not 
cause  a  constraint  on  a  frame-based 
system.  A  frame  is  generally  a  struc¬ 
ture  with  attributes  and  can  be  con¬ 
structed  easily  within  our  typed  sys¬ 
tem. 

Larry:  DanS,  You're  up  next.  Go 
ahead,  please. 

DanS:  My  question  is  directed  first  to 
Mark,  then  to  the  others  if  they  like. 
Can  you  see  any  micro  implementa¬ 
tion  of  PROLOG  being  powerful 
enough  to  sit  in  the  background  of  ex¬ 
isting  programs’  databases,  spread¬ 
sheets,  or  other  applications  and  to 
invoke  themselves  by  sensing  when 
their  users  are  in  trouble?  The  prob¬ 
lem  I  see  with  today’s  "help  key”  ap¬ 
proach  is  that  users  almost  never 
know  they  need  the  help.  Any  com¬ 
ments  on  what  role  PROLOG  and  mi¬ 
cros  might  play  here? 

Mark:  I  think  Turbo  Prolog  is  moving 
toward  the  kind  of  power  you  are 
talking  about.  The  availability  of  a 
fast  compiler  is  a  big  plus.  I  suspect 
that  the  best  configuration  may  be  to 
have  PROLOG  sitting  on  top  of  a  re¬ 
trieval  engine.  We're  looking  at  this 
at  use,  and  I  know  that  Philippe  has 
more  to  say  on  this  topic. 

Philippe:  Well  what  you  are  really 
talking  about  is  a  "lightning”  type  of 
system  where  something  watches  in 
the  background  and  tries  to  help 
you — like  asking  “Do  you  really  mean 
this,  or  do  you  mean  these  things?”  It 


is  more  programming  style  than  an 
implementation  language.  As  I  said 
before,  before  it’s  done  it’s  often 
called  AI,  once  it  works  it’s  called  a 
program. 

JRCooper:  This  is  for  all  panelists. 
There  seems  to  be  an  interest  in  hy¬ 
brid  languages  that  to  some  small  de¬ 
gree  address  the  old  "best  tool  for  the 
given  application”  issue.  Now  we  are 
getting  LISP  systems  that  allow  for  ob¬ 
ject-oriented  programming,  Small¬ 
talk  systems  that  come  with  PROLOG- 
like  inference  engines,  and  so  on. 
Does  anyone  feel  that  there  could 
someday  be  a  language  that  provides 
a  proverbial  AI  Swiss-army  knife? 

Philippe:  There  really  should  not  be 
any  fanaticism  in  the  choice  of  one 
implementation  language  rather 
than  another.  As  a  matter  of  fact,  that 
choice  is  becoming  a  whole  new  field 
of  study:  "What  is  the  optimal  imple¬ 
mentation  language  for  a  given  appli¬ 
cation?”  In  the  same  way,  maybe 
French  is  better  for  poetry,  English  is 
better  for  oral  communications,  and 
German  for  technicalities. 

Mark:  It’s  going  to  get  to  the  point  at 
which  it's  not  clear  what  the  bound¬ 
aries  between  languages  are.  I  don't 
see  why  we  shouldn't  use  the  family 
of  languages  concept  with  a  certain 
amount  of  specialization  in  the  func¬ 
tions  of  each  language.  This  seems 
particularly  suited  to  likely  develop¬ 
ments  in  concurrent  processing  and 
parallel  machines. 

Philippe:  But  the  danger  of  not  hav¬ 
ing  specific  tools  is  that  of  Ada.  I  don't 
believe  you  can  efficiently  be  every¬ 
thing  to  everybody! 

Mike:  The  only  reason  there  are 
Swiss-army  knives  is  that  pockets  are 
of  limited  size.  That  doesn’t  apply  to 
computers,  ultimately. 

JRCooper:  No,  but  it  does  to  my  wal¬ 
let.  .  .  .  One  final  question — I  first 
started  reading  about  AI  around  the 
turn  of  the  decade.  I  was  just  getting 
into  computers  back  then,  and  at  that 
time  the  research  in  AI  seemed  to  be 
toward  literally  making  the  machine 
think  like  we  do.  There  were  all  kinds 


of  theories  about  how  our  brains 
work  and  so  on.  It  now  seems  like  AI 
has  become  more  pragmatic  and 
seems  to  be  focused  primarily  on  ex¬ 
pert  systems.  Has  the  AI  community 
dropped  its  lofty  goals  of  a  decade 
ago? 

Philippe:  Well,  you  are  right.  At  one 
time  people  thought  that  natural  lan¬ 
guage  interfaces  would  be  the  thing. 
It  turned  out,  however,  that  the  re¬ 
sulting  commercial  products — such 
as  Clout,  for  example — never  really 
captivated  a  following.  Why?  Be¬ 
cause  typically  the  people  using  a  da¬ 
tabase  system  and  querying  it  with 
the  help  of  a  natural  language  query 
system  did  not  know  how  to  type 
really  fast  and  a  short-cut  approach 
was  more  efficient.  So  as  long  as  con¬ 
tinuous  speech  recognition  is  not 
commercially  available  at  a  reason¬ 
able  price,  this  is  yet  a  dream.  Expert 
systems  are  a  reality  and  are  useful, 
however.  That  is  why  they  are  now 
the  predominant  part  of  commercial¬ 
ly  oriented  AI  applications. 

Mark:  Expert  systems  are  a  commer¬ 
cial  application  of  what  used  to  be  AI. 
But  expert  systems  work  is  limited 
and  brittle.  What  I  find  intriguing  is 
the  recent  work  in  machine  learn¬ 
ing,  where  it  looks  like  we  may  be 
getting  a  little  closer  to  understand¬ 
ing  induction,  generalization,  and 
learning  by  analogy.  This  is  a  very 
open-ended  area  and  to  my  mind  the 
truest  of  AI  domains. 

Philippe:  Oh,  I  agree  100  percent. 
This  is  the  most  exciting  part! 

Mark:  AI  has  a  business  and  a  re¬ 
search  side  to  it.  Sometimes  the  two 
tend  to  get  confused!  Expert  system 
applications  have  been  getting  all  the 
press,  but  people  haven’t  forgotten 
the  basic  goals.  There  is  interesting 
work  on  machine  learning,  and  there 
are  plenty  of  cognitive  scientists,  lin¬ 
guists,  and  so  on  who  are  using  the 
tools  and  concepts  of  AI  to  explore  the 
fundamental  issues  in  the  acquisition 
and  utilization  of  intelligence. 

DDJ 

Vote  for  your  favorite  feature/article. 

Circle  Reader  Service  No.  9. 


138 

166 


Dr.  Dobb's  Journal,  February  1987 


PROGRAMMER'S  SERVICES 


THE  STATE  OF  BASIC 


Modules  in  True  BASIC 

Two  years  ago,  John  Kemeny  and 
Thomas  Kurtz  (the  creators  of  BASIC) 
and  a  group  of  young  enthusiastic  en¬ 
trepreneurs  launched  True  BASIC,  an 
implementation  of  BASIC  that  accesses 
up  to  640K  of  memory  and  provides 
structured  code  constructs  (some  even 
out-muscle  Pascal  and  Modula-2) 
while  maintaining  simple  data 
types — giving  programmers  both 
numbers  and  strings!  True  BASIC 
dared  to  introduce  structured  code 
into  a  microcomputer  BASIC  dialect.  A 
year  later  Microsoft  launched  its 
QuickBASIC  compiler,  which  extends 
the  syntax  of  BASICA  in  the  same  di¬ 
rection  as  that  of  True  BASIC.  The  sec¬ 
ond  version  of  QuickBASIC  adds  a  few 
more  extensions.  In  addition,  both 
True  BASIC  and  QuickBASIC  support 
external  libraries.  Last  November, 
Borland  International  announced 
Turbo  BASIC,  which,  not  surprisingly, 
also  has  structured  code  features. 
Then,  in  the  same  month,  Kemeny 
and  Kurtz  released  Version  2.0  of  True 
BASIC,  which  supports  modules  and 
such  features  as  the  ability  to  load  li¬ 
braries  and  modules  into  memory. 

We  asked  Dr.  John  Kemeny  about 
the  importance  of  modules  in  True 
BASIC.  His  answer  was:  “First,  you 
have  to  talk  about  modules  and  work 
spaces  together — I  didn't  realize,  at 
first,  the  importance  of  work  spaces. 
I  think  it's  these  two  features  togeth¬ 
er  that  make  True  BASIC  2.0  exciting 
for  me.  No  matter  how  good  a  lan¬ 
guage  is,  there  are  always  more  fea¬ 
tures  you  would  like  to  add.  With 
modules  and  work  spaces,  True  BA¬ 
SIC  really  becomes  an  extendable  lan¬ 
guage.  Implementing  separately 
compiled  libraries  was  step  1.  Imple¬ 
menting  modules  was  the  next  logi¬ 


cal  step.  Using  modules,  the  main 
program  can  be  truly  ignorant  of 
what  is  happening  within  the  li¬ 
brary.  Routines  that  used  to  take  20 
parameters  now  take  3,  and  code 
seems  more  focused.  Loading  mod¬ 
ules  into  a  work  space  gives  you  the 
speed  and  immediacy  that  I  always 
liked  about  BASIC.  My  functions,  im¬ 
ported  from  modules  and  libraries, 
seem  to  be  built-in — I  don’t  need  to 
declare  them.  I  can  concentrate  on 
the  code  portion  I  am  working  on 
and  take  the  rest  for  granted  in  this 
part  of  the  environment." 

True  BASIC  modules  allow  more  so¬ 
phisticated  interaction  of  routines 
and  data.  We’ll  now  describe  briefly 
the  components  of  a  module  and  give 
a  short  example. 


Modules  enable  you  to  declare 
three  scope  levels  for  variables.  First, 
the  PUBLIC  declaration  lists  the  names 
of  all  global  variables  that  can  be  ex¬ 
ported.  These  variables  can  be  ac¬ 
cessed  by  any  routine  within  the 
module  and  by  any  program  that 
uses  that  same  module.  In  contrast, 
the  SHARE  declaration  lists  the  vari¬ 
ables  that  are  accessible  only  by  the 
routines  within  the  module.  More¬ 
over,  shared  variables  are  static,  en¬ 
abling  them  to  retain  their  values  be¬ 
tween  calls  to  different  module 
routines.  Module  developers  can  thus 
use  shared  variables  to  create  and 
maintain  data  structures  that  are  in¬ 
visible  to  client  programs.  The  third 
declaration,  PRIVATE,  lists  variables 
that  are  local  to  the  routines  in  which 


MODULE  Factorial 

PUBLIC  Last_Fact ! Global  variable 

PRIVATE  Bad_Number ! Local  variable 

DECLARE  DEF  Bad_Number  !  Function  declaration 

SHARE  Fact_Array ( 30 )  ,  Product ,  MAX ! Local  static  variables 

! - Initialize  module - 

let  MAX  =  30 

let  Last_Fact  =  Bad_Number 
let  Product  =  Bad_Number 

IF  MAX  >30  THEN  MAT  REDIM  Fact_Arr  ay  ( MAX )  'Adjust  array  size 
let  Fact__Array(  1  )  =  1 
FOR  I  =  2  TO  MAX 

let  Fact_Array(  I )  =  I  *  Fact_Array  ( I  —  1  ) 

NEXT  I 


! - LOCAL  ROUTINES 

DEF  Bad_Number  =  —  1  .  OE  +  200 

! - EXPORTED  ROUTINES 

DEF  Fact  { N ) 


IF  ( INT(N)  -  N)  <  >0  THEN 
let  Fact  =  Bad_Number 
ELSE 

let  Last_Fact  =  Product 

IF  N  <  =  MAX  THEN 

let  product  =  Fact _ Array ( N ) 

ELSE 

let  Product  =  Fact_Ar ray ( MAX ) 
FOR  I  =  MAX+  1  TO  N 

let  Product  =  I  *  Product 
NEXT  I 

END  IF 

let  Fact  =  Product 
END  IF 
END  DEF 

END  MODULE 


Code  Example  1:  Sample  True  BASIC  module 


142 


Dr.  Dobb’s  Journal,  February  1987 

167 


they  appear — routines  listed  in  these 
declarations  cannot  be  exported. 

Module  initialization  is  also  avail¬ 
able  and  can  be  used  to  assign  values 
to  variables,  open  file  buffers,  exe¬ 
cute  module  routines,  and  so  on. 
Modules  are  automatically  initialized 
just  before  a  client  program  starts 
running. 

Code  Example  1,  page  142,  shows  a 
simple  module,  Factorial,  which  pro¬ 
vides  a  routine  to  calculate  factorials. 
It  is  customized  such  that  the  first  30 
factorials  (assumed  to  be  in  the  range 
most  frequently  used)  are  stored  in  a 
local  static  array.  A  factorial  of  up  to 
30  is  simply  recalled  from  memory, 
and  factorials  for  higher  numbers 
are  calculated.  The  PUBLIC  variable 
Last— Fact  returns  the  last  valid  facto¬ 
rial  obtained.  It  is  accessible  to  the  cli¬ 
ent  program.  The  PRIVATE  function 
Bad— Number  is  local  to  the  module 
and  works  by  returning  a  large  nega¬ 
tive  number  if  the  factorial  of  a  non- 
integer  is  requested.  The  SHARE  dec¬ 
laration  lists  the  array  Fact— Array , 
which  stores  the  first  30  most  fre¬ 
quently  used  factorials.  The  scalar 
MAX  stores  the  upper  limit  of  the  fac¬ 
torial  array.  The  values  of  the  array 
are  assigned  during  the  module  ini¬ 
tialization  phase.  During  that  same 
phase,  the  array  may  be  expanded  if 
necessary.  You  can  easily  customize 
the  program  by  assigning  a  value 
other  than  30  to  MAX.  Notice  that 
function  Fact  calculates  the  factorials 
of  numbers  greater  than  30,  starting 
with  Fact— Array  (MAX). 

BASIC  is  back  on  its  feet.  The  lan¬ 
guage  written  off  and  belittled  by 
many  is  making  a  comeback  not  to  be 
taken  lightly. 

DDJ 

Vote  for  your  favorite  feature/article. 

Circle  Reader  Service  No.  10. 


Dr.  Dobb's  Journal,  February  1987 

168 


143 


PROGRAMMER'S  SERVICES 


OF  INTEREST 


Languages 

A  new  Modula-2  compiler  for  PC-DOS 
and  MS-DOS  is  available  from  Farb- 
ware.  The  product  is  a  complete  na¬ 
tive  code  compiler,  code  generator, 
and  run-time  package.  It  implements 
the  full  Modula-2  language,  including 
REALS  and  DECIMALS,  and  a  compre¬ 
hensive  PC-DOS  system  interface.  The 
unprotected  program  sells  for  $89.95. 
Reader  Service  No.  16. 

Farbware 
1329  Gregory 
Wilmette,  IL  60091 
(312)  251-5310 

A  PROLOG  for  the  Macintosh  has  been 
announced  by  Advanced  A.I.  Sys¬ 
tems.  AAIS  PROLOG  provides  efficient 
methods  of  matching  (or  unification) 
and  rule  retrieval  and  allows  each 
section  of  code  to  be  tested  as  it's 
keyed  in.  Other  features  include  Ed¬ 
inburgh  syntax;  compatability  with 
DEC  10/20  PROLOG,  C-PROLOG,  and 
Quintus  PROLOG;  a  package  system 
for  multiple  memory-based  data¬ 
bases;  and  a  PROLOG  pretty  printer. 
Debugging  facilities  include  an  exten¬ 
sive  interactive  debugger,  illegal-ar¬ 
gument  checking,  and  unknown- 
function  checking,  AISS  PROLOG 
requires  a  Macintosh  with  at  least 
512K  RAM  and  costs  $150.  Reader  Ser¬ 
vice  No.  17. 

Advanced  A.I.  Systems  Inc. 

P.O.  Box  39-0360 
Mountain  View,  CA  94039 
(415)  961-1121 

Version  2.0  of  Kyan  Software’s  Pas¬ 
cal  for  the  Apple  is  now  available. 
This  version  is  a  fully  validated  imple¬ 
mentation  of  ISO  Pascal  and  runs  on 
any  Apple  II  with  64K  RAM.  Kyan  Pas¬ 
cal  includes  a  full-screen  text  editor,  a 


native  code  compiler,  a  macro-assem¬ 
bler,  utilities,  and  several  Pascal  ex¬ 
tensions.  The  built-in  macro  assem¬ 
bler  allows  programmers  to  add  in¬ 
line  assembly-language  source  code  to 
their  Pascal  programs.  The  unprotect¬ 
ed  software  is  priced  at  $69.95.  An  ad¬ 
vanced  version  of  the  software  called 
Kyan  Pascal  Plus  is  priced  at  $99.95. 
Reader  Service  No.  18. 

Kyan  Software 
1850  Union  St.,  #183 
San  Francisco,  CA  94123 
(415)  626-2080 

Computer  Innovations  has  released 
a  new  C  compiler,  C86PLUS,  which  is 
based  on  a  technology  that  applies  ar¬ 
tificial-intelligence  techniques  to  pro¬ 
duce  highly  optimized  code.  C86PLUS 
takes  advantage  of  newer,  more  pow¬ 
erful  hardware  architectures,  such  as 
Intel's  80286  and  80386  microproces¬ 
sors.  The  program  runs  the  Sieve 
benchmark  20  percent  faster  than 
does  Microsoft  C,  Version  4.0,  and  is  70 
percent  faster  than  the  current  C86, 
Version  2.3.  C86PLUS  runs  on  the  IBM 
PC,  PC/XT,  PC/AT,  and  compatibles  and 
sells  for  $497.  Reader  Service  No.  19. 
Computer  Innovations 
980  Shrewsbury  Ave. 

Tinton  Falls,  NJ  07724 
(201)  542-5920 

The  Software  Factory  has  intro¬ 
duced  OPAL,  a  full-function  interpre¬ 
tive  language  for  PC  application  de¬ 
velopers.  OPAL  integrates  itself  with 
DOS  and  applications  running  under 
DOS  and  offers  a  high  degree  of  pro¬ 
gramming  capabilities.  OPAL  sells  for 
$169.  Reader  Service  No.  20. 

The  Software  Factory  Inc. 

15301  Dallas  Pkwy.,  Ste.  750  LB  44 
Dallas,  TX  75248 
(214)  490-0835 

CET  Technology  has  released  CET 
BASIC,  a  compiled  application  devel¬ 
opment  language  for  Intel-based 
Unix  and  Xenix  systems.  The  CET  BA¬ 
SIC  compiler  is  compatible  with  OA¬ 
SIS,  THEOS,  and  UX-BASIC.  CET  BASIC 
programs  can  be  intermixed  with 
programs  and  subroutines  in  other 
languages.  Additional  features  in¬ 
clude  multiuser  support  for  ISAM,  di¬ 
rect  and  sequential  files,  terminal  in¬ 


dependence,  error  trapping, 
program  chaining,  and  COBOL-like 
formatting.  The  CET  BASIC  Compiler 
sells  for  $695.  Reader  Service  No  21. 
CET  Technology  Inc. 

5405  Garden  Grove  Blvd.,  Ste.  160 
Westminster,  CA  92683 
(714)  895-4345 

PL/PC,  a  new  programming  language 
from  Creative  Computer  Software, 
is  based  on  APL  with  Modula-2  con¬ 
trol  structures.  It  offers  an  integrated, 
interactive  programming  environ¬ 
ment  and  a  spreadsheet-like  data  edi¬ 
tor.  Structured  programming  is  sup¬ 
ported,  and  English  keywords  are 
used  instead  of  APL  symbols.  Debug¬ 
ging  facilities  include  tracing,  stop¬ 
ping,  single-stepping,  timing,  and 
profiling.  PL/PC  requires  an  IBM  PC  or 
compatible  with  at  least  360K  RAM 
and  DOS  2.11  or  later.  A  demo  version 
is  available  for  $16,  a  standard  ver¬ 
sion  for  $89,  and  an  8087  version  for 
$159.  Reader  Service  No.  22. 

Creative  Computer  Software 
117  York  St. 

Sydney,  NSW  200 

Australia 

(02)  261-1611 

Turbo  BASIC  from  Borland  Interna¬ 
tional  combines  the  interactive  as¬ 
pects  of  BASIC  with  the  structured, 
modular  approach  of  Pascal.  Turbo 
BASIC  employs  the  same  develop¬ 
ment  environment  as  that  of  Turbo 
Pascal,  including  a  memory-to-mem- 
ory  compiler,  a  full-screen  editor,  an 
internal  linker  and  run-time  library, 
and  the  Microcalc  spreadsheet  com¬ 
plete  with  source  code.  Turbo  BASIC 
supports  true  recursion,  full  8087  in¬ 
tegration,  and  block-structured  pro¬ 
gramming  statements.  The  package 
runs  on  the  IBM  PC  and  compatibles 
and  sells  for  $99.95  .  Reader  Service 
No.  23. 

Borland  International 
4585  Scotts  Valley  Dr. 

Scotts  Valley,  CA  95066 
(408)  438-8400 

DDJ 


146 


Dr.  Dobb's  Journal,  February  1987 

169 


FORUM 


SWAINE'S  FLAMES 


{  i  ¥7  ou  have  discovered  the  tru- 
JL  ism,”  Dennis  Allison  told 
me,  "that  there  are  only  two  things: 
memory  and  bandwidth.”  I  had  been 
arguing  that,  with  memory  becom¬ 
ing  cheap  and  plentiful  and  DDJ  hav¬ 
ing  made  its  name  squeezing  com¬ 
puting  power  into  scarce  memory, 
the  magazine  should  now  concen¬ 
trate  on  techniques  for  getting 
around  bandwidth  limitations. 
"Fine,”  he  said,  "but  don't  forget 
about  memory.” 

This  column  is  about  memory.  I 
started  writing  it  in  a  hotel  room  in 
Honolulu,  a  town  in  which  Mark 
Twain,  Jack  London,  and  Hunter  S. 
Thompson  all  wrote  memorable  re¬ 
ports  on  the  Hawaiian  islands;  I  had 
their  books  beside  me  as  I  turned  on 
the  television,  looking  for  news  of  the 
Kilauea  lava  flow. 

"The  king . . .  could  place  a  taboo 
upon  any  spot  or  thing  or  person  and 
it  was  death  for  any  man  to  molest 
it.” — The  Sandwich  Islands,  Mark 
Twain. 

Remember  GNU,  the  outrageous 
nonproprietary  Unix-like  operating 
system-in-progress?  Richard  Stall- 
man  says  he  is  about  to  start  the  ker¬ 
nel.  The  C  compiler  is  now  cranking 
out  68020  code. 

Richard  enjoys  defying  taboos. 

Levi  Thomas  represented  DDJ  at 
the  second  Hacker's  Conference,  a 
gathering  made  memorable  by  the 
spectacle  of  Ted  Nelson,  Jerry  Pour- 
nelle,  and  Timothy  Leary  on  the 
same  stage  and  by  the  reminiscences 
of  Chris  Espinosa  and  the  brothers 
Baum.  In  the  midst  of  it  all  Jolt  ar¬ 
rived  and  was  promptly  declared 
programming  fuel.  Jolt  is  a  soft  drink 
that  boasts  "all  the  sugar  and  twice 
the  caffeine” — sounds  like  some¬ 
thing  you  wouldn  t  want  to  drink  just 
before  applying  for  a  job  at  IBM,  FMC, 


Syntex,  Lockheed,  or  any  of  the  other 
companies  now  requiring  drug  test¬ 
ing  of  job  applicants. 

And  let’s  not  forget  the  West  Coast 
Computer  Faire.  The  Advisory  Com¬ 
mittee  for  next  month's  Faire  (which 
includes  Lee  Felsenstein  and  DDJ’ s  co- 
founder  Dennis  Allison,  first  editor 
Jim  Warren,  and  current  assistant  edi¬ 
tor  Levi  Thomas)  is  concerned  that  the 
Faire  has  lost  much  of  the  excitement 
that  in  the  past  has  made  it  more  than 
a  trade  show  (and,  I  suspect,  wants  to 
prevent  WCCF  from  becoming  the 
unanimous  disappointment  that  the 
PC  Faire  has  become).  The  committee 
has  plans  for  the  Faire  that  it  hopes 
will  rekindle  some  of  that  remem¬ 
bered  excitement.  One  notion:  Chi- 
nese-menu-style  hot-pepper  symbols 
next  to  advanced  topics  in  the  pro¬ 
gram  to  warn  novices.  Also:  a  keynote 
panel  on  who  really  owns  the  IBM 
standard,  a  pioneers’  panel  on  com¬ 
puters  and  the  receding  vision  of  uto¬ 
pia,  and  some  early  reports  from  the 
field  on  386  implementations.  And 
then  there's  Jerry  Pournelle's  "I'm 
mad  as  hell  and  I'm  not  going  to  take  it 
any  more”  session. 

Those  who  attended  the  December 
17,  1986,  meeting  of  the  Homebrew 
Computer  Club  at  Stanford  Universi¬ 
ty  have  something  to  remember.  The 
club's  perennial  toastmaster,  Lee  Fel¬ 
senstein,  opened  the  gathering  for 
the  last  time  that  night.  (Homebrew  is 
where  Processor  Technology  [re¬ 
member  it?]  got  its  first  orders  for  S- 
100  memory  boards  and  where  Steve 
Wozniak  showed  off  his  Apple  I.) 


And  they’ll  have  something  to  re¬ 
member  it  by — cartoonist  Larry  Gon- 
ick  produced  a  commemorative  T- 
shirt  for  the  event. 

After  the  meeting,  the  attendees  re¬ 
tired  to  their  traditional  watering 
hole,  The  Oasis,  for  peanuts,  beer, 
and  wood  carving. 

I  remember  they  fed  us  peanuts  on 
the  flight  to  the  Kona  Coast,  where  I 
hoped  to  get  as  close  as  I  could  get  to 
Kilauea. 

"Kona,  where  nobody  ever  dreams 
of  looking  at  a  thermometer,  where 
every  afternoon  there  falls  a  refresh¬ 
ing  spring  shower,  and  where  nei¬ 
ther  frost  nor  sunstroke  has  ever 
been  known  ...  a  hurricane  ...  a  fog 
.  .  .  are  meteorologically  impossi¬ 
ble  .  .  .” — My  Hawaiian  Aloha,  Jack 
London. 

"The  Kona  Coast  in  December  is  as 
close  to  hell  on  earth  as  a  half-bright 
mammal  can  get.” — The  Curse  of 
Lono,  Hunter  S.  Thompson. 

Both  London’s  and  Thompson's 
memories  were  faulty.  I  have  seen 
the  debris  of  a  terrible  hurricane  on  a 
Hawaiian  beach  and  beautiful 
weather  on  the  Kona  Coast  in  Decem¬ 
ber.  And,  both  terrible  and  beautiful, 
the  plume  of  steam  rising  from  the 
sea  where  molten  lava  was  building 
more  Hawaii.  I  suspect  I’ll  remember 
that  when  much  that  seems  urgent 
now  is  forgotten. 

My  cousin  Corbett's  last  words  as 
he  got  on  the  plane  for  Hawaii  this 
morning  were,  "When  the  Earth  gets 
weird,  the  weird  go  into  real  estate.” 


Michael  Swaine 
editor-in-chief 


152 

170 


Dr.  Dobb's  Journal,  February  1987 


Software 


FOR  THE  PROFESSIONA 


THE 

BANDWIDTH 

BOTTLENECK: 

Compressing 
Image  Data 

Squeezing  Text  Files 

Optimizing  Integer 
Multiplications 

Webster’s  vs.  K&R 

80386  Resources 


Languages: 

C  Text  Formatter 
Object-Oriented  LISP 
BASIC  Modules  and  Libraries 
Assembly  vs.  High-Level  Languages 


(3.95  CANADA) 


M  M  E  R 


I 


MARCH  1987 


CONTENTS 


VOLUME  12,  ISSUE  3 


Compressing  2 
graphic  data 


Squeezing  text  2 


Fast  2 

multiplication 


What's  wrong  ^ 
with  K&R 


80386,  ^ 
assembly,  and 
more 


Object-oriented  ► 
programming 


Channel  2 
capacity 


ARTICLES 


BANDWIDTH:  Compressing  Image  Data  with  1 6 


Quadtrees 

by  Ronald  G.  White 

Ronald  describes  how  a  quadtree,  a  tree  data  structure  that 
can  have  four  child  nodes  for  each  node,  can  be  used  in  a 
recursive  scheme  for  compressing  image  data. 

BANDWIDTH:  ARC  Wars:  MS-DOS  Archiving  Utilities  26 

by  Russell  Nelson 

Archiving  (compressing)  text  files  saves  disk  space  and, 
when  transferring  files  between  systems,  saves  time. 

Russell  compares  and  contrasts  several  ARC  programs  that 
are  available  in  the  MS-DOS  world. 

BANDWIDTH:  Optimizing  Integer  Multiplications  by  34 
Constant  Multipliers 

by  Robert  D.  Grappel 

A  simple  and  nearly  optimal  algorithm  to  speed  up  that 
time-consuming  operation — integer  multiplication. 


COLUMNS 


C  CHEST  96 

by  Allen  Holub 

Allen  continues  the  description  of  his  nroff-like  text  editor, 
nr.  In  the  Flotsam  and  Jetsam  section,  he  takes  Kernighan 
and  Ritchie  to  task  for  their  eccentric  definitions  of  some 
common  words. 

16-BIT  SOFTWARE  TOOLBOX  110 

by  Ray  Duncan 

It's  the  usual  eclectic  collection  of  fun  and  useful  facts  from 
Ray  and  his  readers  this  month,  including  sources  of 
information  on  the  80386  and  assembly  languages,  a  hint 
about  writing  adapting  I/O  routines,  and  a  letter  from  a 
reader  countering  Ray's  attacks  on  high-level  languages. 
STRUCTURED  PROGRAMMING  120 

by  Namir  Clemment  Shammas 

Namir  compares  three  flavors  of  BASIC:  True  BASIC,  BASICA, 
and  QuickBASIC. 

ARTIFICIAL  INTELLIGENCE  126 

by  Ernest  R.  Tello 

Ernie  gives  us  an  overview  of  object-oriented  LISP  and  talks 
with  some  LISP  mavens  about  the  future  of  the  language. 


THE 

BANDWIDTH 

BOTTLENECK: 

CofnpHMUnQ 

ImofHiData 

twming  Tout  foot 

Optimum  tm«0o< 
Muftiptoohom 


10)14  Mmoutom 


M*1 


About  the  Cover 

It’s  easy  to  reduce  the  size  of  a  file 
so  you  can  transmit  it  more  quick¬ 
ly.  The  trick  is  to  do  it  in  such  a 
way  that  you  can  recreate  the  orig¬ 
inal  file  on  the  other  end. 

This  Issue 

Given  a  fixed-capacity  channel  and 
a  quantity  of  information  to  trans¬ 
fer,  how  do  you  move  more  infor¬ 
mation  down  the  channel  per  unit 
of  time  than  the  channel  will  sup¬ 
port?  That  apparently  impossible 
challenge  is  the  bandwidth  prob¬ 
lem,  a  topic  that  we  intend  to  ad¬ 
dress  in  various  ways  throughout 
1987.  The  expanded,  extended,  or 
otherwise  enlarged  memory  space 
of  today's  microcomputers,  in  con¬ 
junction  with  the  processing 
power  of  CPUs  such  as  the  80386 
and  68020,  only  makes  the  infor¬ 
mation  bottlenecks  in  computer 
systems  all  the  more  apparent. 
Two  solutions  to  the  bandwidth  di¬ 
lemma  lead  off  this  issue. 

Next  Issue 

In  the  area  of  artificial  intelligence, 


FORUM 


EDITORIAL  6 

by  Michael  Swaine 
RUNNING  LIGHT  8 

by  Nick  Turner 
ARCHIVES  8 

LETTERS  10 

by  you 

DDJ  ON  LINE  136 

SWAINE'S  FLAMES  1S2 

by  Michael  Swaine 


PROGRAMMER'S 

SERVICES 


THE  STATE  OF  BASIC:  138 
A  look  at  how  the  “new 
wave"  BASICS  support  user- 
defined  libraries. 

OF  INTEREST:  142 

New  products  out  there 
ADVERTISER  INDEX:  151 
Where  to  find  those  ads 


expert  systems  are  now  old  hat. 
The  next  are  a  of  growth  in  AI  must 
be  in  techniques  that  allow  the 
program  to  acquire  new  informa¬ 
tion — to  learn.  Our  annual  AI  issue 
will  include  an  implementation  of 
a  classic  expert  system,  but  it  will 
also  look  forward  with  an  example 
of  software  that  mirrors  the  struc¬ 
ture  of  the  brain  and  learns  by  ex¬ 
perience,  much  as  clusters  of  neu¬ 
rons  may  learn. 

5  bandwidth  topic 

^  entry  point 


Dr.  Dobb's  Journal,  March  1987 

172 


3 


EDITORIAL 


Looking  back,  I  can 
see  that  it  was 
shortly  after  I  started 
commuting  from  San¬ 
ta  Cruz  into  Silicon  Val¬ 
ley  that  I  began  to  see 
the  importance  of  the 
bandwidth  problem. 

Squeezing  in  with 
thousands  of  other 
commuters  through 
the  mountain  pass  to 
fight  my  way  across  Silicon  Valley 
and  up  crowded  Highway  101  got  me 
thinking  about  channel  capacity. 

There  are  interesting  similarities 
and  relationships  between  transpor¬ 
tation  bottlenecks  and  the  informa¬ 
tion-transmission  bottlenecks  that  I 
believe  represent  the  greatest  chal¬ 
lenge  for  programmers  in  the  next 
decade.  Just  as  data  can  be  organized 
into  packets,  commuters  can  be 
packed  up  in  trains  or  car  pools  to 
increase  their  transmission  rate  and 
decrease  the  frequency  of  collisions. 
The  vehicular  capacity  of  a  highway 
really  is  a  form  of  channel  band¬ 
width.  And  anyone  who  has  fallen 
into  a  web  of  one-way  streets  such  as 
Berkeley's  has  something  to  say 
about  network  topology. 

Relationships  between  transporta¬ 
tion  and  communication  are  impor¬ 
tant  when  you  consider  trading  one 
off  against  the  other.  At  DDJ  we  have 
been,  in  a  limited  way,  exploring  the 
benefits  of  telecommuting.  (I'm  refer¬ 
ring  to  the  exchange  of  the  costly 
movement  of  bodies  for  the  lower- 
cost  transmission  of  bits — not  driving 
with  a  telephone  at  your  ear,  which 
so  many  of  my  fellow  commuters  are 
doing  and  which  could  also  be  called 
telecommuting.)  I'm  skeptical  about 
how  closely  people  can  work  togeth¬ 
er  at  physical  long  distance,  but  I’m 
willing  to  experiment  because  the 
cost  of  transportation,  particularly 
the  cost  in  time,  is  so  great. 

That  cost  will  not  go  down,  the 
world's  transportation  bottlenecks 
are  only  going  to  get  worse,  and  there 


is  little  to  indicate  that 
anyone  is  thinking 
very  hard  about  what 
technological  fixes 
there  may  be. 

It  would  seem  that 
we  are  closer  to  solu¬ 
tions  to  communica¬ 
tions  bottlenecks:  we 
have  seen  the  develop¬ 
ment  of  communica¬ 
tions  satellites,  com¬ 
puter  networks,  and  sophisticated 
routing  software  for  voice  and  data. 
The  picture  phones  of  science  fiction 
exist  today.  (See  Swaine's  Flames, 
page  152.) 

In  contrast,  science  fiction's  visions 
of  future  transportation  are  further 
from  realization.  I  did  recently  drive 
a  car  equipped  with  an  impressive 
computerized  navigation  system 
from  Etak  Inc.,  of  Menlo  Park,  but  in¬ 
novation  like  this  in  the  area  of  con¬ 
sumer  transportation  is  rare.  Cars  are 
still  cars,  highways  are  still  high¬ 
ways,  and  traffic  is  still  a  battle¬ 
ground  of  a  thousand  conflicting  de¬ 
sires.  Maybe  society  can  no  longer 
afford  the  luxury  of  our  current  driv¬ 
ing  habits  in  constricted  channels 
like  Highway  101,  and  cars  and  high¬ 
ways  should  be  modified  to  permit 
centralized  control  and  routing.  It's 
not  such  a  radically  ambitious  notion, 
given  the  context  of  Star  Wars,  and 
it's  a  lot  more  down-to-earth. 

This  little  peninsula  with  its  run¬ 
away  population  growth  and  its 
wealth  of  technical  expertise  seems 
like  the  ideal  place  for  the  experi¬ 
ment  to  begin. 

Staccato  signals  of  constant 
information. .  . . 

These  are  the  days  of  miracle  and 
wonder 

This  is  the  long  distance  call 
— Paul  Simon 

Michael  Swaine 
editor-in-chief 


Dr.  Dobb’s  Journal  of 

Software  Tools 

FOR  THE  PROFESSIONAL  PROGRAMMER 

Editorial 

Editor-in-Chief  Michael  Swaine 
Editor  Nick  Turner 
Managing  Editor  Vince  Leone 
Assistant  Editors  Sara  Noah  Ruddy 
Levi  Thomas 

Technical  Editor  Allen  Holub 
Contributing  Editors  Ray  Duncan 
Michael  Ham 
Bela  Lubkin 
Namir  Shammas 
Ernest  R.  Tello 

Copy  Editor  Rhoda  Simmons 
Production 

Production  Manager  Bob  Wynne 

Art  Director  Michael  Hollister 
Assoc.  Art  Director  Joe  Sikoryak 
Typesetter  Jean  Aring 
Cover  Photographer  Michael  Carr 
Circulation 

Circulation  Director  Maureen  Kaminski 
Newsstand  Sales  Mgr.  Stephanie  Barber 
Book  Marketing  Mgr.  Jane  Sharninghouse 
Circulation  Coordinator  Kathleen  Shay 
Administration 
Finance  Director  Kate  Wheat 
Business  Manager  Betty  Trickett 
Accounts  Payable  Supv.  Mayda  Lopez-Qpintana 
Accts.  Receivable  Supv.  Laura  Di  Lazzaro 
Advertising  Director 
Robert  Horton  (415)  366-3600 
Account  Managers 
Lisa  Boudreau  (415)  366-3600 
Gary  George  (404)  897-1923 
Michele  Perkins  (317)  875-8093 
Michael  Wiener  (415)366-3600 
Cynthia  Zuck  (718)  499-9333 
Promotions/Srvcs.Mgr.  Anna  Kittleson 
Advertising  Coordinator  Charles  Shively 


M&T  Publishing  Inc. 

Chairman  of  the  Board  Otmar  Weber 
Director  C.F.  von  Quadt 
President  and  Publisher  Laird  Foshay 
Associate  Publisher  Michael  Swaine 


Dr.  Dobb’s  Journal  of  Software  Tools  (USPS  307690) 
is  published  monthly  by  M&T  Publishing  Inc.,  501  Gal¬ 
veston  Dr.,  Redwood  City,  CA  94063;  (415)  366-3600. 
Second-class  postage  paid  at  Redwood  City  and  at  ad¬ 
ditional  entry  points. 

Article  Submissions:  Send  manuscripts  and  disk 
(with  article  and  listings)  to  the  Editor. 

DDJ  on  CompuServe:  Type  GO  DDJ 
Address  Correction  Requested:  Postmaster:  Send 
Form  3579  to  Dr.  Dobb  s  Journal,  P.O.  Box  27809,  San 
Diego,  CA  92128.  ISSN  0888-3076 

Customer  Service:  For  subscription  problems  call: 
outside  CA  (800)  321-3333;  in  CA  (619)  485-9623  or  566- 
6947.  For  book/software  order  problems  call  (415)  366- 
3600. 

Subscriptions:  $29.97  per  1  year;  $56.97  for  2  years. 
Canada  and  Mexico  add  $27  per  year  airmail  or  $10 
per  year  surface.  All  other  countries  add  $27  pier  year 
airmail.  Foreign  subscriptions  must  be  prepiaid  in  U.S. 
funds  drawn  on  a  U.S.  bank.  For  foreign  subscriptions, 
TELEX:  752-351. 

Foreign  Newsstand  Distributor:  Worldwide  Media 
Service  Inc.,  386  Park  Ave.  South,  New  York,  NY  10016; 
(212)  686-1520  TELEX:  620430  (WUI). 

Entire  contents  copyright  ©  1987  by  M&T 
Publishing,  Inc.,  unless  otherwise  noted  on 
specific  articles.  All  rights  reserved. 


People's  Computer  Company 

Dr.  Dobb’s  Journal  of  Software  Toots  is  published  by  M&.T 
Publishing  Inc.  under  license  from  People’s  Computer  Company, 
2682  Bishop  Dr.,  Suite  107,  San  Ramon,  CA  94583,  a  nonprofit 
corporation. 


6 


Dr.  Dobb's  Journal,  March  1987 

173 


FORUM 


RUNNING  LIGHT 


Several  companies 
are  feverishly 
(and  secretly)  working 
on  new  chips  called 
"neural  networks.” 

These  chips  are  de¬ 
signed  to  emulate  vari¬ 
ous  functions  of  the 
human  brain.  They 
contain  thousands  of 
standard  microcircuit 
elements  organized  in 
a  distinctly  nonstandard  way — hun¬ 
dreds  of  simulated  neurons,  each  one 


of  which  is  an  always-active  compo¬ 
nent  capable  of  making  simple  deci¬ 
sions.  Each  "neuron’’  is  connected  by 
"synapses”  to  up  to  several  hundred 
other  neurons  and  can  change  its 
own  activation  potentials  in  response 
to  signals  it  receives. 

This  could  be  the  most  important 
new  invention  in  cybernetics  since 
the  Turing  machine.  Why?  Because 
neural  networks  function  in  much 
the  same  way  as  your  own  brain  (and 
mine,  1  think).  Once  they  surpass  a 
certain  threshold  of  complexity,  neu¬ 
ral  networks  (whether  they  are  actu¬ 
ally  implemented  in  hardware  or 
simulated  on  traditional  machines) 
begin  to  exhibit  some  interesting 
properties.  Depending  on  their  pre¬ 
cise  organization,  they  can  act  as 
spectacularly  efficient  pattern  recog¬ 
nizers,  solvers  of  multiple  simulta¬ 
neous  equations  (and  other  complex 
mathematical  problems),  learning 
machines  that  actually  reach  their 
own  conclusions  about  the  knowl¬ 
edge  presented,  and  sophisticated  de¬ 
vice  controllers — in  short,  all  the 
functions  that  are  currently  per¬ 
formed  by  neural  networks  in  your 
own  body.  A  properly  designed 
hardware  neural  network,  for  exam¬ 
ple,  could  easily  provide  a  good  solu¬ 
tion  to  a  30-city  traveling  salesman 
problem  in  just  a  few  (very  few)  mi¬ 
croseconds.  It’s  not  guaranteed  to  be 
the  best  solution,  but  it  will  be  ex¬ 
tremely  good.  And  the  solution 
would  be  reached  before  a  tradition¬ 


al  computer  had  even 
initialized  its  variables. 
Watch  these  pages  for 
more  about  this  excit¬ 
ing  new  field. 

This  issue  marks  the 
introduction  of  a  new 
theme  for  DDJ.  As  the 
speed  and  storage  of 
computers  increase 
more  and  more  rapid¬ 
ly,  it  becomes  increasingly  important 
to  be  able  to  move  large  quantities  of 
information  from  one  computer  to 
another.  We  have  advanced  fiber-op¬ 
tic  cables  and  satellite  microwave 
links,  but  we  also  have  an  exponen¬ 
tial  growth  in  the  amount  of  informa¬ 
tion  being  sent.  The  problem  is  one  of 
bandwidth — the  capacity  of  the  com¬ 
munication  channels  is  not  growing 
as  rapidly  as  the  need  to 
communicate. 

We’ll  be  looking  at  the  bandwidth 
problem  from  several  different  per¬ 
spectives.  In  this  issue  Ronald  White 
describes  graphical  quadtrees,  a  way 
to  compress  image  data,  sometimes 
dramatically.  We  also  have  a  compar¬ 
ative  review  of  four  public-domain 
archiving  programs.  Archived  files 
are  files  that  have  been  compressed 
through  the  elimination  of  redun¬ 
dant  data.  Such  files  can  be  transmit¬ 
ted  faster  over  modem  lines,  and  be¬ 
cause  of  this,  archive  programs  have 
become  popular  on  computer  bulle¬ 
tin-board  systems. 

Do  you  have  a  project  that  excites 
you?  Would  you  like  to  write  about  it 
for  DDJ ?  Your  first  step  is  to  give  us  a 
call  at  (415)  366-3600  and  ask  for  a 
copy  of  the  writers'  guidelines.  This 
wondrous  document  contains  care¬ 
fully  refined  advice  that  is  intended 
to  help  you  get  published. 

_ 

Nick  Turner 
editor 


ARCHIVES 


Call  for  Innovation 

"Think  Big,  dream  bigger.  Guest  essays, 
dream  pieces,  signposts  and  other  nefar¬ 
ious  schemes  for  advancing  this  micro 
technology  are  welcome  here,  in  addition 
to  your  hard  software  contributions  to  the 
community.” — Marlin  Ouverson,  DDJ,  Janu¬ 
ary  1982. 

WCCF 

"The  6th  West  Coast  Computer  Faire  is 
over  (sigh  of  relief  and  aching  feet).  Direc¬ 
tor  Jim  Warren's  roller  skates  have  been 
hung  up  for  another  year,  or  at  least  until 
his  rumoured  mini-faires  get  under  way. 
This  year's  was  even  bigger  than 
expected .... 

"Perhaps  the  biggest  surprise  to  com¬ 
mercial  concerns  was  the  profile  of  the  'av¬ 
erage'  attendee.  All  the  sales  people  had 
been  coached  well  in  advance  on  how  to 
spill  forth  a  technical-sounding  pitch;  none 
seemed  able  to  answer  basic  questions  like, 
'Why  should  I  have  a  computer?’  What 
can  I  use  it  for?’  ‘1  don’t  know  anything 
about  computers — where  do  I  start?' 
Would-be  consumers  wandered  from 
booth  to  booth  in  search  of  someone  who 
could  still  remember  how  to  speak  their 
lingo  and  give  a  down-to-earth  reply.” — 
Marlin  Ouverson,  DDJ,  June  1981. 

Ten  Years  Ago  in  DDJ 


"FAIRE  CONSUMES  EDITOR — DDJ  LATE 
“What  more  can  we  say?  ddj's  entrepid 
editor  involved  himself  as  chairperson  of 
the  First  West  Coast  Computer  Faire  and 
discovered  it  to  be  an  infinite  sink  of  time. 
And  he  was  spread  much  too  thin  before 
he  started.  Xeroxing  editors  failed.” 

"From  time  to  time,  over  the  past  twenty 
years  or  so,  there  have  been  predictions  that 
we  will  soon  have  inexpensive  mass  storage 
devices  capable  of  holding  the  entire  Li¬ 
brary  of  Congress  in  a  $19  desk-top  unit. 
These  chimerical  devices  are  usually  based 
on  some  far-out  technology  involving  pro¬ 
ton  resonance,  magnetic  bubbles,  or  holo¬ 
grams.  Well,  I'm  still  waiting  patiently  for 
such  a  device  to  materialize  but  I'm  not 
holding  my  breath."— Jim  Day,  DDJ,  March 
1977. 

Dr.  DoBB'S  loURNALof 

PUTER 

&  Orthodontia 

Running  Light  Without  Overbyte 


8 

174 


Dr.  Dobb's  Journal,  March  1987 


FORUM 


LETTERS 


Programming  Ethics 

Dear  DDJ, 

I’ve  just  read  Allen  Holub’s  December 
Viewpoint  and  was  pleased  to  see 
him  make  his  position  known  to  the 
world.  It's  not  an  easy  thing  to  do.  I'm 
sure  he’ll  receive  letters  criticizing 
his  position. 

On  that  note,  allow  me  to  introduce 
PeaceNet,  an  international  computer 
network  dedicated  to  improving 
communication  between  people 
worldwide  on  the  issues  of  peace. 
PeaceNet  is  in  need  of  programmer 
support,  and  anything  readers  can 
do,  either  directly  or  indirectly, 
would  be  helpful.  PeaceNet  is  run¬ 
ning  on  a  Unix  system  (a  Plexus  P-60 
[68000-based]  to  be  exact)  and  is  acces¬ 
sible  through  Telenet  and  via 
direct  dial  (Palo  Alto,  Califor¬ 
nia). 

In  December,  PeaceNet  had 
more  than  700  users  and  was 
gaining  more  than  100  per 
month.  We  have  conferences 
on  many  important  issues  as 
well  as  "action  alerts”  and  ac¬ 
tivity  calendars.  Readers  inter¬ 
ested  in  finding  out  more 
should  call  (415)  486-0624. 

Corwin  Nichols 

223  Forest  Ave. 

Palo  Alto,  CA  94301 

Dear  DDJ, 

I  am  responding  to  Allen  Ho¬ 
lub’s  Viewpoint  in  DDJ,  De¬ 
cember  1986.  First,  let  me  say, 
"bravo,  Allen”;  then  let  me 
say,  "I  disagree.” 

I  say  bravo  because  Allen 
states  that  he  has  examined 
the  issues  of  working  on  de¬ 
fense  contracts  and  that  he 
feels  he  cannot  live  with  his 


conscience  if  he  performs  this  kind  of 
work.  He  also  states  that  "there  are 
people  who  have  thought  about  these 
issues  and  have  come  to  the  opposite 
conclusion.”  I  am  one  of  these  people, 
hence  I  disagree.  He  feels  these  people 
are  wrong  but  that  they  are  acting  ac¬ 
cording  to  their  beliefs  in  working  on 
defense  projects.  He  says  he  has  his 
problem  with  people  who  refuse  to 
look  at  the  issues  and  work  on  defense 
projects  anyway.  I  agree. 

I  am  one  of  the  people  who  has  ex¬ 
amined  the  issues  and  feels  that  de¬ 
fense  work  is  a  necessary  activity  in 
this  world  today.  Although  I  wish 
that  we  were  living  in  that  time 
when  “they  shall  beat  their  swords 
into  plowshares,  and  their  spears 
into  pruning  hooks:  nation  shall  not 
lift  up  sword  against  nation,  neither 
shall  they  learn  war  anymore”  (Isa¬ 
iah  2:4),  we  have  Armageddon  to  face 
between  here  and  there.  I  agree  that 
defense  work  is  a  grave-digging  activ¬ 
ity:  the  enemy's  grave.  Not  to  engage 
in  defense  is  a  grave-digging  lack  of 
activity:  our  own  grave.  I  cannot 
think  of  abstinence  from  defense 
work  as  other  than  suicide. 

Robert  J.  Brown,  III 

Elijah  Laboratories  International 


5150  W.  Copans  Rd.,  #1135 

Margate,  FL  33063 

Dear  DDJ, 

After  reading  Mr.  Holub’s  article  in 
the  December  issue  of  DDJ,  I  felt  I 
could  not  remain  silent.  I  must  dis¬ 
agree  with  Mr.  Holub’s  opinions 
about  nuclear  weapons  and  whether 
or  not  an  engineer  could  change  this. 

Assume  just  for  the  moment  that 
all  engineers  agreed  with  him.  Then 
what?  Do  we  want  a  nonengineer 
cobbling  together  our  weapons? 
Would  that  make  us  feel  safer? 

He  said  that  if  we  didn’t  design 
them,  they  wouldn’t  exist.  If  all  the 
ethical  engineers  in  this  country 
were  to  walk  out  on  their  defense 
contracts,  do  you  know  who  would 
fill  in  for  them?  The  unethical 
engineers. 

I  agree  that  we  shouldn't  sit  around 
and  wait  for  the  bombs  to  be 
dropped.  All  of  us,  independent  of 
our  profession,  should  be  much  more 
politically  aware — aware  of  the  is¬ 
sues,  who  we  vote  for,  and  what  we 
can  do.  That  is  the  tack  we  need  to 
follow  as  humans,  not  as  engineers, 
to  make  the  world  a  better,  safer 
place  to  live. 

I  had  a  course  in  engineer¬ 
ing  ethics  as  well  as  an  ad¬ 
vanced  philosophy  course  on 
ethics.  I  agree  that  these 
courses  should  be  required  of 
all  engineering  majors.  The 
"tools”  that  you  gain  from 
these  courses  are  just  as  im¬ 
portant  as  everything  else  an 
engineer  must  learn. 

Dave  Podolske 
University  of  Wisconsin 

Dear  DDJ, 

Thanks  to  Allen  Holub  for 
writing  his  article  about  his 
personal  decision  not  to  work 
on  weapons.  I  am  gladdened 
whenever  someone  of  his 
technical  prowess  speaks  up 
on  this  issue. 

He  was  brave  to  write  it, 
and  you  were  brave  to  print 
it. 

You  might  tell  your  readers 
that  there  is  an  organization, 
Computer  Professionals  for 


10 


Dr.  Dobb's  Journal,  March  1987 

175 


LETTERS 

(continued  from  page  10) 


Social  Responsibility,  that  deals  with 
these  issues.  Its  address  is  Box  717, 
Palo  Alto,  CA  94301.  Also,  the  ACM  SIG- 
SOFT  and  SIGAS  publish  related 
information. 

Kerry  Tatlow 
1706  Charles 
Rockford,  IL  61108 

Dear  DDJ, 

Thank  you  and  congratulations  for 
running  Allen  Holub's  Viewpoint  on 
programming  ethics.  It  says  a  lot  for 
DDJ' s  position  on  the  forefront  of  soft¬ 
ware  technology  in  particular  and 
thought  in  general,  regardless  of 
where  one  stands  on  the  guns  and 
butter  debate. 

Michael  Gardner 
Wordtech  Systems,  Inc. 

P.O.  Box  1747 
Orinda,  CA  94563 

Dear  DDJ, 

I  enjoyed  reading  Allen  Holub’s  De¬ 
cember  article  concerning  the 
choices  that  engineers  and  program¬ 
mers  make  in  determining  the  effect 
of  the  work  that  they  do  upon  soci¬ 
ety.  It  appears  that  we  are  primarily 
concerned  with  the  intellectual  chal¬ 
lenge  and  financial  rewards  of  our 
technical  careers  and  that  we  seldom 
think  about  the  effect  of  our  work  on 
society  as  a  whole. 

However,  I  think  that  the  following 
is  evidence  that  people  are  also  con¬ 
cerned  with  the  moral  aspects  of  their 
work:  There  is  a  tendency  for  techni¬ 
cal  people  working  in  the  defense  in¬ 
dustry  to  be  paid  more  than  people 
doing  nondefense-related  work.  A 
large  part  of  the  reason  for  the  pay 
discrepancy  is  that  many  people  just 
don't  want  to  work  on  weapons.  In 
order  to  make  defense-industry  work 
attractive  enough  to  fill  the  positions, 
employers  are  forced  to  pay  more 
than  the  market  rate  to  overcome 
people's  natural  distaste. 

Eric  D.  Andresen 
529  Stone  Dr. 

Novato,  CA  94947 

Iii  Search  of  a  Sine 

A  number  of  people  who  responded  to 
Richard  Campbell’s  article  "In  Search 
of  a  Sine, "  published  in  the  December 
1986  issue,  pointed  out  additional  ref¬ 
erences  for  transcendental  algo¬ 


rithms.  Following  is  a  list  taken  from 
the  letters. — eds. 

Abramowitz,  M.  I.  A.  Stefun  ed., 
Handbook  of  Mathematical  Functions 
and  Formulas,  Graphs  and  Mathemat¬ 
ical  Tables.  National  Bureau  of  Stan¬ 
dards  Applied  Math  Series,  55,  Wash¬ 
ington,  D.C.:  U.S.  Govt.  Printing 
Office,  1964. 

Acton,  Foreman  S.  Numerical  Meth¬ 
ods  That  Work.  New  York:  Harper  & 
Row,  1970. 

Cody,  W.,  and  Waite,  W.  Software 
Manual  for  the  Elementary  Functions, 
Englewood  Cliffs,  N.J.:  Prentice-Hall, 
1980. 

Hart,  J.,  et  al.  Computer  Approxima¬ 
tions.  New  York:  Wiley,  1968. 
Hastings,  Cecil,  Jr.  Approximations 
for  Digital  Computers.  Princeton,  N.J.: 
Princeton  Univ.  Press,  1955. 

Knuth,  D.  E.  Art  of  Computer  Pro¬ 
gramming,  Volume  2:  Seminumerical 
Analysis.  Reading,  Mass.:  Addison- 
Wesley,  1969. 

Dear  DDJ, 

Regarding  the  article  "In  Search  of  a 
Sine,"  I  would  like  to  point  out  that 
the  Taylor's  series  expansion,  al¬ 
though  of  inestimable  value  in  doing 
analytical,  theoretical  work,  is  in 
general  of  little  value  in  computing 
numerical  approximations  of  a  func¬ 
tion  and  is  hardly  ever  used.  There  is 
a  good  reason  for  this.  When  expand¬ 
ing  a  function  into  its  Taylor's  series 
expansion,  you  choose  a  reference 
point  upon  which  to  anchor  the  ex¬ 
pansion  and  then  use  the  series  to  ap¬ 
proximate  the  function  in  the  vicini¬ 
ty  of  this  point.  If,  for  example,  you 
are  interested  in  approximating  sin(x) 
(as  Mr.  Campbell  was),  you  might 
choose  the  anchor  point  x  =  0  and 
come  up  with  the  result: 

sin(x)  =  x  —  2Li  -+-  ... 

3!  5! 

or,  in  expanded  form: 

sin(x)  =  x  -  0.166667x3  +  0.0083333x5 

+  .  .  . 

It  is  evident  that  this  approximation 
produces  an  exact  result  at  the  refer¬ 
ence  point,  x—0. 

If,  however,  you  choose  the  refer¬ 


ence  point  x=7t/4  radians  (45°),  the 
resultant  series  becomes: 

sin(x)  =  -0.00924739  +  1.04438x 
-  0.0758732X2  -  0.117851X3  +  .  .  . 

This  approximation  isn't  worth 
beans  for  x=0  but  is  exact  at  the  ref¬ 
erence  point — well,  exact  to  within 
the  accuracy  of  the  coefficients,  as 
represented,  anyway. 

The  point  is  this.  The  Taylor’s  se¬ 
ries  approximation  of  a  function — 
any  function — is  exact  at  the  refer¬ 
ence  point  about  which  it  is  expand¬ 
ed  but  gets  worse  the  farther  you  get 
from  this  point.  Furthermore,  it  often 
gets  worse  fast!  To  achieve  good  over¬ 
all  accuracy,  you  would  have  to  use 
many  series,  each  anchored  at  an  ap¬ 
propriate  point. 

A  better  way  was  invented  by  a 
clever  mathematician  named  Che- 
byschev  (sometimes  spelled  Tchebys- 
chev  or  variant  thereof).  Instead  of 
approximating  a  function  with  a  se¬ 
ries  of  powers  of  x,  he  whipped  up 
some  nifty  polynomials  in  x  (called 
Chebyschev  polynomials,  what  else?) 
and  approximated  the  function  with 
a  series  of  these  polynomials.  Of 
course,  all  the  powers  of  x  in  these 
polynomials  eventually  combine  to 
produce  something  looking  very 
much  like  a  Taylor's  series,  except 
that  the  coefficients  are  slightly  dif¬ 
ferent  and  they  have  the  absolutely 
lovely  property  of  distributing  and 
bounding  the  approximation  error 
over  the  range  of  the  approximation. 
Not  only  that,  the  Chebyschev 
scheme  produces  approximations  of 
comparable  error  to  Taylor's  but 
with  fewer  terms.  The  process  is  oft- 
times  called  Chebyschev  economiza¬ 
tion.  Without  going  into  the  details  of 
why  this  happens  or  how  to  do  it 
(which  would  require  a  lengthy  arti¬ 
cle  in  itself),  suffice  it  to  say  that  if  you 
expand  sin((7r/2)x)  in  terms  of  Cheby- 
shev  polynomials  and  then  simplify 
it,  the  result  is: 

Sin«7r/2)x)  =  1.5706268X 

-  0.6432292X3  +  0.0727102x5 

I  have  used  the  notation  Sin( )  to 
mean  an  approximation  of  sin( ).  This 
function  has  a  maximum  error  of 
about  0.0001  over  the  range  —  1  <  x 


12 

176 


Dr.  Dobb’s  Journal,  March  1987 


LETTERS 

(continued  from  page  12) 


<  1,  which  covers  the  entire  first  and 
fourth  quadrants.  If  you  require 
greater  accuracy,  you  can  use  the 
approximation: 

Sin«7r/2)x)  =  1.57079631847X 

-  0.64596371106X3 
+  0.07968967928X5 

-  0.00467376557X7 
+  0.00015148419X9 

This  approximation  is  valid  over  the 
same  range  and  has  a  maximum  er¬ 
ror,  which  occurs  at  several  places 
within  the  range,  of  about 
0.000000005. 

Another  mathematician  named 
Pade  generated  a  slick  way  to  do  the 
job  using  the  ratio  of  two  polynomi¬ 
als.  I'll  say  nothing  more  about  this 
technique,  except  that  it  works  sub¬ 
limely  with  certain  types  of 
functions. 

I  would  add  one  last  thing  in  pass¬ 
ing.  You  should  never  evaluate  the 
polynomial  (for  example): 

a  +  bx  +  cx2  +  dx3 

as  it  stands.  You  should  always  ar¬ 
range  it  into  the  nested  form: 

((dx  +  c)x  +  b)x  +  a 

This  form  requires  fewer  multiply 
instructions,  thereby  executing  fast¬ 
er  and  producing  smaller  numerical 
errors. 

Charlie  Rose 
Ball  Aerospace  Systems 
P.O.  Box  1062 
Boulder,  CO  80306 

Dear  DDJ, 

The  sine  routine  given  by  Richard 
Campbell  in  your  December  issue 
could  be  improved.  One  improve¬ 
ment  would  be  to  compute  aa=a*a 
and  then  compute  the  sine  approxi¬ 
mation  as: 

s  =  (((C4  *  aa  +  C3)  *  aa  +  C2)  *  aa 

+  Cl*  a 

using  the  same  coefficients  as  before. 
This  nested  form  of  the  polynomial 
accumulates  the  small  terms  first  and 
thus  reduces  the  errors  due  to  float¬ 
ing-point  rounding.  By  initially 
squaring  a,  you  end  up  doing  three 


additions  and  five  multiplications. 

Another  improvement  would  be  to 
use  a  ninth-order  polynomial  instead 
of  a  seventh-order  one.  The  sine 
would  then  be  computed  as: 

s  =  ((«C5  *  aa  +  C4>  *  aa  4-  C3)  *  aa 

+C2)  *  aa  +  Cl)  *  aa 

where  the  coefficients  are: 

Cl  =  1.5707963 
C2  =  -0.64596371 
C3  =  0.079689679 
C4  =  -0.0046737666 
C5  =  0.00015148513 

The  coefficients  published  in  the 
December  DDJ  can  be  found  on  page 
203,  item  SIN  3340,  in  Computer  Ap¬ 
proximations  (J.  Hart  et  al.),  but  are 
given  to  more  precision.  The  coeffi¬ 
cients  for  the  ninth-order  polynomial 
listed  above  can  be  found  on  page  204, 
item  SIN  3341,  of  the  same  reference. 

A  third  way  to  improve  the  routine 
would  be  to  use  the  methods  outlined 
in  the  two  references  and  develop  a 
program  whose  accuracy  is  limited 
only  by  the  precision  of  the  floating¬ 
point  representation  of  the  final 
result. 

Harry  J.  Smith 

Litton  Computer  Services 

1300  Villa  St. 

Mountain  View,  CA  94039 
Dear  DDJ, 

Here  is  some  feedback  on  "In  Search 
of  a  Sine.”  Some  readers  may  notice 
that  the  coefficients  Cl  through  C4 
are  somewhat  different  from  the  the¬ 
oretical  coefficients  for  the  Maclau- 
rin's  series  for  the  sine  function.  Af¬ 
ter  truncating  a  series  to  a  specific 
number  of  terms,  it  is  advantageous 
to  cook  the  coefficients  using  least- 
squares  curve  fitting.  Presumably  the 
values  given  for  Cl  through  C4  were 
derived  in  this  manner.  Rearranging 
the  sine  formula  before  doing  the  ac¬ 
tual  computation  would  result  in 
fewer  operations.  The  BASIC  rendi¬ 
tion  would  be: 

A2  =  A*A 

sin  =  (((C4  *  A2  +  C3)  *  A2  +  C2)  *A2 

+  Cl)  *  A 

and  the  32K  assembly-language  ren¬ 


dition  would  be: 

DoSin 

MOVF  F3,F1 
MULF  FI, FI 

MOVF  —  0.004362469, F5 

MULF  F1,F5 

A  DDF  0.07948765, F5 

MULF  F1,F5 

ADDF  -  0.645921, F5 

MULF  F1,F5 

ADDF  1.570795, F5 

MULF  F5,F3 

RET0 

Because  the  coefficients  can  be 
used  only  by  the  DoSin  routine,  there 
is  no  reason  to  keep  them  in  a  table. 
Making  them  immediate  operands  is 
more  compact,  faster,  and  more 
readable.  When  coding  for  the  32K, 
you  often  find  that  in-line  coding  is 
more  compact  than  the  looping 
method.  When  using  the  32K,  you 
must  remain  alert  for  opportunities 
to  be  liberated  from  your  old  habits. 

For  higher  precision,  you  could 
further  divide  the  range  of  A.  That  is, 
if  A  is  more  than  0.5,  you  take  the  co¬ 
sine  of  (1.0  -  A)  using  the  Maclaurin’s 
series  for  the  cosine: 

A2  =  A*A 

COS  =  («D4  *  A2  +  D3)  *A2  +  D2)  *  A2 
+  DU*  A2  +  1.0 

In  order  to  realize  the  higher  preci¬ 
sion,  you  would  need  to  recook  the 
coefficients  based  upon  the  shorter 
range. 

Another  thing  to  keep  in  mind 
when  doing  math  routines  for  the  32K 
is  that  the  FPU  is  very  fast;  it  does  a 
double-precision  multiply  faster  than 
the  operands  can  be  moved  in  and  out 
of  memory.  Therefore,  the  old  rule  of 
thumb  about  the  floating-point  opera¬ 
tion  dominating  the  time  is  no  longer 
true. 

The  bugs  in  the  32K  alluded  to  by 
Mr.  Campbell  are  almost  certainly  a 
thing  of  the  past;  if  you  have  a  rea¬ 
sonably  mature  version  of  the  32K, 
you  should  refrain  from  using  any 
and  all  addressing  modes  with  float¬ 
ing-point  operands. 

Neil  R.  Koozer 
Kellogg  Star  Route  Box  125 
Oakland,  OR  97462 
DDJ 


14 


Dr.  Dobb's  Journal,  March  1987 

111 


ARTICLES 


Compressing 

Ima^e  IJata 

With  Quadtrees 


A  quadtree  is  a  tree 
data  structure  in 
which  each  node 
can  have  four  child  nodes 
under  it  (compared  to  a  bina¬ 
ry  tree  in  which  each  node 
can  have  two  children). 

Quadtrees  can  be  used  as  an 
efficient  representation  of 
graphical  information  and  offer  some  interesting  fea¬ 
tures.  For  a  graphical  quadtree,  each  node  represents  a 
square  area  of  the  graphical  image  and  each  of  its  four 
children  represents  one  quadrant,  or  one  fourth,  of  its 
area.  These  subareas  are  defined  by  dividing  the  original 
area  in  two  equal  halves,  left  and  right,  and  dividing  each 
of  these  halves  in  half,  top  and  bottom.  Thus  the  subareas, 
or  quadrants,  are  of  equal  size  and  are  also  squares.  The 
root  node,  called  the  top  node,  represents  the  entire  im¬ 
age  and  has  under  it  four  children  each  representing  one 
fourth  of  the  entire  image.  This  process  of  division  is  re¬ 
peated  until  each  child  represents  only  a  single  pixel  of 
the  original  image.  If  the  original  image  is  not  a  square 
with  the  number  of  pixels  on  each  side  being  a  power  of 
2,  the  image  has  to  be  filled  out  to  that  size  with  a  back¬ 
ground  color.  The  bottom  nodes,  each  representing  one 
pixel,  are  leaf  nodes  of  the  tree.  (Data  structure  trees  are 
upside  down  from  actual  trees  because  the  root  node  is 
said  to  be  on  the  top  and  the  leaf  nodes  on  the  bottom.)  All 
the  nodes  in  the  tree  at  the  same  distance  from  the  root — 
that  is,  having  the  same  number  of  nodes  between  them 
and  the  root — are  said  to  be  on  the  same  level.  For  graphi¬ 
cal  quadtrees,  the  levels  are  numbered  starting  from  the 

Ronald  G.  White,  161  S.  35th  St.,  Boulder,  CO  80303.  Ronald 
has  an  M.S.  in  computer  science  from  the  University  of 
Colorado.  He  is  currently  involved  in  porting  software  to 
new  graphics  terminals  and  workstations. 


bottom  nodes — the  leaves — 
which  are  at  level  0.  If  the 
image  is  NXN  pixels,  where 
N=2k,  then  the  root,  or  top, 
node  is  at  level  k  and  the 
number  of  levels  is  k  +  1. 

To  represent  a  graphical 
image,  the  leaves  of  the 
quadtree  need  to  have  infor¬ 
mation  about  the  corresponding  pixel  associated  with 
them.  If  the  image  is  a  color  image,  then  the  bottom  nodes 
will  have  a  color  value.  This  value  can  be  an  index  into  a 
color  map,  a  set  of  RGB  values,  or  some  other  color  infor¬ 
mation.  The  information  associated  with  nodes  that  are 
not  at  the  bottom  level  has  a  less  obvious  meaning.  If  all 
four  children  of  a  nonleaf  node  are  leaf  nodes  and  they  all 
have  the  same  color,  the  parent  node  can  adopt  that  color. 
In  fact,  the  child  nodes  then  become  redundant  and  can 
be  removed  from  the  quadtree.  With  all  four  children 
removed,  this  node  now  becomes  a  leaf  node  even 
though  it  is  not  a  level  0  node.  Removal  of  unnecessary 
nodes  is  called  pruning  and  can  be  repeated  for  succes¬ 
sively  higher  levels,  moving  from  the  leaves  toward  the 
root. 

If  fewer  than  four  child  nodes  have  the  same  color,  the 
parent  can  still  adopt  the  predominant  color  of  the  chil¬ 
dren.  If  the  children  with  the  same  color  are  leaf  nodes, 
they  can  be  removed  from  the  tree.  The  parent  node  does 
not  become  a  leaf  node  in  this  case  because  there  are  still 
child  nodes  under  it,  but  you  have  reduced  the  size  of  the 
tree  by  removing  at  least  some  of  the  child  nodes.  It 
would  be  possible,  even  if  all  four  child  nodes  were  dif¬ 
ferent  colors,  for  the  parent  node  to  adopt  one  of  the  child 
nodes’  colors  and  to  remove  that  child  node.  As  you  will 
see  later,  in  the  section  about  locational  codes,  in  the  end 
this  removal  does  not  gain  anything.  You  can  get  rid  of 
one  child  node,  but  you  then  have  to  keep  the  parent 


by  Ronald  G.  White 


When  a  graphical  image 
is  represented 
by  an  efficient  quadtree , 
the  amount  of  data  can  be  less 
than  with  other  representations. 


16 

178 


Dr.  Dobb's  Journal,  March  1987 


node.  In  my  code  I  prefer  to  have  the  parent  node  adopt  a 
color  only  if  at  least  two  of  the  child  nodes  have  that  color. 

When  a  graphical  image  is  represented  by  an  efficient 
form  of  a  quadtree  (locational  codes,  presented  in  the 
next  section,  are  one  way),  the  amount  of  data  can  be  less 
than  with  some  other  common  representations  such  as 
simple  pixel  dumps  or  run-length  encodings.  This  de¬ 
pends  on  the  image,  of  course — a  very  fine  mesh  checker¬ 
board  pattern  would  not  be  represented  efficiently  by  a 
quadtree.  Images  in  which  large  areas  are  the  same  color 
can  be  represented  by  small  quadtrees  because  higher- 
level  nodes  in  the  quadtree  can  represent  large  sections  of 
the  image  with  no  need  for  lower-level  nodes  below 
them. 

Another  advantage  of  quadtrees  is  that  if  they  are  trans¬ 
mitted  or  displayed  starting  from  the  top  level,  then  each 
successive  level  represents  a  closer  approximation  to  the 
final  image.  This  is  particularly  useful  on  some  newer 
graphics  displays  that  support  a  fast  polygon  fill  com¬ 
mand  so  that  the  area  represented  by  a  node  can  be  filled 
quickly  by  a  single  command.  Although  I  don’t  define  the 
color  of  intermediate  nodes  in  this  way,  each  nonleaf 
node’s  color  could  be  the  average  color  of  the  area  it  rep¬ 
resents.  Each  lower  level  would  then  provide  a  more  ac¬ 
curate  description  of  the  image  as  it  was  transmitted  or 
displayed.  The  method  I  use  defines  colors  for  intermedi¬ 
ate  nodes  only  if  at  least  half  of  the  final  area  represented 
by  that  node  will  be  that  color.  In  this  case,  each  lower 
level  provides  additional  information  about  some  areas  of 
the  image  that  are  not  yet  accurately  defined.  In  an  inter¬ 
active  situation  where  transmission  of  data  is  slow  or  cost¬ 
ly,  the  user  could  be  allowed  to  stop  the  transmission  or 
display  as  soon  as  the  image  was  accurate  enough  to  be 
recognized  as  wrong  or  of  no  interest.  With  normal  scan¬ 
line  display  of  images,  much  of  the  image,  and  thus  a  lot 
of  data,  must  be  displayed  before  the  contents  of  the  pic¬ 
ture  can  be  guessed. 


Quadtrees  also  have  advantages  for  certain  types  of 
analysis  and  manipulation  of  images,  but  a  discussion  of 
these  is  beyond  the  scope  of  this  article.  For  those  of  you 
who  want  to  pursue  this  topic  further,  I’ve  provided  a  list 
of  references  at  the  end  of  this  article. 

Locational  Codes 

The  quadtree  representation  is  efficient  for  certain  types 
of  processing,  but  it  is  not  very  efficient  in  terms  of  stor¬ 
age.  Locational  codes  are  a  way  of  indicating  the  position 
of  a  node  in  the  quadtree  without  actually  storing  the 
pointers  from  the  root  node  to  the  given  node.  Instead, 
the  path  from  the  root  node  to  the  given  node  is  coded  as  a 
single  number.  This  is  generally  done  by  equating  each 
direction — NW,  NE,  SW,  or  SE — from  the  parent  to  the 
relevant  child  as  a  single  digit  (for  example,  0,  1,  2,  and  3) 
and  combining  the  digits  representing  the  path  into  a  sin¬ 
gle  number.  For  example,  the  path  NE-SE-SE-NW  could  be 
expressed  as  1330,  where  the  1  represents  the  NE  child  of 
the  root  node,  the  first  3  represents  the  SE  child  of  that 
node,  the  second  3  represents  the  SE  child  of  the  13  node, 
and  the  0  represents  the  NW  child  of  the  133  node.  Each 
node,  then,  has  a  unique  representation.  Using  a  base  10 
representation,  however,  wastes  storage  space.  In  my 
code,  I  pack  each  direction  value  into  two  bits — that  is,  I 
use  a  base  4  representation.  Other  authors  recommend 
using  a  base  5  representation  so  that  the  directions  are  the 
values  1,  2,  3,  and  4  and  0  is  reserved  as  a  beginning  mark¬ 
er.  This  is  useful  because,  depending  on  the  level  of  the 
node,  the  number  of  direction  values — that  is,  digits — 
will  vary.  In  base  5,  the  preceding  example  would  be 
02441,  where  the  0  indicates  the  beginning  of  the  code.  In 
base  4,  there  is  no  unique  bit  pattern  to  use  as  a  beginning 
marker  because  all  four  possible  two-bit  patterns  are  used 
as  direction  values. 

The  disadvantage  of  the  base  5  scheme  is  that  it  requires 
multiplying  and  dividing  by  five  in  order  to  manipulate 


Dr.  Dobb's  Journal,  March  1987 


17 

179 


COMPRESSING  IMAGE  DATA 

(continued  from  page  17) 

the  encoding.  I  originally  used  a  base  5  encoding  but 
found  the  conversion  from  a  quadtree  path  to  a  locational 
code  (and  back)  too  slow.  To  speed  up  this  process,  I 
switched  to  a  base  4  encoding  and  replaced  the  multiplies 
and  divides  by  shifts  and  bit  operations,  which  are  faster. 
This  representation  also  needs  fewer  bits  to  store  it.  To 
solve  the  problem  of  a  beginning  marker,  I  chose  to  mark 
the  beginning  with  a  01  bit  pattern.  Because  I  always  put  a 
01  in  front  of  the  actual  direction  values,  I  can  search  the 
bits  in  the  locational  code  from  left  to  right,  two  at  a  time, 
and  know  that  the  first  nonzero  pair  is  not  a  valid  direc¬ 
tion  value  but  that  the  next  pair  is.  Without  this  marker,  it 
would  be  impossible  to  determine  where  the  direction 
values  start. 

Because  pointers  are  not  needed  with  locational-coded 
quadtrees,  nodes  that  only  supply  redundant  informa¬ 
tion — that  is,  serve  only  as  placeholders  in  the  tree — can 
be  removed,  or  pruned,  from  this  form  of  the  quadtree 
for  more  efficient  storage  or  transmission.  This  pruning 
can  result  in  significant  space  savings.  Consider,  for  ex¬ 
ample,  an  image  consisting  of  a  single  red  pixel  against  a 
black  background.  In  the  pointer  form  of  the  quadtree, 
the  root  node  would  have  a  color  of  black  because  this  is 
the  predominant  color  of  the  area  it  represents  (though 
not  the  only  color).  Three  of  its  child  nodes,  representing 
the  three  quadrants  not  containing  the  red  pixel,  are  not 
necessary  because  the  information  they  would  hold — the 
color  black — is  already  held  by  their  parent  and  all  nodes 
under  them  would  also  have  the  same  information.  The 
fourth  node,  on  the  other  hand,  is  necessary  because 
somewhere  at  the  bottom  of  its  subtree  is  a  node  repre¬ 
senting  a  single  pixel  and  its  information  is  different — 
that  is,  it  has  a  color  of  red.  This  pattern — three  nodes  are 
unnecessary  but  the  fourth  is  needed — is  repeated  down 
through  the  levels  of  the  quadtree  until  you  reach  the 
node  representing  the  red  pixel.  The  nodes  in  the  tree 


between  the  root  node  and  the  bottom  node  do  not  con¬ 
tain  useful  color  information,  but  they  are  necessary  to 
save  the  path  from  the  root  node  to  the  bottom  node.  With 
the  locational  code  form  of  the  quadtree,  each  node  is 
represented  by  a  pair  of  numbers — the  locational  code 
itself  and  a  color.  For  the  image  of  a  single  red  pixel,  you 
need  to  have  only  two  nodes — the  root  node  and  the  sin¬ 
gle  bottom  node — all  the  intermediate  nodes  can  be 
thrown  away. 

Another  advantage  of  some  locational  codes  is  that 
when  the  codes  are  sorted  into  numerical  order,  the  high¬ 
er-level  nodes  come  before  lower-level  nodes.  This  is  true 
for  the  three  coding  schemes  already  mentioned  (base  10, 
base  5,  and  my  base  4  with  special  marker)  because  each 
lower-level  code  requires  an  additional  digit  to  represent 
the  next  direction  value.  Sorted  in  normal  numeric  order, 
the  higher-level  nodes,  having  fewer  digits,  will  always 
precede  the  lower-level  nodes.  The  base  4  scheme  I  use 
preserves  this  feature  because  the  1  in  front  of  the  actual 
direction  values  is  shifted  two  bits  left  at  each  lower  level. 
My  code,  however,  makes  sorting  unnecessary  because  it 
outputs  the  nodes  from  the  root  node  one  level  at  a  time. 

Listing  One 

Listing  One,  page  40,  is  a  set  of  routines  for  converting 
a  graphical  image  from  pixels  to  a  quadtree  and  output¬ 
ting  the  quadtree  in  locational  code  form.  I  have  not  pro¬ 
vided  a  main  routine  because  initialization  is  likely  to  be 
application  specific  and  possibly  system  dependent.  The 
main  routine  should  do  whatever  is  necessary  to  make 
available  a  graphical  image.  Necessary  tasks  might  in¬ 
volve  reading  the  image  into  an  array  from  a  file,  getting 
information  interactively  about  what  image  or  what  part 
of  the  image  the  user  wants,  or  initializing  the  display 
with  the  image  if  the  image  is  to  be  taken  directly  from 
the  display.  The  main  routine  also  needs  to  perform  ini¬ 
tialization  for  the  output  of  the  quadtree.  This  could  be 
opening  a  disk  file  and  writing  some  header  information 
to  it,  or  it  might  involve  opening  a  communications  chan¬ 
nel  of  some  sort.  After  all  this  is  done,  the  main  routine 
calls  p?c2quad(  ). 

The  only  externally  accessible  routine  is  p^2quad(), 
which  must  be  passed  the  size  of  the  image.  If  the  actual 
size  of  the  image  is  not  a  square  with  each  side  equal  to  a 
power  of  2,  the  size  passed  to  px2quad( )  is  the  smallest 
such  square  that  the  image  will  fit  inside.  These  routines 
assume  the  availability  of  two  routines,  getpi^cf)  and 
putlccf),  that  they  can  call.  Besides  the  main  routine,  you 
must  also  supply  these  routines.  Getpi/c( )  returns  the  color 
value  of  a  pixel  at  a  given  x,y  position,  and  putlccf )  is 
called  to  output  the  locational  code  and  color  for  each 
node. 

The  primary  data  structure  used  by  the  routines  in  List¬ 
ing  One,  defined  at  the  beginning  of  the  listing,  is  used  for 
each  node  in  the  quadtree.  The  first  field,  child,  is  an  ar¬ 
ray  for  the  four  pointers  to  child  nodes.  The  direction 
value  serves  as  a  subscript  into  this  array,  with  the  follow¬ 
ing  correspondences:  0,  1,  2,  and  3  correspond  to  the  di¬ 
rections  NW,  NE,  SW,  and  SE,  respectively.  The  second 
field  in  the  node  data  structure,  neyf,  is  a  pointer  to  the 
next  node  on  the  linked  list  used  during  output.  This 
linked  list  is  explained  later  as  part  of  the  explanation  of 


18 

180 


Dr.  Dobb's  Journal,  March  1987 


COMPRESSING  IMAGE  DATA 

(continued  from  page  18) 

the  routines  outtree( )  and  outnodeO.  The  next  field,  color , 
is  the  color  associated  with  the  node.  I  store  this  as  a  single 
value  because  I  am  using  the  index  value  into  a  color 
table.  More  extensive  information  could  be  substituted 
for  the  single  value  although  this  would  complicate  the 
code  somewhat.  The  ntype  field  is  a  flag  that  indicates 
what  type  of  node  this  one  is.  The  three  node  types  are 
defined  in  the  next  section  of  code.  The  ntype  field  is  not 
absolutely  necessary  (as  is  explained  later),  but  I  find  it 
useful.  The  final  field  in  the  node  structure,  locode,  is  the 
locational  code  for  this  node,  which  is  calculated  as  each 
node  is  added  to  the  quadtree. 

The  next  section  of  code  defines  the  three  node  types — 
LEAF,  BLEND,  and  WASH.  LEAF  type  nodes  are  nodes  that 
have  no  further  nodes  under  them.  The  color  value  of  a 
leaf  node  represents  the  color  of  all  the  pixels  in  the  area 
defined  by  that  node  even  if  it  is  not  a  bottom-level  node. 
If,  during  the  initial  pruning  of  the  quadtree,  a  node  has 
two  or  three  children  that  are  leaf  nodes  and  that  share 
the  same  color,  the  parent  node  adopts  the  color  of  these 
child  nodes  and  these  nodes  are  removed  from  the  tree. 
This  parent  node  is  marked  as  a  BLEND  type  node  to  indi¬ 
cate  that  its  color  value  is  the  color  of  pixels  in  areas  repre¬ 
sented  by  missing  child  nodes  and  that  the  node  has  child 
nodes  that  were  not  removed.  All  nodes  that  are  neither 
LEAF  type  nodes  nor  BLEND  type  nodes  are  marked  as 
WASH  nodes,  indicating  that  their  only  purpose  is  as 
placeholders  for  the  pointers  to  child  nodes  and  that  their 
own  color  is  not  relevant. 

The  routine  py2quad( )  is  the  control  routine  for  all  the 
processing  necessary  to  build  and  output  the  quadtree. 
The  first  thing  it  does  is  create  the  root  node  with  a  call  to 
crtnodef ).  It  then  calculates  the  level  number  of  the  root 
node  from  the  parameter  size,  which  the  calling  routine 
passed  to  it.  What  this  code  does  is  find  k,  where  size= 2k. 
(There  may  be  a  more  direct  way  to  do  this.)  After  setting 
the  locational  code  for  the  root  to  0  (remember  that  the  1 
is  not  actually  part  of  the  locational  code  but  a  beginning 
marker),  p^2quad( )  calls  addnodef )  to  add  the  root  node  to 
the  currently  empty  quadtree.  Addnodef )  calls  crtnodef ) 
and  itself  recursively  to  create  the  rest  of  the  quadtree. 
When  addnodef )  finally  returns,  the  quadtree  has  been 
built  and  the  initial  pruning  done.  Py2quad()  then  calls 
outtreef )  to  control  the  final  pruning  and  the  output  of  the 
quadtree  as  locational  codes  and  colors. 

The  function  crtnodef)  creates  a  new  node,  initializes  it 
with  default  values,  and  returns  a  pointer  to  it.  I  use  a  call 
to  the  system  routine  mallocf )  to  get  enough  space  for  one 
node.  This  step  could  be  a  problem  for  several  reasons. 
The  first  is  the  overhead  associated  with  making  this  call 
perhaps  hundreds  of  thousands  of  times  during  the  cre¬ 
ation  of  the  quadtree.  This  overhead  could  be  reduced  by 
getting  larger  chunks  of  memory  (or  even  statically  allo¬ 
cating  a  very  large  array)  and  having  crtnodef)  and  rel- 
nodef )  maintain  a  list  of  free  nodes.  This  approach  com¬ 
plicates  the  code,  of  course,  but  it  might  be  worthwhile 
for  improved  performance.  The  second  potential  prob¬ 
lem  is  the  amount  of  memory  required  for  a  quadtree 
that  represents  a  reasonable-size  image.  I’ll  discuss  this  in 


more  detail  at  the  end  of  the  article.  Mallocf)  returns  a 
NULL  pointer  if  no  more  memory  is  available  or  some¬ 
thing  else  goes  wrong;  crtnodef )  checks  for  this  condition. 
The  last  part  of  the  code  initializes  the  newly  created 
node.  The  node  type  is  set  to  LEAF,  indicating  that  this 
node  does  not  have  child  nodes  after  it.  If  addnodef),  the 
next  routine,  creates  new  nodes  under  this  one,  it  will 
also  call  condensef),  which,  among  other  things,  resets 
the  node  type. 

Addnodef )  adds  a  newly  created  node  to  the  quadtree. 
If  this  node  is  not  at  the  bottom  level,  it  creates  four  new 
child  nodes  under  the  current  one  with  calls  to  crtnodef) 
and  calculates  locational  codes  for  these  new  nodes.  This 
calculation  is  simple.  The  locational  code  for  the  current 
node  is  available  in  the  node  data.  The  locational  code  for 
a  child  node  needs  only  one  direction  value  added  onto 
the  end  of  the  parent's  code.  Because  the  direction  value 
is  the  same  as  the  subscript  into  the  array  of  pointers  to 
the  four  child  nodes,  all  the  code  has  to  do  is  shift  the 
parent's  locational  code  left  two  bits  and  add  in  the  direc¬ 
tion  value  for  the  child.  Addnodef )  then  calls  itself  recur¬ 
sively  for  each  of  the  new  child  nodes.  After  these  four 
new  nodes  have  been  added,  addnodef )  calls  the  routine 
condensef ),  which  examines  the  four  nodes  under  the 
current  node  to  see  if  any  of  them  can  be  removed.  If,  on 
the  other  hand,  the  current  node  is  at  the  bottom  level — 
that  is,  this  node  represents  a  single  pixel  of  the  image — 
getcolorf )  is  called  to  get  the  color  of  the  pixel. 

Condensef )  is  the  routine  that  does  the  initial  pruning  of 
the  quadtree  and  is  probably  the  most  complex  routine 
presented  here.  It  first  loops  through  the  four  children  of 
the  current  node  (addnodef )  does  not  call  condensef)  for 
leaf  nodes,  so  the  current  node  will  always  have  four 
children),  collecting  information  about  their  colors.  What 
the  code  is  looking  for  is  a  predominant  color — that  is,  a 
color  shared  by  two  or  more  child  nodes.  Nodes  that  are 
marked  as  type  WASH  are  ignored  because  their  color  is 
meaningless  (as  is  explained  later).  Condensef)  sets  the 
current  node's  color  to  the  predominant  color.  If  no  two 
children  have  the  same  color,  then  the  next  section  is 
skipped.  Otherwise,  the  code  again  loops  over  the  four 
children.  If  a  child  had  a  type  of  WASH,  it  can't  be  deleted, 
so  it  is  ignored.  If  a  child  has  the  same  color  as  the  pre¬ 
dominant  color,  it  is  either  removed,  if  it  is  a  leaf,  or  de¬ 
moted  to  type  WASH  otherwise.  The  reason  why  it  can  be 
removed  if  it  is  a  leaf  is  that  the  area  it  represents  is  in¬ 
cluded  in  the  area  represented  by  its  parent  and  the  par¬ 
ent  now  has  the  same  color — the  color  of  the  child  node  is 
therefore  redundant.  If  a  child  node  is  not  a  leaf,  this 
means  that  it  has  at  least  one  child  node  of  a  different 
color  below  it  and  so  it  cannot  be  deleted  without  losing 
the  pointers  to  its  children.  Its  color,  however,  is  now 
redundant  information,  and  this  fact  is  noted  by  marking 
it  as  a  WASH  type.  The  final  section  of  code  in  condensef) 
resets  the  node  type  of  the  current  node.  If  all  four  chil¬ 
dren  have  been  released  (because  they  were  all  the  same 
color  and  now  this  node  has  that  color),  then  this  node 
becomes  a  leaf  node  and  is  marked  LEAF.  If  this  node  has 
adopted  a  color  because  two  or  more  of  its  children  have 
been  removed  or  marked  as  WASH  type  nodes,  it  is 
marked  as  a  BLEND  type  node  to  indicate  that  it  is  not  a  leaf 
node  but  its  color  is  relevant  information.  If  the  node  is 


20 


Dr.  Dobb's  Journal,  March  1987 

181 


COMPRESSING  IMAGE  DATA 

(continued  from  page  20) 


neither  a  LEAF  nor  a  BLEND,  it  is  marked  as  a  WASH.  This 
happens  when  none  of  the  four  children  share  a  com¬ 
mon  color  either  because  they  are  really  four  different 
colors  or  some  of  them  are  WASH  nodes  and  have  no  color 
information.  The  node  type  is  used  later  when  this  node's 
parent  node  is  passed  to  condense  f )  and  this  node  is  then 
one  of  the  child  nodes.  It  is  also  used  during  output  of  the 
quadtree  because  WASH  nodes  are  not  output. 

Relnodef )  is  the  complement  of  crtnodef ) — it  releases 
an  unneeded  node.  For  efficiency,  this  routine  could  add 
the  node  to  a  linked  list  of  free  nodes  from  which 
crtnodef )  could  get  space  for  new  nodes.  It  is  implement¬ 
ed  here  as  a  call  to  the  system  routine  free( ). 

Getcolorf ),  called  by  addnodef )  when  it  reaches  a  leaf 
node  at  the  bottom  level,  is  a  function  to  return  the  color 
of  the  pixel  represented  by  a  bottom-level  node.  The  ma¬ 
jority  of  the  code  is  concerned  with  converting  the  posi¬ 
tion  of  the  node  in  the  quadtree,  given  as  its  locational 
code,  to  a  column  and  row  (x  and  y)  pixel  position.  The 
code  does  this  by  extracting  the  directions  shifted  into  the 
locational  code  by  addnodet )  in  the  process  of  building 
the  locational  code.  Because  this  is  a  bottom-level  node, 
the  number  of  direction  values  is  equal  to  the  top-level 
number.  Getcolorf )  shifts  the  direction  value  for  each  lev¬ 
el,  starting  with  the  top  level  or  root,  to  the  bottom  bits 
and  masks  them  off.  As  the  direction  code  is  recovered  for 
each  level,  the  column  and  row  values  are  shifted  left  one 
bit  and  the  new  bottom  bit  is  set  or  not  depending  on  the 
direction.  This  works  because  the  direction  values  define 
two  simultaneous  binary  searches  through  the  pixel 
space.  The  first  search,  for  column,  successively  splits  the 
pixel  space  into  left  and  right  halves.  The  second  search, 
for  row,  successively  splits  the  pixel  space  into  upper  and 
lower  halves.  Thus  each  bit  in  the  column  or  row  value  is 
a  direction  in  the  binary  search.  Whereas  the  locational 
code  needs  two  bits  to  represent  one  of  four  directions, 
the  column  and  row  values  need  one  bit  to  represent  one 
of  two  directions. 

Outtreef)  and  outnodef )  are  the  control  routines  for  the 
second  phase  of  Listing  One — outputting  the  quadtree  as 
locational  codes.  In  order  to  output  the  nodes  in  a 
breadth-first  order — that  is,  all  nodes  at  one  level  are  out¬ 
put  before  any  lower-level  nodes — outtreef )  and  out¬ 
nodef  )  construct  a  linked  list  of  nodes  yet  to  be  dealt  with. 
As  the  node  at  the  front  of  the  list  is  examined,  and  possi¬ 
bly  output,  its  children  are  added  to  the  end  of  the  list. 
The  linked  list  serves  as  a  FIFO  queue,  which  means  that 
each  level,  starting  from  the  top,  is  processed  before  the 
next  level  is  started.  Outtreef )  initializes  the  linked  list  by 
putting  the  root  node  on  it  and  setting  its  next  pointer  to 
NULL.  It  then  enters  a  loop  calling  outnodef )  with  the  next 
node  on  the  list  until  the  list  is  exhausted.  Outnodef) 
checks  the  node  type  of  the  current  node  and  outputs  the 
locational  code  and  color  if  the  node  is  not  a  WASH.  During 
this  final  pruning  of  the  quadtree,  nodes  whose  only 
function  in  the  quadtree  was  to  point  to  lower-level  nodes 
are  dropped  from  the  locational  code  form  because  the 
pointers  are  no  longer  needed.  The  routine  putlccf )  is  not 
defined  here  because  what  it  does  could  be  system  or 


application  dependent.  The  simplest  thing  for  putlccf )  to 
do  is  write  the  data  to  a  file  for  later  processing  or  display. 
Putlccf)  could  also  transmit  the  data  to  a  remote  display. 
After  putlccf)  is  called,  the  node’s  children,  if  any,  are 
added  to  the  end  of  the  list. 

Listing  Two 

Listing  Two,  page  44,  presents  a  set  of  routines  for  display¬ 
ing  a  quadtree  from  a  locational  code  form.  As  with  the 
first  set  of  routines,  no  main  routine  is  given  and  only  a 
single  routine  needs  to  be  called.  The  main  routine  will 
have  to  do  whatever  initialization  is  necessary,  such  as 
opening  the  file  containing  the  quadtree  and  reading  in  a 
header  section  or  opening  a  communications  channel.  It 
may  also  need  to  initialize  a  graphics  system  or  at  least 
clear  the  screen.  Qflispf)  is  the  entry  point  for  the  second 
set  of  routines.  It  needs  to  know  the  size  of  the  original 
image,  which  could  be  read  from  a  file  header  associated 
with  the  quadtree  file  or  supplied  by  the  user.  These  rou¬ 
tines  make  calls  to  two  externally  defined  routines  that 
you  must  supply.  These  are  getn)tn( ),  which  returns  the 
next  quadtree  node  as  a  locational  code  and  a  color,  and 
filrecf),  which  fills  in  a  rectangle  on  the  display  with  a 
given  color. 

Most  of  qdispf)  is  a  loop  that  gets  the  next  node  as  a 
locational  code  and  color  by  a  call  to  getnxnf),  converts 
the  locational  code  to  the  corner  and  side  of  the  square 
represented,  and  fills  in  the  square  with  the  color  by  a  call 
to  filrecf ).  Getnjcnf )  and  filrecf )  are  assumed  to  be  supplied 
separately  because  they  might  be  both  system  and  appli¬ 
cation  dependent.  Getnjcnf),  for  instance,  might  be  read¬ 
ing  the  quadtree  data  from  a  file  or  reading  it  from  a  serial 
port.  Filrecf)  is  given  the  upper-left  corner  and  sides  of  a 
rectangle  (node  quadrants  are,  of  course,  always  square, 
but  filrecf )  is  presumed  to  be  more  general)  and  a  color, 
and  it  fills  the  defined  rectangle  on  the  display  with  the 
color.  If  you  are  lucky  enough  to  be  using  a  system  with  a 
graphics  package  that  supplies  such  a  call  (or  better  yet,  a 
display  that  has  the  function  available  in  hardware),  then 
the  implementation  of filrecf )  should  be  simple.  If  you  are 
not  so  lucky,  then  filrecf )  may  have  to  loop  over  all  the 
pixels  in  the  rectangle,  setting  each  to  the  given  color. 

The  routine  squaref )  converts  a  locational  code  to  the 
corner  and  side  of  the  square  represented  by  the  node.  It 
is  very  similar  to  getcolorf)  in  Listing  One.  Because  the 
level  of  the  node  is  not  known,  the  code  must  search  the 
locational  code  for  the  beginning  marker.  After  finding 
this,  the  code  loops,  like  getcolorf ),  over  the  direction  val¬ 
ues  from  the  root  to  the  current  node.  At  each  iteration, 
the  length  of  the  side,  initialized  to  the  original  image  size, 
is  divided  by  two  and  the  corner  position  is  adjusted  ac¬ 
cording  to  which  quadrant  is  indicated. 

Practical  Considerations 

Quadtrees  in  pointer  form  can  use  up  a  lot  of  memory.  In 
the  worst  case,  in  which  no  nodes  can  be  released  during 
construction,  an  image  of  size  N  (=  2k)  would  require 
more  than  NXN  nodes.  For  example,  an  image  of 
256  X  256  pixels  could  require  more  than  64K  nodes.  Using 
my  data  structure  for  a  node,  this  takes  up  2  megabytes  of 
memory  on  a  machine  that  uses  4  bytes  for  ints  and  point¬ 
ers.  A  more  efficient  structure  may  be  needed.  Using 


24 

182 


Dr.  Dobb’s  Journal,  March  1987 


smaller  fields  and/or  combining  fields  would  be  one  way 
to  reduce  memory  needs.  The  node  type  field  could  be 
removed  with  some  additional  processing — the  type 
LEAF  could  be  deduced  from  the  fact  that  all  child  point¬ 
ers  are  NULL  and,  because  the  color  of  a  WASH  node  is 
meaningless,  a  special  value  in  the  color  field  could  indi¬ 
cate  that  a  node  was  a  WASH. 

Besides  the  memory  problem,  the  time  required  to  cre¬ 
ate  the  quadtree  may  also  prove  to  be  a  problem.  Despite 
some  efforts  to  speed  up  the  process,  such  as  switching  to 
a  base  4  representation  for  the  locational  codes,  creating 
the  quadtree  is  still  very  slow.  One  improvement,  as  men¬ 
tioned  already,  might  be  to  change  the  way  crtnode f  J  and 
relnodef )  get  and  release  nodes.  Another  might  be  to  keep 
more  information  about  the  location  of  the  current  node 
so  that  getcolorl )  does  not  have  to  figure  this  out  from  the 
locational  codes.  I  think,  however,  that  major  improve¬ 
ments  will  require  somehow  avoiding  all  the  hundreds 
(or  thousands)  of  calls  to  addnodeO  and  condenseO  for 
sections  of  the  image  that  are  a  single  color  and  could  be 
quickly  defined  as  a  few  high-level  nodes. 

Much  to  my  disappointment,  the  display  of  the  quad¬ 
tree  is  not  very  fast.  Even  on  a  display  that  supports  rect¬ 
angle  fill,  a  scan-line  display  of  an  image  is  faster  than  the 
quadtree  display,  although  the  display  of  the  quadtree  is 
more  interesting  to  watch.  For  any  reasonably  interesting 
image,  a  lot  of  individual  pixels  must  ultimately  be  filled 
in  to  complete  the  image  and  this  takes  a  lot  of  time. 

Despite  these  problems,  quadtrees  can  offer  some  ad¬ 
vantages  over  other  graphical  image  representations  and, 
in  some  cases,  may  be  the  best  choice. 

Availability 

All  the  source  code  for  articles  in  this  issue  (except  C 
Chest)  is  available  on  a  single  disk.  To  order,  send  $14.95  to 
Dr.  Dobb's  Journal,  501  Galveston  Dr.,  Redwood  City,  CA 
94063  or  call  (415)  366-3600  ext.  216.  Please  specify  the  issue 
number  and  disk  format  (MS-DOS,  Macintosh,  Kaypro). 

Bibliography 

Gargantini,  I.  "An  Effective  Way  to  Represent  Quadtrees.” 
Communications  of  the  ACM,  vol.  25,  no. 12  (December 
1982):  905-910. 

Jones,  L.,  and  Iyengar,  S.  "Representation  of  Regions  as  a 
Forest  of  Quadtrees.”  Proceedings  of  the  IEEE  Conference 
on  Pattern  Recognition  (1981):  57-59. 

Samet,  H.  "An  Algorithm  for  Converting  Rasters  to  Quad¬ 
trees.”  IEEE  Transactions  in  Pattern  Analysis  and  Machine 
Intelligence,  vol.  3,  no.  1  (January  1981):  93-95. 

Samet,  H.  "Data  Structures  for  Quadtree  Approximation 
and  Compression.”  Communications  of  the  ACM,  vol.  28, 
no.  9  (September  1985):  973-993. 

Scott,  S.,  and  Iyengar,  S.  "TID — A  Translation  Invariant 
Data  Structure  for  Storing  Images. ' ’  Communications  of  the 
ACM,  vol. 29,  no.  5  (May  1986):  418-429. 

Witten,  I.  H.,  and  Cleary,  J.  G.  "Foretelling  the  Future  by 
Adaptive  Modeling.”  ABACUS,  vol. 3,  no. 3  (Spring  1986):  16- 
36,  73. 

E 

(Listings  begin  on  page  40.) 

Vote  for  your  favorite  feature/article. 

Circle  Reader  Service  No.  1 . 


Dr.  Dobb  s  Journal,  March  1987 


25 

183 


ARTICLES 


ARC  Wars:  MS-DOS 
Archiving  Utilities 


by  Russell  Nelson 


Since  the  dawn  of  computers, 
people  have  been  trying  to 
make  their  computers  run 
faster.  One  speedup  technique  is  data 
compression,  which  lets  the  comput¬ 
er  operate  on  less  data  while  still  ac¬ 
complishing  the  same  amount  of 
work. 

Data  compression  relies  on  the  fact 
that  most  data  are  not  random.  En¬ 
glish  language  text,  for  example,  has 
a  known  character  distribution — the 
letter  E  occurs  more  often  than  T,  T 
more  often  than  O,  and  so  on.  A  data 
compression  algorithm  relies  on 
these  quirks  to  use  fewer  bits  to  rep¬ 
resent  the  same  data.  For  users 
downloading  from  bulletin  boards, 
this  approach  translates  into  lower 
phone  bills. 

The  old  standard  for  data  compres¬ 
sion  was  a  combination  of  three  pro¬ 
grams — one  to  combine  files  into  one 
library  (LU,  library  utility),  one  to 
squeeze  this  library  into  fewer  bits 
(SQ),  and  another  to  unsqueeze  the  li¬ 
brary  (USQ).  The  squeeze  program 
would  produce  a  Huffman  encoding1 
of  its  input  file. 

Thom  Henderson  of  System  En¬ 
hancement  Associates  (SEA)  created 
ARC  to  provide  an  alternative  to  LU. 
ARC  can  add  files  to  an  archive  and 
automatically  determine  which  of 
four  different  compression  methods 
to  use.  An  archive  can  never  be  much 
larger  than  the  component  files  be¬ 
cause  one  of  the  four  "compression” 
methods  consists  simply  of  storing 


Russell  Nelson,  11  Grant  St.,  Potsdam, 
NY  13676.  Russell  is  the  author  of 
Painter's  Apprentice.  He  holds  an 
M.S.E.E.  degree  from  Clarkson  Univer¬ 
sity  in  Potsdam,  New  York. 


Data  compression 
relies  on  the  fact  that 
most  data  are  not 
random. 


the  file  unaltered. 

In  actual  use,  the  savings  are  often 
considerable.  After  automatic  com¬ 
pression  was  added,  ARC  performed 
better  than  the  LU/SQ/USQ  combina¬ 
tion,  and  within  seven  months  of  its 
introduction,  it  became  the  new 
standard  for  BBS  files.  This  could  have 
happened  just  because  it  was  more 
convenient,  but  more  probably  it 
was  because  it  used  an  improved 
compression  algorithm — the  Lem- 
pel-Ziv2  algorithm. 

In  fact,  ARC  became  so  popular  that 
spin-offs  using  the  same  file  format 
appeared.  Spin-offs  were  easy  to  de¬ 
velop  because  SEA  made  the  ARC 
source  available.  At  present,  four  dif¬ 
ferent  ARC-compatible  systems  and 
one  ARC-incompatible  system  are  in 
use.  This  article  reviews  the  perform¬ 
ance  of  each  of  them  on  different  sets 
of  files. 


The  Programs 

Table  1,  below,  lists  the  current  (as  of 
October  1986)  program  versions  that 
comprise  each  of  the  five  archiving 
systems.  Note  that  two  of  the  systems 
are  distributed  as  separate  pro¬ 
grams— PKARC  and  PKXARC  form  one 
system;  ARCA,  ARCE,  and  ARCV  form 
another.  The  other  three  systems  are 
ARC,  ARCH,  and  ZOO. 

The  Donation  column  in  Table  1 
gives  the  suggested  donation  if  the 
software  is  shareware.  Some  of  the 
software  is  copyrighted,  but  no  dona¬ 
tion  is  suggested.  The  Cost  column 
gives  the  cost  of  the  software  includ¬ 
ing  automatic  updates,  printed  docu¬ 
mentation,  and  so  on.  All  these  pro¬ 
grams  are  freely  copyable. 

The  box  on  page  28  gives  the  au¬ 
thor's  address  for  each  system.  In  ad¬ 
dition,  all  the  files  mentioned  in  this 
article  are  available  from  the  Clark¬ 
son  University  Heath  Users  Group 
(CUHUG)  Fido— (315)  268-6667,  300/ 
1,200/2,400  baud,  24  hours — as  well 
as  from  many  other  BBSs.  ARC  and 
PKXARC  are  distributed  as  self-ex¬ 
tracting  .COM  files;  the  other  pro¬ 
grams  are  distributed  in  archive 
form.  Look  for  ARC*.*  and  PKX*.*  if  the 
BBS  has  a  wildcard  list  function. 


Program 

Version 

ARC 

5.12 

ARCA 

1.18 

ARCE 

2.06 

ARCV 

1.15 

ARCH 

5.38 

PKARC 

1.1 

PKXARC 

3.2 

ZOO 

1.20 

Table  1:  Program  versions 


Size 

Donation 

Cost 

(bytes) 

32429 

$35 

$50 

3796 

none 

5424 

none 

2063 

none 

32694 

none 

15972 

$15 

$35 

9984 

$15 

$35 

29120  » 

none 

26 

184 


Dr.  Dobb’s  Journal,  March  1987 


In  addition  to  the  archive  pro¬ 
grams  discussed  in  this  article,  the  CU- 
HUG  Fido  has  the  following  related 
files:  DARC.ARC,  which  deletes  from 
the  disk  any  files  that  may  be  found 
in  the  archive;  XONE2.ARC,  which  ex¬ 
tracts  one  file  from  an  archive  into  a 
new  archive  containing  just  that  file; 
LZ.ARC,  which  contains  assembly- 
language  source  for  a  Lempel-Ziv 
compresser  and  decompresser; 
ARCX.ARC,  which  contains  Turbo  Pas¬ 
cal  source  for  an  archive  extractor; 
and  ARC44.ARC,  which  contains  the 
source  for  Version  4.4  of  ARC. 

Comparison  of  Features 

Table  2,  below,  lists  the  features  that 
each  program  provides.  Most  of  the 
titles  are  self-explanatory;  those  that 
aren't  are: 

•  Add  files  to  archive:  Obviously  all 
archive  systems  can  add  files  to  an 
archive  but  not  all  programs  in  an  ar¬ 
chive  system  can  do  so. 

•  Alphabetic  file  names:  Because  ARC 
uses  a  distributed  directory  (the  file 
names  are  not  kept  in  a  central  loca¬ 
tion),  alphabetic  adding  means  that 
the  files  in  the  archive  must  be  reor¬ 
dered  when  a  new  file  is  added  and  a 
copy  of  the  archive  must  be  made. 
The  advantage  of  not  copying  the  ar¬ 
chive  is  that  an  archive  can  fill  a 


whole  floppy  rather  than  being  lim¬ 
ited  to  just  half.  ZOO  uses  a  version 
numbering  scheme  to  avoid  copying. 
ARCA  simply  ignores  the  issue  and 
makes  two  copies  of  the  file,  only  the 
first  of  which  is  accessible. 

•  Damaged  headers:  If  an  archive  gets 
munged,  a  file  header  can  be  dam¬ 
aged.  Some  archivers  can  skip  the 
damaged  file;  some  just  give  up. 

•  Extract  to  explicit  path:  Sometimes 
you  might  want  to  extract  a  file  to  a 
subdirectory /drive  other  than  the 
current  one. 

•  Forced  storing:  Because  data  com¬ 
pression  can  take  a  fair  bit  of  time, 
some  archivers  allow  you  to  force  the 
files  to  be  added  uncompressed  for 
later  compression  of  the  entire 
archive. 

•  Freshen  files  already  in  archive: 
Only  newer  files  already  in  the  ar¬ 
chive  are  archived. 

•  List  file  names  only:  This  is  useful  if 
you  want  to  pipe  the  list  of  file  names 
to  another  program. 

•  Only  packing  and  crunching:  Some 
archivers  don’t  bother  to  squeeze  or 
store  a  file. 

•  Update  adds  only  newer  files:  Older 
files  in  the  archive  are  left  alone. 

•  Wildcard  archive  file  names:  You 
can  specify  an  ambiguous  archive 
file  name  using  wildcards  in  combi¬ 
nation  with  some  operations. 


Benchmarks 

I  ran  benchmarks  using  a  S-MHz  Z100 
with  a  V20  to  run  MS-DOS  2.18— all  the 
programs  run  under  generic  MS-DOS. 

I  stored  the  files  and  programs  in  a 
RAM  disk  so  that  physical  disk  access 
times  were  not  significant.  The  ver¬ 
bose  listings  were  redirected  to  a  file, 
so  console  output  time  is  not  reflected 
in  the  run  time. 

I  have  tried  to  use  test  data  that  is 
easily  obtainable.  ARC44  is  an  archive 
of  the  source  of  ARC,  Version  4.4,  and  I 
used  it  to  test  the  ability  of  an  ar¬ 
chiver  to  cope  with  a  file  that  cannot 
be  compressed  further.  The  MASM 
benchmark  is  the  MASM,  Version  4.0, 
distribution  disk,  and  I  included  it  to 
test  the  compression  of  nontext  files. 
The  TDebug  benchmark  is  the  source 
of  TDebug  Plus,  available  from  Turbo 
Power  Software;  I  included  it  to  test 
the  compression  of  text  files. 

I  tested  only  the  most  common  op¬ 
erations— add,  add  to  existing,  delete, 
list,  and  extract.  Results  of  the  bench¬ 
mark  tests  are  shown  in  Table  3,  page 
28. 1  assume  less  common  operations, 
such  as  update,  freshen,  and  move, 
will  be  roughly  similar  in  speed.  Ex¬ 
ample  1,  page  30,  shows  the  output  of 
verbose  listings  for  some  of  the  ar¬ 
chive  systems.  There  is  not  much  to 
say  because  all  give  the  same  infor¬ 
mation  and  have  similar  run  times. 


Add  files  to  archive 
Alphabetic  file  names 
Archive  file  compatible 
Comments  attached  to  files 
Copies  while  modifying 
Damaged  headers 
Delete  files  after  adding 
Encryption 

Extract  files  to  console 
Extract  files  to  printer 
Extract  into  archive 
Extract  to  explicit  path 
Forced  storing 

Freshen  files  already  in  archive 
List  file  names  only 
Make  backup  of  the  archive  file 
Only  packing  and  crunching 
Replace  existing  files  on  extract 
Test  archive  integrity 
Update  adds  only  newer  files 
Verbose  listing  of  archive 
Wildcard  archive  file  names 


ARC 

y 

y 

y 

y 

y 

y 

y 

y 


ARCA  ARCE  ARCV  ARCH  PKARC  PKXARC  ZOO 


y 

y 

y 

y 

y 

y 


y 

y 

y 

y 

y 


n/a 

y 


y 

y 

y 

y 


y 

y 

y 

y 

y 

y 

y 

y 

y 


y 

y 

y 

y 

y 

y 

y 

y 

y 


Table  2:  Comparison  of  features 


Dr.  Dobb's  Journal,  March  1987 


27 

185 


ARC  WARS 

(continued  from  page  27) 

Conclusions 

As  shown  in  Table  3,  PKARC  is  the 
fastest  archiver  by  a  wide  margin, 


and  PKXARC  is  the  fastest  extractor. 
PKARC  also  produced  the  smallest  ar¬ 
chives  in  all  but  one  instance,  in 
which  ZOO  was  slightly  smaller. 
ARCA/ARCE/ ARCV,  PKARC/PKXARC, 
and  ZOO  are  written  in  assembly  lan¬ 


guage,  whereas  ARC  and  ARCH  are 
written  in  C.  If  you  don’t  mind  the 
donation,  PKARC/PKXARC  is  the  sys¬ 
tem  to  use. 

ZOO  performed  adequately.  ZOO  is 
the  only  explicitly  public-domain  ar- 


ARC 

Thom  Henderson 

System  Enhancement  Associates 

21  New  St. 

Wayne,  NJ  07470 

For  $50  you  receive  a  program  disk 
with  printed  documentation.  If  you 
obtain  ARC  by  other  means,  then  you 
cannot  use  it  in  a  commercial  envi¬ 
ronment  or  a  government  organiza¬ 
tion  unless  you  pay  a  $35  license  fee. 
Site  licenses  and  commercial  distri¬ 
bution  licenses  are  available,  as  is  the 
full  program  source. 

ARCA,  ARCE,  ARCV 
Vernon  D.  Buerg 
456  Lakeshire  Dr. 

Daly  City,  CA  94015 
CompuServe:  70007,1212 


Authors  /Vendors 

Data/RBBS:  (415)  994-2944 

ARCH 

Les  Satenstein 
PCOM  RBBS  Montreal 
(514)  989-9450 

Given  the  similarity  in  features,  run 
time,  and  results,  ARCH  must  be  a 
modified  copy  of  ARC.  The  ARC  copy¬ 
right  permissions  strictly  prohibit 
distribution  of  modified  copies  of 
ARC.  Nevertheless,  ARCH  has  more 
features  than  ARC,  and  so  I  included  it 
in  this  review. 

PKARC,  PKXARC 

Phil  Katz 

7032  Ardara  Ave. 

Glendale,  WI  53209 
Send  comments  to: 


Exec-PC  multiuser  IBM  BBS 
modem:  (414)  964-5160 
If  you  find  PKARC  and  PKXARC  fast, 
easy,  and  convenient  to  use,  a  contri¬ 
bution  of  $15  would  be  appreciated. 
With  each  contribution  of  $35  or 
more,  you  receive  free  upgrades  of 
the  next  versions  of  PKARC  and 
PKXARC  when  available,  including 
documentation. 

ZOO 

Rahul  Dhesi 
GEnie:  DHESI 
People/Link:  OLS806 
ARPAnet/CSnet:  dhesi%bsu@csnet- 
relay.ARPA 

UUCP:  !seismo!csnet  relay 
.ARPA!bsu!dhesi 
ZOO  is  in  the  public  domain. 


-archive  of  source  of  ARC  (57,728  bytes) 
ARC  ARCA 

144.66  42.42 

57,728  58,014 

57,759  65,564 

stored  packed 


Archive  add— ARC44- 

Run  time  (seconds) 

Size  (bytes) 

Total  size  (bytes) 

Stowage 

Archive  add — MASM  4.00  distribution  disk  (288,122  bytes  total) 

ARC  ARCA 

Run  time  (seconds)  648.77  122.41 

Total  size  (bytes)  237,072  221,751 

Compression  (percent)  17  23 

Archive  add— TDebug  Plus  source  (289,049  bytes  total,  all  ASCII) 


ARCH 

150.27 

57,728 

57,759 

stored 

ARCH 

653.73 

237,072 

17 


ARC 

373.44 

116,802 

59 


ARCA 

86.56 

116,255 

59 


ARCH 

364.17 

116,802 

59 


Run  time  (seconds) 

Total  size  (bytes) 

Compression  (percent) 

Archive  extract— ARC44— archive  of  source  of  ARC  (57,728  bytes) 

_  x.  ARC  ARCE  arch 

Runtime  136.87  51.27  1 39.75 

Archive  extract— MASM  4.00  distribution  disk  (288,122  bytes  total,  13,595  bytes  ASCII) 

_  x.  ARC  ARCE  ARCH  PKXARC 

Run  time  244.39  58.06  252.00  46.74 

Archive  extract— TDebug  Plus  source  (289,049  bytes  total,  all  ASCII) 

.  arc  ARCE  ARCH 

Runtime  182.60  42.73  188.30 


PKARC 

25.86 

57,728 

57,759 

stored 

PKARC 

92.11 

221,020 

23 

PKARC 

65.36 

115,950 

59 

PKXARC 

50.98 


PKXARC 

33.43 


ZOO 

40.41 

57,728 

60,625 

stored 

ZOO 

124.91 

221,907 

23 

ZOO 

82.25 

113,013 

61 

ZOO 

n/a 

ZOO 

69.64 

ZOO 

54.86 


Table  3:  Benchmark  results 


28 

186 


Dr.  Dobb's  Journal,  March  1987 


ARC  WARS 

ARC  and  PKARC  are  distributed 

in  a 

Codes, 

”  Proceedings  of  the  Institute  of 

(continued  from  page  28) 

self-extracting  .COM  file  but  provide 

Radio  Engineers,  vol.  40  (May  1952): 

no  facility  for  creating  your  own 

self- 

1098- 

1101. 

chiver  I  reviewed,  which  is  its  strong- 

extracting  .COM  file.  Another 

nice 

2.  Terry  A.  Welch, 

'A  Technique  for 

est  point.  Its  weakest  point  is 

that  its 

member  of  the  ARC  family  would  be 

High  Performance  Data  Compres- 

archive  files  (.ZOO)  are  not  compatible 

a 

"ROM”  disk  driver  that  takes  an  ar- 

sion." 

IEEE  Computer,  vol.  17,  no.  6 

with  ARC-type  archive  files 

(.ARC). 

chive  file  as  input  and  produces  a 

(June,  1984). 

ZOO's  author  plans  to  put  the  finished 

read-only  disk  drive  when  installed. 

source  in  the  public  domain, 

so  I  ex- 

A  ROM  disk  would  be  handy  for  fre- 

s 

pect  someone 

will  convert  it 

to  use 

quently  used  files. 

Vote  for  your  favorite  feature/article. 

ARC-type  files 

Circle  Reader  Service  No.  2. 

Surprisingly,  not  one  of  the  archive 

(Votes 

programs  includes  a  "rename  file  in 

l 

D.  Huffman,  "A 

Method  for  Con- 

archive”  command.  In  addition,  both 

structing  Minimum  Redundancy 

- arc - 

Name 

Length 

Stowage 

SF 

Size  now 

Date 

Time 

CRC 

ARC  .H 

1841 

crunched 

39% 

1133 

9  Nov  85 

1  0 :  25p 

F24F 

ARC  .  M 

512 

packed 

34% 

343 

22  Aug  85 

12:18a 

3057 

LOAD . BAT 

1024 

crunched 

49% 

531 

12  Nov  85 

11:02a 

725A 

XARC . MAC 

3584 

crunched 

41% 

2142 

8  Nov  85 

1 1 : 37p 

6B57 

Total  22  117610 

52% 

57063 

Run  time: 

319 

- arch - 

— 

— 

Doing  ARC44 .ARC: 

Name 

Length 

Stowage 

SF 

Size  now 

Date 

Time 

ARC  .H 

1841 

crunched 

39% 

1  133 

9  Nov  85 

1  0  :  2  5p 

ARC  .  M 

512 

packed 

34% 

343 

22  Aug  85 

12:18a 

LOAD . BAT 

1024 

crunched 

49% 

531 

12  Nov  85 

1  1 : 02a 

XARC . MAC 

3584 

crunched 

4  1% 

2142 

8  Nov  85 

1 1 :37p 

22  f i les  117610 

Run  time: 

200 

- arcv - 

— 

-- 

Archive:  ARC44.ARC 

Name 

Length 

Stowage 

SF 

Size  now 

Date 

Time 

CRC 

ARC.H 

1841 

crunched 

38% 

1133 

09  Nov  85 

22:25 

F24F 

ARC  .M 

512 

packed 

33% 

343 

22  Aug  85 

12:18 

3057 

LOAD . BAT 

1024 

crunched 

48% 

531 

12  Nov  85 

11:02 

725A 

XARC . MAC 

3584 

crunched 

40% 

2142 

08  Nov  85 

23:37 

6B57 

♦total  22  117610 

5  1% 

57063 

Run  time: 

164 

- pkxarc-- 

_ 

_ 

PKXARC 

FAST! 

Archive  Extract  Utility 

Version  3.2 

9-12-86 

Copyright 

(c)  1986  Phil 

Katz , 

All  Rights  Reserved.  PKXARC/h  for 

help 

Searching : 

ARC  4  4  .  ARC 

Name 

Length 

Method 

Size 

Ratio 

Date 

Time 

ARC.H 

184  1 

crunched 

1133 

39% 

1  1-09-85 

22:25:16 

ARC.M 

512 

packed 

343 

34% 

08-22-85 

12:18:44 

LOAD . BAT 

1024 

crunched 

531 

49% 

1  1-12-85 

11:02:58 

XARC . MAC 

3584 

crunched 

2142 

41% 

11-08-85 

23:37:48 

0022 

117610 

57063 

52% 

Run  time: 

786 

Example  1: 

Output  of  verbose  listings 

30 


Dr.  Dobb's  Journal,  March  1987 

187 


ARTICLES 


Optimizing  Integer 
Multiplications  by 
Constant  Multipliers 


Integer  multiplication  by  a  con¬ 
stant  multiplier  can  occur  fre¬ 
quently  in  high-level  language 
programs.  Besides  the  explicitly  cod¬ 
ed  multiplications,  the  compiler 
must  generate  multiplication  instruc¬ 
tions  as  part  of  each  array  reference. 
To  address  an  array  element,  the 
compiler  forms  code  to  multiply  the 
array  index  by  the  size  of  an  array 
element  (a  constant).  If  the  array  ele¬ 
ments  are  simple  data  types  (bytes, 
words,  and  so  on)  then  the  multipli¬ 
cation  is  often  done  as  a  shift  left  be¬ 
cause  the  size  of  the  element  is  a 
power  of  2.  Some  of  the  more  power¬ 
ful  processors  (Intel  80386,  Motorola 
68020,  National  32016,  and  so  on)  pro¬ 
vide  scaled-indexed  addressing 
modes  that  incorporate  the  appropri¬ 
ate  shift  as  part  of  the  address  calcula¬ 
tion. 

If  the  array  element  is  not  a  simple 
type,  however,  the  multiplication 
must  be  done  explicitly.  Multiply  di¬ 
mensioned  arrays  require  a  multipli¬ 
cation  for  each  index  (except,  per¬ 
haps,  for  the  last  one,  which  can  be 
done  with  a  shift). 

Multiplication  is  a  time-consuming 
operation,  even  on  those  processors 
that  have  multiplication  instructions. 
Table  1,  right,  shows  the  execution 
times,  in  clock  cycles,  for  several 
types  of  instructions,  including  mul¬ 
tiplication,  on  some  modern  micro¬ 
processors.  The  68000,  for  example, 
requires  about  70  clocks  for  a  16-bit 


Robert  D.  Grappel,  28  Buckmaster  Dr., 
Concord,  MA  01742.  Robert  Grappel 
has  a  Ph.D.  in  solid-state  physics  from 
Ohio  University.  He  is  currently  a  con¬ 
sultant  involved  in  the  design  of  new 
air-traffic  control  systems. 


by  Robert  D.  Grappel 


Some 

multiplications 
can  be  sped  up 
by  ' unrolling '  the 
calculation. 


register-to-register  multiplication  in¬ 
struction,  compared  to  about  8  clocks 
for  a  32-bit  register-to-register  addi¬ 
tion  or  subtraction.  A  32-bit  register 
shift  requires  about  6  clocks  plus  an 
additional  2  clocks  per  shift  position. 
Clearly,  the  68000  can  do  several 
adds,  subtracts,  and  shifts  in  the  time 
it  takes  to  perform  one  multiplica¬ 
tion. 

Indexing  an  array  (without  artifi¬ 
cially  limiting  the  size  to  64K)  re¬ 
quires  a  32-bit  multiplication,  which 
neccessitates  at  least  two  16-bit  multi¬ 
plications  and  an  addition  on  the 
68000  along  with  some  logic.  Because 
this  32-bit  multiplication  is  likely  to 
be  done  as  a  run-time  subroutine, 
there  is  often  an  additional  setup  and 
calling  overhead,  too.  (The  16-bit  mul¬ 
tiplication  of  the  Intel  80286  is  suffi¬ 
cient  to  address  an  entire  memory 
segment.)  Thus,  there  is  room  for  a 
compiler  to  fabricate  an  optimized 
sequence  of  additions,  subtractions, 
and  shifts  in  place  of  a  multiplication 


Unrolling  the  Loop 

Computers  multiply  numbers  using 
some  variation  of  the  following 
algorithm: 

1.  Clear  work  register  Rw,  which  be¬ 
comes  the  product. 

2.  If  the  low-order  bit  of  the  multipli¬ 
er  is  a  1,  add  the  multiplicand  to  Rw. 

3.  Shift  the  multiplier  right  one  bit 
position. 

4.  If  the  multiplier  is  0,  stop  (product 
in  Rw). 

5.  Shift  the  multiplicand  left  one  bit 
position. 

6.  Go  to  step  2. 

It  is  apparent  that  the  computer  per¬ 
forms  multiplication  as  a  sequence  of 
shifts  and  additions — step  2  is  an  ad¬ 
dition;  steps  3  and  5  are  shifts.  If  the 
multiplier  is  a  constant,  the  algo¬ 
rithm  can  be  "unrolled”  into  a  se¬ 
quence  that  includes  only  adds  and 
shifts.  This  sequence  is  called  a  "star- 
chain”  sequence  because  the  result  of 
each  step  is  used  immediately  in  the 
next  step — no  intermediate  stores  are 
required.  The  sequence  requires 
only  two  registers —  the  original  mul¬ 
tiplicand  and  the  work  register  in 
which  the  product  is  formed.  Consid¬ 
er  the  following  examples,  in  which 
the  notation  R1  indicates  the  multipli- 


on  any  of  these  processors. 

positions,  and  +  = 

R1  means  add  the 

80286 

68000 

68020 

ADD 

2 

8 

0-3 

SUB 

2 

8 

0-3 

SHIFT  n 

5+n 

6+2n 

1-4 

MUL16 

21 

70 

21-28 

MUL32 

41-44 

Table  1:  Timing  for  several  basic  arithmetic  instructions  (clock  cycles) 


34 

188 


Dr.  Dobb's  Journal,  March  1987 


multiplicand: 

#include  stdio.h 

/*  Program  to  generate  a  "star-chain"  sequence  to  replace 

Rl  *  10: 

multiplication  by  a  positive  integer  constant  with  a 

1  Rw  =  Rl 

series  of  add,  subtract ,  and  shift-left  instructions  . 

2  Rw  <<=  1 

Assumes  two  machine  "registers"  .  Instructions  are 

3  Rw+=R1 

formed  on  a  temporary  stack  ,  then  output .  A  stack 

4  Rw  <<=  2 

element '  s  magnitude  is  the  shift  amount ,  the  sign 

5  Rw  +  =  Rl 

indicates  subsequent  add  (plus)  or  subtract  (minus) .  */ 

Rl  *  7: 

long  mult;  /*  32-bit  signed  constant  multiplier  */ 

int  f lag , cnt , stkptr , stack [ 16]  ,  last_cnt , last_shif t , ts ; 

1  Rw  =  Rl 

2  Rw  <<  =  1 

int  trim_tr ailing ( one_zero )  int  one_zero ; 

3  Rw  +  =  Rl 

{ 

4  Rw  <<=  1 

int  c ; 

5  Rw  +  —  Rl 

for  (  c  =  0  ;  (  (mult  6  1  )  =  one_zero  )  ;  C  +  +  ,  mult  >)  =  1  )  ; 

6  Rw  <<=  1 

return  c  ; 

7  Rw  +  =  Rl 

} 

main( ) 

Note  that  the  shifts  and  additions  al- 

stkptr  =  0;  /*  init.  stack  pointer  */ 

ways  come  in  pairs.  Note,  also,  that 

pr intf ( "\nenter  integer  multiplier " ) ;  scanf ( " %d" , Smult ) ; 

there  are  as  many  shift-add  pairs  as 

if  ( mu 1 1  }  0  ) 

{ 

there  are  one  bits  in  the  multiplier. 

This  implies  that  the  worst-case  se- 

last_cnt  =  0  ; 

quence  will  have  as  many  shift-adds 

last_shift  =  trim_tr ailing  (  0  )  ;  /*  cut  trailing  O' s  */ 

as  the  bit  width  of  the  multiplier. 

while  (  1  ) 

You  can  generate  shorter  se- 

{  /*  decompose  "mult"  ,  build  stacked  instructions  */ 

quences  by  using  shift-subtract  as 

cnt  =  tr im_tr a i ling ( 1 ) ;  /*  count  low-order  1  ’  s  */ 
if  ( cnt  )  1  ) 

{  /*  more  than  1  bit ,  use  shift-subtract  */ 

well  as  shift-add  pairs.  If  the  notation 

2  n  indicates  2  to  the  power  n,  you 

flag  =  0  * 

can  write  ((2  h)  -  1)  to  denote  a  bina- 

if  ( last_cnt  ==  1  ) 

ry  integer  with  n  Is  in  a  row  (for  ex- 

/*  shiftk,  sub,  shift  1,  add--) 

ample,  8-1  =  7).  Hence,  the  se- 

shift  k+  1  ,  sub  */ 

quence  shown  above  can  be 

/*  overwrite  last  entry  */ 

shortened  to: 

stack [ stkptr  -  1  ]  =  - ( cnt+  1  )  ; 
else 

Rl  *  7: 

stack  [  stkptr  H- +  ]  =  -cnt; 

1  Rw  =  Rl 

} 

2  Rw  <<=  3 

/*  will  need  another  shift-add  */ 

3  Rw  —  =  Rl 

else  flag  =  1  ; 

Here  one  shift-subtract  replaces 

/*  "mult"  fully  decomposed,  time  to  output  */ 

three  shift-adds.  The  worst  case  is 

if  (mult  =  =  0)  break; 

now  alternating  Is  and  Os  in  the  mul- 

/*  count  low-order  zeros  */ 

tiplier,  requiring  at  most  one-half  the 

number  of  sequence  steps. 

last_cnt  =  trim_trailing(  0 )  +  flag; 
stack[ stkptr  +  +  ]  =  last_cnt;  /*  shift-add  */ 

} 

A  further  improvement  can  be 
made  in  the  algorithm.  Some  num- 

bers  (such  as  55  and  119)  have  a  series 

/♦now  output  code  from  stack  */ 

of  Is,  then  a  single  0,  then  another 

pr  int  (  "\nRw  =  Rl"  )  ;  /*  load  working  register  */ 

series  of  Is  (119  =  1110111  binary). 

The  algorithm  would  generate  a 

while  ( stkptr  )  0  ) 

/ 

shift-subtract,  then  a  shift-add  by  one 

v  . 

ts  =  stack  [  — stkptr  ]  ;  /*  get  top  stack  element  */ 

place.  Here  is  the  sequence  for  119: 

if  (ts  <  0)  printf(  "\nRw  ( (  =  %d  \nRw  -=  Rl  "  ,  -  ts  )  ; 
else  printf  (  "  =\nRw  ( (  =  %d  \nRw  -f  =R1  "  ,  ts  )  ; 

Rl  *  119: 

i 

1  Rw  =  Rl 

if  ( last_shif t  !  =  0  )  printf ( "\nRw  ((  =  %d" ,  last_shift ) ; 

2  Rw  <<=  3 

3  Rw  -=  Rl 

} 

4  Rw  <<=  1 

printf  (  "\=\nRw  =  0"  )  ;  /*  special  case  for  mult  —  0  */ 

> 

5  Rw  +  =  Rl 

0  Rw«=3 

7  Rw  -=  Rl 

Code  Example  1:  The  star-chain  algorithm  in  C 

Dr.  Dobb  s  Journal,  March  1987 


35 

189 


INTEGER  MULTIPLICATIONS 

(continued  from  page  35) 


Steps  2  through  5  can  be  combined 
by  incrementing  the  shift  count  in 
step  2  and  omitting  steps  4  and  5,  giv¬ 
ing  the  following  sequence  for  HI  * 
119: 

R1  *  119: 

1  Rw  =  R1 

2  Rw  <<=  4 

3  Rw  —  R1 

4  Rw  <  <  =  3 

5  Rw-=R1 

This  sequence  (32-bit  operands) 
would  require  about  46  clocks  on  a 
68000,  which  is  faster  than  a  single 
16-bit  multiplication.  It  would  re¬ 
quire  five  words  of  code,  as  com¬ 
pared  to  the  two  or  three  words  re¬ 
quired  for  a  subroutine  call.  It  seems 
clear  that  star-chain  sequences  can 
provide  a  way  to  readily  optimize 
multiplication  by  a  constant. 

An  Actual  Implementation 

The  C  program  shown  in  Code  Exam¬ 
ple  1,  page  35,  implements  the  star- 
chain  algorithm.  It  prompts  the  user 
for  the  multiplier  (which  must  be 
positive)  and  prints  out  the  star-chain 
sequence.  It  would  be  easy  to  convert 
the  program  to  generate  code  steps 
for  use  in  an  optimizing  compiler. 

The  program  works  in  two  steps: 
the  first  step  builds  the  sequence  on  a 
last-in,  first-out  stack;  the  second  step 
outputs  the  sequence  from  the  stack. 
Note  that,  because  the  multiplier  is  32 
bits  long,  the  stack  need  only  hold  16 
elements;  there  is  no  danger  of  over¬ 
flow.  Each  stack  element  holds  a 
shift-add  or  shift-subtract.  The  en¬ 
coding  uses  the  sign  of  the  stack  ele¬ 
ment  to  indicate  shift-add  (plus)  or 
shift-subtract  (minus).  The  magni¬ 
tude  of  the  stack  element  is  the  shift 
count.  The  function  trimMrailing  is 
used  to  count  the  number  of  low-or¬ 
der  Os  or  Is  in  the  multiplier.  Note 
that,  as  written,  mult  must  be  a  global 
variable  because  trim _ trailing  oper¬ 

ates  on  it.  The  variable  flag  is  used  to 
sigrlal  the  shift-subtract  optimization. 

The  program  shown  works  only 
for  positive  multipliers,  which  is  al¬ 
ways  the  case  in  array  addressing.  To 
make  it  handle  negative  multipliers, 
simply  call  it  with  the  absolute  value 
of  the  multiplier  and  then  output  a 


"negate”  instruction. 

The  sequences  that  this  program 
produces  are  not  unique.  For  exam¬ 
ple,  R1  *  119  can  be  written: 

R1  *  119: 

1  Rw  =  R1 

2  Rw  <<=  3 

3  Rw  -=  R1 

4  R1  =  Rw 

5  Rw  <<=  4 

6  Rw  +=  R1 

This  sequence  is  derived  by  factor¬ 
ing  119  =  7  X  17.  Steps  1  through  3 
are  a  multiplication  by  7,  and  steps  4 
through  6  multiply  by  17.  The  alter¬ 


nate  sequence  here  is  not  shorter  or 
faster  than  the  one  generated  by  the 
algorithm,  but  factoring  can  yield  im¬ 
provements  in  some  cases.  (Note  that 
the  multiplicand  register  R1  is  over¬ 
written  in  step  4.  The  star-chain  algo¬ 
rithm  described  in  this  article  does 
not  destroy  the  multiplicand.)  The 
problem  with  the  factoring  ap¬ 
proach  is  that  it  can  take  a  great  deal 
of  time  to  find  the  factors  (or  it  may 
require  a  large  table  of  factoriza¬ 
tions). 

DDJ 

Vote  for  your  favorite  feature/article. 

Circle  Reader  Service  No.  3. 


Dr.  Dobb's  Journal,  March  1987 

190 


37 


COMPRESSING  IMAGES 

Listing  One  (Te)ct  begins  on  page  16.) 

i 

/*  Initialize  it  */ 

/*  Listing  one  */ 

for  (i  -  0;  i  <  4;  i++)  { 

newnode->child[i ]  =  NULL; 

/*  Subroutines  for  converting  a  pixel  image 

} 

to  a  quadtree  and  output  the  quadtree  as 

newnode->color  -  0; 

locational  codes 

newnode->ntype  -  LEAF; 

Written  by:  Ronald  G.  White 

return  (newnode) ; 

} 

External  routines: 

static  addnode (pnode,  level) 

px2quad  -  only  entry  point 

int  level; 

*/ 

PNODE  pnode; 

♦include  <stdio.h> 

/*  add  a  new  node  to  the  quad  tree 

If  the  node  is  not  at  the  bottom  level,  four  child 

/*  Define  structure  for  each  node  */ 

nodes  are  created  and  added  below  the  current  node. 
Otherwise  the  node  color  is  set  to  that  of  the 

typedef  struct  qnode  ( 

corresponding  pixel. 

struct  qnode  *child[4);  /*  pointers  to  each  child  */ 

struct  qnode  *next;  /*  used  during  output  */ 

int  color; 

pnode  -  pointer  to  the  current  node 

int  ntype;  /*  see  below  for  types  */ 

level  -  level  number  of  the  current  node 

int  locode;  /*  location  code  */ 

*/ 

)  QNODE,  * PNODE; 

/*  Node  types:  */ 

( 

int  i; 

•define  LEAF  1  /*  no  children  */ 

int  newlevel; 

•define  BLEND  2  /*  color  of  >2  kids  */ 

PNODE  crtnode (),  newchild; 

•define  WASH  3  /*  color  irrelevent  */ 

extern  getpix  () ;  /*  return  pixel  color  at  given  posn  */ 

/*  if  this  node  is  not  at  the  bottom  level, 

*  add  four  children  below  this  node 

extern  putllc();  /*  output  location  code  and  color  */ 

*/ 

static  int  toplevel;  /*  top  level  of  tree  (root  node)  */ 

if  (level  >  0)  ( 

newlevel  -  level  -  1; 

px2quad (size) 

for  (i  -  0;  i  <  4;  i++)  ( 
newchild  -  crtnode  (); 

int  size; 

pnode->child[i)  -  newchild; 

/*  entry  point  for  these  routines.  Control  routine  to 

newchild->locode  -  (pnode->locode  «  2)  +  i; 

create  a  quadtree  from  the  pixel  image  and  output  it 

addnode (newchild,  newlevel); 

as  locational  codes 

i 

input : 

/*  Remove  any  unnecessary  children  */ 

size  -  size  of  the  image  rounded  up  to  nearest 

condense (pnode) ; 

power  of  two 

*/ 

/*  bottom  level;  get  actual  pixel  color  */ 

i 

)  else  { 

pnode->color  -  getcolor  (pnode->locode) ; 

1 

PNODE  crtnode(),  proot; 

/*  Create  the  root  node  */ 

) 

proot  -  crtnode  () ; 

static  condense (pnode) 

/*  Calculate  the  toplevel  number  V 

PNODE  pnode; 

/•  examine  children  of  the  current  node  and 

toplevel  =0; 

remove  any  that  are  unnecessary 

while  (size  >  1)  { 

toplevel++; 

size  »=  1;  /*  divide  by  two  */ 

I 

pnode  -  pointer  to  current  node 

*/ 

/*  Build  the  quad  tree  */ 

{ 

proot ->locode  -  1; 

int  colcnt[4); 

addnode (proot,  toplevel); 

int  colors [41; 

/*  Output  it  as  location  codes  */ 

int  i,  j; 

int  maxclr  -  0; 

outtree (proot) ; 

int  nkids; 

i 

int  childclr; 

static  PNODE  crtnode  () 

PNODE  pchild; 

/*  create  a  quadtree  node  and  initialize  it 

/*  Initialization  */ 

output: 

for  (i  =0;  i  <  4;  i++)  { 
colcnt[i]  -  0; 

returns  a  pointer  to  the  node 

i 

( 

/*  Determine  colors  of  children  */ 

int  i; 

for  (i  =  0;  i  <  4;  i++)  ( 

PNODE  newnode; 

pchild  -  pnode->child [i] ; 

/*  Get  space  for  it  */ 

if  (pchild-> ntype  --  WASH)  ( 

/*  this  child  has  no  color  */ 

newnode  -  (PNODE)  malloc (sizeof (QNODE) ) ; 

continue; 

if  (newnode  --  NULL)  { 

i 

/*  Something  went  wrong  */ 

fprintf  (stderr. 

childclr  -  pchild->color; 

"crtnode:  malloc  failure;  unable  to  continueO) ; 

exit  (1) ; 

(continued  on  nejct  page) 

_ _ _ _ 

40 


Dr.  Dobb's  Journal,  March  1987 


COMPRESSING  IMAGES 


Listing  One  (Listing  continued,  text  begins  on  page 

/*  loop  through  colors  found  so  far: 
do  we  have  a  match? 
note:  we'll  always  "break"  out  of 
this  loop  because  there  can  be  at 
most  four  different  colors. 

V 

for  (j  -  0;  j  <  4;  j++)  { 
if  (colcnt [ j]  —  0)  { 

/*  new  color  */ 
colors [j]  -  childclr; 
colcnt [j]  -  1; 
break; 

}  else  if  (childclr  —  colors(jj)  { 

/*  existing  color  */ 
colcnt [j]++; 

if  (colcnt [j]  >  colcnt [maxclrj )  { 
maxclr  -  j; 

} 

break; 


/*  Set  node  color  */ 
pnode->color  -  colors [maxclr] ; 

/*  Remove  redundant  children  —  if  more  than 
one  child  node  has  the  same  color  as  the 
current  node,  then  it  contains  redundant 
information.  If  the  redundant  node  is  a 
leaf  node,  it  can  just  be  removed.  If  it 
is  not  a  leaf  node,  mark  it  as  a  WASH  type 
and  ignore  it  during  output. 

*/ 

nkids  -  4; 

if  (colcnt [maxclr)  >  1)  { 

/*  Loop  through  the  four  children  */ 
for  (i  -  0;  i  <  4;  i++)  ( 

pchild  -  pnode->child[i] ; 

/*  If  child  node  is  already  a  WASH, 
nothing  else  can  be  done  to  it 

*/ 

if  (pchild->ntype  —  WASH)  { 
continue; 

) 

childclr  -  pchild->color; 

/*  Check  for  color  match  */ 
if  (childclr  -=  pnode->color)  ( 

/*  If  child  is  leaf,  release  */ 
if  (pchild->ntype  —  LEAF)  ( 
relnode (pchild) ; 
pnode->chi Id [ i ]  -  NULL; 
nkids — ; 

/*  otherwise,  mark  it  as  a  WASH  V 
)  else  ( 

pchild->ntype  -  WASH; 


/*  Reset  node  type  —  a  LEAF  node  has  no  children  */ 
if  (nkids  »-  0)  ( 

pnode->ntype  -  LEAF; 

/*  A  BLEND  node  has  a  color  that  represents  some 
missing  children,  but  still  has  some  other 
children  that  are  a  different  color. 

*/ 

)  else  if  (colcnt [maxclr 1  >  1)  ( 
pnode->ntype  -  BLEND; 

/*  A  WASH  node  is  necessary  in  the  quadtree  because 


it  points  to  existent  children  nodes,  but  will  not 
be  output  because  its  information  (i.e.  color)  is 
available  either  in  child  nodes  or  parent  nodes. 


)  else  ( 


pnode->ntype  -  WASH; 


relnode (pnode) 

PNODE  pnode; 

/*  release  a  node 

input : 

pnode  -  pointer  to  node  to  release 


free ( (char  *)  pnode) ; 


static  getcolor(lcode) 
int  lcode; 

/*  get  the  color  of  the  pixel  corresponding  to  a 
bottom  level  node  whose  position  is  given  by  a 
locational  code 

input: 

lcode  -  locational  code  of  bottom  level  node 
output : 

returns  pixel  color 


int  dir; 
int  col  -  0; 
int  level; 
int  row  -  0; 
int  shift; 

/*  Convert  node  locational  code  to  pixel  row  s  column 
by  looping  through  direction  codes  in  locational 
code  for  each  level  from  top  to  bottom 

*/ 

for  (level  -  toplevel;  level  >  0;  level — )  ( 

/*  shift  last  row  &  col  values  left  one  bit  */ 
col  «-  1; 
row  <o  1; 

/*  calculate  the  position  of  the  direction 
code  for  this  level  and  extract  it 

*/ 

shift  -  (level  -  1)  *  2; 
dir  -  (lcode  »  shift)  i  0x3; 

/*  increment  the  col  value  if  quadrant  is  in 
left  half,  i.e.  NE  or  SE  child 
V 

if  (dir  —  1  | |  dir  -=  3)  ( 
col++; 


/*  increment  the  row  value  if  quadrant  is  in 
bottom  half,  i.e.  SW  or  SE  child 

*/ 

if  (dir  —  2  | |  dir  —  3)  ( 
row++; 

} 


/*  return  pixel  color  */ 
return  (getpix (col,  row) ) ; 


static  outtree (proot) 

PNODE  proot; 

/*  output  the  relevant  nodes  in  the  quad  tree 


*  proot  -  pointer  to  the  root  node 


(continued  on  page  44) 


Dr.  Dobb  s  Journal,  March  1987 


COMPRESSING  IMAGES 

Listing  One 

(Listing  continued,  text  begins  on  page  16.) 

PNODE  outnode(),  pcur,  plast; 

/*  Set  up  the  linked  list  with  root  node  */ 
pcur  -  proot; 
plast  -  proot; 
proot->next  -  NULL; 

/*  Output  each  node  on  the  linked  list  in  order 
until  there  are  no  more  nodes  on  the  list 
V 

while  (pcur  !-  NULL)  { 

plast  -  outnode (pcur,  plast); 
pcur  -  pcur->next; 

) 


static  PNODE  out node (pnode,  plast) 

PNODE  pnode,  plast; 

/*  output  the  locational  code  and  color  index  for  a 
node  and  put  its  children  on  the  list 

input: 

pnode  -  pointer  to  node  to  output 

plast  -  pointer  to  last  node  on  the  linked  list 


output: 

returns  pointer  to  new  last  node  on  linked  list 

*/ 

{ 

int  i; 

PNODE  pchild; 

/*  If  node  is  not  a  WASH,  output  it  */ 
if  (pnode->ntype  !-  WASH)  { 

putlcc (pnode->locode,  pnode->color) ; 

} 

/*  Put  the  node’s  children  on  list  */ 
if  (pnode->ntype  !-  LEAF)  { 

for  (i  -  0;  i  <  4;  i++)  { 

pchild  “  pnode->child[i] ; 
if  (pchild  !-  NULL)  { 

plast->next  -  pchild; 
plast  -  pchild; 


/*  Return  new  last  pointer  */ 
return (plast) ; 

) 


End  Listing  One 


Listing  Two 

/*  Listing  two  */ 

/*  Subroutines  for  displaying  an  image  from  a  quadtree 
input  as  locational  codes 

Written  by:  Ronald  G.  White 

External  routines: 

qdisp  -  only  entry  point 

*/ 

♦include  <stdio.h> 

extern  getnxnO;  /*  read  in  next  node’s  data  */ 

extern  filrec();  /*  fill  rectangular  region  */ 

static  int  orgsize;  /*  original  image  size  */ 

qdisp (size) 
int  size; 

/*  main  entry  point  for  the  display  of  a  quadtree 


*  i nput : 

*  size  -  size  of  the  original  image 


int  lcode,  color; 
int  corner [2); 
int  side; 

/*  Make  the  image  size  global  */ 
orgsize  -  size; 

/*  Read  and  display  each  node  */ 
while  (getnxn (&lcode,  &color)  !«  EOF)  ( 

/*  Convert  loc  code  to  corners,  side  of  square  */ 
square (lcode,  corner,  &side) ; 

/*  Fill  in  the  square  */ 

filrec (corner [0] ,  corner [1],  side,  side,  color); 


square (lcode,  corner,  pside) 
int  lcode; 
int  corner [2]; 
int  *pside; 

/*  convert  quadtree  locational  code  to  corner  and  side 
of  the  square  represented  by  the  corresponding  node. 

input : 

lcode  -  locational  code  for  this  node 
output: 

corner  -  upper  left  corner  of  quadrant 
pside  -  the  size  of  the  quadrant  in  pixels 


int  dir; 
int  shift; 

corner [0]  -  corner [1]  -  0; 

♦pside  -  orgsize; 

/*  Find  the  begining  of  the  code  */ 
for  (shift  -  30; 

((lcode  »  shift)  &  Oxff)  —  0;  shift  —  2); 

/*  Convert  node  locational  code  to  corner  row  & 
column  by  looping  through  direction  codes  in 
locational  code  for  each  level  from  top  down. 

*/ 

for  (shift  —  2;  shift  >-  0;  shift  —  2)  { 

/*  The  side  of  the  square  is  reduced  by  a 
factor  of  two  each  level  down. 

*/ 

*pside  »«  1; 

/*  extract  the  direction  code  */ 
dir  »  (lcode  »  shift)  &  0x3; 

/*  increment  the  col  value  if  quadrant  is 
in  left  half,  i.e.  NE  or  SE  child 

*/ 

if  (dir  —  1  | I  dir  —  3)  { 
corner (0)  +«  ♦pside; 

} 

/*  increment  the  row  value  if  quadrant  is 
in  bottom  half,  i.e.  SW  or  SE  child 
V 

if  (dir  —  2  | |  dir  —  3)  { 
corner [1]  +~  *pside; 

} 


End  Listings 


44 


Dr.  Dobb's  Journal,  March  1987 

193 


C  CHEST 


Listing  Ten  (Text  begins  on  page  96.) 


1  (include  <stdio.h> 

2  (Include  <stdarg.h> 

3 


4  Ierr(  fmt  ) 

5  char  *fmt; 

6  1 

7  /*  fe rr()  is  used  for  fatal  error  processing.  It 

8  *  is  used  just  like  printf () .  However,  it  exits 

9  *  the  program  with  a  status  of  1  Immediately  after 

10  *  printing  the  message.  I'm  using  ANSI,  not  UNIX 

11  *  variable  argument  conventions  here. 

12  */ 

13 

14  va_llst  args; 

15  va_start(  args,  fmt  ); 

16  vfprintf(  stderr,  fmt,  args  ); 

17  exit  (  1  )  ; 

18  ) 


End  Listing  Ten 


Listing  Eleven 


#include  <aacii.h> 

♦  define  min(a,b)  {(a)  <  (b)  ?  (a)  :  (b) ) 

♦define  raax(a,b)  ((a)  >  (b)  ?  (a)  :  (b) ) 

typedef  unsigned  char  UCHAR; 


♦ifdef  DEBUG 


♦ 

define 

D  (x)  x 

/* 

♦  else 

♦ 

define 

D(x) 

/* 

♦endif 

♦define 

MAXLTRAP  100 

/* 

♦define 

MAXSTR 

257 

*/ 

/*  1 

♦define 

MAXPAGE 

511 

V 

/*  ] 

♦define 

MAXARGS 

10 

*/ 
/*  1 

♦  define 

MAXNEST 

10 

/*  l 

♦define 

MAXMBUF 

256 

/*  ] 

/- - 

/*  r 

Special  characters: 

empty  string. 


trap. 


input  and  output  widths 

page  number  which  can  be  g 
command  line  switch 

Max  ♦  of  arguments  in  a  macro 


These  symbols  are  used  internally  to  pass  information 
from  the  character-oriented  input  functions  to  the 
(in  nrinp.c)  to  the  multiple-byte  character  processing 
functions  (in  nrtext.c  and  nrout.c).  They  are  all 
two-character  sequences.  Of  these,  VMOVE,  HMQVE,  and 
CH  FONT  are  also  used  in  the  16-bit  wide  CTYPE  characters 
*  discussed  below. 

*/ 


♦define 

VMOVE 

(  Oxf 8  ) 

/* 

♦define 

HMOVE 

(  0xf9  ) 

/* 

♦define 

CH  FONT 

(  Oxfa  ) 

/* 

♦define 

CH  ATTRIB 

(  Oxfb  ) 

/* 

♦define 

SOFT  HYPHEN  ■ 

(  Oxfc  ) 

/* 

♦define 

ZWIDTH 

(  Oxfd  ) 

/* 

♦define 

UP  SPACE  i 

(  Oxfe  ) 

/* 

♦define 

LITCHAR  i 

(  Oxf f  ) 

/* 

/* 

/* 

/* - 

*  Default  fonts  and  attributes: 

* 


Vertical  motion 
Horizontal  motion 
Change  font 

Change  attribute  in  current  font 
A  soft  hyphen  goes  here. 

Next  character  is  zero  width 
Unpaddable  space 

next  character  goes  to  printer  is 
literal  (it  goes  to  the  printer 
unchanged. 


*/ 

*/ 

*/ 

*/ 

*/ 

*/ 

*/ 

*/ 

V 

*/ 


e  wnicn  may  apply  to  any  font.  PREVI 

turns  off  these  attributes  but  doesn't  spring  a  change  font  macro.  ROMAN 
replaces  the  current  font  with  the  roman  font  and  also  clears  all 
*  attributes. 


♦define 

BOLD 

•B' 

/* 

Bold  face 

♦define 

OVER 

•O' 

/* 

Overstrifce 

♦define 

ITALICS 

•I* 

/* 

Italics 

♦define 

PREVIOUS 

•  pi 

/* 

Previous 

♦define 

/* - 

ROMAN 

•R* 

/* 

Roman 

♦define 

BOTH 

•b' 

/* 

Legal  adjustment 

^define 

ALT  BOTH 

•n* 

♦define 

LEFT 

•1* 

♦define 

RIGHT 

'r' 

♦define 

CENTER 

•c' 

*/ 

*/ 

V 

*/ 

*/ 


(continued  on  page  53) 


48 

194 


Dr.  Dobb's  Journal,  March  1987 


C  CHEST 


Listing  Eleven  (Listing  continued,  text  begins  on  page  96.) 


*  Number  registers: 


The  number  registers  are  maintained  as  an  array  pointers  to  NREG 
type  objects.  The  array  is  kept  sorted  by  register  name, 
nname  is  a  pointer  to  the  name,  it  usually  points  at  nbuf. 


typedef  struct  _nr 


char 

nname [3]  ; 

int 

nfmt  ; 

int 

nval 

int 

incr  amt; 

)  NREG 

/* 

Read/write  pre-defined  number  registers: 

*/ 

!* 

*/ 

extern 

NREG 

*Nrpg; 

/* 

%  Page  number 

•/ 

extern 

NREG 

*Nrdy; 

/* 

dy  Day 

*/ 

extern 

NREG 

*Nrdw; 

1* 

dw  day  of  the  week  (0  -  sun,  6-  sat) 

*/ 

extern 

NREG 

*Nrh; 

/* 

h  hour 

*/ 

extern 

NREG 

*Nrln; 

/* 

In  Current  .nm  line  number 

*/ 

extern 

NREG 

•Nrnl; 

/* 

nl  Current  output  line  number 

*/ 

extern 

NREG 

*Nrra; 

/* 

ra  minute 

*/ 

extern 

NREG 

*Nrmo; 

/• 

mo  Month 

*/ 

extern 

NREG 

*Nrs; 

/* 

s  second 

V 

extern 

NREG 

*Nryr; 

/* 

yr  Year 

-/ 

/*  Read  only  pre-defined  number  registers: 

V 

/* 

*/ 

extern 

NREG 

*Nrhp; 

/*  hp  Current  horizontal  place  on  input  line 

*/ 

extern 

NREG 

*Nrdn; 

/*  dh  Height  of  most  recent  diversion 

*/ 

extern 

NREG 

•Nrdl; 

/*  dl  Width  of  last  completed  diversion 

*/ 

extern 

NREG 

*Nrargs 

/*  .$  Number  of  args  at  current  macro  level 

*/ 

extern 

NREG 

*Nrlines; 

/*  .c  Number  of  lines  read  from  current  input 

*y 

extern 

NREG 

♦Nrvplace; 

/*  .d  Current  vert,  place  in  current  diversion 

extern 

NREG 

*Nrfont 

/*  .f  Index  in  Fonts l)  of  current  font 

*/ 

extern 

NREG 

•Nrindent; 

/*  .i  Current  indent  column 

*/ 

extern 

NREG 

*Nrllen 

/*  .1  Current  line  length 

*/ 

extern 

NREG 

*Nrtlen 

/*  .n  Length  of  text  portion  of  previous  line 

*/ 

extern 

NREG 

•Nroffset; 

/*  .o  Current  page  offset 

*/ 

extern 

NREG 

•Nrplen 

/*  .p  Current  page  length 

*/ 

extern 

NREG 

•Nrtotrap; 

/*  .t  Distance  to  next  trap 

V 

extern 

NREG 

•Nrfill 

/*  .u  1  if  in  fill  mode,  0  otherwise 

*/ 

extern 

NREG 

*Nrv; 

/*  .v  current  vertical  base-line  spacing 

Number  register  format  types.  The  specified  character  found  in  the 
left-most  postion  of  the  .af  command's  c  argument  signifies  the 
type.  In  addition,  the  number  of  characters  in  the  arabic  padded  mode 
determines  the  fieldwidth  of  the  number. 

The  format  for  arabic  numbers  is  an  ascii  digit.  If  this  digit  is 
'O'  or  'I1  then  the  number  is  printed  unpadded.  If  the  digit  is  a  '4' 
it  is  printed  in  a  4  space  field,  right  justified  in  the  field  and 
padded  with  zeros.  The  special  format  READONLY  is  used  by  the  read 
only  pre-defined  number  registers.  They  are  always  arabic  format. 


♦define 

ARABIC 

'1' 

/* 

0, 

1, 

2, 

♦define 

PADDED 

•O' 

/* 

ooo. 

001, 

002, 

♦define 

LC  ROMAN 

•i' 

/* 

o. 

i. 

ii. 

♦define 

UC  ROMAN 

'I* 

/* 

o. 

I, 

11/ 

♦de fine 

LC  ALPHA 

'a* 

/* 

o. 

a. 

b. 

♦define 

UC  ALPHA 

•A' 

/* 

o. 

A, 

B, 

♦define 

LC  ENG 

'  e' 

/* 

zero, 

.  one. 

,  two, 

♦define 

UC  ENG 

•E' 

/* 

Zero, 

r  One, 

,  Two, 

♦define 

READONLY 

•r' 

/* 

Pre-defined,  a; 

z,  aa,  ab  ... 
Z,  AA,  AB  ... 


/■ - 

* 

V 

Default  values 

of  the  pre-i 

defined  number  registers 

♦define 

DEF  PAGE 

1 

/* 

Page  number 

*/ 

♦define 

DEF  WIDTH 

0 

!* 

Width  of  most  recent  diversion 

*/ 

♦define 

DEF  HEIGHT 

0 

/• 

Height  of  most  recent  diversion 

*/ 

♦de fine 

DEF  DAY 

1 

/* 

Default  day 

*/ 

♦define 

DEF  HORIZ 

1 

/• 

Current  place  on  input  line 

*/ 

♦define 

DEF  LINE 

1 

1* 

Output  Line  number 

V 

♦define 

DEF  MONTH 

1 

/* 

Default  month 

*/ 

♦define 

DEF  YEAR 

1985 

/* 

Default  year 

*/ 

♦define 

DEF  NARGS 

0 

/* 

♦  Args  in  current  macro 

*/ 

♦define 

DEF  INLINES 

0 

/* 

♦  Lines  read  from  current  input 

*/ 

♦define 

DEF  VERT 

1 

/* 

Vertical  place  in  current  diver. 

*/ 

♦define 

DEF  FONT 

0 

J* 

Index  in  Fonts ()  of  current  font 

*/ 

♦define 

DEF  INDENT 

0 

/* 

Current  indent  column 

*/ 

♦  define 

DEF  LINLEN 

80 

/* 

Default  line  length 

•/ 

♦define 

DEF  TEXTLEN 

0 

/* 

Len  of  text  part  of  previous  line  */ 

♦define 

DEF  OFFSET 

0 

/* 

Page  offset 

*/ 

♦define 

DEF  PGLEN 

66 

/* 

Page  length 

V 

♦define 

DEF  TOT RAP 

66 

/* 

Distance  to  next  trap 

*/ 

♦define 

DEF  FILL 

0 

/* 

1  if  in  fill  mode 

*/ 

♦define 

DEF_LS 

1 

/* 

Default  line  spacing 

*/ 

♦define 

NUMTABS  (MAXSTR 

+  1) 

/* 

Largest  column  in  which  a  tab  can  be 

set  */ 

typedef 

int  TSTOP (  NUMTABS  ] ; 

extern 

TSTOP  Tabstop; 

/* 

The  tabstop  array  (see  nrglbls.c) 

*/ 

* 

*/ 

Table  used  by  commando 

to 

parse  command  lines: 

typedef  char  *CHARPTR; 

(continued  on  next  page) 

Dr.  Dobb's  Journal,  March  1987 


53 

195 


C  CHEST 


Listing  Eleven  (Listing  continued,  tegt  begins  on  page  96.) 


typedef  struct 
l 


char 

•cmd; 

/• 

int 

(•action) () ; 

/• 

unsigned 

type  :  3  ; 

/* 

unsigned 

inhib  :  1  ; 

/• 

char 

•def; 

/• 

) 

CTAB; 


Command  name 

Subroutine  to  call  when  cmd  found 
Command  type 
1  —>  Inhibit  works 
Default  value  of  numeric  argument 


*/ 

*/ 

V 

*/ 

*/ 


/* - 

*  Table  of  user  defined  fonts  (Fonts)  is  made  up  of  FONT  type. 

*  NUMFONTS  is  the  maximum  number  of  user  defined  fonts. 

*  The  “widths**  array  holds  the  character  widths.  Maximum 

*  number  of  characters  is  MAX  CHARS_IN_FONT .  Font  numbers 

*  must  be  single  characters  (In  the  range  0-9)  so  NUMFONTS 

*  must  be  <-  10.  The  "resolution"  field  is  the  middle  argument 

*  from  the  .hd  command  that  was  in  effect  when  the  .df  was 

*  executed.  If  a  space  is  6  units  wide  and  "resolution"  is  set  to 

*  2,  then  sending  Right_str  to  the  printer  three  times  will 

*  move  the  carriage  one  space  to  the  right.  If  "resolution"  is 

*  1,  then  the  string  will  have  to  be  sent  6  times  to  move  the 

*  same  amount. 

*/ 


typedef  struct 
{ 


UCHAR 

name; 

/• 

UCHAR 

smac [3] ; 

/• 

UCHAR 

eraac [3] ; 

/• 

int 

resolution; 

/* 

UCHAR 

•left; 

/• 

UCHAR 

•right; 

/• 

UCHAR 

•widths; 

/• 

) 

FONT; 

♦define  NUMFONTS  10 

♦define  MAX_CHARS_IN_FONT  256 

/* - 

*  Msc  ♦defines  and  typedefs 


Font  name  */ 
Macro  to  enter  new  font  */ 
Macro  to  exit  font  */ 
Min  horizontal  resolution  from  .hd  •/ 
String  to  go  left  from  .hd  •/ 
String  to  go  right  from  .hd  •/ 
Array  of  character  widths.  */ 


♦define  ISCMD(c)  (  (c)~ Crad_chr  ||  (c)— Nobreak  ) 


/* - 

*  ♦Defines  to  get  at  the  value  fields  of  the  pre-defined 
number  registers  (Those  marked  with  an  E  are  saved  with  a  .ev 

*  command) : 

*/ 


♦define 

PAGE 

(Nrpg->nval) 

/• 

♦define 

WIDTH 

(Nrdl->nval) 

/• 

♦define 

HEIGHT 

(Nrdn->nval) 

/• 

♦define 

DAY 

(Nrdy->nval) 

/* 

♦define 

HORIZ 

(Nrhp->nval) 

/• 

♦define 

LINE 

(Nrln->nval) 

/• 

♦define 

OLINE 

(Nrnl->nval) 

/• 

♦define 

MONTH 

(Nrmo->nval) 

/• 

♦define 

YEAR 

(Nryr->nval) 

/• 

♦define 

NARGS 

(Nrarg8->nval) 

/* 

♦define 

INLINES 

(Nrlines->nval) 

/* 

♦define 

VERT 

(Nrvplace->nval) 

/• 

♦define 

CURFONT 

(Nrfont->nval) 

/• 

♦define 

INDENT 

(Nrindent->nval) 

/• 

♦define 

LINLEN 

(Nrllen->nval) 

/• 

♦define 

TEXTLEN 

(Nrtlen->nval) 

/• 

♦define 

OFFSET 

(Nroffset->nval) 

/* 

♦define 

PGLEN 

(Nrplen->nval) 

/• 

♦define 

TOTRAP 

(Nrtotrap->nval) 

/* 

♦define 

FILL 

(Nrfill->nval) 

/• 

♦de fine 

LSPACE 

(Nrv->nval) 

/* 

♦define 

WEEKDAY 

(Nrdw->nval) 

/• 

♦  define 

HOUR 

(Nrh->nval) 

/• 

♦define 

MIN 

(Nrm->nval) 

/• 

♦define 

SEC 

(Nrs->nval) 

/* 

/* - 

Page  number 

Width  of  last  completed  diversion 
Height  of  last  completed  diversion 
Day 

Current  horiz->  place  on  input  line 

Current  .nm  line  number 

Current  output  line  number 

Month 

Year 

♦  of  args  at  current  macro  level 

♦  of  lines  read  from  current  input 
Vert,  place  in  current  diversion 


Currently  active  font  (Font[i])  (E) 
Current  indent  column  (E) 

Current  line  length  (E) 

Length  of  text  part  of  prev  out  line 
Current  page  offset  (E) 

Current  page  length 
Distance  to  next  trap 
1  if  in  fill  mode,  0  otherwise  (E) 


Current  line  spacing  (set  w/.ls)  (E) 

day  of  the  week 

hour 

minute 

second 


% 

dl 

dn 

dy 

hp 

In 

nl 

mo 

yr 

.$ 


.f 

.i 

.1 

.n 

.o 

•P 

.t 

.u 

.V 


wd 

h 

w 

8 


*/ 

*/ 

*/ 

•/ 

*/ 

*/ 

*/ 

•/ 

*/ 

*/ 

•/ 

*/ 

*/ 

*/ 

*/ 

*/ 

*/ 

•/ 

*/ 

*/ 

•/ 

•/ 

•/ 

*/ 

*/ 


Global  vara  used  by  more  than  one  module.  (Those  marked  with  an  E  are 
saved  with  a  .ev  command.)  Most  of  these  are  declared  in  nrglbls.h 
but  some  are  coranand-line  switches  and  are  found  in  nr.c. 


extern  int  Adjmode  ;  /•  Current  adjustment  mode  E  */ 
extern  int  Adjusting  ;  /*  One  if  adjusting  lines  E  */ 
extern  int  Cmd_chr  ;  /*  Command  character  E  */ 
extern  int  Cont_ul  ;  /•  Num  of  input  lines  to  continuously  underline  E  •/ 
extern  int  Divtrap  ;  /*  Location  of  diversion  trap,  -1  if  none  */ 
extern  char  Dtrap  name ( ] ;  /*  Macro  to  invoke  when  diversion  trap  reached  */ 
extern  int  Divwidth  ;  /•  Width  of  most  of  last  completed  diversion  */ 
extern  int  Esc  ;  /*  Current  escape  character  e  •/ 
extern  int  Hyphenate  ;  /*  Hyphenation  is  enabled  during  filling  only  */ 
extern  int  Hyphen_chr  ;  /•  Soft  hyphen  is  \<Hyphen_chr>.  default  -  \*  •/ 
extern  char  “Endm  ;  /*  Name  of  macro  invoked  at  end  of  input  */ 
extern  FONT  Fonts (]  ;  /•  Table  of  user  defined  fonts  */ 
extern  int  H_units  ;  /*  Number  of  horizontal  units  /  inch  */ 
extern  int  H_space  ;  /*  Number  of  horizontal  units  in  a  space  */ 
extern  FILE  *Ifile  ;  /*  Current  input  file  */ 
extern  char  *Ifilename  ;  /*  Name  of  current  input  file  */ 
extern  int  Inhibit  ;  /*  Inhibit  text  and  command  processing  except  .{.)*/ 
extern  int  Itrap  ;  /*  Lines  left  to  current  input  trap  -1  if  none  E  */ 
extern  char  Itrap_name(l;  /*  Macro  to  invoke  when  Itrap  reaches  0  E  */ 
extern  int  Israacro  ;  /•  l  if  ifile  is  a  macro,  0  if  a  file  */ 
extern  int  Isdiv  ;  /*  l  if  ofile  is  a  diversion,  0  if  a  file  */ 
extern  int  Leader  ;  /*  Current  leader  character  E  */ 
extern  int  Linenura  ;  /*  Current  output  line  number  */ 


54 

196 


Dr.  Dobb's  Journal,  March  1987 


extern 

char 

*  *Macv 

/* 

extern 

int 

Nobreak 

/* 

extern 

int 

Nospace 

/* 

extern 

int 

No_cntl 

/* 

extern 

int 

Nra_blanks 

/* 

extern 

int 

Nm_on 

/* 

extern 

int 

Nm_rault 

/• 

extern 

char 

*Nm_8tr 

/• 

extern 

int 

Nr_cpmode 

/* 

extern 

int 

Num_bold 

/* 

extern 

int 

Nura_center 

/* 

extern 

int 

Nura_os 

/• 

extern 

int 

Num  under 

/• 

extern 

FILE 

•Oflle 

/• 

extern 

int 

Page_ch 

/• 

extern 

int 

Plain 

/* 

extern 

int 

Ptab  [] 

/* 

extern 

int 

Quit 

/* 

extern 

int 

Tab 

/* 

extern 

int 

Tabwidth 

/* 

extern 

int 

Tabs  enabled  ; 

extern 

int 

Terapln 

/* 

extern 

int 

Title_len 

/* 

extern 

int 

Verbose 

/* 

extern 

int 

Wordstar 

/* 

extern 

char 

•Lmarg_str 

/• 

extern 

char 

*Rraarg_str 

/• 

extern 

char 

•Bd  on 

/• 

extern 

char 

•Bd“off 

/* 

extern 

char 

•Ul  on 

/• 

extern 

char 

*Ul_off 

/* 

extern 

char 

•Os  on 

/* 

extern 

char 

•Os  off 

/• 

extern 

int 

Bold 

/* 

extern 

int 

Over 

/* 

extern 

int 

Italics 

/• 

extern 

char 

•Dn_str 

/* 

extern 

char 

•Up_str 

/* 

extern 

int 

Vs_arat 

/* 

extern 

char 

*Left_str 

/* 

extern 

char 

*Right_str 

/• 

extern 

int 

Hs_amt 

/* 

Macros  arguments  for  current  macro  level  •/ 
Nobreak  character  E  */ 
Supress  spacing  as  per  .ns  command  •/ 
Don't  print  control  characters  •/ 
.nra  will  number  blank  lines  too  if  true  E  */ 
Line  numbering  enabled  by  .nm  crad  E"/ 


The  M  argument  of  the  most  recent  .nm  command  E*/ 
The  S  argument  of  the  most  recent  .nm  command  E*/ 
Nroff  copy  mode,  expand  \  in  macro  definitions  •/ 


Remaining  number  of  input  lines  to  do  bold  E  */ 

Remaining  number  of  input  lines  to  center  E  •/ 

Number  of  input  lines  to  print  overstruck  E  */ 

Remaining  number  of  input  lines  to  underline  E  * / 

Output  file  descriptor  •/ 

Translates  to  page  number  in  3  part  titles  */ 
Suppress  all  bold,  underline,  and  overstrike  */ 

Proportional  spacing  table  */ 

Terminate  nroff  when  set  by  .ex  command  •/ 

Tab  repetition  character  E  */ 

Width  of  input  tab  stops  */ 

/•  Expand  tabs  only  if  true  */ 

Temporary  indent  column  E  */ 

3  part  title  line  length  (set  with  .It  cmd)  E  */ 

Echo  commands  to  stdout  as  they're  executed  •/ 

Wordstar-mode  output  */ 

String  used  in  .ml  command  E  */ 

String  used  in  .me  command  E  */ 

Send  to  printer  to  turn  bold  face  on  */ 

Send  to  printer  to  turn  bold  face  off  •/ 

Send  to  printer  to  turn  underline  on  */ 

Send  to  printer  to  turn  underline  off  */ 

Send  to  printer  to  turn  Overstrike  on  */ 

Send  to  printer  to  turn  Overstrike  off  */ 

Boldface  currently  active  E  */ 

Overstrike  currently  active  E  */ 

Italics  currently  active  E  •/ 

Send  to  printer  to  send  cursor  down  1/2  line  */ 
Send  to  printer  to  send  cursor  up  1/2  line  */ 

This  many  \u  or  \d  cmds  moves  one  line  •/ 

Send  to  printer  to  go  left  •/ 

Send  to  printer  to  go  right  */ 

Above  moves  1/n  spaces  */ 


End  Listing  Eleven 


Listing  Twelve 


Characters  are  handled  internally  as  CTYPE's  rather  than  chars. 
The  routines  in  nrmap.c  copy  character  strings  into  CTYPE 
strings.  The  #defines  in  this  file  define  the  various  attribute 
bits,  etc.,  in  a  CTYPE: 


pad:  character  is  paddable,  only  used  for  spaces. 

lit:  character  must  be  taken  literally. 

width:  If  0,  character  takes  no  space  in  output.  If  1,  the 

character's  width  is  in  the  currently  active  character 
width  table. 

sh:  A  soft  hyphen  preceedes  this  character, 

os:  Overstrike  attribute  (character  is  overstruck) 

bold:  Boldface  attribute 

ul:  underline  attribute 

15  14  13  12  11  10  9  8  7  0 

+ - + - + - + - + - + - + - + - + - +• 

I  0  |  |  pad  |  width  |  sh  |  os  I  bold  |  ul  I  character  I 

+ - + - + - + - + - + - + - + - + - + 


15  14  13  12  10  0 

I  1  I  vm  |  hm  |  cf  |  amount  or  font-ID  | 


vm:  vertical  motion 
hra:  horizontal  motion 
cf:  change  font 


typedef  unsigned  int  CTYPE; 


Idefine 

CHR 

OxOOff 

/*  Character 

mask 

•/ 

♦define 

UNDERLINED 

0x0100 

/*  underlined  bit 

•/ 

♦define 

BOLDFACE 

0x0200 

/*  boldface 

bit 

•/ 

♦define 

OVERSTRIKE 

0x0400 

/*  overstrike  bit 

•/ 

♦define 

HYPHEN 

0x0800 

/*  soft  hyphen  bit 

*/ 

♦define 

WIDTH  BIT 

0x1000 

/*  width  bit 

*/ 

♦define 

NOPAD_BIT 

0x2000 

/*  space  is 

paddable 

•/ 

♦define 

MODE  BIT 

0x8000 

/*  Selects  one  of: 

•/ 

♦define 

VM  BIT 

0x4000 

/•  Vertical 

motion 

*/ 

♦de fine 

HM  BIT 

0x2000 

/•  Horizontal  motion 

*/ 

♦define 

FONT_BIT 

0x1000 

/*  Change  font 

*/ 

♦define 

SET  UL(C) 

((c)  |- 

UNDERLINED  ) 

/*  Underlined 

♦define 

I  S_UL  ( C) 

((c)  & 

UNDERLINED  ) 

♦define 

SET  BD(C) 

((c)  |- 

BOLDFACE  ) 

/•  Bold 

♦define 

IS_BD (c) 

((c)  4 

BOLDFACE  ) 

♦define 

SET  OS(C) 

((c)  |- 

OVERSTRIKE  ) 

/*  Overstrike 

♦  define 

IS_OS (c) 

((c)  4 

CfVERSTRIKE  ) 

♦define 

HYPHENATE (C) 

((c)  |- 

HYPHEN  ) 

/•  Soft 

hyphen 

♦define 

UNHYPHENATE (C) 

((C)  4-  -HYPHEN  ) 

♦define 

HAS_HYPHEN  (c) 

((C)  4 

HYPHEN  ) 

♦define 

CLRWIDTH (C) 

(  (C)  4- 

-WIDTH  BIT) 

♦define 

SETWIDTH ( c) 

(  (C)  |- 

WIDTH_BIT) 

(continued  on  next  page) 


Dr.  Dobb  s  Journal,  March  1987 


55 

197 


_ C  CHEST _ 

Listing  Twelve  (Listing  continued te)it  begins  on  page  96.) 


♦define  HASWIDTH(c) 


♦  de fine 
♦define 


♦de fine 
♦define 


♦define 

♦define 


♦define 

♦define 


CWIDTH(c) 
SPACE  SIZE 


SETNOPAD (C) 
PAD DAB LE (c) 


CHAR (C) 
ATTRIBUTES (c) 


TO_CTYPE(c) 
WHITE (C) 


♦define 

♦define 


♦define  VERTICAL <c) 

♦define  HORIZONTAL (c) 

♦define  ISMOTION(c) 
♦define  MOTION (arat) 


(  ((C)  4  (MODE_BIT  |  WIDTH_BIT) )  —  \ 

((C)  4  (MODE_BIT  |  WIDTH_BIT) )  ) 

(HASWIDTH (c)  ?  Fonts [CURFONT] . widths [ (c) 4CHR]  :  0) 
(  Fonts (CURFONT] .widths [  '  '  )  ) 

((C)  |-  NOPAD_BIT) 

(!  ((C)  4  NOPADBIT)  ) 

(((C)  4  MODE_BIT)  —  0) 

((C)  4  CHR  ) 

( (unsigned) (c)  »  8  ) 

( (CTYPE) (c)  |  WIDTH_BIT) 

(  ! ( (C)  4  MODE_BIT)  44  PADDABLE (c)  44  CHAR(c)--'  *) 

(  (int)  ((UCHAR)  (c)  )  ) 

(  (int) ((((int) (c))  «  4)  »  4)  ) 

(  ((C)  4  (MODE_BIT  |  FONT_BIT) )  —  \ 

(MODE_BIT  |  FONT_BIT)  ) 

(  ((C)  4  (MODE_BIT  |  VM_BIT) )  —  \ 

(MODE_BIT  |  VM_BIT)  ) 

(  ((C)  4  (MODE_BIT  |  HM  BIT))  —  \ 

(MODE_BIT  |  HMJ3IT)  ) 

(  VERTICAL (C)  ||  HORIZONTAL (c)  ) 

(  ( (CTYPE) (amt)  4  Oxfff)  |  (MODE_BIT  |  HM  BIT) ) 


End  Listing  Twelve 


Listing  Thirteen 

/*  Length  of  text  portion  of  current  line.  */ 

♦define  TLEN  (LINLEN  -  (INDENT  +  Tenpin))  /*  in  spaces  */ 

♦define  U_TLEN  (  TLEN  •  SPACE_SIZE  )  /«  in  units  »/ 


End  Listing  Thirteen 


Listing  Fourteen 


*  (c)  1987,  Allen  I.  Holub,  All  rights  reserved 

*  This  module  contains  the  nroff  main()  routine,  and  all 

*  support  for  command  line  processing. 


♦include  <stdio.h> 
♦include  <fcntl.h> 
♦include  <getargs.h> 
♦include  <bitmap.h> 
♦include  <signal.h> 
♦include  Mnr.hM 


/*  Variables  set  by  command  line  switches.  The  non-static 

•  variables  are  used  in  other  modules.  The  others  are  used 

*  by  various  routines  in  nrtext.c 


static  char  *Pagelist  -  NULL; 

static  char  *Plist  -  ""  ; 

int  Plain  -  0 

int  Stop  -  0 

int  No  cntl  -  0 


int  Verbose  -  0 
static  int  Fpage  -  1 


static  int  Even 
static  int  Odd 
static  int  Unbuf 
extern  int  do_mfile() 
extern  int  do_tstr() 
extern  int  do_rreg() 


Bit  map  used  for  -o  option  */ 

List  of  pages  to  print  */ 

suppress  bold,  underline,  etc.  */ 
Stop  output  every  N  pages  */ 

Don't  print  any  control  characters 
^except  \n.  Used  in  nrtext.c 

echo  commands  as  they're  executed  */ 
Number  of  the  first  page.  We  can't 
use  the  PAGE  number  register 
because  number  registers  don't 
exist  yet. 

/ 

Print  only  even  pages  */ 

Print  only  odd  pages  */ 

Don't  buffer  the  input  stream  */ 

Defined  below,  processes  -ra  */ 

Defined  below,  processes  -t  */ 

Defined  below,  processes  -r  */ 


static  ARG  Argtab[]  - 


'c' 

BOOLEAN, 

4No  cntl, 

d' 

BOOLEAN, 

40dd, 

1  e' 

BOOLEAN, 

4Even, 

m' 

PROC, 

(int 

*> 

do_mf  ile, 

n' 

INTEGER, 

4Fpage, 

o' 

STRING, 

(int 

*) 

4P list. 

P' 

BOOLEAN, 

4Plain, 

r  • 

PROC, 

(int 

*) 

do_rreg. 

s 1 

INTEGER, 

4Stop, 

t' 

PROC, 

(int 

*) 

do  tstr. 

u' 

BOOLEAN, 

4  Unbuf, 

v' 

♦ 

BOOLEAN, 

4Verbose, 

"don't  print  (C)ontrol  characters" 
"print  only  o(D)d  pages" 

"print  only  (E)ven  pages" 

"prepend  (M)acro:  /lib/traac/<str> .mac" 
"(N) umber  first  page  N" 

"print  (O)nly  pages  in  list  (<str>)" 
"Suppress  bold,  underline,  overstriJce" 
"set  number  (R)eg:  -rx<num>  -r(xx<num>' 
"(S)top  every  n  pages" 

"set  s (T) ring;  -tx<str>  -t (xx<str>" 
"Don't  buffer  input  for  debugging" 
"(V)erbose  mode,  echo  input  commands" 


56 

198 


Dr.  Dobb's  Journal,  March  1987 


*/ 


♦define  ISODD (x) 
♦define  ISEVEN(x) 
♦define  NAMESIZE  SO 


((x)  4  1) 

(! ISODD (X) ) 


/*  Full  path  name  of  macro  file  specified  with 

*  the  -ra  command  line  argument.  The  %s  will 

*  be  replaced  with  the  string  following  the 

*  -m  on  the  command  line.  NAMESIZE  should 

*  agree  with  the  precision  field.  (%1.32s). 

*  A  NAMESIZE  sized  buffer  is  used  to  hold 

*  the  expanded  macro  file  name. 

-/ 

♦define  MACFILE  "\\lib\\tmac\\%l . 8s. mac" 


get_pglist(  ) 
{ 


/•  Produces  a  page  list  from  str.  A  page  list  is  a 

*  bit  map  with  one  bit  representing  each  legal  page. 

*  0  <  N  <-  512.  Str  is  a  null  terminated  string  giving 

*  the  legal  pages  seperated  by  commas.  The  notation 

*  "N-M"  may  be  used  to  print  all  pages  between  N  and  M. 

-  M-NM  means  from  the  beginning  of  the  document  to 

*  page  N.  "N-"  means  from  page  N  to  the  end  of  the 

-  document.  The  special  forms:  -oe  and  -oo  print  all 

*  even  pages  and  all  odd  pages  respectively. 


register  int 


start,  end; 


if(  ! (Pagelist  -  (char  -)  raakebitraap(  MAXPAGE  ))  ) 

{ 

err ("Not  enough  memory  to  make  page  list\n"); 
return; 


if(  (Even  ||  Odd)  44  ! -Plist  ) 

Plist  -  ; 

while  (  -Plist  ) 

I 

start  -  (  -Plist  -'-•)?  1  :  stoi(  4Plist  )  ; 

/-  At  this  point  start  will  be  set  to  1  if  there 
*  was  a  leading  dash  or  to  the  number  if  no 

-  leading  dash.  "Plist"  will  be  pointing  past  the 

-  number.  If  -Plist  is  a  dash  we  are  doing  a  range 

-  else  we  are  setting  a  single  page. 

-/ 

if(  -Plist  !-  *-•  ) 
end  -  start; 

else 

{ 

Plist++;  /-  Skip  the 

if(  ! (end  -  stoi (iPlist) ) ) 

end  -  MAXPAGE;  /-  No  ♦  following  - 


for(;  start  <-  end;  start++  ) 


(  !Even  4  4  !Odd 
||(  Even  44  ISEVEN( start)  ) 
I | (  Odd  44  ISODD  (start)  ) 


setbit (  start,  Pagelist,  1  ); 


while  (  -Plist  — 
Plist++; 


♦ifdef  DEBUG 

printf("Only  the  following  pages  will  be  printed:  \n") ; 

for(  start  -  1;  start  <-  MAXPAGE;  start++  ) 
if(  testbit (start,  Pagelist)  ) 
printf("%d,  ",  start  ); 

printf ("\n") ; 

♦endif 


ispage (  n  ) 


/-  Return  1  if  n  is  a  legal  page  to  print  (is  in 
-  the  page  list  or  no  -o  option  was  ever  given) . 
-/ 

return(  Pagelist  ?  testbit (n,  Pagelist)  :  1  ); 


do_rafile (  name  ) 
char  -name; 

{ 

/-  This  routine  is  called  when  the  -m<str>  argument 

-  is  encountered  on  the  command  line.  It  is  called 

-  directly  by  getargs. 


char  nbuf [NAMESIZE)  ;  /-  Macro  file  name  -/ 
if (  -name  ) 


(continued  on  ne^t  page) 


Dr.  Dobb  s  Journal,  March  1987 


C  CHEST 


Listing  Fourteen  (Listing  continued,  text  begins  on  page  96.) 


INLINES  -  0; 

sprintf (nbuf ,  MACFILE,  name); 

I  filename  -  nbuf  ; 

if(  I(  Ifile  -  fopen(nbuf,  "r")  )) 

err ("Can't  open  macro  file;  %s\n",  nbuf  ); 
exit (  1  ) ; 


process (  Ifile,  Ifilename,  0,  0  ); 


main(argc,  argv) 


fclose(  Ifile  ); 


do_tstr(  str  ) 
char  *str; 

{ 

/*  Called  by  getargsO  when  -t  is  encountered. 

*  Given  -tx<str>  or  -t(xx<str>  initializes 

*  register  x  or  xx  to  <str>. 


static  char  line[  MAXSTR  ]; 
char  name [4]; 
extern  sgetc(); 

name [2]  -  name(l]  -  0; 


/*  Initialize:  */ 

df("R",  /*  default,  monospaced,  font  */ 

init_text();  /*  text  module  */ 

init_nreg();  /*  pre-defined  number  registers  */ 

signal (  SIGINT,  onintr  );  /*  Treat  ~C  like  .ex  */ 

argc  -  getargs (argc,  argv,  Argtab, 

sizeof (Argtab) /sizeof (ARG) ,  usage) ; 


PAGE  -  Fpage;  /*  Number  of  first  page  of 

*  document  as  per  -n  argument. 
*/ 

if <  *Plist  ||  Even  ||  Odd  ) 
get_pglist () ; 

Ofile  -  stdout; 

do  {  /*  process  a  single  input  file  */ 

INLINES  -  0; 

if(  argc  o  1  ) 

{ 

I  filename  -  ""  ; 

Ifile  -  stdin  ; 


if <  (name(0]  -  *str++)  —  •(•  ) 

( 

if(  *str  44  * (str  +  1)  ) 

{ 

name[0]  -  *str++; 
narae[lj  -  *str++; 


Ifilename  -  *++argv  ; 

Ifile  -  fopen(  Ifilename,  "r"  ); 
if(  SIfile  ) 

( 

err ("Can't  open  input  file  <%s>\n", 
Ifilename) ; 

break; 


Ifile  -  (FILE  *) 4 str; 
getline(  line,  0,  sgetc  ); 
ds(name,  line  ); 


fprintf (stderr,  "Illegal  string  name  on  command  line"); 


do_rreg(str) 
char  *str; 

( 

/*  Processes  -r  command  line  argument.  Given 

*  -rx<atr>  or  -r (xx<str>  initializes  number 

*  register  x  or  xx  to  <str>. 


char  name [4]; 

name [2]  -  name(l]  -  0; 


if(  "str  ) 

{ 


if (  (narae(O)  -  *str++) 


if  (  *str  44  Mstr  +  1)  ) 

( 

name [0]  -  *str++; 
name [I]  -  *str++; 

) 


putnreg(name,  0,  atoi(str)  ,  0,  1,  0); 
return; 


/*  The  setvbuf  call  puts  us  into  unbuffered 
*  input  mode. 


if (  Unbuf  ) 

setvbuf (  Ifile,  NULL,  _IONBF,  0  ); 

process (  Ifile,  Ifilename,  0,  0  ); 
fcloae  (  Ifile  )  ; 

]  while (  — argc  >1  44  !Quit  ); 

^rJcO;  /*  Flush  output  buffer  */ 

/*  Do  the  end  macro  if  there  is  one.  If  we 

*  don't  clear  Quit  before  expanding  the  macro, 

*  getline ()  will  return  end  of  file  and  the  macro 

*  won't  be  executed. 

*/ 

if(  *Endra  ) 

{ 

Quit  -  0; 

expand__macro  (Endra) ; 


mac_clean ( ) ;  /*  Delete  all  macros  disk  files  */ 

exit (  0  ) ; 


End  Listing  Fourteen 


Listing  Fifteen 


fprintf (stderr, "Illegal  register  name  on  command  line"); 


onintr  () 
( 


/*  Treat  a  Ctrl-C  or  Ctrl-Break  as  if  we've 
*  executed  a  .ex  command. 


*  Copyright  (c)  1987,  Allen  I.  Holub.  All  rights  reserved. 

*  This  module  contatins  the  routines  to  process  individual 

*  commands.  Rotines  are  accessed  via  the  Cmdtab  (see  nr.c 

*  and  below 


•include  <stdio.h> 
♦include  <ctype.h> 
♦include  "nr.h" 


signal(  SIGINT,  onintr  ); 

mac_clean();  /*  Delete  all  macro  disk  files  */ 
exit (  1  ) ; 


extern  CTAB  Cmdtab []  ; 

extern  int  Ctabsize  ; 

extern  char  *skipto(),  ‘skipspace () ;  /*  in  tools. lib  */ 

extern  char  *bsearch()  ; 


cmdcrap (  matchstr,  cp  ) 

register  unsigned  char  *matchstr; 

register  CTAB  *cp; 

/*  Comparison  routine  used  by  search  called  in 
*  commando  below. 


58 

200 


Dr.  Dobb's  Journal,  March  1987 


register  unsigned  int  1, 


D(  printf ("comparing  %2.2s 
D(  printf ("and  %2.2s,  " 


1  -  matchstr  [0] ; 
r  -  (cp->cmd) [0] ; 


if(  1  —  r  ) 

( 

1  -  matchstr [1]  ; 
r  -  (cp->crod) [1]  ; 


) 


if(  isspace(l)  ) 
1  -  0  ; 


r; 

",  matchstr)  ); 
,  cp->cmd  )  ) ; 


C  CHEST 


Listing  Fifteen 

(Listing  continued,  text  begins  on  page  96.) 

( 

/*  Process  a  command  found  in  "first".  Do  this  by  finding 

*  the  command  in  Cradtab.  If  the  command  is  found  then 

*  the  associated  subroutine  (in  the  Cradtab)  is  executed. 

*  The  calling  convention  depends  on  the  command  type. 

*  There  are  4  types: 

*  type  0:  .xx  <optional  string> 

*  type  1:  .xx  <number>  <optional  string> 

*  type  2:  .xx  <string>  <number>  (optional  tail) 

*  type  3:  .xx  <string>  coptional  string> 


D (  pr int f ("returning  %d\n",  1  -  r)  ); 
return (  1  -  r  ); 


A  <string>  is  passed  null  terminated  with  leading  and 
trailing  white  space  or  quotes  stripped.  A  <number> 
is  passed  as  an  int.  An  <optional  string>  is  passed 
without  trailing  white  space  or  leading  and  trailing 
quotes  stripped. 


/* - */ 

int  numarg(s,  offset) 

char  “s; 

int  ‘offset; 

/*  Get  value  of  a  numeric  argument  from  *s.  If  the 

*  number  is  followed  by  an  i,  inches  are  converted 

*  to  spaces.  If  the  number  is  preceeded  by  a  *+*  or 
‘a  •-*  offset  is  set  to  1.  The  argument  may  be  an 

*  expression  and  spaces  are  ignored.  However  the 

*  argument  must  have  been  enclosed  in  double  quotes 

*  for  a  space  to  be  part  of  the  argument.  S  is 

*  advanced  past  the  numeric  componant  and  any 

*  trailing  whitespace.  Return  the  value  of  the 

*  argument. 


Type 

0: 

(*  action) (  str. 

dobreak  ) ; 

char 

•str; 

Type 

lx 

(•  action) (  val. 

str,  offset. 

dobreak  ) 

Type 

2: 

(•  action) (  val. 

str,  offset. 

dobreak,  tail) 

double 

val; 

char 

•str; 

char 

•tail; 

Type 

3: 

(•  action) ( 

leftstr,  rightstr,  dobreak  ) 

char  *leftstr,  ‘rightstr  ; 


Note  that  commando  will  mess  up  the  input  string, 
putting  a  null  after  the  first  character.  If  you  need 
to  keep  the  string  around  for  longer  than  one  command, 
copy  it  somwhere  safe. 


) 

/ 


int 

extern  int 
extern  double 
double 


error; 

getvarO,  null(); 
parse  (); 
val  -  0.0  ; 


if< 

{ 


if  (  “a  —  '  +  '  ) 

( 

•offset  —  1; 

( *s) ++; 

) 

else  if(  ‘*s  --  '-'  ) 
•offset  -  1; 


val  -  parse  (  a  ) ; 

} 

return (  (int) val  ); 


extern 

register 

char 

char 

int 

int 

int 

int 


CTAB  ‘search (); 
CTAB  *crad 

•second  ; 
•p 

val  -0; 
offset  -0; 
rval  -0; 
dobreak  ; 


/*  routine  to  search  for  cmd  •/ 
/*  current  command  •/ 

/*  points  at  second  argument  */ 
/•  general-purpose  pointer  */ 
/*  value  of  numeric  argument  •/ 
/•  true  if  val  is  an  offset  */ 
/*  return  value  •/ 

/*  True  if  a  normal  Cmd  char, 

*  false  if  a  nobreak  command 

*  char. 

*/ 


dobreak  -  (*first++  —  Cmd_chr); 


while (  ‘first  44  (‘first  4  0x7f)  o  '  1  ) 
first++; 

if(  !• first  )  /*  This  is  a  comment  line  •/ 

return  0; 

cmd  -  (CTAB  *)  bsearch( first,  Cradtab,  Ctabsize, 

sizeof (CTAB) ,  cradcmp) ; 


splitfields (  cur,  next  ) 
char  **cur,  “next; 

{ 

/• 

*  Split  cur  into  two  fields.  Modify  next  to  point 

*  at  the  beginning  of  the  second  or  at  end  of  line. 

* 

*  Leading  and  trailing  white  space  around  the  first 

*  field  is  skipped  quoted  arguments  are  recognized 

*  as  being  a  single  field,  even  if  the  quoted 

*  string  contains  whitespace. 

*/ 

register  char  *p; 


if(  !crad  ) 

( 

/•  Command  isn't  in  the  table.  See  if  it's  a 
•  macro.  Print  an  error  message  if  it  isn't. 

*/ 

if (  Jexpand_macro (  first  )  ) 

err(".%2s  not  a  command  or  macro\n",  first  ); 

return  0; 

) 

if(  Inhibit  44  crad->inhib  )  /*  Input  is  inhibited.  See  •/ 
return  0;  /•  See  doif()  in  nrrasc.c  •/ 

/•  for  details.  */ 


p  -  ‘cur; 

p  -  skipspace(p.  Esc); 

if(  ‘p  —  •"•  ) 
t 

•cur  —  ++p; 

p  -  skiptoC"',  p.  Esc)  ; 

) 

else 

( 

•cur  -  p  ; 

p  -  skiptoC  p.  Esc  ) ; 

) 

if(  ‘p  )  j 

*p++  -  '\0';  /*  Terminate  current  field  */ 

p  -  skipspace(  p.  Esc  );  /*  Skip  to  next  field  •/ 

if (  *p  —  '"'  )  /•  strip  any  quotes  */ 

( 

•skiptoC'",  ++p.  Esc  )  -  0; 

) 

•next  -  p; 

) 

/ - */ 

command (  first  ) 
char  *  fir st.- 


offset  -  0 
first  +-  2 


/*  advance  past  the  actual  •/ 
/*  command.  */ 


if (  cmd->type  —  0  ) 


first  -  ‘first  ?  skipspace (first, Esc)  :  crad->def; 

if< 


•first  —  ""  ) 
•skipto  ( "" 


rval  -  {  • 
goto  exit; 


++first.  Esc)  -  0; 
(crad->action) ) (  first,  dobreak  ); 


splitfields (  4first,  isecond  ); 


switch (  cmd- > type  ) 

i 

case  1: 

p  -  ‘first  ?  first  :  crad->def  ; 


val  -  nuraarg (  4p  ,  4offset  ); 

rval  -  (• (crad->action) ) (  val,  second,  offset, 

dobreak) ; 

break; 


case 


p  —  ‘second  ?  second  :  crad->def  ; 


Dr.  Dobb's  Journal,  March  1987 


59 

201 


val  -  nuraarg (  4p,  4offset  ); 

p  -  skipspace  (p,  Eac); 
if(  *p  —  *\'"  ) 

•skipt©^"',  ++p,  Esc)  -  ' \0 1 ; 

rval  -  <*(cmd->action) ) (  val,  first,  offset, 

.  dobreak,  p  ) ; 

break; 

case  3: 

rval  -  (*(cmd->action) )  (  *first?  first;  crad->def, 

second,  dobreak) ; 

break; 

default; 

err (“**  Internal  Error,  bad  type;  %d  in  Cmdtab\n", 
cmd->type  )  ; 

break; 


exit; 

return (  cmd->inhib  ?  0;  rval  ); 


End  Listing  Fifteen 


Listing  Sixteen 

♦include  <stdio.h> 

♦include  <hash.h> 

♦include  "nr.hM 

- - 

*  NREG.C 

^  Copyright  (c)  1987,  Allen  I,  Holub.  All  rights  reserved. 

*  This  module  holds  routines  for  mainpulating  and  accessing 

*  number  registers. 


*/ 


static  int  Regnura  -  0;  /*  Used  to  print  number  registers  V 

HASH_TAB  *Nrega  -  0;  /*  Hash  table  that  holds  number  */ 

/*  registers.  */ 


init  nreg() 

{ 

extern  NREG  *putnreg(); 
int  garbage; 

Nregs  -  raaketab(  127  ); 


Nrpg 

Nrargs 

Nrlines 

Nrvplace 

Nr font 

Nrindent 

Nrllen 

Nrtlen 

Nrof fset 

Nrplen 

Nrtotrap 

Nrfill 

Nrv 

Nrdl 

Nrdn 

Nrdy 

Nrh 

Nrhp 

Nrln 

Nrnl 

Nrra 

Nrrao 

Nrs 

Nrdw 

Nryr 


putnreg(  "%"  , 
putnreg(  M.$", 
putnreg (  ".cH, 
putnreg  (  ".d", 
putnreg (  ".f", 
putnreg (  " .i", 
putnreg (  " .1", 
putnreg  (  ".n", 
putnreg (  ".o", 
putnreg (  M.pM, 
putnreg (  " .t", 
putnreg (  ".uH, 
putnreg (  " .vM, 
putnreg (  "dl", 
putnreg (  "dn", 
putnreg (  "dyM, 
putnreg (  "h", 
putnreg (  "hp", 
putnreg (  "In", 
putnreg (  "nl", 
putnreg (  "m", 
putnreg (  "mo", 
putnreg {  "s", 
putnreg (  "vd", 
putnreg (  "yr". 


ARABIC, 

READONLY, 

READONLY, 

READONLY, 

READONLY, 

READONLY, 

READONLY, 

READONLY, 

READONLY, 

READONLY, 

READONLY, 

READONLY, 

READONLY, 

ARABIC, 

ARABIC, 

ARABIC, 

ARABIC, 

ARABIC, 

ARABIC, 

READONLY, 

ARABIC, 

ARABIC, 

ARABIC, 

ARABIC, 

ARABIC, 


DEF 
DEF' 
DEF' 
DEF’ 
DEF' 
DEF' 
DEF' 
DEF' 
DEF' 
DEF" 
DEF' 
DEF" 
DEF" 
DEF" 
DEF" 
DEF" 
0,  " 
DEF 
DEF" 
1, 

0, 
DEF 
0,  “ 
0, 
DEF 


PAGE, 

"NARGS, 

"INLINES, 

"VERT, 

"FONT, 

"INDENT, 

"linlen, 

'textlen, 

"offset, 

PGLEN, 

'TOTRAP, 

'fill, 

LS, 

WIDTH, 

HEIGHT, 

DAY, 

HORIZ, 

'LINE, 


MONTH, 

YEAR, 


0,1,1) 

0,1,1) 

0,1,1) 

0,1,1) 

0,1,1) 

0,1,1) 

0,1,1) 

0,1,1) 

0,1,1) 

0,1,1) 

0,1,1) 

0,1,1) 

0,1,1) 

0,1,1) 

0,1,1) 

0,1,1) 

0,1,1) 

0,1,1) 

0,1,1) 

0,1,1) 

0,1,1) 

0,1,1) 

0,1,1) 

0,1,1) 

0,1,1) 


time (  4 HOUR,  4MIN,  4SEC,  4garbage  ); 
date (  4MONTH,  4 DAY,  4 YEAR,  4 WEEKDAY  ); 

WEEKDAY++;  /*  Translate  0-6  to  1-7  for  compatability  */ 


not_deletable (  name  ) 
char  ‘name; 

( 


/*  Return  true  if  name  ia 
*^only  number  register. 

register  int  cl,  c2; 

cl  -  narae[0] ; 
c2  -  name [ 1 j ; 


a  pre-defined  but  not  read 


return ( 

cl  — 

44 

c2  — 

0  )  || 

cl  — 

•d* 

44 

C2  — 

•h')  || 

cl  — 

•d* 

44 

c2  — 

'l')  II 

(continued  on  riejct  page) 


Dr.  Dobbs  Journal,  March  1987 

202 


61 


_ C  CHEST 

Listing  Sixteen  (Listing  continued,  te?ct  begins  on  page  96.) 


cl  -- 

•d' 

u 

c 2  — 

•y') 

1  1 

{ 

cl  -- 

•h' 

&  4 

c2  — 

•p') 

1  1 

cl  — 

•1* 

tt 

c2  — 

•n' ) 

1  1 

cl - 

'm' 

tt 

c2  — 

'o' ) 

1  1 

) 

cl  — 

V 

it 

c2  — 

'r') 

>; 

if< 

/*— 

HREG 

char 

int 

{ 


•putnreg (name,  fmt,  val,  offset,  create,  incr_arot) 
•name; 

fmt,  val,  offset; 

Change  the  value  of  any  field  of  a  number 
register  called  "name."  If  the  number  register 
does  not  exist  and  "create"  is  true,  then  create 
it.  Offset  is  treated  as  follows: 

offset  <  0  number  register  not  modified 

offset  —  0  number  register  -  val 
offset  >  0  number  register  +-  val 

If  fmt  is  non-zero  put  it  into  the  format  field, 
else  leave  the  nfmt  field  alone.  Ditto  with 
incr_arat.  Humber  registers  who's  format  is 
READONLY  can't  be  modified.  Return  a  pointer  to 
the  register  if  it  was  found  or  created  or 
NULL  if  register  not  found  and/or  not  created. 


err("Can't  modify  a  read/only  number  register\n") ; 
return  (NREG  •)  0; 


if (fmt) 


pnode->incr_arat  -  incr_amt  ; 
pnode->nfmt  -  fmt; 


if(  offset  >-  0  ) 


if(  offset  ) 

pnode->nval  +-  val 

else 

pnode->nval  -  val 


return (pnode) ; 


rm_nreg(name) 
char  ‘name; 


/*  Remove  number  register  "name"  if  it  exists. 


register  NREG  *pnode; 

if  (  *narae  —  *\0') 
return  NULL; 

if(  ! (pnode  -  (NREG  *)  findsyra(  Nregs,  name  ))  ) 

( 

if<  Icreate  ) 

errC'Number  register  doesn't  exist\n"); 
return  NULL; 

> 

pnode  -  (NREG  •)  addsyra(  Nregs,  name,  sizeof (NREG) ) ; 
pnode->nfmt  -  ARABIC  ; 

pnode->nval  -  0  ; 

pnode->incr_amt  -  1  ; 

) 

if(  pnode->nfmt  —  READONLY  ) 


register  NREG  ‘node; 

if(  J (node  -  (NREG  *)  findsym(  Nregs,  name  ))  ) 

err ("Can't  find  number  register  <%1.2s>\n",  name  ); 

else  if(  node->nfmt  —  READONLY  ||  not  deletable (name) ) 
err ("May  not  delete  pre-defined  nuroEer  register\n") ; 

else 

delsyra(  Nregs,  (BUCKET  •)  node  ); 

) 

/* - */ 

prnt  (  name,  p  ) 

NREG  *p; 

printf ("%2s  -  %4d  (format  -  %c,  incr  by  %d)", 

name,  p->nval,  p->nfrat,  p->incr_arat) ; 


printf  (  (++Regnura  %  2)  ?  "\t|  "  :  "\r\n"  ); 

) 

pr_nregs  () 

< 

/*  Print  out  the  values  of  all  the  number  registers 
•/ 

Regnura  -  0; 

ptab(  Nregs,  prnt  ); 

printf ("\r\nThere  are  %d  number  registers\r\n",  Regnum) ; 

) 

/* - V 

int  nrtoi(  p,  fmt  ) 

char  *p  ; 

int  *fmt  ; 

/*  Return  the  value  of  the  number  reg.  variable 

*  whose  name  is  in  string.  Fmt  is  modified  to  be 

*  the  contents  of  the  registers  format  field. 

*  If  *p  is  a  +,  the  number  register  is 

*  auto  pre-incremented,  if  it's  a  -,  it's  pre- 

*  decremented.  If  it's  0,  it  isn't  modified. 

*  Non-existant  number  registers  evaluate  to  0. 

*/ 

int  rval  -  0  ; 

int  i  -  0  ; 

NREG  *node  : 

•fmt  -  ARABIC;  /*  Default  error  return  values  */ 

if(  !*p  ) 

err ("Missing  number  register  name\n"); 

else 

{ 

if(  *p  —  ||  *p  —  '+'  ) 

i  -  (*p++  —  •+•)  ?  1  :  -1  ; 

if (  node  -  (NREG  *)  findsym (Nregs,  p)  ) 

{ 

•fmt  -  node->nfrat  ; 

node->nval  +-  (  node->incr_amt  *  i  ); 
rval  -  node->nval  ; 

) 

) 

return  (  rval  ) ; 

} 

End  Listing  Sixteen 


Listing  Seventeen 

/*  NRGLBLS.C;  Global  variables  used  by  several  modules 
•  Copyright  (c)  1987,  Allen  I.  Holub. 


*  Global  variables  used  by  the  various  nroff  routines 


*/ 

♦include 

♦include 

<stdio .h> 

"nr . h" 

» 

Nodes  for  pre-defined  number  registers 

V 

NREG 

•Nrpg 

NREG 

•Nrargs 

NREG 

•Nrlines 

NREG 

•Nrvplace 

NREG 

•Nrfont 

NREG 

•Nrindent 

NREG 

•Nrllen 

NREG 

•Nrtlen 

NREG 

•Nroff set 

NREG 

•Nrplen 

NREG 

•Nrtotrap 

NREG 

•Nrfill 

NREG 

•Nrdl 

NREG 

•Nrdn 

NREG 

•Nrdy 

NREG 

•Nrhp 

NREG 

•Nrln 

NREG 

•Nrmo 

NREG 

•Nrnl 

NREG 

•Nryr 

NREG 

•Nrdw 

NREG 

•Nrh 

NREG 

•Nrm 

NREG 

•Nrs 

NREG 

/* - 

•Nrv 

*  Tabstop  is  an  array  of  tabstops,  indexed  by  column  number. 

*  0  means  no  tab  stop  in  that  column,  'R'  means  right 

*  adjusting,  'L'  is  left  adjusting,  'C'  is  centering.  Tab 

*  positions  are  all  increments  of  spaces  from  the  left 

*  margin.  The  leftmost  column  is  column  1.  tabs  are  set 

*  and  cleared  by  tabsetO  and  tabclrO  (in  nrrasc.c) .  They 

*  are  used  in  nrtext.c 

'/  (continued  on  next  page) 


62 


Dr.  Dobb's  Journal,  March  1987 

203 


C  CHEST 


Listing  Seventeen  (Listing  continued,  te?ct  begins  on  page  96.) 


TSTOP 

{ 


Tabstop  - 

0, 

o,  0,  0,  0, 

*L',  0,  0,  0, 
'L',  0,  0,  0, 
'L',  0,  0,  0, 
'L',  0,  0,  0, 
'L',  0,  0,  0, 
'L',  0,  0,  0, 
•L',  0,  0,  0, 
'L',  0,  0,  0, 
'L',  0,  0,  0, 
'L',  0,  0,  0, 
•L',  0,  0,  0, 
*L',  0,  0,  0, 
'L',  0,  0,  0, 
'L',  0,  0,  0, 
'L',  0,  0,  0, 


0,  0,  0,  0, 
0,  0,  0,  0, 
0,  0,  0,  0, 
o,  0,  0,  0, 
0,  0,  0,  0, 
0,  0,  0,  0, 
0,  0,  0,  0, 
o,  0,  0,  0, 
o,  0,  0,  0, 
0,  0,  0,  0, 
o,  0,  0,  0, 
o,  0,  0,  0, 
0,  0,  0,  0, 
o,  0,  0,  0, 
o,  0,  0,  0, 
0,  0,  0,  0 


- - - 

FONT  Fonta [NUMFONTSJ ;  /*  Font  table.  Fonta[0J  defines  the  */ 
/*  default,  non-proportional  font  */ 
/*  attached  to  the  name  R,  it  is  */ 
/*  created  in  raain() .  */ 


!* -  - 

• 

Msc .  other  Global 

*/ 

int 

Ad jraode 

-  BOTH 

;  /* 

int 

Adjusting 

-  0 

;  /* 

• 

int 

Crad  chr 

_  •  1 

;  /* 

int 

Cont  ul 

-  0 

;  /* 

int 

Divtrap 

-  -1 

;  /* 

int 

Divwidth 

-  0 

;  /* 

/* 

/* 

char 

*Endm 

_  MU 

/* 

int 

Esc 

“  *\\ 

/• 

int 

Hyphen  chr 

-  '%' 

/* 

int 

Hyphenate 

-  0 

/* 

int 

Itrap 

-  -1 

/* 

int 

Isdiv 

-  0 

/* 

int 

Ismacro 

-  0 

/* 

int 

Leader 

_  1  1 

/* 

char 

*  *Macv 

-  0 

/* 

int 

Nobreak 

_  t  •  1 

/* 

int 

Nospace 

-  0 

/• 

int 

Nr_cpraode 

-  0 

/* 

int 

Num  bold 

-  0 

/* 

int 

Num  center 

-  0 

/* 

int 

Num  os 

-  0 

/* 

int 

Num  under 

-  0 

/* 

int 

Page  ch 

-  '%• 

/* 

int 

Quit 

-  0 

/* 

int 

Tab 

-  1  • 

/* 

int 

Tabwidth 

-  8 

/* 

int 

Tabs  enabled  -  1 

/* 

int 

TerapXn 

-  0 

/* 

int 

Title  len 

-  80 

/* 

int 

Wordstar 

-  0 

/* 

int 

H  units 

-  120 

/* 

int 

H_space 

-  10 

/* 

int 

Bold 

-  0 

/* 

int 

Over 

-  0 

/* 

int 

Italics 

-  0 

/* 

char  *Bd_on 
char  * Bd_o f f 
char  *Ul_on 
char  *Ul_off 
char  *Os_on 
char  *Os_off 

char  *Dn_str 
char  *Up_str 
int  Vs  amt 


Current  adjustment  mode  */ 

One  if  adjusting  lines.  We  can't  just  use  an 
adjustment  mode  here  because  we  have  to  remember 
the  most  recent  adjustment  mode . 

Command  character  */ 

Num  of  input  lines  to  continuously  underline  */ 
Distance  to  diversion  trap  (-1  if  none)  */ 

Width  of  current  diversion.  This  differs  from  */ 
WIDTH,  which  is  the  width  of  the  most-recent  */ 
diversion.  */ 

Name  of  macro  invoiced  at  end  of  input  */ 

Current  escape  character  */ 

Soft  hyphen  is  \<Hyphen_chr> .  default  -  \%  */ 

Hyphenation  is  enabled  during  filling  only  */ 

Distance  to  input  line  trap,  -1  if  none  */ 

1  if  Ofile  is  diversion,  0  if  file  */ 

1  if  Ifile  is  a  macro,  0  if  a  file  */ 

Current  leader  character  */ 

Macro  arguments  for  current  macro  level  */ 

NobreaJc  character  */ 

Inhibit  spacing  as  per  .ns  command  */ 

Nroff  copy  mode,  expand  \  in  macro  definitions*/ 

Remaining  number  of  input  lines  to  print  bold  */ 

'  Remaining  number  of  input  lines  to  center  */ 

1  Number  of  input  lines  to  overstrike  */ 

Remaining  number  of  input  lines  to  underline  */ 
Translates  to  page  number  in  3  part  titles  */ 
Terminate  nroff  when  set  by  .ex  command  */ 

Tab  repetition  character  »/ 

Width  of  input  tab  stops  */ 

Expand  tabs  only  if  true  */ 

Temporary  indent  column,  (offset  from  INDENT)  */ 
3  part  title  line  length  (set  with  .It  crad)  */ 
Wordstar-mode  output  */ 

Number  of  horizontal  units  in  an  inch  */ 

Number  of  horizontal  units  in  a  space  */ 

Boldface  is  currently  active  */ 

Overstrike  is  currently  active  */ 

Italics  is  currently  active  »/ 

Send  to  printer  to  turn  boldface  on  */ 

Send  to  printer  to  turn  boldface  off  */ 

Send  to  printer  to  turn  underline  on  */ 

Send  to  printer  to  turn  underline  off  */ 

Send  to  printer  to  turn  overstrike  on  */ 

Send  to  printer  to  turn  overstrike  off  */ 


-  1 


;  /*  Send  to  printer  to  send  cursor  down  1/2  line  */ 
;  /*  Send  to  printer  to  send  cursor  up  1/2  line  */ 
;  /*  This  many  \u  or  \d  cmds  moves  one  line  */ 


char  *Left_atr  -  M\b";  /*  Send  to  printer  to  go  left 

char  *Right_str  -  M  "  ;  /*  Send  to  printer  to  go  right 

int  Hs_amt  —  1  ;  /*  Above  moves  1/n  spaces 


char  *Imarg_str  - 
char  *Rmarg_str  - 


String  used  in  .ml  command 
String  used  in  .me  command 


char  Dtrap_name[3]  -  (0,0,0)  ; 
char  Itrap_name[3]  -  (0,0,0); 


/*  Macro  executed  when  Divtrap  reached  */ 
/*  Macro  executed  when  Intrap  —  0  */ 


FILE  *1  file 
FILE  *Ofile 


char 

*1  filename 

int 

Nestlev 

int 

Nm_on 

_ 

int 

Nra_blanks 

- 

int 

Nra_rault 

- 

char 

•Nra  str 

int 

Inhibit 

st din; 
stdout; 

"standard  input"; 


/*  Current  input  file 
/*  Output  file  descriptor 
/*  name  of  input  file 


/*  Nesting  level  determined  by  .(/.)  * 

/*  Line  numbering  enabled  by  .nm  crad 

If  true  then  blank  lines  are  numbered  by  .nm 
/*  The  M  argument  of  the  most  recent  .nm  command 
/*  The  S  argument  of  the  most  recent  .nm  command  : 
/*  Inhibit  all  text  and  command  processing  except 
*  . )  and  . ( 

*/ 


End  Listing  Seventeen 


64 

204 


Dr.  Dobbs  Journal,  March  1987 


Listing  Eighteen 

/* - 

*  NRINP.C:  Input  and  escape  sequence  processing.  Also 

*  contains  process ()#  the  highest  level  input 

*  processing  routine. 

*  (c)  1987,  Allen  I.  Holub,  All  rights  reserved 


*/ 

♦include  <stdio.h> 

♦include  <ctype.h> 

♦include  "nr.h" 

♦define  ishex(c)  ((*0*<-(c)  (i  (c)<-'9‘)  II  \ 

( 1  A1  <-  (c)  t  t  <C)<-'F')  ) 

♦define  tohex(c)  (('0'<-(c)  it  (c)<-'9')  ?  (c)-'O'  \ 

:  ((c)-'A')  +  0x0a) 

/* - */ 

static  int  Abort_process  -  0;  /*  Used  by  escape  to  tell 

*  process  to  abort  the 

*  current  process. 

*/ 

static  int  New_font  -  0;  /*  Used  by  chfontO  */ 

extern  char  *expandstr (char*,  char*,  int);  /*  nrmac.c  */ 

extern  char  *cpy(  char*,  char*  );  /*  tools. lib  */ 

/* - */ 

gnum(  inp,  ifile,  nextc  ) 
int  (*inp)(),  *nextc  ; 

FILE  *ifile; 

( 

/*  Get  a  decijnal  number  from  input,  the  number  can  be 

*  given  explicitly  or  as  a  number  register 

*  (ie.  \l\n (xx-  is  legal,  it  will  draw  as  many  '-‘s 

*  as  are  specified  in  the  \n(xx  number  register.  The 

*  number  is  returned  and  *c  is  modified  to  hold  the 

*  first  nondigit. 

*/ 

int  c,  i,  sign  -  1  ; 

UCHAR  name [4]; 

if  (  (c  -  (*  inp)  (ifile) )  —  Esc  ) 

( 

if(  (c  -  (*  inp) (ifile))  —  *n'  ) 

{ 

gname(  name,  inp,  ifile,  1  ); 
i  -  nrtoi(  name,  t c  ); 
c  -  (*  inp) (ifile) 

) 

else 

err ("Must  use  number  or  number  register\n") ; 

} 

else 

{ 

if  C  c  —  ) 

( 

sign  -  -1  ; 
c  -  (*inp) (ifile) ; 

) 

else  if  (  c  —  '+'  ) 
c  -  (*inp) (ifile) ; 

for (  i  -  0  :  isdigit (c)  ;  c  -  (*inp) (ifile)  ) 

{ 

i  -  (i  *  10)  +  (c  -  '0')  ; 

> 

) 

*nextc  -  c; 
return  i  *  sign  ; 

} 

- - * 

gname(  name,  inp,  ifile,  nreg  ) 
char  *name; 

int  ( * inp) ( ) : 

FILE  *ifile; 

/*  Get  a  string  or  number  register  name  from  inp  and 

*  put  it  into  name.  In  the  case  of  an  autoincreraent 

*  (  \n-(xx  for  example  ),  a  leading  -  or  +  is  put 

*  into  the  name  too. 

*/ 

register  int  c; 

c  -  (*inp) (ifile)  ; 

if(  nreg  it  (c  —  *+'  II  c  --  '-')  ) 

( 

*name++  -  c  ; 
c  -  (*inp) (ifile)  ; 

) 

if(  c  —  '('  ) 

*name++  -  (*inp) (ifile) ; 

*narae++  -  (*inp) (ifile) ; 


(continued  on  ne?ct  page) 


Dr.  Dobb's  Journal,  March  1987 


65 

205 


C  CHEST 


Listing  Eighteen  (Listing  continued,  text  begins  on  page  96.) 


/*  In  non-nroff  copy  mode,  only  \"  and  \<CR> 

*  are  recognized.  Everything  else  goes 

*  through  to  the  output. 


♦define  get_quote (str)  if(  (*inp) (ifile)  !-  • \ •  ) 


err ("Missing  quote  in  %s\n",  str  ); 
break; 


escape (  tstart,  target,  copymode,  inp,  ifile,  maxch) 
“tstart,  * “target ; 

(*inp)  (); 

“ifile; 


*  Expand  escape  sequences,  using  inp()  to  get 

“  additional  input  when  required.  Expand  at  most  maxch 
“  characters.  Target  is  modified  to  point  past  the 

*  expanded  string.  The  input  character  following  the 
“  escape  sequence  is  returned. 

* 

*  Tstart  is  the  array  into  which  characters  go.  “target 
“  is  the  current  location  in  that  array.  The  input 

*  character  following  the  escape  sequence  is  returned. 

*  Note  that  the  string  "\t"  will  actually  put  a  tab 

*  ^aracter  (unexpanded)  into  the  input  stream.  An  ASCII 

*  *1  or  a  \T,  will  have  been  expanded  by  getline(). 

*  Copy  mode  is  a  subset  of  normal  mode  used  for  macro 

*  definitions.  In  normal  copy  mode  (Nr_cpmode  —  0)  the 

*  only  recognized  escape  sequences  are  \"  and 

*  \<newline>.  Other  sequences  are  just  copied  to  the 

*  target  string.  In  nrof f-compatible  copy  mode 

*  (Nr_cpmode  !-  0),  the  following  are  recognized: 

\\  \*  \$  \n  \M  \<newline> 

* 

*  Nested  \*  expansions  are  supported  and  the  strings 

*  can  contain  other  escape  sequences  (like  \nx) .  Note 
“  that  nesting  is  handled  recursively  in  that 

*  expands tr ( ) ,  called  below,  will  call  escape ()  to 

*  expand  internal  escape  sequences.  For  reasons  of 

*  nrof f  compatability  \(  is  mapped  to  .{  and  \)  will 

*  cause  process ()  to  terminate  immediatly  after  doing 

*  the  line  on  which  the  \)  was  found,  as  if  it  had 

*  seen  a  . ) 


•de8t++  -  Esc 
*dest++  -  c 
goto  newchar; 


/“  Cases  in  the  following  switch  are  expanded  either  in 

*  nrof f-compatible  copy  mode  or  when  not  in  copy  mode 

of  any  sort  (because  of  the  goto  branch  in  the  default 

•  case  of  the  previous  switchO).  They  are  not  expanded 
in  normal  copy  mode. 

*/ 

switch (  c  ) 

{ 

case  ' . • : 
case  1  * • : 

“dest++  -  LITCHAR  ; 

*dest++  -  c; 

HORIZ++; 
goto  newchar; 

case  '$'!  /*  \$N  1  <-  N  <-  9  */ 

/*  Expand  macro  arguments.  The  leftmost  one  is  in 

*  Macv[0]  but,  for  nroff  comapatability  we  access 

*  it  *8  \$1.  \$0  can  not  be  accessed. 

“/ 

if(  !Macv  ) 
f 

err ("\\$<num>  can  only  be  used  in  a  macro\n"  ); 
goto  newchar; 

} 

if(  (i  -  (*inp) (ifile)  -  '0')  <1  ||  i  >  9  ) 

err("\\$n:  invalid  number,  1  <-  n  <-  9\n"); 
goto  newchar; 

} 

for(  bp  -  Macv [i-1]  ;  “bp  ii  — maxch  >-  0  ;  ) 

HORIZ++  ; 

“dest++  -  *bp++; 

goto  newchar; 


register  int  i; 

int  c; 

int  j ; 

int  linechar; 

UCHAR  *bp; 

UCHAR  “deat; 

UCHAR  name (8); 

static  UCHAR  temp [80] ; 


temporary 

current  input  character 
temporary 

line-drawing  character 
general-purpouae  pointer 
Pointer  to  target  array 
string  or  number  reg  name 
buffer  used  by  itoascii() 
to  translate  number 


♦ifdef  DEBUG 

printf ("escape :targ-0x%x,  cpymode-%d,  inp-0x%x,  maxch-%d" 
target,  copymode,  inp,  maxch  ); 
printf ("\nescape:  macv  -  0x%x\n",  Macv); 
for(  i-10  ;  Macv  ii  — i>-0  ;  ) 

*  ^  printf ("escape:  %2d:  <%2.2s>\n",  i,  Macv[i]  ); 

♦endif 

dest  -  “target; 

/*  Cases  in  the  following  switch  are  expanded  whether 
“  or  not  we're  in  copy  mode,  "c"  holds  the  character 
*  following  the  escape  character. 


switch (  c  -  (*inp) (ifile)  ) 


,e  ‘n*  :  /*  \nx  or  \n(xx 

gname (  name,  inp,  ifile,  1  ); 
i  -  nrtoi(  name,  ij  )  ; 

if(  j  —  READONLY  ) 
j  -  ARABIC; 

i  -  itoascii (temp,  j,  i); 
if (  maxch  <  i  ) 

err ("Buffer  too  small  to  expand  register\n") ; 


dest  -  cpy (  dest,  temp  ); 
HORIZ  +-  i; 


goto  newchar; 


\ *  (xx  or  \*x 


gname (  name,  inp,  ifile,  0  ); 
bp  -  dest  ; 

dest  -  expandstr(  name,  dest,  maxch  ); 
HORIZ  +—  dest  -  bp  ; 
goto  newchar; 


/*  Throw  away  input  up  to  the  newline  or  end  of 
*  file.  Then  delete  all  white  space  preceeding  the 
“  comment.  Use  "goto  exit"  in  order  to  avoid 
“^getting  another  input  character. 

while (  (c  -  (“inp) (ifile) )  !-  '\n'  u  c  J-  EOF  ) 


if(  copymode  )  /»  we're  in  nrof f-compa table  “/ 

/*  copy  mode  «/ 

if<  c  !-  Esc  ) 

“dest++  -  Esc  ; 

“dest++  -  c  ; 
goto  newchar; 


while (  * — dest  '  •  ||  “dest  —  *\t'  ) 

if(  dest  <  tstart  ) 
break; 

++dest  ; 
goto  exit  ; 


/“  Cases  in  the  following  switch  are  expanded  only 
“/When  we're  not  in  copy  mode  of  any  sort. 


case  '\n':  /•  line  continuation,  just  eat  the  \n  */ 

goto  newchar; 

default  i 

if(  copymode  it  !Nr_cpmode  ) 


switch  (  c  ) 

( 

case  '  '  :  • 
case  'O'  ;  * 
case  ' | '  i  / 
case  '*'  j  / 


:  *dest++  -  UP  SPACE;  HORIZ++; 
:  *dest++  -  •  T.  HORIZ++; 
:  /•  ignored  */ 

»  /*  ignored  “/ 


•++;  break; 

I++;  break; 

break; 

break; 

(continued  on  page  68) 


66 

206 


Dr.  Dobb’s  Journal,  March  1987 


C  CHEST 


Listing  Eighteen  (Listing  continued,  tejct  begins  on  page  96.) 


case 

'N' 

:  *dest++ 

- 

'\n' ; 

break; 

case 

:  *dest++ 

- 

HORIZ++; 

break; 

case 

'4* 

:  *dest++ 

- 

LITCHAR; 

break; 

case 

1  z' 

:  *dest++ 

“ 

ZWIDTH; 

break; 

case 

't ' 

:  “dest++ 

_ 

LITCHAR 

/•fall  through*/ 

case 

•T* 

:  »dest++ 

“ 

1  \t 1 ; 

break; 

case 

'a  • 

:  *dest++ 

. 

LITCHAR 

/•fall  through*/ 

case 

•A' 

:  *dest++ 

- 

SOH  ; 

break; 

case 

' o'  : 

/*  superimpose  \o 

abed1  */ 

get  quote ("\\o")  ; 

while  ( (c  -  (*  inp)  (if lie) )  !-  *\"  44  roaxch  >  4) 

{ 

maxch  —  2  ; 

*dest++  -  ZWIDTH; 

•dest ++  -  (c  i-  Esc)  ?  c  :  (*inp) (ifile)  ; 

) 

*dest++  -  HMOVE  ; 

•dest++  -  Hs_amt  ; 
break; 


HORIZ ++; 
case  'x' : 


/*  \x<2-hex-digits> 


c  -  <*  inp) (ifile);  /*  c  -  MS  digit  */ 

i  -  (*  inp) (ifile);  /*  i  -  LS  digit  */ 

c  -  toupper(c); 
i  -  toupper (i)  ; 

if (  *ishex(c)  ||  !ishex(i)  ) 

err("\\x  must  be  followed  by  two  hex  digits\n"); 

else 

{ 

/*  \xNH  takes  up  space.  If  you  need  to  have 

*  a  zero  width  escape  sequence  get  to  the 

*  output  in  right-adjusted  text,  use  the 

*  .ou  command  or  \fX  mechanism. 

*/ 


*dest++  ■ 
HORIZ++; 


(tohex(c)  «  4) 


|  tohex(i) 


break; 

case  ' { * : 

*dest++  -  ' . ' ; 

*dest++  -  • { • ; 

HORIZ  +-  2; 

if (  c  -  <*inp) (ifile)  —  Esc  ) 
c  -  (*inp) (ifile) ; 

goto  exit; 

case  * ) 1 : 

/*  setting  Abort_process  to  nonzero  forces 

*  process ()  to  terminate  AFTER  processing  the 

*  current  line.  It  has  the  same  effect  as  a  .} 

*  command  at  the  beginning  of  the  next  line. 

*  All  text  following  the  \(  on  the  line  is 

*  discarded.  ()) 

*/ 


Abort jprocess  -  1; 

while(  (c  -  (*inp)  (ifile) )  !- 


44  C  !-  EOF  ) 


goto  exit 
case  ' e' : 


/*  \e  printable  version  of  the 


*dest++  -  LITCHAR  ;  /*  current  escape  character 
*dest++  -  Esc; 

HORIZ ++; 
break; 

e  '  f's  /*  Change  font  \f(RBIOPx)  * 

if (  maxch  <  2  ) 

( 

err ("Ho  room  in  input  buffer\n") ; 
break; 

) 

switch(  c  -  (*inp) (ifile)  ) 

( 

case  BOLD: 
case  ITALICS: 

case  OVER:  *dest++  -  CH_ATTRIB  ;  break; 
default:  *dest++  -  CH  FONT  ;  break; 


*dest++  -  c 
break; 

case  'r*  : 
case  'u*  : 
case  'd*  : 
if (  maxch  >  2  ) 


/*  up  1  line 
/*  up  1/2  line 
/*  down  1/2  line 


*dest++  -  VMOVE; 

*dest++  -  (  c  —  'r*  )  ?  -  Vs_amt  : 

(  c  —  'u'  )  ?  -max(  Vs_amt/2,  1)  j 
/“  c  —  'd'  •/  max(  Vs  amt/2,  1)  ; 


case  'h*  :  /*  \h'N'  \h'Nu*  Horizontal  motion  */ 
case  'v'  :  /*  \v'H'  \v'Nu*  vertical  motion  */ 

if(  maxch  <  2  ) 
break; 

get_quote("\\v  or  \\h")  ; 

*dest++  -  (C  —  'v')  ?  VMOVE  :  HMOVE  ; 

j  -  c; 

i  -  gnum(inp,  ifile,  4c); 
if(  c  !-  'u*  ) 

*dest++  -  i  *  ((j— 'V')  ?  Vs  arat  :  Ha_amt); 

else 

{ 

*dest++  -  i; 

get_quote("\\v  or  \\h")  ; 


case  '1* 
case  *L' 


/*  horizontal  line 
/*  vertical  line 


\l'Nc'  */ 
\L'Nc'  */ 


/*  Note  that  you  can't  use  an  escape  sequence  for 

*  the  line  character  (as  in  \l'10\x8S').  You  can 

*  say: 

*  .ds  li  \\1 ' 10\x85 ' 

*  \- (li 

*  however . 

*/ 

get_quote("\\l'Nc*  or  WL'Nc'"); 

j  -  (c  —  *1*);  /*  j  -  1  if  horizontal  */ 

i  -  gnum(inp,  ifile,  4c)  ;  /*  I  -  N  in  \L'Nc'  */ 

if  (  c  —  *\"  ) 

linechar  -  j  ?  :  1  I  '  ; 

else 

{ 

linechar  -  c; 
get_quote("\\l  or  \\L")  ; 

) 

while  (  — i  >-  0  44  — maxch  >  4  ) 

{ 

if(  ! j  ) 

*dest++  -  ZWIDTH; 

*dest++  -  linechar  ; 

if(  ! j  ) 

{ 

*dest++  -  VMOVE  ; 

*dest++  -  Vs  amt  ; 


if (  maxch  <  0  ) 

err ("line  drawn  by  \1  or  \L  is  too  long\n"); 


break; 
default  : 


if(  c  —  Hyphen_chr  ) 

*dest++  -  SOFT_HYPHEN; 

else 

*dest++  -  c  ; 

HORIZ ++; 

break; 

break; 


newchar : 

c  -  (  *inp  ) (  ifile  ) ; 
exit: 

•target  -  dest; 
return (  c  ) ; 


/•  \%  •/ 
/*  \<any  char>  */ 


Sgetc()  is  used  to  process  a  mode  2  process ()  call. 


sgetc (s) 
UCHAR  **s; 


return  **s  ?  *((*s)++)  :  EOF 


chgfont(  c  ) 

( 

/•  If  New_font  is  non-zero  a  font-change 


(continued  on  page  72) 


68 


Dr.  Dobb's  Journal,  March  1987 

207 


_ C  CHEST 

Listing  Eighteen 

(Listing  continued,  text  begins  on  page  96.) 

*  request  is  appended  to  the  front  of  the 

*  next  input  line.  ChgfontO  is  called 

*  from  the  routine  that  processes  the  .ft 

*  request  (  ft()  in  nrprocs.c  ]  and  also 

*  when  an  environment  is  restored  (by 

*  pop  env()  in  nrmsc.c) . 

*/  ~ 

New  font  -  c; 

} 

/* - '/ 

int  getline(  target,  copymode,  inp) 

int  (  *inp  )  () ; 

UCHAR  ‘target; 

{ 

/*  Get  an  input  line  &  put  it  into  target.  Get  at  most 

*  MAXSTR  characters.  The  input  file,  Ifile,  is  either 

*  an  input  file  or  an  input  macro  depending  on  the 

*  state  of  the  global  Ismacro.  Return  1  on  success 

*  or  0  on  end  of  file.  Lines  ending  with  <Esc><newline> 

*  are  continued  to  the  next  line,  otherwise  newline 

*  terminates  the  line.  The  newline  character  is  not 

*  put  into  the  string.  Tabs  (*I)  are  expanded  to  a 

*  sequence  of  spaces  (\t  is  expanded  into  a  '*1  by 

*  escape ().  This  AI  will  be  processed  by  in  the  text() 

*  module.  Trailing  white  space  is  stripped  from  the  line. 

* 

*  Copymode  is  just  passed  through  to  escape ()  (which 

*  expands  escape  sequences) . 


register  UCHAR  *rp 
register  int  c 
UCHAR  *p 


C  CHEST 


Listing  Eighteen 

(Listing  continued,  text  begins  on  page  96.) 

*  macro  descriptor  returned  by  inopen ()  (in  nrmac.c) 

*  according  to  the  state  of  mode.  The  following  modes 

*  are  recognized 

*  0  Input  is  from  a  file  or  stream  and  Ifile  is 

*  a  FILE  pointer. 

*  1  Input  is  from  a  macro  and  Ifile  is  a  macro 

*  pointer  returned  from  mopen(). 

*  2  Input  is  from  a  string  and  Ifile  is  a  pointer 

*  to  that  string.  Note  that  mode  two  commands 
are  processed  with  the  current  file  (ie. 
ifile,  nifilenaroe  and  nraacv  are  ignored) . 

*  Process  returns  iramediatly  if  commando  returns  true. 

*  This  routine  is  extremely  recursive.  Be  careful  with 

*  static  variables  (ie.  don't  use  them) . 


UCHAR 

line  (MAXSTR] 

UCHAR 

•oiname,  “omaev  ; 

int 

oinlines,  oismacro  ; 

FILE 

•oifile  ; 

int 

oinhibit  ; 

int 

mgetc (),  fgetc(); 

♦ifdef  DEBUG 

printf ("Mode  %d  process  call:  ",  mode  ); 
printf ("file  <0xtx>  named  <%s>,  macv  0  Ox%x\n", 

nifile,  nifilename,  nmacv  ) ; 

printf ("%s  processing  started  (from  %s,  line  %d)\n", 
nifilename,  Ifilename,  INLINES  ); 

♦  endif 

oinhibit  -  Inhibit  ; 


if(  Quit  )  /*  Quit  i 8  set  by  the  .ex  command  •/ 

return  0;  /*  pretend  we've  seen  end  of  file  •/ 

INLINES++  ;  /*  Increment  number  of  input  lines  •/ 

p  -  target  ; 
c  -  (*inp) (Ifile) ; 

while (  (p  -  target)  <  MAXSTR  ) 

{ 

if (  c  —  ' \n*  ||c  —  EOF  ) 

break; 

if(  c  !-  Esc  ) 

{ 

*p++  -  c 

c  -  (  *inp  ) (  Ifile  )  ; 

) 

else 

c  -  escape (target,  *p,  copymode,  inp,  Ifile, 

MAXSTR  -  (p-target)); 


*p  -  0; 

if (  Tabs  enabled  ) 

Hotab(  target  );  /*  Expand  tabs  and  leaders  */ 

if(  New_font  &&  ! ISCMD (‘target)  ) 

/*  This  is  a  kludge  but  it's  the  most  convenient 

*  way  to  get  a  font  change  into  the  input  stream 

*  at  the  correct  place.  We  can't  just  change 

*  fonts  when  the  .ft  is  executed  because  we 


raemepy (  target+2,  target,  (p-target)  +  1  ); 
switch (New  font) 

{ 

case  PREVIOUS: 
case  BOLD: 
case  ITALICS: 

case  OVER:  target [0]  -  CH_ATTRIB; 

break; 


oinlines 

-  INLINES 

oifile 

-  Ifile 

oiname 

-  Ifilename 

oismacro 

-  Ismacro 

omaev 

-  Macv 

INLINES 

-  0 

Ifile 

-  nifile 

I  filename 

-  nifilename 

I sraacro 

-  mode 

Macv 

—  nmacv 

/*  Save  program  state  on  */ 
/*  the  stack  »/ 


/*  Create  new  program  state  */ 


if (  mode  — —  2  ) 

{ 

Ifile  -  (FILE  *)  snifile;  /*  Ifile  is  a  string  ptr  •/ 
get line (  line,  0,  sgetc  ); 

if(  Verbose  ) 

printf (  "\n%s:<%s>\n",  Ifilename,  line  ); 


Inhibit  -  oinhibit  ; 

INLINES  -  oinlines  ; 

Ifilename  -  oiname 
Ifile  -  oifile 

Ismacro  -  oismacro  ; 
Macv  -  omaev  ; 


if<  ISCMD (  ‘line  )  ) 

command (  line  ); 

else 

text(  line  ); 

) 

else 

{ 


while ( 

( 

if( 


if< 


getline (line,  0,  Ismacro  ?  mgetc  :  fgetc)  ) 
Verbose  ) 

printf (  "\nt«:<%s>\n“,  Ifilename,  line  ) ; 

•line  —  FF  ) 
command (  ".bp"  ); 


else  if (  I  ISCMD ( ‘line)  ) 
text (  line  ); 


> 


default: 

target [0]  -  CH_FONT, 

) 

break; 

target (1] 

-  New  font  ; 

New_font 

-  0; 

return (  ! (c  —  EOF  it  p  —  target)  ); 


else  if(  command (line)  ) 
break; 

if{  Abort jprocess  ) 

l  /‘{‘/ 

Abort_process  -  0; 
if(  command (".)" )  ) 
break; 

) 


- - - 

process (  nifile,  nifilename,  mode,  nraacv  ) 

FILE  ‘nifile  ;  /*  Input  file  descriptor  */ 

char  ‘nifilename  ;  /*  Name  of  input  file  */ 

int  mode;  /•  processing  mode  (see  below)  */ 

char  “nraacv  ;  /‘  Macro  arguments  */ 


/*  This  routine  actually  does  the  processing  of  a  file 

*  or  a  macro.  It  is  a  2nd  order  recursive  routine.  That 

*  is  process ()  is  called  recursively  every  time  the 

*  input  is  changed  (by  a  .so  command,  a  macro  expansion, 

*  etc.).  Macv  is  an  argv-like  array  of  arguments  to 

*  macros.  Ifile  is  a  pointer  to  either  a  FILE  or  to  a 


Inhibit  -  oinhibit  ; 

INLINES  -  oinlines  ; 

Ifilename  -  oiname 
Ifile  -  oifile 

Ismacro  -  oismacro  ; 

Macv  -  omaev 

) 

♦ifdef  DEBUG 

printf ("%s:  processing  done  (returning  to  %s,  line  %d)\n", 
nifilename,  Ifilename,  INLINES  ); 

♦  endif 

return  0; 

1  End  Listings 


72 

208 


Dr.  Dobb's  Journal,  March  1987 


16  BIT 

Listing  One  (Listing  continued,  te}tt  begins  on  page  110.) 

*  paste 

—  a  program  to  attach  to  the  lines  of  a  file  the  correspond- 

ing  lines  of  another  file,  with  an  optional  string  between 

them. 

•  Written  January,  1984  by  John  M.  Gamble 

•  Updated  for  UNIX  April,  1986 

*  usage 

*  paste 

[-paste]  [-b  <string>]  [-<n>]  [filel]  [file2] 

•  options: 

•  -p 

<filel>  does  not  exist  (<string>  is  prepended  to  each 

line. ) 

*  -a 

<file2>  does  not  exist  (<string>  is  appended  to  each 

line. ) 

*  -s 

Do  not  print  <string>  with  lines  from  only  one  file. 

*  -t 

An  option  to  resolve  the  ambiguous  command 

"paste  <file>" .  The  -t  flag  forces  <file>  to  trail 

standard  input.  I.e., 

"paste  <file>" 

is  equivalent  to  "paste  <file>  <stdin>" 

"paste  -t  <file>" 

is  equivalent  to  "paste  <stdin>  <file>". 

*  -e 

Do  not  print  <string>  if  both  input  lines  are  empty 

(i.e.,  that  consist  of  no  characters  but  ' \n'.) 

*  -b  <string>  A  string  of  characters  to  be  inserted  between  the  lines 

• 

of  <filel>  and  <file2>.  The  string  may  contain  all 

* 

the  standard  escape  codes  with  the  exception  of  * \ 0 * . 

The  string  may  also  indicate  blanks  with  the  escape 

sequence  '\s'. 

*  -<n> 

Print  <n>  lines  of  <filel>  before  appending  lines  of 

• 

<file2> .  If  <n>  is  negative  (e.g.,  "paste  — 3")  then 

• 

<n>  lines  of  <file2>  will  be  printed  first. 

♦include 

<stdio.h> 

♦include 

<ctype.h> 

/*  On  systems  such  as  UNIX,  if  a  string  with  blanks  in  it  is 

•  surrounded  by  quote  marks,  it  is  considered  to  be  one  string. 

*  On  other  systems,  the  blank  ends  the  string  and  the  quote 

*  marks 

are  passed  along  with  the  other  characters.  So,  while 

*  on  UNIX,  the  command 

'paste  -b  M;  do  "  listl  list2‘ 

•  would 

set 

argv [2]  to";  do  ", 

argv[3]  tolistl. 

argv[3]  tolist2. 

•  a  system  like  MSDOS  would  set 

argv[2] 

argv[3]  to "do". 

argv [4]  to"\"". 

argv[5]  tolistl. 

argv[6]  tolist2. 

*  This 

is  easily  taken  care  of,  but  it  does  mean  that  conditional 

•  compilation  is  required  by  setting  the  switch  below  to  either 

•  zero 

or  one,  depending  on  your  particular  operating  system. 

♦define 

BLANK_ENDS_STR  0 

♦define 

STRINGLEN  128 

♦define 

TRUE  1 

♦define 

FALSE  0 

♦define 

isoctal (x)  l(x)  >-  '0‘  it  (x)  <  • B * ) 

typedef  unsigned  int  Boolean; 

char 

bstring [STRINGLEN)  -  {'NO'); 

char 

•nullstr  -  ""; 

char 

•strf  -  "%s"; 

char 

•program  name  -  "paste"; 

char 

*error_msg[]  - 

/•<)•/ 

"usage:  %s  [-aptse]  [-b  V'stringV']  [-<n>]  [filel]  [file2]  %s\n". 

/*W 

"  %s :  unknown  flag  %s\n". 

/•2"/ 

"%s:  at  least  one  file  must  exist%s\n". 

.  .. 

/•3“/ 

"%s:  -t  flag  is  only  valid  with  one  file  on  the  command  line%s\n  , 

/*4  •/ 

"%s:  both  files  can't  be  standard  input%s\n' , 

/*5  •/ 

"%s:  contradictory  options%s\n". 

/•6*/ 

"%s :  can't  open  %s\n". 

/•7*/ 

"%s:  -a  or  -p  flags  are  invalid  with  two  files%s\n". 

/*8*/ 

"%s :  too  many  files%s\n". 

/"9*/ 

"%s :  string  argument  lacks  closing  V  or  \"%s\n". 

/•10*/ 

"%s :  null  character  in  string  argument%s\n". 

/•!!•/ 

"%s:  string  argument  too  long%s\n" 

); 

main(argc,  argv) 

int 

argc; 

char 

( 

••argv; 

Dr.  Dobb's  Journal,  March  1987 

209 


‘^V*’ 

'  v,f  -  ^1,%^ 

,s  -  : '  '  X  '■s^' 


FILE  *fpl,  *fp2,  *fopen(); 

Boolean  prepend  -  FALSE,  append  -  FALSE,  trailing  -  FALSE; 
Boolean  print  empty  -  TRUE,  print  single  -  TRUE; 

int  slip  -  0; 

char  *subarg; 


if  (argc  —  1) 

exit_error (0,  nullstr); 

/*  Get  the  flags. 

*/ 

while  ( — argc  >  0  i(  **++argv  — 
switch  (• (*argv  +  1)) 


( 


case  • \ 0 • ;  /*  Because  default:  won't  catch  this.*/ 

exit  error (1,  *argv) ; 
brealc; 

case  'b' : 

if  (argc  —  1) 

exit_error (0,  nullstr) ; 


argc — ; 
argv++; 


#if  BLANK_ENDS_STR 

strget  (4argc, 


♦  else 
lendif 


8trload(*argv, 

break; 


4argv,  batring) ; 
bat ring) ; 


case  : 
case  1  O' : 
case  ' 1 ' : 
case  '2' : 
case  '3': 
case  ' 4 ' : 
case  ' 5 ' : 
case  ' 6' : 
case  ' 7* : 
case  ' 8 '  : 
case  ' 9' : 

slip  -  atoi(*argv  +  1); 
break; 

default: 

subarg  -  *argv; 
while  (*++subarg) 

{ 

switch  (*subarg) 

1 

case  'a': 

append  -  TRUE; 
break; 


case  'e': 

printempty  -  FALSE; 
break; 

case  'p': 

prepend  -  TRUE; 
break; 

case  's': 

printsingle  -  FALSE; 
break; 

case  't': 

trailing  -  TRUE; 
break; 

default: 

exit  error(l,  *argv) ; 
brealc; 

) 

break; 

) 

) 

) 


if  (prepend  44  append)  /*  Contradictory  options.*/ 

exit_error (2,  nullstr); 

switch  (argc)  /*  The  number  of  file  names  on  the  command  line.*/ 

{ 

case  0: 

if  (trailing) 

exit_error (3,  nullstr); 

if  (! (prepend  ||  append))  /*  Both  files  can't  be  stdin. */ 

exit_error (4,  nullstr); 

if  (append) 

attachf (stdin,  NULL,  slip,  printsingle,  printempty); 
else 

attachf (NULL,  stdin,  slip,  printsingle,  printempty); 
break; 

case  1: 

/*  Contradictory  options? 

if  (trailing  44  (prepend  | |  append) ) 
exit_error (5,  nullstr) ; 

if  {(fpl  -  fopen(*argv,  "r") )  —  NULL) 
exit_error (6,  *argv) ; 

(continued  on  page  82) 


Dr.  Dobb's  Journal,  March  1987 

210 


79 


16  BIT 


Listing  One  (Listing  continued,  te^ct  begins  on  page  110.) 

if  (append) 

attachf (fpl,  NULL,  slip,  printsingle,  printempty); 
else  if  (prepend) 

attachf (NULL,  fpl,  slip,  printsingle,  printempty) ; 
else  if  (trailing) 

attachf (stdin,  fpl,  slip,  printsingle,  printempty); 
else 

attachf (fpl,  stdin,  slip,  printsingle,  printempty); 

fclose(fpl) ; 
break; 


case  2: 

if  (trailing) 

exit_error (3,  nullstr) ; 

if  (prepend  | |  append) 

exit_error (7,  nullstr); 

if  ((fpl  -  fopen(*argv,  "r") )  —  NULL) 
exit_error (6,  *argv) ; 

if  ( ( fp2  -  fopen (*++argv,  "rM))  —  NULL) 
exit_error (6,  *argv) ; 

attachf (fpl,  fp2,  slip,  printsingle,  printempty); 
fclose(fpl) ; 
fclose(fp2) ; 
break; 


default: 

exit_error (8,  nullstr); 
break; 


exit  (0) ; 

) 


/*  End  of  main.*/ 


♦if  BLANK_ENDS_STR 

/  * - - - - — - - - - - - 

*  strget  —  retrieve  the  <string>  argument  from  the  command  line. 

*  If  the  string  contains  blanks,  C  assumes  this  is  the  end  of  the 

*  string,  and  places  a  \0  at  its  end.  Since  WE  know  that  it's 

*  just  a  blank,  we  put  one  in,  update  the  position  of  argv,  and 

*  decrement  argc.  Escape  sequences  are  treated  just  as  defined 

*  in  C  (except  \0,  which  is  an  error) .  One  extra  escape  sequence 

*  ('\s')  exists  to  handle  multiple  blanks  on  a  line,  for  even  if 

*  the  string  is  enclosed  in  quotes  the  extra  blanks  will  not  be 

*  passed  from  the  command  line. 


strget (pargc,  pargv,  bstr) 
int  * pargc ; 

char  *• *pargv; 

char  *bstr; 

register  int  j; 

char  *subarg; 

Boolean  st_quote  -  FALSE; 

Boolean 8t_apost  -  FALSE; 

subarg  -  **pargv; 

/*  If  the  string  is  begun  with  a  quote  or  an  apostrophe,  remember 
*  so  that  we  know  when  to  end  the  string. 

*/ 

if  ( (st_quote  -  (* subarg  —  ' " '))  II  (st_apost  -  (* subarg  —  • \ *  * ) ) ) 
subarg++; 

for  (j  -  0;  j  <  STRINGLEN;  bstr++,  subarg++,  j++) 

/*  A  or  *'*  encountered  could  mean  the  end  of  a  string  - 
*  check  against  st_quote  or  st_apost. 

*/ 

if  ((st_quote  44  ‘subarg  —  "*')  II 
(st_apost  44  *subarg  —  '\'')) 
break; 

else  if  ( * subarg  —  '\0')  /*  Blank  encountered  in  string.*/ 

( 


/*  If  we  began  with  a  quote,  we  are  not  finished. 

*/ 

if  (st_quote  ||  st_apost) 

*  /*  If  nothing  is  left  on  the  command  line, 

*  a  quote  mark  is  missing. 

*/ 

if  ( — (*pargc)  —  0) 

exit_error (9,  nullstr); 

/*  Put  the  blank  in,  and  point  subarg 

*  to  the  next  argv  string. 

*/ 

*bstr  -  ' 

subarg  -  *  (++ (*pargv) )  -  1; 

) 

/*  otherwise  we  didn't  start  with  a  quote  mark  -  end. 

*/ 

else 

break; 

) 

else  if  (*subarg  —  1 \\ 1 )  /*  Escape  sequences.*/ 

switch (*++8ubarg) 

(continued  on  page  84 ) 


Dr.  Dobb’s  Journal,  March  1987 

211 


16  BIT 


Listing  One  (Listing  continued,  text  begins  on  page  110.) 


i 

/*  Nothing  after  the  • \ •  -  let  the  'blank' 
*  section  handle  it. 

*/ 

case  1 \0  • : 
bstr — ; 
subarg--; 

j~; 

break; 


case  ""  i 


-bstr  - 
break; 

case  'O': 
case  '  1  •  : 
case  ' 2 1  : 
case  ' 3 '  ; 
case  *  4 '  : 
case  ' 5 ' : 
case  1 6' ; 
case  • 7 • s 
-bstr  - 
break; 

(char)  bit_pattern(isubarg) 

case  '\\'s 
-bstr  - 
break; 

•W; 

case  'b* : 
-bstr  - 
break; 

•\b'; 

case  1  f 1 : 
-bstr  - 
break; 

1  \f  * ; 

case  'n' ; 
-bstr  - 
break; 

•\n'; 

case  1 r • : 
-bstr  - 
break; 

'  \r 1 ; 

case  • t • s 
-bstr  - 
break; 

'  \t 1 ; 

case  ' s' : 
-bstr  - 
break; 

default: 
-bstr  - 
break; 

-subarg; 

) 

else 

^  -bstr  -  -subarg;  /-  No  special  character  handling.*/ 

if  (j  —  STRINGLEN) 

exit_error(ll,  nullstr) ; 

*bstr  -  ' \ 0 ' ; 

)  /-  End  of  strget.*/ 

♦else 

/* - , 

*  strload  retrieve  the  <string>  argument  from  the  command  line.  * 

*  Escape  sequences  are  treated  just  as  defined  in  C  (except  \0,  * 

which  is  an  error).  One  extra  escape  sequence  ( •  \s 1 )  exists  in  * 

*  order  to  handle  multiple  blanks  on  a  line  without  bothering  to  * 

*  enclose  the  string  in  quotes.  * 

strload (subarg,  bstr) 

char  * subarg; 

char  ‘bstr; 

{ 

extern  char  ‘nullstr; 

register  int  j; 

for  (j  -  0;  -subarg  it  j  <  STRINGLEN;  bstr++,  subarg++,  j++) 

if  (-subarg  —  * \\ 1 )  /-  Escape  sequences.*/ 

switch (*++subarg) 

( 

case  'O': 
case  ' 1 1 : 
ca  se  • 2  •  : 
case  ' 3  '  : 
case  ' 4  1  : 
ca  se  1  5 1  : 
case  a  6 1 : 
case  1 7 1 : 

-bstr  -  (char)  bit_pattern Usubarg)  ; 
break; 

case  * \\ • s 

-bstr  -  ' \\'; 

break; 

case  'b' ; 

-bstr  -  •  \b'; 
break; 


84 

212 


Dr.  Dobb’s  Journal,  March  1987 


•bstr  -  '  \f; 
break; 

case  1 n* : 

•bstr  -  '\n'; 
break; 

case  1  r '  : 

•bstr  -  '\r'; 
break; 

case  ' t '  : 

•bstr  -  *\t'; 
break; 

case  1  s' : 

•bstr  -  *  '; 
break; 


default: 

•bstr  -  *subarg; 
break; 

> 


else 

•bstr  -  *subarg;  /*  No  special  character  handling.*/ 


If  (j  —  STRINGLEN) 

exit_error (11,  nullstr) ; 

•bstr  -  '\0': 

)  /*  End  of  strload.*/ 

♦endif 


bit_pattern  —  Change  the  \ddd  format  to  a  character  symbol.  It  will 
check  to  see  if  there  are  (at  most  two)  other  octal  digits 
present.  It  does  not  allow  the  return  of  the  null  character. 
The  pointer  *ddd  is  only  incremented  by  one  for  each  extra 
digit,  because  the  pointer  will  be  incremented  again  upon 
returning  from  the  function. 


bit_pattern (ddd) 

char  **ddd; 

( 

extern  char  ‘nullstr; 

int  num; 

num  -  **ddd  -  'O';  /*  Num  is  octal,  otherwise  we  wouldn't  be  here.*/ 

if  (isoctal(* (*ddd  +1)))  /*  Is  the  next  character  an  octal  digit?*/ 

{ 

num  -  8  *  num  +  *++*ddd  -  'O'; 

if  (isoctal (* (*ddd  +1)))  /*  How  about  this  character?*/ 

num  -  8  *  num  +  *++*ddd  -  'O'; 

} 

if  (!num) 

exit_error (10,  nullstr);  /*  No  \0  allowed.*/ 

return (num) ; 


)  /*  End  of  bit_pattern. */ 

/* - * 

*  attachf  —  Take  the  lines  of  <file2>,  if  any,  and  attach  them  * 

*  to  the  lines  of  <filel>,  if  any.  Slip  determines  how  many  * 

*  lines  of  <filel>  (<file2>  if  negative)  are  printed  before  * 

*  printing  the  lines  from  both  files  together.  It  is  possible  to  * 

*  specify  some  slippage  even  if  the  -a  or  -p  flags  are  present.  * 

*  This  is  not  an  error.  Attachf  is  smart  enough  to  skip  slip  * 

*  in  that  case.  * 

*  - */ 

attachf (fpl,  fp2,  slip,  printsingle,  printempty) 

FILE  *fpl,  * f p2 ; 

int  slip; 

Boolean  printsingle,  printempty; 

{ 

Boolean  notempty; 
register  int  nxtc; 

/*  Handle  slippage,  if  any,  up  to  the  end  of  the  file. 

*/ 

for  (;  slip  >  0  it  fpl  !-  NULL;  slip—) 

{ 

notempty  -  (nxtc  -  nextc(fpl))  !-  '\n'; 

if  (nxtc  —  EOF) 

{ 

fpl  -  NULL; 
break; 

) 

put_line(fpl) ; 

if  (printsingle  it  (printempty  | I  notempty) ) 
printf(strf,  bstring); 

putchar  (' \n' ) ; 

) 

if  (slip  <  0) 
slip  -  -slip; 

for  (;  slip  >  0  it  fp2  !-  NULL;  slip — ) 


(continued  on  nejct  page) 


Dr.  Dobb's  Journal,  March  1987 


85 

213 


Listing  One  jed,  text  begins  on  page  110.) 

if  ( (nxtc  -  n  ) 

{ 

fp2  -  NULL 
break; 

) 

if  (print sing  II  nxtc  J-  '\n')) 

printf (str 

put_line(fp2) 
putchar ('\n') 

) 

/•^Paste  the  lim  ogether. 

while  (fpl  «-  NU]  I 

not empty  -  (n i  !-  '\n'; 

if  (nxtc  —  EX 
( 

fpl  -  NULL; 
break; 


put_line  ( fpl ) ; 

if  (printerapt*  »extc(fp2)  I-  '\n') 

printf (strl 

put_line ( fp2 ) ; 

if  (nextc(fp2) 


whi  ,L) 

( 

■  tc  -  nextc(fpl))  !-  '\n'; 


■e  44  (printerapty  | |  noterapty  ) ) 
bat  ring ) ; 


Dr.  Dobb's  Journal,  March  1987 


STRUCTURED  PROGRAMMING 


Listing  One  (Text  begins  on  page  120. ) 

Listing  1.  CHANGE. BAS  Utility  to  search/replace  text  in  a  number  of  files. 

1000  '  Batch  Find/Replace  Utility  Version  1.0  10/29/86 

1005  '  IBM  PC  BASICA  version  2  or  later 

1010  '  Copyright  (c)  1987  Namir  Clement  Shammas 
1020  DEFINT  A-Z 

1030  DIM  FILENAME? (20)  , STRNG$ (30) ,  REPLACE (30) ,  REPLACE$ (30)  ,  L$ (500) 


1040 

TRUE  =  1 

1050 

FALSE  =  0 

1060 

MAX. LINES  =  500  '  Current  maximum  number  of  lines  read  from  a  file 

1900 

CLS 

1910 

T$  =  "BATCH  FILE  FIND/REPLACE  PROGRAM"  :  GOSUB  8000 

1920 

PRINT 

1930 

T$  =  "VERSION  1.0"  :  GOSUB  8000 

1940 

PRINT  :  PRINT 

2000 

GOSUB  5000  '  Get  filenames 

2010 

GOSUB  6000  '  Get  strings 

2030 

FOR  IFILE  =  1  TO  NUM. FILES 

2060 

GOSUB  7000  '  Read  text  lines  from  file 

2070 

FOR  I  =  1  TO  NUM. STRINGS 

2080 

FOUND  =  FALSE 

2090 

FOR  J  =  1  TO  NUM. LINES 

2100 

PTR  =  INSTR (L$ (J) , STRNGS (I) ) 

2110 

WHILE  PTR  >  0 

2120 

IF  (FOUND  =  TRUE)  THEN  2150 

2130 

FOUND  =  TRUE 

2140 

LPRINT  "KEYWORD  :  STRNGS (I) 

2150 

B$  =  STRS ( J)  +  ":" 

2153 

OFFSET  =  LEN(BS) 

2155 

LPRINT  J;  " : "; L$ ( J) 

2160 

LPRINT  SPC  (PTR+OFFSET) 

2170 

IF  (REPLACE (I)  =  FALSE)  THEN  2240 

2180 

FIRSTS  =  "" 

2190 

IF  PTR  >  1  THEN  FIRSTS  =  MIDS (L$ ( J) , 1, 

(PTR-1) ) 

2200 

LASTS  =  "" 

2210 

IF  (PTR+LEN (STRNGS (I ) ) )  =>  LEN (L$ ( J) ) 

THEN  2230 

2220 

LASTS  =  MIDS (L$ ( J) ,  (PTR+LEN (STRNGS (I) )) ) 

2230 

L$  ( J)  =  FIRSTS  +  REPLACES (I)  +  LASTS 

2231 

LPRINT  "BECOMES"  :  LPRINT 

2233 

LPRINT  J; " : " ;L$ (J)  :  LPRINT  :  LPRINT 

2240 

PTR  =  INSTR (PTR+1, L$ ( J) ,  STRNGS (I)  ) 

2250 

WEND 

2260 

NEXT  J 

2270 

NEXT  I 

2275 

GOSUB  9000  '  Write  file  back 

2277 

LPRINT  :  LPRINT 

2280 

NEXT  IFILE 

2290  LPRINT  CHR$ (140)  '  FORM  FEED 

3000  END  ' - 

5000  '  Subroutine  to  input  filenames  from  the  keyboard 

5010  NUM. FILES  =  0 

5020  WHILE  NUM. FILES  <=  0 

5030  INPUT  "Enter  number  of  files  ";NUM. FILES 

5040  PRINT 

5050  WEND 

5060  FOR  I  =  1  TO  NUM. FILES 

5070  PRINT  "Enter  filename  #  ";I;"  "; 

5080  INPUT  FILENAMES (I)  :  PRINT 
5090  IF  FILENAMES (I)  =  ""  THEN  5070 
5100  NEXT  I 
5110  RETURN 

6000  '  Subroutines  to  inpur  search/replace  strings 

6010  NUM. STRINGS  =  0 

6020  WHILE  NUM. STRINGS  <=  0 

6030  INPUT  "Enter  number  of  search/replace  strings  ";NUM. STRINGS 
6040  PRINT 

6050  WEND 

6060  FOR  I  =  1  TO  NUM. STRINGS 
6065  REPLACES (I)  =  "" 

6070  PRINT  :  PRINT  "For  string  #  ";I 
6080  INPUT  "  Enter  string  ";STRNG$(I) 

6090  INPUT  "  R) eplace  F) ind  ";A$ 

6100  IF  (INSTR ("Rr", MIDS (AS, 1# 1) )  =  0)  THEN  REPLACE (I)  = 

FALSE  ELSE  REPLACE (I)  =  TRUE 
6110  IF  REPLACE (I)  =  FALSE  THEN  6125 

6120  INPUT  "  •  Enter  replacement  string  "/REPLACES (I) 

6125  PRINT 
6130  NEXT  I 
6140  RETURN 

7000  '  Subroutines  to  read  text  lines 

7003  LPRINT  "PROCESSING  FILE  :  "; FILENAMES (IFILE) 


88 


Dr.  Dobb's  Journal,  March  1987 

215 


7006  OPEN  "I" , 1 , FILENAMES (I FILE) 

7010  NUM. LINES  -  0 

7020  WHILE  (NOT  EOF (1 ) )  AND  (NUM. LINES  <=  MAX. LINES) 

7030  NUM. LINES  -  NUM. LINES  +  1 

7040  LINE  INPUT#1,L$ (NUM. LINES) 

7050  WEND 
7060  CLOSE  HI 
7070  RETURN 

8000  '  Subroutine  to  center  a  message 
8010  PRINT  SPC(40  -  LEN(T$)/2);T$ 

8020  RETURN 

9000  'Subroutine  to  write  the  updated  file 
9010  OPEN  “0",1, FILENAMES (IFILE) 

9020  FOR  I  -  1  TO  NUM. LINES 
9030  PRINT#1 , L$ (I) 

9040  NEXT  I 
9050  CLOSEH 
9060  RETURN 

End  Listing  One 


Listing  Two 


Listing  2.  CHNG1.TRU  the  version  of  True  BASIC  CHANGE. BAS  produced  by  the 
BASIC-Converter . 


This  program  converted  from  the  Microsoft  Advanced  Basic 
language  on  the  IBM  PC  to  the  True  BASIC  language. 

Convertor  copyright  (c)  1985  by: 

True  BASIC,  Inc. 

Hanover,  NH  03755 
All  rights  reserved. 

True  BASIC  makes  no  warranty,  expressed  or  implied,  that 
this  converted  program  is  a  precise  and  accurate  equivalent 
of  the  original  BasicA  program.  This  conversion  is  provided 
only  as  an  aid  to  a  complete  conversion  by  the  owner  of  the 
program  being  converted. 


lof 

val 


err,  erl 


t  argl 


10 
11 
12 

13 

14 

15 

16 

17 

18 

19 

20 
21 
22 

23 

24  LIBRARY  "deflib" 

25  DECLARE  DEF  csrlin,  oef,  fre,  hex$,  inkeyS,  loc, 

26  DECLARE  DEF  mki$,  mks$,  cvi,  cvs,  oct$,  csr  pos, 

27 

28  DEF  Eof  (f) 

29  IF  end  #f  then  LET  eof  =  -1  else  LET  eof  =  0 

30  END  DEF 

31 

32  DEF  Loc  (f) 

33  ASK  #f:  record  T_ARG1 

34  LET  t_a rgl  =  -int (- (t_argl-l) /128) 

35  IF  t_argl  -  0  then  let  loc  =  1  else  let  loc  = 

36  END  DEF 

37 

38  DEF  Lof  (f) 

39  ASK  #f :  filesize  T_ARG1 

40  LET  lof  =  t_argl 

41  END  DEF 

42 

43  OPTION  BASE  0 

44 

1000  !  Batch  Find/Replace  Utility  Version  1.0 

'  IBM  PC  BASICA  version  2  v. 

Copyright  (c)  1987  Namir  Clement  Shammas 

-  .  defint  A-Z 

1030  dim  filenames  (20) ,  strngS  (30) ,  replace  (30),  replaces (30) ,  1  $  (500) 

1040  let  true  =  1 
1050  let  false  =  0 

1060  let  max _ lines  =  500  !  Current  maximum  number  of  lines  read  from  a  file 

1900  clear 

1910  let  t$  =  "BATCH  FILE  FIND/REPLACE  PROGRAM" 

1911  gosub  8000 
1920  print 

1930  let  t$  =  "VERSION  1.0" 

1931  gosub  8000 

1940  print 

1941  print 

1945  OPEN  #9  :  PRINTER 

2000  gosub  5000  !  Get  filenames 

2010  gosub  6000  !  Get  strings 

2030  for  i file  -  1  to  num _ files 

2060  gosub  7000  !  Read  text  lines  from  file 

2070  for  i  -  1  to  num  strinos  ✓  » 

—  y  (continued  on  ne^ct  page 


1005 

1010 

1020 


10/29/86 
or  later 


Dr.  Dobb  s  Journal,  March  1987 

216 


89 


STRUCTURED  PROGRAMMING 


Listing  Two  (Listing  continued,  text  begins  on  page  120.) 

2080  let  found  -  false 

2090  for  j  -  1  to  num _ lines 

2100  let  ptr  -  pos (1$ ( j) , strngS (i) ) 

2110  do  while  ptr  >  0 

2120  if  (found  =  true)  then  goto  2150 

2130  let  found  =  true 

2140  print  #9  :  "KEYWORD  :  "; STRNGS (I) 

2150  let  b$  “  str$  ( j)  & 

2153  let  offset  -  round (len (b$) ) 

2155  print  #9  :  J; " : ";L$ (J) 

2160  print  #9  :  REPEAT$  ("  ",  (PTR+OFFSET+1)  ) ; "A"  !  Manual  fix  on  this  line 
2170  if  (replace  (i)  -  false)  then  goto  2240 
2180  let  first$  -  "" 

2190  if  ptr  >  1  then  let  first$  -  (1$  ( j) ) [1 :1+  (ptr-1) -1 ) 

2200  let  last$  =  "" 

2210  if  (ptr+len  (strng$  (i) ) )  «=>  len(l$(j))  then  goto  2230 
2220  let  last$  -  (1$ (j) )[  (ptr+len  (strng$ (i) )) :maxnum] 

2230  let  l$(j)  =  first$  &  replaces (i)  &  lasts 

2231  print  19  :  "BECOMES" 

2232  print  #9  :  . 

2233  print  #9  :  J;":";L$(J) 

2234  print  #9  : 

2235  print  #9  : 

2240  let  ptr  «  pos  (1$  ( j) , strng$  (i) , ptr+1) 

2250  loop 
2260  next  j 
2270  next  i 

2275  gosub  9000  !  Write  file  back 

2277  print  #9  : 

2278  print  #9  : 

2280  next  ifile 

2290  print  #9  :  CHR$(140)  !  FORM  FEED 

3000  stop  ! - 

5000  !  Subroutine  to  input  filenames  from  the  keyboard 

5010  let  num  files  =  0 

5020  do  while  num _ files  <=  0 

5030  input  prompt  "Enter  number  of  files  num _ files 

5040  print 
5050  loop 

5060  for  i  *  1  to  num  files 

5070  print  "Enter  filename  #  ";  i;  " 

5080  input  filenames (i) 

5081  print 

5090  if  filenames  (i)  -  ""  then  goto  5070 
5100  next  i 
5110  return 

6000  !  Subroutines  to  inpur  search/replace  strings 

6010  let  num _ strings  =  0 

6020  do  while  num _ strings  <«  0 

6030  input  prompt  "Enter  number  of.  search/replace  strings  " :  num _ strings 

6040  print 
6050  loop 

6060  for  i  =  1  to  num  strings 
6065  let  replaces (i)  =  "" 

6070  print 

6071  print  "For  string  #  ";  i 

6080  input  prompt  "  Enter  string  ":  strng$(i) 

6090  input  prompt  "  R)eplace  F) ind  ":  a$ 

6100  if  (pos ("Rr",  (a$)  (1:1) )  -  0)  then  let  replace(i)  = 
false  else  let  replace (i)  =  true 
6110  if  replace (i)  *  false  then  goto  bi2S 

6120  input  prompt  "  Enter  replacement  string  replaces (i) 

6125  print 
6130  next  i 
6140  return 

7000  !  Subroutines  to  read  text  lines 

7003  print  #9  :  "PROCESSING  FILE  :  "; FILENAMES (IFILE) 

7006  open  #1:  name  filenames  (if ile) ,  access  input,  create  old 
7010  let  num _ lines  -  0 

7020  do  while  ((not  eof(l)  <>  0))  and  (num _ lines  <«=  max _ lines) 

7030  let  num  lines  =  num _ lines+1 

7040  line  input  #1:1$ (num _ lines)  !  Manual  fix  here 

7050  loop 
7060  close  #1 
7070  return 

8000  !  Subruotine  to  center  a  message 
8010  print  tab (csr_pos+40-len (t$) / 2) ;  t$ 

8020  return 

9000  ! Subroutine  to  write  the  updated  file 

9010  open  #1:  name  filenames  (i file) ,  access  output,  create  old 
9015  erase  #1  !  this  line  is  added 


Dr.  Dobb's  Journal,  March  1987 

217 


9020  for  i  =  1  to  num  lines 
9030  print  #1:1$  (i) 

9040  next  i 
9050  close  #1 

9060  return 

9061  end 

End  Listing  Two 

Listing  Three 


Listing  3.  CHNG2.TRU  the  True  BASIC  version  of  CHANGE. BAS  that  is 
translated  manually. 

!  Batch  Find/Replace  Utility  Version  1.0  10/29/86 

•  IBM  PC  True  BASIC  version  1 

!  Copyright  (c)  1987  Namir  Clement  Shammas 

DIM  FILENAME$ (20) , STRNG$ (30) , REPLACE (30)  ,  REPLACE$ (30) ,  L$ (500) 

LET  TRUE  *  1 
LET  FALSE  -  0 

LET  MAX_LINES  -  500  !  Current  maximum  number  of  lines  read  from  a  file 
CLEAR  !  Clear  screen 

CALL  CenterText ("BATCH  FILE  FIND/REPLACE  PROGRAM") 

PRINT 

CALL  CenterText ("VERSION  1.0") 

PRINT 

PRINT 

OPEN  #9  :  PRINTER 

CALL  GetFile (FILENAMES,  NUM_FILES)  !  Get  filenames 

CALL  GetStrings (STRNGS, REPLACES, REPLACE, NUM_STRINGS)  !  Get  strings 

FOR  IFILE  “  1  TO  NUM_FILES 

CALL  ReadLines (L$, FILENAMES, IFILE,  NUM_LINES)  !  Read  text  lines  from  file 
FOR  I  =  1  TO  NUM_STRINGS 
LET  FOUND  =  FALSE 
FOR  J  =  1  TO  NUM_LINES 

LET  PTR  =  POS(L$(J) , STRNGS (I)) 

DO  WHILE  PTR  >  0 

IF  (FOUND  =  FALSE)  THEN 
LET  FOUND  =  TRUE 

PRINT  #9  :  "KEYWORD  :  ";STRNG$(I) 

END  IF 

LET  B$  =  STRS ( J)  &  !  Use  &  to  concatenate  strings 

LET  OFFSET  =  LEN (B$) 

PRINT  #9  :  J; ": ";L$ (J) 

PRINT  #9  :  REPEATS ("  ", (PTR+OFFSET+1) ) ;"A" 

IF  (REPLACE (I)  -  TRUE)  THEN 
LET  FIRSTS  =  "" 

IF  PTR  >  1  THEN  LET  FIRSTS  -  L$ ( J) [1: (PTR-1) ] 

LET  LASTS  -  "" 

IF  (PTR+LEN (STRNGS (I) ) )  <  LEN (L$  ( J) )  THEN 

LET  LASTS  =  L$  (J)  ( (PTR+LEN (STRNGS (I) )) :LEN (L$ (J) ) ] 

END  IF 

LET  L$ ( J)  -  FIRSTS  &  REPLACES (I)  &  LASTS 
PRINT  #9  :  "BECOMES" 

PRINT  #9  : 

PRINT  #9  :  J;":";L$(J) 

PRINT  #9  : 

PRINT  #9  : 

END  IF 

LET  PTR  -  POS (L$(J), STRNGS (I),  (PTR+1) ) 

LOOP 
NEXT  J 
NEXT  I 

CALL  WriteLines (L$, FILENAMES, REPLACE, IFILE,  NUM_LINES) 

!  Write  file  back 
PRINT  #9  : 

PRINT  19  : 

NEXT  IFILE 

PRINT  #9  :  CHRS (140)  !  FORM  FEED 

SUB  GetFile (FILENAMES () ,  NUM_FILES) 

!  Subroutine  to  input  filenames  from  the  keyboard 
LET  NUM_FILES  -  0 
DO  WHILE  NUM_FILES  <=  0 

INPUT  PROMPT  "Enter  number  of  files  ":NUM_FILES 
PRINT 
LOOP 

FOR  I  «  1  TO  NUM_FILES 
LET  FILENAMES (I)  -  "" 

DO  WHILE  FILENAMES  (I)  -  "" 

PRINT  "Enter  filename  I  ";I;" 

INPUT  FILENAMES (I) 

(continued  on  ne?ct  page) 


Dr.  Dobb's  Journal,  March  1987 

218 


91 


STRUCTURED  PROGRAMMING 


Listing  Three  (Listing  continued,  te^ct  begins  on  page  120.) 

PRINT 
LOOP 
NEXT  I 
END  SUB 

SUB  Get Strings (STRNGS () , REPLACES () , REPLACE () ,  NUM_STRINGS) 

!  Subroutines  to  inpur  search/replace  strings 
LET  NUM_STRINGS  =  0 
DO  WHILE  NUM_STRINGS  <=  0 

INPUT  PROMPT  "Enter  number  of  search/replace  strings  " :NUM_STRINGS 
PRINT 
LOOP 

FOR  I  =  1  TO  NUM_ST RINGS 
LET  REPLACES (I)  =  "" 

PRINT 

PRINT  "For  string  #  ";I 

INPUT  PROMPT  "  Enter  string  ": STRNGS (I) 

INPUT  PROMPT  "  R) eplace  F) ind  ":A$ 

IF  ( POS ( " Rr " # AS [1:1])  =  0)  THEN 
LET  REPLACE (I)  -  FALSE 
ELSE 

LET  REPLACE (I)  =  TRUE 

INPUT  PROMPT  "  Enter  replacement  string  ": REPLACES (I ) 

END  IF 
PRINT 
NEXT  I 
END  SUB 

SUB  ReadLines (L$ () , FILENAMES () , INDEX, NUM_LINES) 

!  Subroutines  to  read  text  lines 

PRINT  #9  :  "PROCESSING  FILE  :  "; FILENAMES (INDEX) 

OPEN  #1  :  NAME  FILENAMES  (INDEX) ,  ORGANIZATION  TEXT,  ACCESS  INPUT,  CREATE  OLD 
LET  NUM_LINES  =  0 
DO  WHILE  MORE  #1 

LET  NUM_LINES  -  NUM_LINES  +  1 
LINE  INPUTIl  :  L$(NUM_LINES) 

LOOP 

CLOSE  #1 
END  SUB 

SUB  CenterText (T$) 

!  Subroutine  to  center  a  message 
PRINT  REPEATS ("  ",(40  -  LEN (T$) /2)  )  ;T$ 

END  SUB 

SUB  WriteLines (L$ () , FILENAMES () , INDEX, NUM_LINES) 

! Subroutine  to  write  the  updated  file 

OPEN  II  :  NAME  FILENAMES (INDEX) ,  ORGANIZATION  TEXT,  ACCESS  OUTPUT,  CREATE  OLD 
ERASE  #1 

FOR  I  =  1  TO  NUM_LINES 
PRINT# 1  :  L$ (I) 

NEXT  I 
CLOSE#  1 
END  SUB 

END  End  Listing  Three 

Listing  Four 

Listing  4.  CHNG1.BAS  the  first  QuickBASIC  version  of  CHANGE. BAS  that  is 
translated  manually. 

'  Batch  Find/Replace  Utility  Version  1.0  10/29/86 

•  IBM  PC  QuickBASIC  version  2 

•  Copyright  (c)  1987  Namir  Clement  Shammas 
DEFINT  A-Z 

DIM  FILENAMES  (20) , STRNGS  (30) , REPLACE (30)  ,  REPLACES (30) ,  L$ (500) 

TRUE  =  1 
FALSE  =  0 

MAX. LINES  =*  500  *  Current  maximum  number  of  lines  read  from  a  file 
CLS 

T$  -  "BATCH  FILE  FIND/REPLACE  PROGRAM"  :  GOSUB  Center 
PRINT 

T$  =  "VERSION  1.0"  :  GOSUB  Center 
PRINT  :  PRINT 

GOSUB  GetFile  '  Get  filenames 
GOSUB  GetStrings  '  Get  strings 
FOR  IFILE  =  1  TO  NUM. FILES 


92 


Dr.  Dobb  s  Journal,  March  1987 

219 


GOSUB  ReadLines  1  Read  text  lines  from  file 
FOR  I  =  1  TO  NUM. STRINGS 
FOUND  =  FALSE 
FOR  J  -  1  TO  NUM. LINES 

PTR  -  INSTR (L$  (J) , STRNG$ (I) ) 

WHILE  PTR  >  0 

IF  (FOUND  -  FALSE)  THEN 
FOUND  =  TRUE 

LPRINT  "KEYWORD  :  ";STRNG$(I) 

END  IF 

B$  -  STR$ ( J)  + 

OFFSET  =  LEN (B$) 

LPRINT  J; " :";L$ (J) 

LPRINT  SPC  (PTR+OFFSET)  ;"A" 

IF  (REPLACE (I)  =  TRUE)  THEN 
FIRSTS  =  "" 

IF  PTR  >  1  THEN  FIRSTS  =  MIDS (L$ ( J) , 1, (PTR-1)) 

LASTS  »  "" 

IF  (PTR+LEN (STRNGS (I) ) )  <  LEN (L$ ( J) )  THEN 
LASTS  =  MIDS (L$ ( J) , (PTR+LEN (STRNGS (I) )) ) 

END  IF 

L$ ( J)  =  FIRSTS  +  REPLACES (I)  +  LASTS 
LPRINT  "BECOMES"  :  LPRINT 

LPRINT  J;":";L$(J)  :  LPRINT  :  LPRINT 
END  IF 

PTR  =  INSTR (PTR+1, L$ (J) , STRNGS (I) ) 

WEND 
NEXT  J 
NEXT  I 

GOSUB  WriteLines  '  Write  file  back 
LPRINT  :  LPRINT 
NEXT  IFILE 

LPRINT  CHRS  (140)  '  FORM  FEED 

END  ' - 

GetFile:  '  Subroutine  to  input  filenames  from  the  keyboard 
NUM. FILES  =  0 
WHILE  NUM. FILES  <=  0 

INPUT  "Enter  number  of  files  ";NUM. FILES 
PRINT 

WEND 

FOR  I  =  1  TO  NUM. FILES 
FILENAMES (I)  =  "" 

WHILE  FILENAMES (I)  =  "" 

PRINT  "Enter  filename  #  ";I;"  "; 

INPUT  FILENAMES (I)  :  PRINT 
WEND 
NEXT  I 
RETURN 

GetStrings:  '  Subroutines  to  inpur  search/replace  strings 
NUM. STRINGS  =  0 
WHILE  NUM. STRINGS  <=  0 

INPUT  "Enter  number  of  search/replace  strings  ";NUM. STRINGS 
PRINT 

WEND 

FOR  I  -  1  TO  NUM. STRINGS 
REPLACES (I)  =  "" 

PRINT  :  PRINT  "For  string  #  ";I 
INPUT  "  Enter  string  ";STRNG$(I) 

INPUT  "  R)eplace  F) ind  ";A$ 

IF  (INSTR ("Rr", MIDS  (AS, 1,1))  -  0)  THEN  REPLACE  (I)  - 
FALSE  ELSE  REPLACE (I)  -  TRUE 
IF  REPLACE (I)  -  TRUE  THEN 

INPUT  "  Enter  replacement  string  REPLACES (I) 

END  IF 
PRINT 
NEXT  I 
RETURN 

ReadLines:  •  Subroutines  to  read  text  lines 

LPRINT  "PROCESSING  FILE  :  "; FILENAMES (IFILE) 

OPEN  "I", 1, FILENAMES (IFILE) 

NUM. LINES  «  0 

WHILE  (NOT  EOF  (1) )  AND  (NUM. LINES  <=  MAX. LINES) 

NUM. LINES  -  NUM. LINES  +  1 
LINE  INPUT# 1, L$ (NUM. LINES) 

WEND 
CLOSE  #1 
RETURN 

Center:  '  Subroutine  to  center  a  message 

PRINT  SPC (40  -  LEN (TS) /2)  ;T$ 

RETURN 

WriteLines:  'Subroutine  to  write  the  updated  file 
OPEN  "0" , 1 ,  FILENAMES (IFILE) 

(continued  on  ne?ct  page ) 


Dr.  Dobb's  Journal,  March  1987 

220 


STRUCTURED  PROGRAMMING 


Listing  Four  (Listing  continued,  text  begins  on  page  120.) 


FOR  I  -  1  TO  NUM. LINES 
PRINTfl, LS (I) 

NEXT  I 

CLOSE#l 

RETURN 


End  Listing  Four 


Listing  Five 

Listing  5.  CHNG2 . BAS  the  second  QuickBASIC  version  of  CHANGE. BAS  that  is 
translated  manually. 

'  Batch  Find/Replace  Utility  Version  1.0  10/29/86 

•  IBM  PC  QuickBASIC  version  2 

•  Copyright  (c)  1987  Namir  Clement  Shammas 
DEFINT  A-Z 

DIM  FILENAMES (20) , STRNGS (30) , REPLACE (30) , REPLACES (30) ,  LS (500) 

TRUE  “  1 
FALSE  «  0 

MAX. LINES  -  500  '  Current  maximum  number  of  lines  read  from  a  file 
CLS 

CALL  Cent erText ("BATCH  FILE  FIND/REPLACE  PROGRAM") 

PRINT 

CALL  CenterText ("VERSION  1.0") 

PRINT  :  PRINT 

CALL  GetFile (FILENAMES () ,  NUM. FILES)  •  Get  filenames 

CALL  GetStrings (STRNGS () , REPLACES  () ,  REPLACE () , NUM. STRINGS)  '  Get  strings 
FOR  IFILE  =  1  TO  NUM. FILES 

'  Read  text  lines  from  file 

CALL  ReadLines (LS () , FILENAMES () , IFILE,  NUM. LINES) 

FOR  I  =  1  TO  NUM. STRINGS 
FOUND  “  FALSE 
FOR  J  *  1  TO  NUM. LINES 

PTR  -  INSTR (LS ( J) , STRNGS (I) ) 

WHILE  PTR  >  0 

IF  (FOUND  “  FALSE)  THEN 
FOUND  -  TRUE 

LPRINT  “KEYWORD  :  STRNGS (I) 

END  IF 

B$  -  STRS ( J)  + 

OFFSET  «=  LEN(BS) 

LPRINT  J; ":";L$ (J) 

LPRINT  SPC (PTR+OFFSET) ;"A" 

IF  (REPLACE (I)  =  TRUE)  THEN 
FIRSTS  -  "" 

IF  PTR  >  1  THEN  FIRSTS  =  MIDS (LS ( J) / 1, (PTR-1) ) 

LASTS  *  "" 

IF  (PTR+LEN (STRNGS (I) ) )  <  LEN(L$(J))  THEN 
LASTS  »  MIDS (LS  (J) ,  (PTR+LEN (STRNGS (I) ) ) ) 

END  IF 

LS ( J)  -  FIRSTS  +  REPLACES (I)  +  LASTS 
LPRINT  "BECOMES"  :  LPRINT 
LPRINT  J; " :M;L$ (J)  :  LPRINT  :  LPRINT 
END  IF 

PTR  -  INSTR (PTR+1, LS ( J) , STRNGS (I) ) 

WEND 
NEXT  J 
NEXT  I 

'  Write  file  back 

CALL  WriteLines  (LS  () , FILENAMES () , REPLACE () , IFILE,  NUM. LINES) 

LPRINT  :  LPRINT 
NEXT  IFILE 

LPRINT  CHRS (140)  '  FORM  FEED 

END  - - 

SUB  GetFile (FILENAMES  (1) ,  NUM. FILES)  STATIC 
•  Subroutine  to  input  filenames  from  the  keyboard 
NUM. FILES  -  0 
WHILE  NUM. FILES  <=  0 

INPUT  "Enter  number  of  files  NUM. FILES 
PRINT 

WEND 

FOR  I  “  1  TO  NUM. FILES 
FILENAMES (I)  -  "" 

WHILE  FILENAMES (I)  -  "" 

PRINT  "Enter  filename  #  ";I;" 

INPUT  FILENAMES (I)  :  PRINT 
WEND 
NEXT  I 
END  SUB 


94 


Dr.  Dobb's  Journal,  March  1987 

221 


SlIB  GetStrings  (STRNGS  (1) ,  REPLACE?  (1) ,  REPLACE (1) ,  NUM.  STRINGS)  STATIC 
Subroutines  to  lnpur  search/replace  strings 
NUM. STRINGS  =  0 
WHILE  NUM. STRINGS  <=  0 

PRINT  "Enter  number  search/replace  strings  NUM. STRINGS 

WEND 

FOR  I  =  1  TO  NUM. STRINGS 
REPLACES  (I)  - 

PRINT  :  PRINT  "For  string  #  ";I 
INPUT  "  Enter  string  M;STRNGS(I) 

INPUT  “  RJeplace  F) ind  ";A$ 

IF  (INSTR ("Rr" ,  MID? (AS,  1,1))  -  0)  THEN 

REPLACE (I)  -  FALSE 
ELSE 

REPLACE (I)  -  TRUE 

INPUT  "  Enter  replacement  string  "/REPLACE? (I) 

END  IF 
PRINT 
NEXT  I 
END  SUB 

SUB  ReadLines (LS (1) .FILENAME? (1) , INDEX, NUM. LINES)  STATIC 
'  Subroutines  to  read  text  lines 

LPRINT  "PROCESSING  FILE  :  "/FILENAME? (INDEX) 

OPEN  "I", 1,  FILENAMES (INDEX) 

NUM. LINES  -  0 

WHILE  (NOT  EOF (1) )  '  AND  (NUM. LINES  <=  MAX. LINES) 

NUM. LINES  -  NUM. LINES  +  1 
LINE  INPUT#1,L$(NUM. LINES) 

WEND 
CLOSE  #1 
END  SUB 

SUB  CenterText (T?)  STATIC 
'  Subroutine  to  center  a  message 
PRINT  SPC (40  -  LEN (TS) /2) / TS 
END  SUB 

SUB  WriteLines(LS(l), FILENAME? (1), INDEX, NUM. LINES)  STATIC 
’  Subroutine  to  write  the  updated  file 
OPEN  “0“,1, FILENAMES (INDEX) 

FOR  I  -  1  TO  NUM. LINES 
PRINT# 1, L? (I) 

NEXT  I 
CLOSEI1 
END  SUB 

End  Listings 


Dr.  Dobb's  Journal,  March  1987 

222 


95 


COLUMNS 


C  CHEST 


Nr:  A  C  Implementation  of  Nroff;  Part  2 


This  month  I’ll  continue  discuss¬ 
ing  the  nr  text  formatter  that  I 
introduced  last  month.  I’ll  present 
the  first  part  of  a  complete  users' 
guide  and  continue  it  in  the  next  col¬ 
umn.  The  source-code  disk  contains  a 
complete  implementation  of  the 
ms  macro  package.  (See  the  end  of 
this  column  for  information  about 
the  source  code  disk.)  Nr  is  as  much  a 
programming  language  as  it  is  a  text 
formatter,  and  a  look  at  a  complex 
macro  package  such  as  ms  can  show 
you  how  to  program  in  that 
language. 

I  should  preface  this  article  by  say¬ 
ing  that  I’ve  implemented  the  Unix 
nrofF  as  closely  as  I  can.  Over  the 
course  of  several  years,  I've  learned 
more  things  about  nroff  than  I  actual¬ 
ly  care  to  know.  I  do  not  claim,  how¬ 
ever,  to  know  everything  there  is  to 
know  about  the  real  nroff.  As  a  con¬ 
sequence,  there  may  be  a  few  differ¬ 
ences  between  nr  and  the  real  nroff, 
introduced  because  I  can't  figure  out 
how  the  real  nroff  works.  Sorry.  I 
should  also  say  that,  though  I've  used 
the  program  presented  here  for  sev¬ 
eral  years  now  and  don't  know  about 
any  bugs,  I’m  a  creature  of  habit  and 
probably  haven’t  exercised  those 
parts  of  the  program  that  have  bugs 
in  them.  In  particular,  when  you  get 
into  the  realm  of  fancy  laser  printers 
and  proportional  spacing,  nr  may  not 
work  without  your  having  to  modify 
the  program  somewhat.  It  works  fine 
on  the  various  printers  I  own  (HP 
Thinkjet,  Brother  HR-15,  and  HP-La- 
serjet  +  ),  but  these  are  the  only  print- 


by  Allen  Holub 

ers  on  which  the  program  has  been 
used.  I’ve  added  the  proportional- 
spacing  features  very  recently,  so  I 
don't  have  as  much  confidence  in 
that  part  of  the  program  as  I  do  in  the 
older  parts. 

As  I  write  this  article,  I'm  looking  at 
the  code  more  closely  than  I  have  for 


a  while.  As  a  consequence  I'm  notic¬ 
ing  (and  fixing)  a  few  nroff  incompa- 
tabilities  I  hadn't  noticed  before.  Two 
such  fixes  affect  one  of  the  subrou¬ 
tines  I  discussed  last  month — the  ex¬ 
pression  parser  in  parse. c.  This 
parser  treats  the  'strl  'strZ '  expression 
as  if  it  were  using  the  C  strcmpt ) 
function.  Nroff,  on  the  other  hand, 
evaluates  this  expression  to  true  if  the 
two  strings  are  equal  and  to  false  if 
they  are  not — the  opposite  of 
strcmpt  ).  A  second  problem  with  the 
parser  is  actually  a  bug.  It  shouldn’t 
recognize  a  quote  as  white  space.  To 
modify  parse. c  to  fix  these  problems, 
replace  line  293  of  parse.c  with: 

rval  =  !strcmp(  si,  s2  ); 

and  change  line  137  to: 

while(  isspacePStr) ) 

Nr  Users’  Guide 

It's  almost  impossible  to  describe  a 
program  as  complex  as  nr  in  an  or¬ 
derly  fashion  because  there's  no  way 
to  organize  the  material  to  avoid  for¬ 
ward  references.  Consequently, 


you'll  probably  have  to  read  this 
guide  (and  its  conclusion  in  my  next 
column)  twice — once  to  get  a  general 
idea  of  how  the  program  works  and 
a  second  time  to  fill  in  the  details. 

Nr  is  an  almost  complete  imple¬ 
mentation  of  the  Unix  nroff  text  for¬ 
matter.  It  incorporates  several  of 
troff's  functions  as  well,  and  it  can 
generate  output  for  most  printers 
without  any  modifications  to  the 
source  code. 

Nr  is  a  compiler-like  text  formatter. 
You  create  the  input  text  with  a  nor¬ 
mal  editor  and  then  submit  it  to  nr 
just  like  you’d  submit  a  program  to  a 
compiler.  Nr  formats  the  input  and 
sends  the  resultant  text  to  standard 
output  (so  you  have  to  redirect  it  if 
you  don’t  want  to  display  it  on  the 
screen).  You  invoke  nr  with: 

nr  [—switches]  files  .  .  .  [  >  stream  ] 

You  can  list  several  files — they  are 
just  concatenated  as  the  program 
runs.  The  command-line  switches 
are  optional,  and  several  of  them  are 
position  sensitive.  Table  1,  below, 
summarizes  supported  switches. 
They  are: 

- print  a  list  of  legal  command-line 

switches. 

— c — map  all  control  characters,  if 
present,  to  visible  characters  before 
they're  printed.  This  option  is  partic- 


—c 

don't  print  (c)ontrol  characters 

-d 

print  only  o(d)d  pages 

—e 

print  only  (e)ven  pages 

— m<str> 

append  (m)acro:  /lib/tmac/<str>.mac 

—n<num>  (n)umber  first  page  <num> 

— o<list> 

print  (o)nly  pages  in  list  <list> 

-p 

suppress  bold,  underline,  overstrike 

—r<str> 

set  number  (r)eg:  —rx<num>  —r(xx<num> 

-s<num>  (s)top  every  <num>  pages 

—t<str> 

sets(t)ring:  —tx<str>  — t(xx<str> 

—  v 

(vjerbose  mode,  echo  input  commands 

Table  1:  Summary  of  command-line  arguments 


96 


Dr.  Dobb's  Journal,  March  1987 

223 


C  CHEST 

(continued  from  page  96) 


ularly  useful  for  debugging  escape 
sequences  that  are  sent  directly  to  the 
printer.  Nonprinting  characters  are 
output  as  <DD>,  where  DD  is  two 
hex  digits. 

—  d — print  only  odd-numbered 
pages.  This  option  is  useful  if  you're 
sending  output  to  a  laser  printer  and 
want  two-sided  output.  This  com¬ 
mand  interacts  with  the  —  o  and  —  n 
switches  described  later  (for  example 
nr  —d  —olO—ZOfile.nr  >prn  prints 
only  odd-numbered  pages  in  the 
range  11  to  19. 

—  e — print  only  even-numbered 
pages. 

—m<name> — cause  the  contents  of 
a  macro  file  to  be  processed  before 
any  of  the  normal  input  files  are  pro¬ 
cessed.  You  can  think  of  — m  as  short 
for  /lib/tmac/name.mac.  For  exam¬ 
ple,  the  switch  —ms  causes  nr  to  pro¬ 
cess  the  file  /lib/tmac/s.mac.  If  you 
specify  several  —  m  options,  the  files 
are  processed  in  order  from  left  to 
right,  and  all  macro  files  are  pro¬ 
cessed  before  any  normal  files  are 
processed. 

—n<num> — cause  the  first  page  to 
be  numbered  N — for  example,  —nlO 
causes  page  numbering  to  start  at  10. 

—o<list> — print  only  those  pages  in 
<Iist>.  The  list  can  take  several 
forms.  The  simplest  is  —ol,3,5,  which 
prints  only  pages  1,  3,  and  5.  You  can 
specify  ranges  of  pages,  as  in 
—o5—10,  which  prints  pages  5  to  10 
inclusive.  The  notation  —o  —  lO 
means  print  all  pages  from  the  begin¬ 
ning  of  the  document  up  to  and  in¬ 
cluding  page  10.  Similarly,  —  olO— 
means  print  from  page  10  to  the  end 
of  the  document.  You  can  combine 
all  these  forms,  as  in  —o— 10,12, 
15— 20,30— ,  which  prints  pages  1  to 
10,  page  12,  pages  15  to  20,  and  from 
page  30  to  the  end  of  the  document. 
Note  that  the  —n  option  interacts 
with  the  —o  and  —e  options — that  is, 
if  you  say  —n5,  then  saying  —o2 
won’t  work  because  there  is  no  page 
2. 

—p — generate  plain  output  (suppress 


all  boldface,  underline,  and 
overstrike). 

—r<str> — initializea  number  regis¬ 
ter  (described  later).  This  option  can 
have  two  forms: 

— rxl23 
— r(xxl23 

The  first  form  initializes  the  single- 
character  number  register  *  to  123; 
the  second  initializes  the  two-charac¬ 
ter  register  yy.  These  numbers  can  be 
used  in  the  document  with  \ny  and 
\nfyy  (see  later). 

—s<num>  —stop  output  every 
num/  pages. This  option  is  useful  if 
you  have  to  hand-feed  paper  into 
your  printer  one  page  at  a  time  (use 
—si  for  this  purpose). 

—  t<str> — initialize  a  text  string 
macro  (works  just  like  —n  does).  The 
text  is  available  inside  the  document 
using  the  \*y  and  \*(jqc  mechanisms, 
described  later.  Note  that  you  have  to 
quote  the  string  to  get  blanks  into  the 
text: 

nr  " — tytext  with  spaces"  file 

—  v  (verbose  mode) — cause  com¬ 
mands  to  be  echoed  to  standard  out¬ 
put  just  before  they're  executed  but 
after  all  the  escape  sequences  (de¬ 
scribed  later)  have  been  expanded. 
Commands  that  are  part  of  macro 
definitions  aren't  echoed.  The  name 
of  the  input  source  (a  file  or  macro 
name)  is  printed  as  well.  This  is  a  de¬ 
bugging  option. 

Input  to  Nr 

The  command  structure  and  com¬ 
mand  names  nr  uses  are  almost  iden¬ 
tical  to  those  that  nroff  uses.  There 
are  a  few  minor  differences  that  I 
will  discuss  later.  Because  I  didn't 
want  to  create  a  binary  intermediate 
file,  such  as  the  one  used  by  ditroff 
(device-independent  troff),  I’ve  add¬ 
ed  several  nonstandard  commands  to 
support  configuration  to  various 
printers.  Nonstandard  commands 
are  identified  as  such  in  the  follow¬ 
ing  command  descriptions. 

One  of  my  original  intentions  in 
writing  nr  was  to  be  able  to  write 
documents  at  home  and  then  upload 
these  to  the  nroff  system  at  school  for 


final  typesetting.  Consequently,  I 
tried  to  make  the  move  as  painless  as 
possible.  At  the  macro  level,  nr  is 
identical  to  nroff.  I've  written  an  im¬ 
plementation  of  the  ms  macro  pack¬ 
age  that's  in  use  at  UC  Berkeley.  If 
your  documents  are  formatted  with 
ms,  as  are  the  overwhelming  major¬ 
ity  of  nroff  and  troff  documents, 
porting  to  a  real  Unix  system  is  trivi¬ 
al.  The  few  minor  differences  be¬ 
tween  the  nr  internal  commands  and 
the  real  nroff  are  well  documented 
and  easily  translated.  I  just  recently 
transferred  a  complete  book  from  nr 
to  the  VAX  at  school  and  for  the  most 
part  experienced  no  difficulties.  The 
main  problem  I  had  was  with  trans¬ 
lating  macros  not  in  the  ms  package 
to  the  real  nroff/troff.  Nr  is  better 
documented  than  nroff  itself.  As  a 
consequence,  writing  real  nroff  mac¬ 
ros  can  be  difficult.  Once  you  have 
created  the  equivalent  macros,  trans¬ 
lation  is  no  problem,  of  course.  The 
other  problems  I  had  were  typeset¬ 
ter-related.  A  typesetter  is  not  a  daisy- 
wheel  printer,  and  the  differences 
took  a  few  days  to  figure  out. 

Nr  takes  as  input  a  normal  ASCII 
text  file  that  contains  intermingled 
text  and  formatting  commands.  Note 
that  nr  won’t  automatically  map  AS¬ 
CII  to  a  funny  daisywheel — you  have 
to  do  it  yourself.  Nr,  unlike  troff,  un¬ 
derstands  the  entire  ASCII  character 
set.  Some  of  the  characters  (such  as  \) 
have  a  special  meaning  to  nr,  howev¬ 
er,  and  have  to  be  entered  in  a  special 
way,  discussed  later.  There’s  also  a 
provision  for  printing  special  non-AS- 
CII  characters. 

Nr  commands  take  two  forms:  dot 
commands  and  escape  sequences. 
Dot  commands  all  start  with  a  dot  in 
the  leftmost  column.  The  dot  is  fol¬ 
lowed  by  a  one-  or  two-letter  com¬ 
mand  name.  All  of  the  built-in  com¬ 
mands  have  two-letter  names.  You 
can  create  new  commands  using  nr’s 
macro  capability,  however,  and 
these  can  have  either  one-  or  two-let¬ 
ter  names.  There  can  be  any  amount 
of  white  space  (spaces  or  tabs)  be¬ 
tween  the  dot  and  the  first  character 
of  the  name,  which  is  useful  inside  a 
macro  if  you  want  to  indent  the  body 
of  an  .if  statement.  Because  .if  and  .ie 
(if.  .  .  e/sej  statements  nest,  indenting 
can  help  a  great  deal. 

Escape  sequences,  the  other  sort  of 
command,  are  text  strings  that  are 


98 

224 


Dr.  Dobb 's  Journal,  March  1987 


embedded  in  the  text  itself.  They  all 
begin  with  a  backslash  ( \)  but  are  oth¬ 
erwise  dissimilar.  You  use  escape  se¬ 
quences  for  such  tasks  as  changing 
fonts  on  the  fly  or  expanding  certain 
macros.  The  \f  I  escape  sequence,  for 
example,  changes  the  current  font  to 
italics  and  \f  P  puts  it  back  to  the  pre¬ 
vious  state.  You  can  put  a  word  into 
italics  with  \fIword\fP. 

Expressions 

All  the  nr  commands  that  take  nu¬ 
meric  arguments  can  also  take  ex¬ 
pressions  (which  are  computed  as  the 
document  is  processed)  instead  of  ab¬ 
solute  numbers.  Several  operators  are 
available,  shown  in  Table  2,  right.  All 
these  operators  work  just  like  their  C 
equivalents  do  except  that  expression 
evaluation  dosn't  terminate  when  the 
truth  or  falsity  of  an  &&  or  !!  expres¬ 
sion  is  determined.  Note  that  this  is  a 
more  powerful  expression  syntax 
than  is  supported  by  the  real  nroff. 

Be  careful  of  strings  that  follow  ex¬ 
pressions  on  the  command  line.  Be¬ 
cause  white  space  is  legal  in  an  ex¬ 
pression,  the  analyzer  just  scans  the 
input  line  until  it  finds  an  illegal  char¬ 
acter.  If  you  say  something  such  as: 

.vd  <up>  1  <down> 

the  <  that  precedes  down  will  be  ab¬ 
sorbed  by  the  expression  parser  be¬ 
cause  <  is  a  legal  character  in  an  ex¬ 
pression.  The  problem  can  be  fixed  by 
putting  quote  marks  around  the 
strings: 


.vd  "<up>"  1  "<down>" 

Most  commands  treat  leading  plus 
or  minus  signs  specially.  These  signs 
cause  the  current  value  associated 
with  a  command  to  be  incremented 
or  decremented  by  the  indicated 
amount. 

For  example: 

.in  10  \"  Set  indent  level  to  10 
.in  +5  \"  Increase  it  to  15 
.in  —5  \"  Decrease  it  back  to  10 


The  \"  is  a  comment;  all  text  that  fol¬ 
lows  it  is  ignored. 

The  real  nroff  supports  several 
unit-of-measurement  operators  that 
can  be  appended  onto  numbers 
(inches,  picas,  points,  and  so  forth).  Nr 
does  not  support  these. 

Dot  Commands 

Nr  supports  a  rich  set  of  dot  com¬ 
mands  (90  or  so).  As  I  mentioned  earli¬ 
er,  all  commands  that  take  numeric 
arguments  can  be  passed  expressions 


Operator 

Precedence 

Level 

Meaning 

0 

5 

used  for  grouping 

— 

5 

unary  minus  (as  in  -5) 

I 

5 

logical  NOT 

’s1’s2’ 

5 

compares  two  strings — evaluates  to 
true  (1 )  if  they  are  equal,  to  false  (0) 
if  they  are  not 

* 

4 

multiply 

/ 

4 

divide 

% 

4 

modulus  (MOD) 

+ 

3 

addition 

— 

3 

subtraction 

< 

2 

less  than 

< 

2 

less  than  or  equal 

> 

2 

greater  than 

>=  ■ 

2 

greater  than  or  equal 

=  = 

o 

equal 

/= 

2 

not  equal 

&& 

1 

logical  AND 

!  S 

1 

logical  OR 

Table  2:  Operators.  All  operators  associate  left  to  right.  Higher  numbers 
have  higher  precedence. 


Dr.  Dobb 's  Journal,  March  1987 


99 

225 


C  CHEST 

(continued  from  page  99) 


instead  of  explicit  numbers.  The  es¬ 
cape  sequences  on  the  line  are  ex¬ 
panded  before  the  expression  is  eval¬ 
uated,  so  you  can  use  number 
registers  and  the  like  in  expressions 
(I’ll  discuss  these  in  depth  in  a  mo¬ 
ment).  If  a  command  argument  con¬ 
tains  any  space  characters,  you  must 
enclose  it  in  double  quotes,  as  in  the 
following : 

,ds  x  "several  words  in  a  string" 

Unlike  the  various  Unix  shells,  the 
quotes  are  just  for  grouping — they  do 
not  protect  any  internal  escape  se¬ 
quences  (introduced  with  a  \)  from 
expansion.  For  example: 

.sp  "(\nx  +  15)  *  \ny" 

is  treated  identically  to: 

.sp  (\nx  +  15)*\ny 

but  is  a  little  easier  to  read. 

All  supported  dot  commands  are 
discussed  later.  The  commands  are 
grouped  functionally.  Don’t  be  dis¬ 
mayed  by  their  number  and  com¬ 
plexity.  As  I  mentioned  earlier,  nr  is 
really  a  programming  language  that 
generates  formatted  text  as  output 
rather  than  a  compiled  program.  Con¬ 
sequently,  you  hardly  ever  have  to 
use  the  primitive  commands  them¬ 
selves;  rather,  you  use  subroutines 
(macros)  that  are  written  in  terms  of 
the  primitive  commands.  The  advan¬ 
tage  of  a  system  such  as  this  is  that  you 
can  redefine  the  way  your  text  for¬ 
matter  works  to  suit  your  conve¬ 
nience. 

In  all  the  following  descriptions, 
brackets  delimit  an  optional  argu¬ 
ment  (/arg7 );  in  nonliteral  arguments, 
on  is  a  string  that  turns  something  on 
and  N  is  a  number;  and  angle  brack¬ 
ets  are  used  when  more  than  one 
word  is  needed  to  describe  an  argu¬ 
ment  ( <Ieft  str». 

Configuration 

Nr  has  several  commands  that  config¬ 
ure  it  to  work  with  specific  printers. 
Typically  these  are  concentrated  in  a 
macro  file  that  is  read  using  the  —  m 
command-line  switch — for  example, 
the  switch  —mlaser  tells  nr  to  read 


the  file  /lib/tmac/laser.mac  before 
processing  other  files.  The  ms  macro 
package  I  use  is  configured  so  that  text 
is  displayed  properly  on  the  screen, 
provided  that  ANSI.SYS  is  installed — 
that  is,  boldface  is  shown  in  high  in¬ 
tensity,  italic  is  underlined,  and  so 
forth. 

The  configuration  commands  are: 

.bd  on  off- — takes  as  its  argument  two 
strings — one  to  turn  boldface  on,  the 


Nr  is  a 

compiler-like 
text  formatter  that 
can  generate 
output 

for  most  printers. 

other  to  turn  it  off.  The  maximum 
length  of  either  string  is  80  charac¬ 
ters.  Use  \y  to  send  control  charac¬ 
ters.  For  example: 

,bd  \xlb[lm  \xlb[0m 

configures  nr  for  ANSI.SYS.  It  outputs 
ESC[lm  {Ojclb  is  an  ESC)  to  enable  bold¬ 
face  printing.  ESC[0m  turns  it  off 
again.  If  a  .bd  command  is  never 
specified,  or  if  a  .bd  is  executed  with 
no  arguments,  then  boldface  is  done 
by  printing  every  character  twice 
with  an  intervening  backspace 
( C<bs>C ).  This  command  is  a  little 
different  from  the  one  in  nroff. 

.cm  (on]—e nable  nroff-style  copy 
mode  during  macro  definitions.  If  an 
argument  is  present,  nroff  copy 
mode  is  enabled;  otherwise,  it's 
turned  off.  In  normal  copy  mode 
only  \ "  and  \  <CR>  are  recognized. 
In  nroff  mode  the  following  are 
recognized: 

\"  \<cr>  \n  V  \$  \\  V  \t  \a 

Both  modes  are  discussed  in  greater 
detail  later. 

.hd  <left  str>  N  <right  str> — de¬ 
fine  horizontal  motion.  The  two 
strings  send  the  printer  cursor  left  or 
right  by  1/N  spaces.  The  width  of  a 
space  is  taken  from  the  currently  ac¬ 
tive  character-width  table  (it  is  1  in 


the  default  monospaced  font)  and 
can  be  changed  with  a  .ss  command. 
N  determines  the  minimum  resolu¬ 
tion  for  the  space  between  charac¬ 
ters  in  proportional-spacing  mode. 
All  the  widths  in  the  character-width 
table  must  be  in  terms  of  N  as  well. 

As  an  example,  if  a  space  character 
occupies  12  units  of  horizontal  reso¬ 
lution  in  a  specific  font,  N  is  12  and 
the  two  strings,  when  sent  to  the 
printer,  move  the  cursor  1/12  of  a 
space  width.  The  character-width  ta¬ 
bles  loaded  with  the  .df  command 
(discussed  later)  contain  widths  that 
will  all  be  in  terms  of  these  mini¬ 
mum,  1/12  space  units.  For  example, 
if  the  character-width  table  entry  for 
i  is  6,  the  character  i  occupies  6/12  of 
the  space  occupied  by  a  space.  If  the 
entry  for  A  is  14,  the  character  A 
takes  up  14/12  (1%)  of  the  space  re¬ 
quired  for  a  space  character.  The  de¬ 
fault  <left  string>  is  a  single  back¬ 
space  character,  the  default  <right 
string>  is  a  single  space  character, 
and  the  default  N  is  1. 

.id  on  off— send  the  string  specified 
in  on  to  the  printer  to  put  it  into  italics 
(underline)  mode;  off  takes  it  out.  The 
maximum  length  of  either  string  is  80 
characters.  Use  \x<two  hex  digits> 
to  send  a  control  character.  If  no  ar¬ 
guments  are  present  or  if  no  .id  is 
specified,  then  underlining  is  used — 
nr  prints  an  underscore,  a  backspace, 
and  then  the  character  for  each 
character. 

.od  on  off—  put.  the  printer  into  over- 
strike  mode  (works  like  .id  does).  A 
dash  is  used  instead  of  an  underscore 
in  the  default  situation. 

.ss  N — change  the  width  of  a  space 
in  the  currently  active  font  to  N;  the 
default  N  is  1. 

.vd  <up  str>  N  <down  str> — de¬ 
fine  vertical  motion.  The  <up  str> 
string  moves  the  printer  cursor  up  1/ 
N  lines;  the  <down  str>  moves  it 
down  again. 

Font  Control  and  Character 
Attributes 

Several  commands  are  available  to 
change  the  current  font  and  to  load 
new  fonts.  Nr  handles  fonts  a  little 
differently  from  the  way  nroff  does, 
primarily  because  most  printers  han- 


100 

226 


Dr.  Dobb's  Journal,  March  1987 


die  the  various  highlight  modes 
somewhat  differently  from  the  way 
that  phototypesetters  do.  All  fonts 
have  single-letter  names.  Five  names 
are  reserved  by  nr: 

R — Roman,  the  default  font 
I — italics  (or  underline) 

B — boldface 
O — overstrike 
P — previous 

The  B  font  is  the  default  font.  Ini¬ 
tially  it  is  a  monospaced  (nonpropor¬ 
tional)  font,  but  you  can  replace  it 
with  a  proportional  font  by  using  a 
.df  command.  You  can  change  the 
current  font  with  either  the  .ft  <font 
name>  command  or  with  an  embed¬ 
ded  \f<font  name>  escape  se¬ 
quence.  For  example,  you  can  put 
the  word  into  italics  with  \flita- 
Iics\fP.  Here,  \fl  switches  into  the 
italics  font  (by  sending  out  the  string 
defined  with  the  .id  command,  de¬ 
scribed  earlier),  prints  the  word  ital¬ 
ics,  and  then  switches  back  to  the 
previously  active  font  with  \fP. 

The  /  (italics)  attribute  is  a  little 
weird  in  that  it’s  used  for  both  italics 
and  underlining.  Typically  you  can 
have  only  one  or  the  other  in  a  docu¬ 
ment,  not  both.  If  you  want  to  have 
both,  you  should  use  nr’s  italics  and 
then  use  the  line-drawing  characters 
to  underline  a  word  when  necessary. 
Note  that  the  real  nroff  doesn't  sup¬ 
port  an  O  default  font.  Nr  is  also  dif¬ 
ferent  from  nroff  in  that  nr  treats  the 
I,  B,  and  O  fonts  as  attributes  rather 
than  as  actual  fonts.  That  is,  when 
you  change  to  font  I,  the  current  font 
stays  active  but  nr  sends  whatever 
string  was  defined  with  the  .id  com¬ 
mand  out  to  the  printer.  This  way 
you  can  have  a  bold-italic  character 
by  using  \fI\fBword\fP.  Changing  to 
any  font  other  than  /,  B,  or  O  disables 
all  three  attributes. 

Font  commands  are: 

.bo  [-f — IN — put  all  words  on  the 
next  N  input  lines  into  boldface.  The 
default  N  (used  when  N  is  missing)  is 
1.  Note  that  this  is  not  an  nroff  com¬ 
mand,  though  it  can  be  simulated 
with  a  macro  in  nroff.  If  you  want  to 
put  an  unspecified  amount  of  text 
into  boldface,  use: 

.bo  1000 

<a  bunch  of  text  goes  here> 


.bo  0 

.ul  [N] — underline  (or  put  into  italics) 
words  only  on  the  next  N  input  lines. 
Only  alphanumeric  characters  are 
underlined;  punctuation,  spaces,  and 
so  on  are  not. 

.cu  [-1 — ]N — underline  (or  italicize) 
words  continuously  on  the  next  N  in¬ 
put  lines.  All  characters  are  under¬ 
lined,  even  spaces  and  punctuation. 
For  example,  This  is  continuous  un¬ 
derlining  and  this  is  not. 

.os  [N ]  — overstrike  the  next  N  input 
lines  (works  like  .ul  does).  If IV  is  miss¬ 
ing,  1  is  used. 

.df  F  <start>  <end>  <cwidths> — 
redefine  the  R  font  (but  not  the  /,  O,  or 
P  fonts)  or  add  a  new  font.  If  no  argu¬ 
ments  are  present,  a  list  of  existing 
fonts  is  printed  to  standard  output 
along  with  the  character-width 
tables. 

F  is  a  font  name  (one  character), 
<start>  is  the  name  of  a  macro  to 
invoke  when  the  font  is  activated  in 
the  normal  way  (with  a  \fF  or  .ft  F 
command),  <end>  is  a  macro  to  in¬ 
voke  when  you  switch  out  of  the 
font,  and  <cwidths>  is  the  name  of  a 
file  that  holds  the  character-width  ta¬ 
ble  associated  with  the  font.  This  file 
must  be  composed  of  256  numbers, 
with  the  numbers  listed  in  ASCII  or¬ 
der — that  is,  the  first  number  is  the 
character  corresponding  to  an  ASCII 
'\0',  the  second  number  is  a  Ctrl-B, 
the  32nd  number  is  the  width  of  the 
space  character,  and  so  on.  Numbers 
must  be  separated  from  each  other 
by  either  white  space  or  new  lines. 


A  sample  font-width  table  is  shown 
in  Table  3,  below.  The  Os  on  the  first 
line  correspond  to  the  characters 
having  numeric  values  in  the  range  0 
to  31  (all  the  control  characters).  A 
space  (ASCII  32)  is  12  units  wide,  an 
exclamation  point  (ASCII  33)  uses  6 
units,  a  double-quote  mark  (ASCII  34) 
uses  8  units,  and  so  on.  If  numbers  are 
missing  from  the  end  of  the  list,  1  is 
assumed.  A  unit  here  must  also  be  de¬ 
fined  in  the  .hd  command  described 
earlier.  If  no  font-width  file  is  speci¬ 
fied  to  .df  a  table  is  created  and  all 
entries  in  it  are  set  to  1.  (This  is  the 
default  for  a  monospaced  font.) 

.ft  F — change  to  font  F  at  the  begin¬ 
ning  of  the  next  input  text  line.  You 
can  also  embed  font  changes  with  a 
\fF  escape  sequence.  Note  that,  if  font 
F  doesn't  exist,  the  error  won’t  be 
flagged  until  the  output  routines  try 
to  process  the  font  change  request. 

Text  Filling,  Adjusting, 
and  Centering 

Nr  generally  fills  lines — that  is,  it  col¬ 
lects  words  from  input  (a  word  is  any 
space-delimited  collection  of  charac¬ 
ters)  until  it  has  collected  an  entire 
output  line,  and  then  it  outputs  all  the 
words  on  a  single  line.  For  example: 

This 

is  several 
words. 

will  be  collected  and  printed  as: 

This  is  several  words. 

If  hyphenation  is  enabled,  it  will 
read  one  word  too  many  and  then  try 


0  I 

0  0 

0 

0 

0  0 

0 

0  0 

0 

0  I 

0  0 

0 

0 

0  0 

0 

0  0 

0 

12 

6 

8 

12 

10 

16 

14 

6 

6 

6 

10 

10 

6 

8 

6 

8 

10 

10 

10 

10 

10 

10 

10 

10 

10 

10 

6 

6 

10 

10 

10 

10 

16 

14 

12 

14 

14 

12 

12 

14 

14 

6 

10 

14 

12 

16 

14 

14 

12 

14 

14 

10 

12 

14 

12 

16 

14 

14 

12 

6 

8 

6 

10 

12 

10 

10 

10 

10 

10 

10 

8 

10 

10 

6 

6 

10 

6 

16 

10 

10 

10 

10 

8 

8 

8 

10 

10 

14 

10 

10 

10 

6 

6 

6 

10 

0 

Table  3:  A  font-width  file 


Dr.  Dobb's  Journal,  March  1987 


101 

227 


C  CHEST 

(continued  from  page  101) 


to  hyphenate  the  last  one.  If  nr  can 
insert  a  hyphen  to  squeeze  more 
characters  onto  the  current  line,  it 
will  do  so.  You  can  also  adjust  the  text 
in  several  other  ways.  The  most  com¬ 
mon  is  to  insert  white  space  between 
words  in  order  to  get  the  rightmost 
characters  to  line  up  (with  the  words 
spread  as  evenly  as  possible  on  the 
line). 

You  can  force  a  line  break  (in 
which  the  contents  of  the  fill  buffer 
are  printed  even  if  there  aren’t 
enough  words  to  fill  the  line)  in  sever¬ 
al  ways.  The  .br  command  always 
causes  a  break  and  leaves  the  cursor 
at  the  beginning  of  the  next  output 
line.  In  addition,  several  other  com¬ 
mands — .bp,  .br,  .ce,  .fi,  .in,  .nf  .sp, 
and  .fi — cause  breaks  as  a  side  effect 
of  their  operation.  If  you  don't  want  a 
line  to  break  when  one  of  these  is  ex¬ 
ecuted,  replace  the  dot  that’s  usually 
used  to  introduce  a  command  with 
the  nobreak  command  character  (the 
default  is  a  backquote  [  '  ]).  For  exam¬ 


ple,  the  .sp  N  command  usually 
causes  a  break  and  then  prints  N 
blank  lines.  The  'sp  N  command, 
however,  prints  the  blank  lines  with¬ 
out  flushing  the  fill  buffer  first.  You 
can  change  the  default  no-break 
character  with  a  ,c2  command. 

Commands  for  controlling  filling 
and  margins  are: 

.ad  [C] — turn  on  margin  adjusting. 
Adjustment  modes  (values  of  C)  are: 
b — adjust  both  margins. 
n — same  as  b. 

I — adjust  only  the  left  margin, 
leaving  a  ragged-right  edge,  as  in  a 
hand-typed  document. 
r — adjust  only  the  right  margin, 
leaving  a  ragged-left  edge.  God 
knows  what  this  mode  is  good  for, 
but  nroff  supports  it. 
c — center  each  output  line  on  the 
page. 

If  C  is  missing  then  the  most  recently 
active  adjustment  mode  is  used. 

.br  (break) — print  all  the  words  in  the 
current  fill  buffer  even  if  there  aren't 
enough  words  to  fill  the  output  line, 


then  go  to  the  next  output  line. 

.ce  [N]  — center  the  next  N  input  lines 
without  filling.  Default  N  is  1.  This 
command  causes  a  break. 

.fi — enable  line  filling.  The  default  is 
filling  off,  so  a  fi  command  must  be 
specified  at  the  top  of  the  input.  This 
is  usually  done  automatically  by  a 
macro  file  such  as  ms.  This  command 
causes  a  break. 

.na — turn  off  adjusting.  Turn  it  back 
on  with  a  .ad. 

■nf—  disable  line  filling,  flushing  the 
buffer  first.  This  command  causes  a 
break. 

Page  Control 

Nr  has  several  commands  for  page 
control: 

.bp  [[+—]  N] — begin  page  N.  If  N  is 
absent,  use  the  current  page  number 
plus  1.  Note  that  N  is  the  number  of 
the  new  page,  not  the  current  one,  so 
a  footer  on  the  current  page  will  re¬ 
flect  the  old  number.  If  JV  has  a  lead¬ 
ing  plus  or  minus  sign,  the  current 
page  number  is  modified  by  the  indi¬ 
cated  amount.  This  command  causes 
a  break. 

.ne  N — need  N  lines.  If  there  aren't 
that  many  lines  on  the  current  page, 
then  force  a  new  page.  The  .ne  com¬ 
mand  actually  looks  at  the  distance 
from  the  current  position  on  the  out¬ 
put  page  to  the  next  output  line  trap, 
discussed  later  in  the  Macros,  Strings, 
Diversions,  and  Traps  section.  If  this 
distance  is  less  than  N,  nr  skips  for¬ 
ward  to  the  trap.  The  assumption 
here  is  that  the  trap  will  be  an  end-of- 
page  trap. 

.pi  [  +  —]N — set  page  length  to  N  lines. 

.po  [-f — ]N — set  page  offset  to  N 
spaces.  The  page  offset  is  a  specified 
number  of  space  characters  that  are 
printed  to  the  left  of  every  output 
line — that  is,  .po  defines  the  width  of 
the  left  margin. 

Changing  Special  Characters 

Certain  characters  are  special  to  nr. 
These  are: 

.(command  character) — introduces 


102 

228 


Dr.  Dobb's  Journal,  March  1987 


C  CHEST 

(continued  from  page  102) 


dot  commands. 

(nobreak  character) — also  intro¬ 
duces  commands,  but  a  line  break  is 
not  done  if  that  command  usually 
forces  a  break. 

\  (escape  character) — introduces  an 
escape  sequence. 

You  can  change  these  characters 
with  the  following: 

,c2  [C] — change  no-break  character 
to  C.  If  C  is  missing,  use  a  backquote  ( ' ). 

.cc  [C] — change  command  character 
to  C.  If  C  is  missing,  use  a  period  (.). 

.ec  c — change  escape  character  to  C. 
If  C  is  missing,  use  a  backslash  (\). 

.eo — disable  the  escape  mechanism 
entirely  (change  the  escape  character 
to  nothing).  You  can  restore  it  again 
with  a  .ec  command. 

Spacing,  Line  Length, 
and  Indenting 

You  can  use  the  commands  listed  in 
this  section  to  change  line  spacing,  the 
current  indent  level,  and  so  forth.  If 
you  use  a  leading  plus  or  minus  sign  in 
a  numeric  argument,  the  current  val¬ 
ue  is  modified  by  the  indicated 
amount;  otherwise,  the  current  value 
is  changed  to  the  indicated  value. 

.in  [  +  —]  — change  the  indent  level  to 
N.  This  indent  is  in  addition  to  the  left 
margin,  which  is  set  up  with  the  .po 
command,  described  earlier.  If  you 
use  both  .po  and  .in,  then  the  left  mar¬ 
gin  is  the  sum  of  the  values  given  to 
the  two  commands.  Generally  the 
page  offset  remains  constant 
throughout  a  document,  and  the  in¬ 
dent  is  changed  with  .in.  This  com¬ 
mand  causes  a  break. 

.II  [+—]N — change  line  length  to  N 
spaces.  The  line  length  determines 
how  many  words  are  collected  when 
line  filling  is  enabled. 

.Is  [+  —]N — change  line  spacing  to  N 
lines;  1  is  single  spacing,  2  is  double 
spacing,  and  so  on. 


.ns — inhibit  the  printing  of  blank 
lines  (no-space  mode) — that  is,  no 
blank  lines  will  be  printed  until  some 
text  is  encountered,  a  .bp  N  is  execut¬ 
ed  (the  JV  is  required),  or  a  ,rs  is  exe¬ 
cuted.  This  command  is  useful  in  the 
top-of-page  macro. 

.rs — restore  blank  line  printing 
when  it  has  been  turned  off  with  a 
previous  .ns  command. 

.sp  [N] — space  down  N  lines  (print  N 
blank  lines).  JV  can  be  negative  if  your 
printer  supports  reverse  line  feeds 
and  a  previous  .vd  command  was  exe¬ 
cuted.  This  command  causes  a  break. 
Note  that  a  blank  input  line  is  treated 
identically  to  a  .sp  1  (it  forces  a  break 
and  prints  a  blank  line  under  the 
flushed  text). 

.ti  N — set  the  temporary  indent  to  AT 
spaces — that  is,  only  the  next  output 
line  will  be  indented  by  the  indicated 
amount.  This  command  is  useful  for 
the  first  line  of  a  paragraph.  The  in¬ 
dent  level  for  this  line  will  be  the  sum 
of  the  indents  specified  in  the  .po  and 
.ti  commands — that  is,  the  .in  com¬ 
mand  isn’t  used  in  the  calculation.  To 
indent  relative  to  the  current  indent 
level,  use  a  leading  plus  or  minus  sign. 
For  example,  .ti  +5  causes  the  next 
line  to  be  indented  five  spaces  further 
than  the  current  indent  level,  as  speci¬ 
fied  with  previous  .in  or  .po  com¬ 
mands.  The  .ti  command  causes  a 
break. 

Macros,  Strings,  Diversions, 
and  Traps 

Macros  are  the  heart  of  nr.  Without 
them  the  word  processor  would  be 
so  difficult  to  use  that  it  wouldn't  be 
worth  the  trouble.  Macros  are  collec¬ 
tions  of  text.  When  you  define  a  mac¬ 
ro,  the  text  is  saved  by  nr,  and  when 
you  expand  a  macro,  the  text  is  used 
for  input.  The  mechanism  is  identical 
to  the  * define  mechanism  in  C.  A 
macro  name  can  be  any  length, 
though  for  nroff  compatibility  I'd 
suggest  limiting  yourself  to  one-  or 
two-character  names.  Macros  used  in 
traps  (discussed  shortly)  must  have 
one-  or  two-character  names,  howev¬ 
er.  Macro  names  are  case  sensitive. 
You  cannot  define  a  macro  that  has 
the  same  name  as  a  built-in  dot  com¬ 
mand  (if  you  do,  the  macro  will  just 
be  ignored). 


A  macro  is  defined  with  a  .de 
<name>  command  and  is  expanded 
as  if  it  were  a  dot  command  whenev¬ 
er  you  precede  its  name  with  a  dot  in 
the  first  column.  A  macro  can  take  up 
to  nine  arguments  (accessible  within 
the  macro  using  \$1,  \$2,  and  so  forth). 
For  example  a  macro  defined  with: 

.de  xx 

arg  1  <\$1> 
arg  2  <\$2> 
arg  3  <\$3> 

is  invoked  with: 

.xx  "this  is  one  argument"  doo  wha 
and  will  print: 

arg  1  Cthis  is  one  argument> 
arg  2  <doo> 
arg  3  <wha> 

Macros  can  also  call  other  macros 
(though  recursion  is  not  permitted). 
In  practice  they  are  used  in  the  same 
way  as  subroutines  are.  They  let  you 
take  the  nr  primitives  described  here 
and  do  something  useful  with  them. 

There  are  two  flavors  of  macros: 
true  macros  and  strings.  A  true  mac¬ 
ro  is  intended  to  hold  a  collection  of 
commands  and  text;  a  string  is  in¬ 
tended  to  hold  text  that  is  expanded 
into  a  line.  In  practice,  the  only  dif¬ 
ference  is  that  the  last  line  in  a  macro 
is  terminated  with  a  carriage  return 
whereas  the  last  line  of  a  string  is  not. 
Strings  are  defined  with  a  .ds  or  .as 
command.  They  are  expanded  using 
the  \*x  or  \*(?qc  escape  sequences. 
The  first  syntax  is  for  one-character 
names,  and  the  second  is  for  two- 
character  names.  Strings  may  con¬ 
tain  escape  sequences.  Note  that  they 
are  defined  in  normal  mode,  howev¬ 
er  (not  in  copy  mode  as  the  real  nroff 
does  it).  This  means  that  you  need  to 
use  double  backslashes  to  get  an  es¬ 
cape  sequence  into  a  string.  For  ex¬ 
ample,  if  you  want  to  define  a  string 
called  #d  that  prints  the  word  * define 
in  boldface,  you  could  use: 

.ds  #d  \\fB#define\\fP 

The  string  could  be  used  later  by  em¬ 
bedding  a  \  *(  #d  into  the  text  where 
you  wanted  the  word  to  appear. 

A  diversion  is  a  macro  that's  used  to 


104 


Dr.  Dobb’s  Journal,  March  1987 

229 


delay  printing  temporarily.  This  way 
you  can  collect  footnotes  or  a  table  of 
contents  in  a  diversion  and  then  print 
the  diversion  out  at  the  end  of  the  doc¬ 
ument.  The  .di  xx  command  causes 
output  to  be  sent  to  the  macro  called 
yy  rather  than  to  the  output  stream.  A 
.di  without  arguments  will  stop  the 
redirection  and  restore  the  previous 
output  stream.  Diversion  nesting  is 
permitted — you  can  redirect  to  a  di¬ 
version  from  within  a  diversion. 

Macros  and  diversions  are  both 
created  in  copy  mode,  a  crippled  in¬ 
put  mode  in  which  only  two  escape 
sequences  are  recognized  (\"  and 
\<CR».  Copy  mode  is  described  in 
greater  depth  later.  Nr  supports  two 
copy  modes — the  one  just  described 
and  an  nroff-compatible  copy  mode 
that  is  a  little  less  restrictive.  Small 
macros  are  stored  internally,  in  RAM. 
If  the  macro  gets  too  large  (greater 
than  256  characters),  it  is  stored  on 
disk,  however.  The  file  names  all 
take  the  form  xxxx.mac,  where  yyyy 
is  four  hex  digits.  The  string  defined 
in  the  TMP  environment  (created  by 
COMMAND.COM  with  a  set  command 
and  by  the  shell  with  a  setenv  com¬ 
mand)  is  appended  to  the  front  of  the 
file  name,  so  you  can  use  something 
such  as: 

set  TMP  d:/tmp/ 

to  put  macro  files  onto  a  RAM  disk. 
The  trailing  /  is  necessary  here.  Mac¬ 
ro  files  are  all  deleted  when  nr 
terminates. 

One  of  the  more  useful  features  of 
nr  is  a  trap.  A  trap  is  way  to  tell  nr  to 
expand  a  macro  automatically  when 
a  specified  event  occurs.  For  exam¬ 
ple,  you  can  set  a  trap  to  expand  a 
macro  at  the  top  or  bottom  of  every 
page.  You  can  spring  a  trap  after  a 
specified  number  of  input  lines  have 
been  read  or  after  a  specified  num¬ 
ber  of  lines  have  been  put  into  a  di¬ 
version.  There's  also  a  special  trap 
that's  sprung  once,  after  the  entire 
document  has  been  printed. 

.de  name  [xxl — define  a  macro  and 
give  it  the  indicated  name.  All  lines 
between  the  .de  command  and  the 
first  line  that  begins  with  . .  (or  with 
.yy,  where  yy  is  the  second  argument 
to  .de)  are  added  to  the  macro.  If  a 
macro  with  the  indicated  name  ex¬ 
ists,  it  is  destroyed.  If  both  arguments 


are  missing,  all  currently  defined 
macros  are  printed  (like  .pm  in  real 
nroff  does,  except  the  contents  of  the 
macro  are  printed,  too). 

.am  name  /yy7  — append  text  to  an  ex¬ 
isting  macro.  It  works  like  .de  does 
but  doesn’t  overwrite  the  existing 
macro. 

.ds  name  text — define  a  string  called 
name,  and  put  the  indicated  text  into 
it.  If  the  string  already  exists,  it  is 
deleted. 

.as  name — append  text  to  the  end  of 
an  existing  string.  It  works  like  .ds 
does  except  that  it. doesn’t  overwrite 
the  existing  string. 

.di  [name] — divert  output  to  the 
named  macro.  The  diversion  is  termi¬ 
nated  by  a  .di  or  .da  command  that 
has  no  argument.  Diversions  can  be 
nested.  Normal  text  processing  oc¬ 
curs  in  a  diversion  except  that  the 
page  offset  isn't  done.  If  a  macro  hav¬ 
ing  the  indicated  name  already  ex¬ 
ists,  it  is  destroyed. 

.da  [name] — divert  text  to  the  named 
macro,  appending  to  its  end  rather 
than  overwriting  it.  Stop  appending 
when  a  .da  or  .di  without  an  argu¬ 
ment  is  encountered. 

.rm  name — remove  the  named  mac¬ 
ro  or  string.  If  the  macro  is  on  the 
disk,  the  file  is  deleted. 

.em  name — use  the  named  macro  as 
the  end  macro.  This  macro  will  be  ex¬ 
ecuted  once,  after  all  output  has  been 
processed.  You  can’t  give  arguments 
to  the  end  macro. 

■  wh  N  [name] — set  an  output  line 
trap.  The  named  macro  is  executed 
automatically,  immediately  after 
printing  line  N  on  every  page.  If  N  is 
0,  the  trap  is  sprung  at  the  top  of  a 
page  (above  line  1).  If  the  name  is  ab¬ 
sent,  the  trap  at  line  N  is  removed.  If 
N  is  negative,  then  the  trap  is  set  rela¬ 
tive  to  the  bottom  of  the  page.  (The 
location  is  determined  by  looking  at 
the  page  length  [as  set  with  .pi]  that 
was  in  effect  when  the  .wh  was  exe¬ 
cuted.)  The  macro  replaces  any  pre¬ 
viously  installed  macro  for  that  trap 
position;  macros  do  not  shadow  one 
another  as  in  the  real  nroff. 


.ch  name  [  [i — ]N  ] — change  output 
line  trap  position  for  the  named  mac¬ 
ro  to  line  N.  Any  existing  trap  at  that 
position  is  destroyed  (nroff  will  shad¬ 
ow  the  earlier  trap,  not  destroy  it).  If 
N  is  absent,  the  trap  is  removed. 

■  dt  [-1 — ] N  name — set  a  diversion 
trap.  The  named,  macro  is  executed 
after  N  lines  have  been  written  into 
the  current  diversion.  Only  one  di¬ 
version  trap  may  be  active. 

■it  [+—]N  name — set  an  input  line 
trap.  Execute  the  named  macro  after 
N  lines  of  input  have  been  read.  Only 
one  input  line  trap  may  be  active. 
A  .it  destroys  a  previous  trap  if  one 
exists. 

Environments 

In  a  real  programming  language  you 
can  copy  things  into  local  variables 
when  you  need  to  save  them.  Nr, 
however,  only  supports  global  vari¬ 
ables,  and  this  can  present  a  problem. 
A  solution  of  sorts  is  the  environment 
mechanism.  An  environment  is  a 
stack.  When  you  save  an  environ¬ 
ment,  various  parameters  that  con¬ 
trol  how  nr  works  are  pushed  onto 
the  stack.  You  can  then  change  those 
parameters  at  will.  The  old  environ¬ 
ment  can  be  be  popped  from  the 
stack  at  a  later  date,  overwriting  any 
changes  that  were  made  after  the 
previous  push.  The  saved  parame¬ 
ters  are  listed  in  Table  4,  page  108. 

The  .ev  [N]  command  pushes  vari¬ 
ous  commonly  used  variables  onto  an 
environment  stack.  Nroff  supports 
several  environments,  and  nr  sup¬ 
ports  only  one.  If  an  argument  is 
present,  the  current  environment  is 
pushed.  If  no  argument  is  present,  a 
previously  saved  environment  is 
popped  from  the  stack.  The  stack  can 
hold  up  to  five  environments.  An  er¬ 
ror  message  is  printed  if  you  try  to 
push  more  than  five. 

Number  Registers 

Number  registers  are  nr’s  global  vari¬ 
ables.  They  are  used  to  hold  numeric 
quantities.  You  create  number  regis¬ 
ters  with  a  .nr  command  and  expand 
them  into  the  text  with  \nx  or  \n(xx 
escape  sequences.  The  first  syntax  is 
for  one-character  names,  and  the  sec¬ 
ond  is  for  two-character  names.  The 
string  \nx,  when  found  in  the  input,  is 
replaced  by  a  string  representing  the 


Dr.  Dobb's  Journal,  March  1987 

230 


105 


C  CHEST 

( con tinued  from  page  105) 

contents  of  the  indicated  number 
register. 

There  are  preincrement  and  decre¬ 
ment  syntaxes,  too:  \n+g,  \n+(gg, 
\n—  x,  and  \n  —  (&c.  With  these  syn¬ 
taxes  the  number  register  is  incre¬ 
mented  (or  decremented)  by  a  prede¬ 
termined  amount  before  the  escape 
sequence  is  interpolated.  Nonexistent 


number  registers  expand  to  0.  You 
can  use  number  registers  both  in  com¬ 
mands  and  embedded  in  the  text. 

Several  number  registers  are  creat¬ 
ed  and  maintained  by  nr  itself  (see  Ta¬ 
ble  5,  below ).  These  hold  such 
things  as  the  current  page  number. 

You  can  use  number  registers  to  do 
things  such  as  keep  track  of  the  cur¬ 
rent  footnote  number.  For  example, 
.nr  fn  0  creates  a  number  register 
called  fn  that  will  hold  the  footnote 


number.  You  can  access  this  register 
with  \n(fn,  but  if  you  use  \n  +  (fn, 
then  the  register  will  be  incremented 
automatically  before  it's  expanded. 
This  process  can  in  turn  be  hidden  in 
a  string — .ds  *  \u  \  \n+(fh  \d.  Here  the 
\u  and  \d  send  the  cursor  up  and 
down  half  a  line.  You  need  two  back¬ 
slashes  to  prevent  nr  from  expanding 
the  number  register  at  definition 
time.  You  can  now  expand  the  string 
with  a  \**  in  the  text,  thereby  both 
printing  and  incrementing  the  cur¬ 
rent  footnote  number.  The  number 
register  is  incremented  before  it’s 
expanded. 

Number  registers  can  be  expanded 
into  the  text  in  several  formats.  That 
is,  the  number  is  just  a  number,  but  it 
can  be  expanded  as  an  Arabic  num¬ 
ber  (with  optional  zero  fill),  as  an  up¬ 
percase  or  lowercase  Roman  numer¬ 
al,  in  outline  format  (a,  b,  c,  z  .  .  .  aa, 
ab  .  .  .  az),  or  in  English  words  (one 
thousand,  two  hundred  fifty-seven). 

.nr  name  [+  —]N  [[—]M]  — create  or 
modify  number  register  name  by  N. 
For  example,  .nr  x  10  creates  a  num¬ 
ber  register  called  x  and  initializes  it 
to  10,  and  .nr  x  +5  increases  the  con¬ 
tents  of  y  to  IS.  M  is  the  increment 
amount  (the  default  is  1  if  M  is  ab¬ 
sent).  When  the  number  register  is 
accessed  using  the  \n+jc  or  \n-ffx* 
syntax,  then  M  is  added  to  the  regis¬ 
ter  before  it  is  expanded.  M  may  be 
negative.  Unlike  nroff,  .nr,  with  no 
arguments,  prints  a  list  of  all  current¬ 
ly  defined  number  registers  and 
their  contents. 

,rr  name — remove  the  named  num¬ 
ber  register. 

.af  name  mode — alter  the  expansion 
format  of  the  named  number  register 
to  the  indicated  mode.  Default  is  Ara¬ 
bic.  Legal  values  of  mode  are  shown 
in  Table  6,  page  109.  The  leading  Os  in 
the  Arabic  formats  (as  in  the  second 
and  third  lines  of  Table  6)  determine 
the  field  width  of  the  number. 

In  my  next  column,  I’ll  conclude 
this  user’s  manual  by  describing  tab¬ 
ulation,  control  flow,  hyphenation, 
line  numbering,  and  more. 

Availability 

The  February,  March,  and  April  1987 
C  Chests  have  been  combined  to  cre- 


The  following  parameters  are  saved: 

•  the  unprinted  contents  of  the  fill  buffer  (the  buffer  is  cleared  after  its  contents  are  stored) 

•  the  input  line  trap  (it’s  cleared  after  being  saved) 

•  the  count  associated  with  the  .cu, . ul ,  .bo,  or  .os — all  these  are  set  to  zero  after  being 
saved 

•  the  adjustment  mode  (as  set  with  . ad) 

•  the  current  font  (as  set  with  \f  or  .ft) 

•  the  command  character  (as  set  with  .cc) 

•  the  escape  character  (as  set  with  .ec) 

•  the  current  no-break  character  (as  set  with  ,c2) 

•  the  fill  status  (line  filling  enabled  [.fi  ]  or  disabled  [of]) 

•  the  indent  level  (as  set  with  .in) 

•  the  page  offset  (as  set  with . po ) 

•  the  line  spacing  (as  set  with  .Is) 

•  the  line-numbering  values  (as  set  with  .nm) 

•  the  margin  characters  (.me  and  ,lm) 

•  the  tab  stops  and  the  tab  and  leader  expansion  characters 

•  the  line  length  (.11) 

•  the  temporary  indent  (.  ti) 

•  the  three-part  title  length  (.tl) 


Table  4:  The  contents  of  an  environment 


%  current  output  page  number 

dl  width  of  (maximum  line  length  of  any  line  in)  the  most  recently  completed  diversion 

dn  height  of  (numbers  of  lines  in)  most  recently  completed  diversion 
dy  day  when  execution  started  (1  —  31 ) 
h  hour  when  execution  started  (1—24) 

hp  current  horizontal  place  on  input  line 

In  current  line  number  used  by  the  .nm  command  for  line  numbering. 

m  minute  when  execution  started  (1  —  59) 

nl  current  output  line  number  (used  by  .nm) 

mo  month  when  execution  started  (1—12) 

s  second  when  execution  started  (1  -  59) 

wd  day  of  the  week  when  execution  started  (1  —7, 1  is  Sunday) 

yr  year  (for  example,  1 987) 

.$  number  of  arguments  to  the  current  macro 
.c  number  of  lines  read  from  current  input  file. 

.d  vertical  place  in  current  diversion  (distance  from  line  1). 

.1  currently  active  font  (can  be  stored  and  then  passed  to .  ft  later  on). 

./  current  indent  column  (as  set  with  a  .in) 

.1  current  line  length  (as  set  with  a  .It) 

.n  length  of  the  text  part  of  previous  output  line 
.o  current  page  offset  (as  set  with  .po) 

.p  current  page  length  (as  set  with  .pi) 

.t  distance  to  next  trap  (in  lines)  (very  large  if  there’s  no  trap) 

.u  1  if  in  fill  mode,  0  otherwise 
.  v  current  line  spacing  (as  set  with  .Is) 


Table  5:  Predefined  number  registers 


108 


Dr.  Dobb’s  Journal,  March  1987 

231 


Mode  Register  Expands  As 

1  1,2,  3,  4,... 

01  01,02,03,04,... 

001  001,002,003,004,... 

i  i,  ii,  iii,  iv,  v,  vi,  vii, . .  . 

I  I,  II,  III,  IV,  V,  VI,  VII, ... 

a  a,  b,  c, . . .  z,  aa,  ab,  ac  . . .  az,  ba,  bb, . . . 

A  A,  B,  C,  .  .  .  Z,  AA,  AB,  AC  .  .  .  AZ,  BA,  BB, 

e  one,  two,  three,  four,  five,  six, . . . 

E  One,  Two,  Three,  Four,  Five,  Six, . . . 


Table  6:  Number  register  output  formats 


ate  Nr:  An  Nroff-Like  Text  Processor 
for  MS-DOS.  This  reprint  is  available 
with  a  source-code  disk  for  $29.95. 
Send  prepaid  orders  to  M&T  Books, 
501  Galveston  Dr.,  Redwood  City,  CA 
94063  or  call  (415)  366-3600,  extension 
216.  Please  add  $2.25  for  shipping  and 
handling  ($5  for  foreign  orders). 

DDJ 

(Listings  begin  on  page  48.) 

Vote  for  your  favorite  feature/article. 

Circle  Reader  Service  No.  4. 


Definitions; 
Declarations;  and  Casts 

Kernighan  and  Ritchie,  for  reasons 
unknown  to  myself,  use  the  terms 
declaration  and  definition  in  a  special 
way.  Unfortunately,  the  way  they 
use  these  words  is  the  inverse  of  the 
way  in  which  every  other  program¬ 
mer  thinks  of  them.  A  declaration  is 
an  announcement  (at  least  according 
to  Webster's).  Consequently  K  &  R 
use  the  word  declaration  to  mean 
that  you  are  announcing  the  pres¬ 
ence  of  a  variable  to  the  compiler. 
You  aren’t  allocating  space  for  that 
variable;  you’re  just  announcing  its 
presence  somewhere  in  some  mod¬ 
ule  in  your  program.  The  linker  will 
find  the  actual  variable  when  the 
modules  are  linked.  An  extern  state¬ 
ment  is  used  to  declare  a  variable  in  K 
&  R’s  sense  of  the  word. 

On  the  other  hand,  Webster's  says 
that  to  define  an  object  is  to  "fix  or 
mark  the  limits”  of  that  object,  to  allo¬ 
cate  space  for  the  object.  So  a  variable 
definition  in  C  is  what  actually  allo¬ 
cates  space  for  a  variable.  This  usage  is 
backward  from  the  normal  usage, 
thus  the  confusion.  A  declaration  is  al¬ 
ways  implicit  in  a  definition— when 
you  allocate  space  for  an  object  (de¬ 
fine  it),  you  also  tell  the  compiler  that 
the  object  exists  somewhere  (here). 

The  declaration/definition  conun¬ 
drum  can  cause  problems.  A  particu¬ 
larly  nasty  one  is  brought  about  by 
implicit  declarations  of  subroutines. 
If  you  use  a  subroutine  that  hasn’t 
been  previously  declared  (with  ei¬ 
ther  an  extern  statement  or  a  real  def¬ 
inition),  the  compiler  assumes  that 
the  subroutine  returns  an  int.  The 
problem  arises  when  you  then  use 


Flotsam  and  Jetsam 


the  cast  operator  in  conjunction  with 
the  implicit  declaration. 

A  cast  operator  temporarily 
changes  the  type  of  a  specific  object. 
It  is  formed  by  writing  a  variable  dec¬ 
laration  of  the  required  type,  sur¬ 
rounding  the  declaration  with  pa¬ 
rentheses,  and  then  removing  the 
name  and  semicolon.  For  example, 
you’d  declare  a  character  pointer 
with: 

char  ‘Dostoevski ; 

You  change  the  declaration  to  a  cast 
by  surrounding  the  foregoing  with 
parentheses: 

(char  ‘Dostoevski ;) 

and  removing  the  name  and 
semicolon: 

(char  *) 

You  can  now  convert  an  object  to  a 
character  pointer  by  preceding  its 
use  with  the  cast. 

An  example:  you’ve  defined  an  inf- 
size  variable  called  baton  and  want  to 
pass  it  to  a  subroutine  called  run- 
ner( ),  which  expects  a  double-size  ar¬ 
gument.  You  can  force  an  int-to-dou- 
ble  type  conversion  with  a  cast: 

runner(  (double)  baton ); 

The  definition/declaration  prob¬ 
lem  arises  when  you  try  to  use  a  cast 
to  change  the  type  of  an  object  that 
was  implicitly  declared  as  type  int. 
For  example,  the  following  will  not 
work  as  expected  in  the  8086  medi¬ 
um  or  large  models: 


struct  building  ‘tourist  ; 
tourist  =  (struct  building  *) 

mallod  sizeofistruct  building) ); 

You  had  intended  to  convert  the 
character  pointer  returned  from 
mallocl )  into  a  building  pointer.  The 
compiler  doesn't  know  that  mallod  ) 
returns  a  character  pointer,  howev¬ 
er.  It  assumes  that  mallod  )  returns 
an  int  because  there’s  no  preceding 
extern  statement.  Pointers  and  ints 
are  different  sizes  in  the  8086  medi¬ 
um  or  large  models,  however.  (An  int 
is  probably  16  bits  wide,  and  a  point¬ 
er  is  probably  32  bits  wide.)  Because 
the  compiler  thinks  that  mallod  )  re¬ 
turns  an  inf,  it  truncates  the  32-bit 
pointer  down  to  16  bits — the  size  of 
an  int.  Only  now  will  it  look  at  the 
cast  operator,  converting  the  int  back 
to  a  pointer.  Unfortunately,  the  pre¬ 
cision  that  you  lost  when  the  variable 
was  truncated  is  still  lost.  That  is,  the 
upper  16  bits  of  the  pointer  are  lost 
forever,  converted  to  0s. 

You  can  fix  the  problem  by  telling 
the  compiler  that  mallod  )  indeed  re¬ 
turns  a  pointer  of  some  sort.  Use 
either: 

extern  char  ‘mallocl ); 
tourist  =  (struct  building  *) 

mallocl .  .  . ); 

or: 

extern  struct  building  ‘mallod  ); 
tourist  =  mallod  . . . ); 

In  the  first  example,  you're  convert¬ 
ing  a  character  pointer  to  a  building 
pointer.  As  both  of  these  pointers  are 
the  same  width,  no  precisionis  lost.  O 


Dr.  Dobb's  Journal,  March  1987 

232 


109 


16-BIT  SOFTWARE  TOOLBOX 


80386  Resources 

ttendees  of  the  November  1986 
Comdex  in  Las  Vegas  found 
themselves  deluged  by  Intel  80386 
hype  and  hysteria  from  vendors  and 
press  alike.  I’ll  be  discussing  this  in¬ 
teresting  new  supermicro  at  length 
in  these  pages  after  Santa  brings  me  a 
80386-based  machine  or  accelerator 
board  to  play  with;  in  the  meantime, 
here  are  some  helpful  sources  of 
information: 

80386  Programmer’s  Reference  Man¬ 
ual.  About  350  pages.  Intel  order 
number  230985-001. 

This  manual  covers  architecture, 
memory  management,  memory  pro¬ 
tection,  multitasking,  input/output, 
exception  and  interrupt  handling, 
debugging  support,  virtual  8086 
mode,  and  mixing  16-bit  and  32-bit 
code,  and  it  has  a  full  reference  sec¬ 
tion  on  the  individual  instructions. 
It’s  a  must-have  for  would-be  80386 
programmers. 

80386  Hardware  Reference  Manual. 
Intel  order  number  231732-001. 

This  book  covers  internal  architec¬ 
ture  and  pipelining,  local  bus  inter¬ 
face,  coprocessor  interface,  and 
memory  cache.  It’s  for  hardware 
knowledgeable  types  only. 

Introduction  to  the  80386.  Intel  order 
number  231252-001. 

A  nice  readable  overview  of  the 
80386  in  its  native  32-bit  processing 

by  Ray  Duncan 


mode  and  its  support  for  paging, 
memory  protection,  and  multitask¬ 
ing.  It  also  includes  a  discussion  of  up¬ 
ward  compatibility  from  8086,  286 
protected  mode,  and  the  virtual  86 
mode. 

80386:  A  Collection  of  Article  Reprints. 
60  pages.  Intel  order  number  231737- 
001. 


A  compilation  of  recent  feature  ar¬ 
ticles  from  Electronic  Design,  IEEE  Mi¬ 
cro,  Computer  Systems,  and  Tech 
Notes. 

The  80386:  A  High  Performance  Work¬ 
station  Microprocessor.  Intel  order 
number  231776-001. 

An  evaluation  of  the  throughput  of 
the  80386  and  comparisons  with 
other  popular  processors.  It  includes 
the  C  source  code  for  the  Dhrystone 
and  Whetstone  benchmarks. 

80386  High  Performance  32-Bit  Micro¬ 
processor  with  Integrated  Memory 
Management.  Product  data  sheet  dat¬ 
ed  April  1986.  131  pages.  Intel  order 
number  231630-002. 

A  very  terse  summary  of  the  hard¬ 
ware  reference  and  programmer’s 
reference  mentioned  earlier. 

You  can  order  all  the  above  from 
Intel  Literature  Sales,  P.O.  Box  58130, 
Santa  Clara,  CA  95052-8130;  (800)  548- 
4725.  Intel's  telephone  order  service 
is  courteous,  and  delivery  is  prompt. 
The  Intel  publication  catalog,  order 
number  210620-010,  is  free  for  the 
asking. 

80386  Un-Resources 

Murray,  William  H.,  Ill,  and  Pappas, 
Chris  H.  80386/80286  Assembly  Lan¬ 
guage  Programming.  Berkeley,  Calif.: 
Osborne/McGraw-Hill,  1986.  548 
pages  with  index.  ISBN  0-07-881217-8. 

This  is  the  80386  reference  book 
not  to  buy;  it  is  a  sad  example  of  a 
publisher’s  unscrupulous  attempt  to 
cash  in  on  a  new  technology.  Mur¬ 
ray’s  book  is  essentially  about  8086 
programming  with  a  few  nods  to  the 


additional  instructions  and  protected 
mode  of  the  80286,  and  it  makes  only 
token  references  to  the  80386.  The 
few  program  fragments  that  illus¬ 
trate  the  80386's  32-bit  instructions 
would  never  run  if  assembled  in  cur¬ 
rent  environments  because  they 
don’t  include  the  32-bit  override  byte. 
Some  of  the  more  interesting  features 
of  the  386,  such  as  caching,  pipelined 
instruction  execution,  segments  up  to 
4  gigabytes  in  length,  and  bit  instruc¬ 
tions,  are  not  covered  at  all. 

Assembly-Language 

Resources 

The  November/December  1986  issue 
of  Programmer's  Journal  contains 
two  articles  that  16-Bit  Toolbox  read¬ 
ers  will  find  especially  useful.  M.  Ste¬ 
ven  Baker  has  contributed  an  expla¬ 
nation  of  Terminate  and  Stay 
Besident  utilities  that  includes  discus¬ 
sion  of  the  In-DOS  flag  ( int  21h,  func¬ 
tion  34h)  and  the  Multiplex  Interrupt 
(int  2fh).  George  Defenbaugh  has  writ¬ 
ten  an  article  on  "Parents,  Children, 
Redirection,  and  Piping”  that  dis¬ 
cusses  the  MS-DOS  DUP  and  CDUP  func¬ 
tions  (int  21h,  functions  45h  and  46h ). 

The  Byte  Information  Exchange 
(BIX)  has  an  exceptionally  active  and 
useful  conference  called  MS-DOS  Se¬ 
crets.  This  conference  already  con¬ 
tains  nearly  a  thousand  messages 
about  undocumented  MS-DOS  inter¬ 
rupts,  TSR  techniques,  MS-DOS  bugs 
and  work-arounds,  and  the  like.  If 
you  are  a  serious  MS-DOS  program¬ 
mer,  you  will  find  the  cost  of  a  BIX 
account  more  than  justified  by  this 
conference  alone. 

William  Claff  was  kind  enough  to 
send  me  copies  of  the  first  eight  is¬ 
sues  of  his  monthly  newsletter,  PC 
Tech  Report.  These  issues  cover  such 
topics  as  the  ASSUME  and  GROUP  dir¬ 
ectives,  making  .EXE  files  resident,  de¬ 
vice  driver  templates,  8087  program¬ 
ming,  and  a  complete  critical  error 


110 


Dr.  Dobb’s  Journal,  March  1987 

233 


16-BIT 

(continued  from  page  110) 

- - - 

(inf  Z4h )  handler.  The  more  recent 
newsletters  range  from  6-11  pages  in 
length  and  have  a  heavy  emphasis  on 
working  source  code.  Subscriptions 
cost  $18  per  year.  Contact  Mr.  Claff  at 
7  Roberts  Rd.,  Wellesley,  MA  02181; 
(617)  235-9505. 

Call  for  Papers 

The  Waite  Group,  a  San  Francisco- 
based  computer  book  developer  and 
publisher,  is  looking  for  contributing 
authors  for  a  new  book  on  MS-DOS  en¬ 
titled  The  MS-DOS  Papers. 

The  news  release  from  the  Waite 
Group  says:  "The  MS-DOS  Papers  will 
be  a  collection  of  learning  tutorials 
written  by  a  broad  range  of  MS-DOS 
experts,  gurus,  wizards,  and  spokes¬ 
persons.  The  MS-DOS  Papers  will  pro¬ 
vide  insightful  information  on  the 
MS-DOS  operating  system,  revealing 
the  more  hidden  and  obscure  truths 
about  MS-DOS  in  an  interesting,  easy 
to  read  Waite  Group  format.  Its  con¬ 
tributed  nature  allows  us  to  include 
subjects  that  might  not  support  a  sep¬ 
arate  book  as  well  as  subjects  that  are 
on  the  cutting  edge  of  MS-DOS  technol¬ 
ogy.  The  audience  level  is  intermedi¬ 
ate  to  advanced  businesspeople,  pro¬ 
grammers,  and  anyone  who  wants 
the  most  up-to-date  information 
about  this  popular  operating  system. 
Examples  are  given  in  both  MS-C  and 
MASM. 

"The  book  will  consist  of  three 
types  of  contributions: 

•  Tutorials  on  topics  that  have  never 
been  adequately  discussed  in  the  lit¬ 
erature.  These  include  inside  BIOS, 
tips  and  undocumented  secrets,  stay 
resident  programming,  advanced 
MASM  programming,  and  debugging 
as  well  as  new  concepts  arising  in  MS- 
DOS,  such  as  protected  mode  MS-DOS 
and  CD  ROMs. 

•  Issue  papers  by  experts  in  a  particu¬ 
lar  area  of  MS-DOS.  These  will  discuss 
past  controversies,  the  future  of  MS- 
DOS,  and  so  on. 

•  Case-history  papers,  which  will  tell 
the  bottom  line  about  real  MS-DOS 
machines,  projects,  and  software 
tools.” 

For  more  information,  contact 
Mitchell  Waite  at  one  of  the  follow¬ 


ing  electronic  mail  addresses: 

BIX:  mwaite 

The  WELL:  mitch 

Usenet:  111-lcc,  hplabs}!well!mitch 

A  Nifty  Tool 

Cruise  Control  is  a  Terminate  and 
Stay  Resident  (TSR)  utility  for  IBM  PCs 
and  compatibles  that  eliminates  cur¬ 
sor  runon,  the  term  the  utility's  au¬ 
thor  uses  for  the  behavior  of  pro¬ 
grams  that  cannot  process  keystrokes 
as  fast  as  the  keyboard’s  auto-repeat 
rate.  When  you  are  using  such  a  pro¬ 


gram  and  hold  down  a  key,  the  key¬ 
strokes  pile  up  in  the  type-ahead 
buffer  until  it  is  full  and  then  you 
hear  a  beep  that  tells  you  to  release 
the  key.  The  piled-up  keystrokes  are 
then  processed  until  the  buffer  is 
empty,  so  you  frequently  tab  or  scroll 
much  farther  than  you  intended  to. 
Lotus  1-2-3,  WordStar,  and  Microsoft 
Word  are  three  commonly  used  pro¬ 
grams  that  have  this  problem  on  old¬ 
er  8088  or  8086-based  PCs. 

The  primary  effect  of  Cruise  Con¬ 
trol  is  that  it  monitors  the  type-ahead 
buffer  and  dynamically  adjusts  the 


Name 

PASTE— horizontally  concatenate  two  files 

Synopsis 

paste  [-paste]  [-b  <string>]  [  — <n>]  [filel  ]  [file2] 

Description 

PASTE  will  append  to  the  lines  of  <file1>  the  corresponding  lines  of  <file2>,  with  an 
optional  string  between  them.  PASTE  writes  to  standard  output. 

The  following  flags  are  recognized  by  PASTE: 

-p  <file1>  does  not  exist  (<string>  is  prepended  to  each  line). 

—a  <file2>  does  not  exist  (<string>  is  appended  to  each  line). 

— s  Do  not  print  <string>  with  lines  from  only  one  file. 

-t  Resolve  the  ambiguous  command  paste  <fite>.  The  -f  flag  forces  <file>  to 
trail  standard  input — that  is,  paste  <file>  is  equivalent  to  paste  <fite> 
<stdin>,  and  paste  —t  <file>  is  equivalent  to  paste  <stdin>  <file>. 

-e  Do  not  print  <string>  if  both  input  lines  are  empty  (contain  no  characters  but  \n  ). 

—b  Indicates  that  a  string  of  characters  follows.  The  string  is  inserted  between  each 

line  of  < filel >  and  <file2>.  The  string  can  contain  all  the  standard  escape  codes 
with  the  exception  of  \0'.  The  escape  sequence  \s'is  also  known  to  represent  a 
blank.  Blanks  may  also  be  embedded  in  a  string  by  enclosing  the  string  in  quotes. 
—  <n>  Print  n  lines  of  <file1>  before  appending  lines  of  <file2>.  If  n  is  negative  (for 
example,  paste  — 3),  then  n  lines  of  <file2>  will  be  printed  first. 

Bugs 

On  some  systems,  you’ll  have  to  use  an  escape  sequence  to  represent  capital 
letters  in  string.  Also,  a  quoted  string  with  multiple  blanks  can  have  them  reduced  to  single 
blanks  on  systems  that  do  not  recognize  quote  marks  as  special — use  the  escape  se¬ 
quences  \s’or  \ 

As  of  this  writing,  the  standard  escape  sequences  are: 

\b  backspace 

\f  form  feed 

\n  new  line 

\r  carriage  return 

\t  tab 

\  0  null  character  (not  allowed  in  string  argument) 

\  \  literal  backslash 

\*  literal  quote  mark 

\  ’  literal  apostrophe 

\ddd  bit  pattern,  consisting  of  1  -  3  octal  digits 
Escape  sequences  special  to  PASTE: 

\s  space 

A  backslash  followed  by  any  other  character  merely  represents  that  character. 

Author 

John  M.  Gamble,  January  1984 
Table  1:  Instructions  for  using  the  PASTE  utility 


112 

234 


Dr.  Dobb’s  Journal,  March  1987 


keyboard  auto-repeat  rate  to  match 
the  program's  capability  to  process 
the  keystrokes.  This  means  that  you 
never  tab,  page,  or  scroll  past  your 
desired  destination.  For  those  pro¬ 
grams  that  can  handle  it,  the  appar¬ 
ent  speed  of  many  keys  (such  as  the 
arrow  or  page  keys)  is  drastically  in¬ 
creased. 

It  sounds  like  a  simple  concept,  but 
the  difference  in  the  behavior  of 
your  computer  and  favorite  editor 
with  Cruise  Control  installed  is  dra¬ 
matic.  I  have  used  it  with  both  Micro¬ 
soft  Word  and  MicroPro's  WordStar 
with  excellent  results.  Cruise  Control 
also  offers  a  few  nifty  fringe  benefits, 
such  as  an  automatic  screen  dimmer 
after  a  configurable  time  delay,  on¬ 
line  help,  and  a  date  and  time  stamp¬ 
er  with  configurable  formats.  The 
vendor  claims  that  the  utility  is  com¬ 
patible  with  most  other  RAM-resident 
programs;  it  worked  fine  for  me  with 
both  SideKick  and  ProCED. 

You  can  obtain  Cruise  Control 
from  Revolution  Software  Inc.,  715 
Rte.  10  E,  Randolph,  NJ,  07869;  (201) 
366-4445. 

Programming  Pearl 
of  the  Month 

Richard  Rodman,  of  Falls  Church, 
Virginia,  writes:  "Here's  a  helpful 
hint  for  programmers  attempting  to 
write  adapting  I/O  routines. 

"The  IBM  PC  data  bus  is  not  pulled 
up.  If  you  try  to  read  a  data  port  to  see 
if  the  board  is  or  is  not  installed,  and 
the  board  is  not  installed,  you  may 
get  a  false  indication  because  the 
floating  bus  still  contains  the  last  data 
byte  that  was  fetched  by  the  CPU. 

"To  correct  this  problem,  you  need 
to  ensure  that  the  bus  contains  a  pat¬ 
tern  with  as  many  bits  set  to  1  as  pos¬ 
sible.  One  method  of  doing  this  is 
shown  below: 

clc 

clc 

clc 

clc 

clc 

clc 

in  al,dx 

clc 

clc 

clc 

clc 

clc 

clc 


"The  8088's  instruction  prefetch 
queue  is  6  bytes  long.  The  six  clc  in¬ 
structions  (opcode  0f8h )  on  each  side 
of  the  in  instruction  allow  the  bus  to 
float  at  a  value  of  0f8h  hex  for  unim¬ 
plemented  hardware. 

"The  real  solution,  of  course, 
would  have  been  a  terminated  bus. 
Unfortunately,  the  IBM  PC  was  a 
quick  and  dirty  design.” 

The  PASTE  Utility 

John  Gamble  of  West  Lafayette,  Indi¬ 
ana,  has  sent  in  a  useful  program 
called  PASTE  that  appends  the  lines  of 
one  file  to  the  end  of  the  lines  of  an¬ 


other  file  and  writes  the  resulting 
lines  to  the  standard  output  device. 
PASTE  can  be  used  to  horizontally 
concatenate  tables  or  columns  of  in¬ 
formation  that  have  been  edited  sep¬ 
arately.  The  program  optionally  pre¬ 
pends,  appends,  or  inserts  a  string 
into  the  newly  generated  lines.  Table 
1,  page  112,  contains  instructions  for 
using  the  program,  and  the  pro¬ 
gram's  source  code  accompanies  this 
column  as  Listing  One,  page  78. 1  have 
tested  the  program  before  publica¬ 
tion  with  Microsoft  C,  Version  4.0, 
and  MS-DOS,  Version  3.1. 

John  writes:  "I  have  found  thispro- 


Dr.  Dobb  s  Journal,  March  1987 


113 

235 


16-BIT 

(continued  from  page  113) 


gram  useful  for  creating  command 
files  on  the  fly  for  systems  such  as 
VMS  or  MS-DOS  by  taking  a  directory 
listing  as  an  input  file  and  attaching 
strings  to  the  beginning  or  end  of  the 
lines.  The  Unix  shell  can  do  this  by 
itself,  but  there  are  still  some  tricks 
that  can  be  played  with  PASTE.  For 
example,  the  command 

paste  -a  -b  '\n'  afile.txt 

will  double-space  the  lines  in 
afile.txt. 

“I  think  that,  if  the  program  needs 
any  improvement,  it  is  in  its  method 
of  input/output — it  is  done  character 
by  character.  I  really  ought  to  have 
made  it  more  efficient,  but  I  fell  foul 
of  the  good  enough'  syndrome  and 
lost  interest. 

"[After  writing  this  program]  I 
learned  that  there  is  a  similar  Unix 
System  V  utility,  also  called  PASTE.  It 
appends  the  lines  of  one  file  to  anoth¬ 
er,  too,  but  it  automatically  inserts  a 
tab  between  the  lines  and  will  accept 
more  than  two  files  on  the  command 
line — I  think  it  is  meant  more  for 
nroff  text  processing.” 

Semester  Final 

Larry  Heberlein,  of  Maryville,  Mary¬ 
land,  submits  this  little  tidbit: 

"Final  Exam:  Algorithm  Design  101 
Extra-Credit  Question 
You  attempt  a  search  and  replace  op¬ 
eration  using  a  commercial  word 
processor — the  latest  version  of 
Word  from  Microsoft,  the  world's 
largest  microcomputer  software 
house. . .  .  You  load  a  20K  file  into  a 
PC  with  640K  RAM.  With  the  program 
and  file  loaded,  400K  of  memory  are 
free.  You  attempt  to  replace  every 
carriage  return  in  the  file  with  a 
space.  Less  than  a  quarter  of  the  way 
through  the  file,  the  operation  aborts 
with  the  error  message  'insufficient 
memory.'  You  observe  that  this  hap¬ 
pens  reliably,  in  any  file,  on  any  re¬ 
placement,  with  a  sufficiently  large 
number  of  occurrences. 

"An  A  for  the  course  goes  to  any 
student  who  turns  in  a  replacement 
algorithm  so  bad  that  it  can't  succeed 
in  memory  20  times  the  size  of  the 
data.” 


Assembly  vs. 

High-Level  Languages 

Charles  Lyall  of  Kingman,  Alberta, 
writes:  "I  couldn't  let  your  invitation 
in  the  July  1986  issue  of  DDJ  to  discuss 
the  assembly-language  vs.  high-level- 
language  issue  go  unanswered. 

"I  am  an  EDP  consultant  who 
hacked  his  first  piece  of  code  in  1963 
on  an  IBM  1620.  Even  in  those  days  we 
were  arguing  the  relative  merits  of 
assembly  vs.  higher-level  languages. 
Now  we  have  several  fourth-genera¬ 


ls  is  not 
always  true 
that  assembly- 
language  programs 
run  faster 
than  high-level 
programs. 


tion  languages  that  help  to  spice  up 
the  debate  even  more. 

"I  must  take  issue  with  your  state¬ 
ment  that  'It  doesn't  take  me  more 
than  an  hour  or  two  to  write  a  pro¬ 
gram  the  size  of  TEE  from  scratch  in 
assembly  language.  .  .  .’  The  state¬ 
ment  is  false!  Oh,  I  am  quite  sure  that 
you  could  write  it  in  Microsoft  MASM 
for  the  IBM  PC.  Could  you  write  it  in 
two  hours  in  assembly  language  for 
the  VAX?  No,  then  how  about  a  Data 
General  NOVA?  A  UNIVAC  1100  per¬ 
chance?  I  think  not.  I  can  keep  two 
assembly  languages  in  my  head  at 
one  time,  but  that  is  it.  I  doubt  if  you 
are  truly  fluent  in  more  than  two  as¬ 
sembly  languages  either. 

"What  you  really  meant  is  that  you 
can  write  TEE  in  assembly  language 
for  one  particular  machine  in  two 
hours.  Mr.  Gary  Woodman  can  write 
the  equivalent  program  in  a  few  min¬ 
utes  for  every  machine  that  has  a  C 
compiler.  I  suggest  that  the  C  compil¬ 
er  is  infinitely  more  productive. 

"To  quote  you  again:  For  me,  the 
benefits  of  the  superior  performance 
and  compactness  of  an  assembly-lan¬ 
guage  program  almost  always  out¬ 
weigh  all  other  considerations  for 
utility  programs  I  am  going  to  run 
more  than  once.’  I  submit  that  this  is 
illogical.  Let  us  put  some  numbers  on 
it,  Ray.  Suppose  Gary  Woodman’s 


program  runs  in  10  seconds  and  your 
program  runs  in  1  second.  But  you 
took  two  hours  to  write  your  pro¬ 
gram,  and  Gary  probably  took  15 
minutes.  That's  a  difference  of  105 
minutes,  or  6,300  seconds,  of  coding 
time.  At  a  9-second  advantage  per 
run,  you  are  going  to  have  to  run  the 
little  turkey  700  times  in  order  to 
break  even  on  total  elapsed  time. 

"It  is  not  necessarily  always  true 
that  assembly-language  programs 
run  faster  and  take  less  space  than 
equivalent  high-level-language  pro¬ 
grams.  A  case  in  point  is  a  little  rou¬ 
tine  available  to  strip  the  high-order 
bit  from  WordStar  files.  It  is  written 
in  assembly  language  and  handles  its 
input  and  output  one  character  at  a 
time,  and  it  uses  DOS  to  redirect  both 
input  and  output.  It  is  slowwww!  I 
wrote  a  C  program  that  reads  in  16K 
of  input,  strips  the  bit  off  using  regis¬ 
ter  variables  for  my  pointers  to  make 
things  trot  along,  and  then  writes  the 
buffer.  Now  that  moves.  True,  my 
routine  is  much  bigger,  but  in  this 
case  I  will  cheerfully  trade  size  for 
speed.  It  took  only  a  few  minutes  to 
write,  too. 

"Your  emphasis  on  performance 
and  compactness  is  not  without  mer¬ 
it,  and  I  can  argue  your  side  of  the 
debate,  too.  The  microcomputers  we 
have  had  to  deal  with  in  the  last  ten 
years  have  been  characterized  by 
very  limited  memory,  poor  CPU  per¬ 
formance,  and  expensive  slow  back¬ 
ing  storage.  Under  these  circum¬ 
stances  you  can  raise  a  heck  of  a  good 
defense  for  your  position.  But  this  sit¬ 
uation  is  coming  to  an  end.  A  half 
megabyte  of  memory  is  now  com¬ 
mon.  Processors  such  as  the  80286 
can  almost  get  out  of  their  own  way, 
and  the  latest  generation  of  68000 
chips  are  quite  peppy.  The  80386  ma¬ 
chines  will  probably  accept  16  mega¬ 
bytes  of  directly  addressable  memo¬ 
ry  (a  24-bit  memory  bus)  and  be  two 
or  three  times  as  fast  as  the  AT  ma¬ 
chines.  [The  80386  can  actually  ad¬ 
dress  4  gigabytes  of  physical  memory 
and  some  70  terabytes  of  virtual  mem¬ 
ory. — Ray] 

"Look  what  happens  now  to  the 
numbers  I  gave  you  a  few  para¬ 
graphs  back.  If  the  machine  is  three 
times  as  fast,  then  your  utility  will 
run  in  0.3  seconds  and  Gary's  will  run 
in  3.3  seconds.  It  will  now  take  2,100 
executions  of  the  utility  before  you 


116 

236 


Dr.  Dobb's  Journal,  March  1987 


16-BIT 

(continued  from  page  116) 

break  even!  The  faster  a  machine  is, 
the  less  benefit  assembly-language 
code  is  on  that  machine. 

"A  similar  argument  can  be  made 
about  storage.  The  cheaper  mass  stor¬ 
age  and  memory  become,  the  less  ad¬ 
vantageous  compact  code  becomes. 
Accountants  call  it  a  rise  in  the  oppor¬ 
tunity  cost.  In  other  words,  by  writ¬ 
ing  a  utility  in  assembly  language, 
you  forego  the  other  utilities  that  you 
could  have  written  if  you  had  used  a 
more  productive  language.  Against 
this  you  balance  benefits  of  program 
size  and  run  time,  which  become  less 
and  less  significant  as  computer 
speeds  rise  and  primary  and  second¬ 
ary  storage  costs  decline. 

"As  a  high  priest,  to  use  Jerry  Pour- 
nelle’s  epithet,  I  use  assembly  lan¬ 
guage  to  answer  two  classes  of  prob¬ 
lems:  when  I  can’t  describe  the 
procedure  in  a  high-level  language 
and  when  a  critical  small  portion  of 
the  program  runs  too  slowly. 

"With  the  advent  of  true  fourth- 
generation  languages,  your  position 
is  going  to  be  even  harder  to  defend.  I 


am  currently  designing  a  system  us¬ 
ing  Powerhouse,  a  fourth-generation 
language  for  superminis.  In  less  than 
three  hours,  I  can  generate  a  proce¬ 
dure  to  paint  a  screen  with  a  form, 
accept  any  number  of  fields  from 
that  screen,  and  update  a  record  or 
create  a  new  record  in  a  file  that  has 
three  indexes.  The  generated  proce¬ 
dure  is  about  8K  long.  On  a  VAX  ma¬ 
chine,  the  run  time  is  sufficiently 
close  to  zero  that  it  isn't  material.  I 
have  no  doubt  that  a  good  macro  as¬ 
sembler  programmer  could  write  a 
similar  routine  in  about  a  week.  The 
resulting  routine  costs  the  client  $150 
if  I  write  it  in  three  hours.  The  same 
routine  written  in  assembly  language 
is  not  worth  a  penny  more.  The  Pow¬ 
erhouse  code,  which  is  nonproce¬ 
dural,  will  be  trivial  to  maintain. 

"Interestingly  enough,  my  compa¬ 
ny  has  encountered  a  problem  with  a 
communications  handler,  and  the 
communications  expert,  a  red-hot 
macro  programmer,  intends  to  solve 
that  one  in  FORTRAN.  Certainly  he 
could  solve  it  in  assembly  language, 
but  he  can  do  it  cheaper  in  FORTRAN. 
On  a  16-megabyte  VAX  780,  the  run 
time  and  space  disadvantages  are 


immaterial. 

"In  the  long  run,  we  spend  three 
times  as  much  time  and  effort  on 
code  maintenance  than  we  do  on  the 
initial  design  and  coding.  To  me,  the 
benefits  of  clarity  and  simplicity  in 
code  and  ease  of  maintenance  out¬ 
weigh  all  other  considerations  about 
90  percent  of  the  time.  The  other  10 
percent  of  the  time,  we  are  dealing 
with  the  nasty  bits  that  shouldn’t  be 
discussed  in  a  family  magazine. 

"In  conclusion,  my  position  is  that 
languages  are  tools  to  be  used  to  solve 
problems  and  they  are  not  ends  in 
themselves.  They  have  merits  only  to 
the  extent  that  they  help  us  meet  our 
objectives.  Two  of  these  objectives 
are  speed  and  compactness.  Other 
objectives  are  clarity,  simplicity,  self¬ 
documentation,  maintainability,  pro¬ 
grammer  productivity  in  lines  per 
day,  and  so  on. 

"A  person  who  uses  only  one  pro¬ 
gramming  language  puts  me  in  mind 
of  the  man  whose  only  tool  is  a  ham¬ 
mer.  All  his  problems  look  like  nails.” 

Thanks,  Charles,  for  a  beautifully 
written,  educational,  and  witty  let¬ 
ter.  I  wouldn't  want  you  to  go  away 
believing  that  my  only  tool  is  a  ham¬ 
mer;  I  use  Forth,  C,  and  even  BASIC  (I 
hope  that  at  least  one  of  those  meets 
your  criteria  for  a  high-level  lan¬ 
guage).  But  I  feel  most  at  home  with 
assembly  language,  and  contrary  to 
your  assertion  that  it  is  not  possible  to 
be  fluent  in  more  than  two  assembly 
languages,  I  consider  myself  quite 
fluent  in  8080,  Z80,  80x86,  PDP-11,  and 
Raytheon  703  assembly  language.  I 
can  also  get  by  well  enough  in  6502, 
8051,  8096,  68000,  and  1802  assembler 
language.  But  I  admit  to  total  igno¬ 
rance  of  the  UNIVAC,  Data  General, 
and  VAX  that  you  mentioned! 

Availability 

All  the  source  code  for  articles  in  this 
issue  (except  for  C  Chest)  is  available 
on  a  single  disk.  To  order,  send  $14.95 
to  Dr.  Dobb's  Journal,  501  Galveston 
Dr.,  Redwood  City,  CA  94063  or  call 
(415)  366-3600  ext.  216.  Please  specify 
the  issue  number  and  format  (MS-DOS, 
Macintosh,  Kaypro). 

DDJ 


(Listing  begins  on  page  78.) 

Vote  for  your  favorite  feature/article. 
Circle  Reader  Service  No.  5. 


118 


Dr.  Dobb's  Journal,  March  1987 

237 


COLUMNS 


STRUCTURED  PROGRAMMING 


In  this  issue,  I  discuss  the  differ¬ 
ences  between  the  IBM  PC  BASICA 
and  the  new  True  BASIC  and  Quick¬ 
BASIC.  I  also  include  a  short  utility 
program  to  show  interdialect  transla¬ 
tion  and  discuss  several  differences 
in  these  BASIC  dialects. 

In  most  cases,  BASIC  is  the  language 
that  microcomputer  users  learn  first, 
and  in  the  IBM  PC  world,  the  imple¬ 
mentation  they  encounter  is  Micro¬ 
soft's  BASICA.  BASIC  has  been  judged 
inadequate  for  large  software  pro¬ 
jects,  difficult  to  maintain,  and  lack¬ 
ing  many  new  programming  con¬ 
cepts.  With  the  advent  of  structured 
languages,  such  as  Turbo  Pascal,  pro¬ 
grammers  have  been  given  the  taste 
of  better  techniques,  and  the  myth 
that  BASIC  will  always  be  the  lan¬ 
guage  is  no  longer  true. 

New  BASIC  Dialects 

The  wheel  of  progress  has  not  spared 
BASIC  from  change,  however.  Two 
years  ago,  the  original  authors  of  BA¬ 
SIC  (Kemeny  and  Kurtz)  launched 
True  BASIC,  a  more  structured  imple¬ 
mentation  that  is  close  to  the  new 
proposed  ANSI  BASIC.  Almost  simulta¬ 
neously,  Microsoft  launched  a  new 
BASIC  compiler  version,  QuickBASIC, 
and  in  mid-1986,  it  introduced  Ver¬ 
sion  2.0,  which  includes  a  versatile 
environment.  QuickBASIC  is  not  just 
another  compiler  for  BASICA — it 
brings  with  it  a  new  syntax,  similar 
in  many  instances  to  that  of  True 
BASIC. 

by  Namir  Clement 
Shammas 


QuickBASIC  does  not  require  line 
numbers;  instead,  you  place  alpha¬ 
numeric  labels  in  your  programs  to 
direct  branching.  True  BASIC,  howev¬ 
er,  requires  the  entire  program  ei¬ 
ther  to  have  line  numbers  or  to  have 
none  at  all.  If  you  use  GOTO  or  GOSUB 
in  True  BASIC,  you  need  the  line  num¬ 
bers;  otherwise,  they  are  not  manda- 


120 

238 


BASIC:  Quo  Vadis? 


tory.  To  my  disappointment,  True 
BASIC  does  not  support  labels. 

Both  True  BASIC  and  QuickBASIC 
support  more  structured  program 
code.  Multiline  functions  and  subrou¬ 
tines  (with  argument  lists)  enable  you 
to  create  more  modular  code  that  is 
easier  to  maintain  and  enhance.  The 
new  dialects  also  implement  external 
libraries  with  the  added  notion  that 
not  all  variables  are  global,  which  re¬ 
sembles  many  features  in  FORTRAN. 

What  about  translation  between 
BASICA  and  True  BASIC  or  QuickBA¬ 
SIC?  As  you  may  expect,  because  Mi¬ 
crosoft  wrote  both  BASICA  and  Quick¬ 
BASIC,  the  two  dialects  have  many 
built-in  functions  and  statements  in 
common.  As  a  rule  of  thumb,  the  as¬ 
pects  of  BASICA  not  available  to  the 
QuickBASIC  compiler,  such  as  CHAIN 
MERGE,  are  related  to  the  interpreter 
features.  In  general,  BASICA  is  up¬ 
ward  compatible  with  QuickBASIC. 

Translating  programs  from  BASICA 
to  True  BASIC  requires  more  work. 
True  BASIC  Inc.  sells  a  BASIC  converter 
to  handle  many  systematic  conver¬ 
sion  steps.  Like  the  Logitech  Transla¬ 
tor  I  discussed  in  my  last  column,  the 
BASIC  converter  does  not  translate 
100  percent  of  the  BASICA  code.  Later 
I  will  present  a  sample  BASICA  pro¬ 
gram  and  its  translated  versions  in 
True  BASIC  and  QuickBASIC.  To  pave 
the  way,  I  will  first  discuss  several 
differences  between  the  three 
dialects. 

Similarities  and  Differences 

Concerning  data  types,  BASICA  and 
QuickBASIC  support  an  identical  set 
of  strings,  integers,  and  single-  and 
double-precision  reals.  True  BASIC 


supports  even  simpler  types:  strings 
and  numbers.  The  distinction  be¬ 
tween  integers  and  reals  is  context- 
sensitive — if  a  number  has  no  frac¬ 
tional  part,  it  is  stored  as  an  integer; 
otherwise,  it  is  stored  as  a  real.  String 
manipulation  follows  a  different  syn¬ 
tax  in  True  BASIC.  For  example,  when 
you  extract  and  assign  substrings  in 
BASICA,  you  use  something  such  as 
MIDS(L$,FIRST%,LONG%).  In  True  BASIC 
this  is  written  as  L$[FlRST:(LONG 
-h  FIRST-1]  The  square  brackets  and 
the  colon  inside  them  specify  the  first 
and  last  characters  (as  opposed  to  the 
number  of  characters  in  BASICA). 
True  BASIC  uses  the  ampersand  to 
concatenate  strings  and  names  sever¬ 
al  string  functions  differently. 

True  BASIC  supports  matrix  opera¬ 
tors,  functions,  and  I/O  procedures. 
While  hand-translating  BASICA  pro¬ 
grams  that  perform  matrix  opera¬ 
tions,  you  can  substitute  blocks  of 
code  lines  with  single  MAT 
statements. 

Loop  constructs  in  BASICA  and 
QuickBASIC  are  identical.  I  look  for¬ 
ward  to  seeing  more  powerful  RE¬ 
PEAT  . . .  UNTIL-like  loops  in  the  next 
version  of  QuickBASIC.  True  BASIC  of¬ 
fers  a  variety  of  loops  that  include  the 
FOR  . . .  NEXT,  DO,  DO  WHILE,  and  the 
double-test  DO  WHILE  .  .  .  LOOP  WHILE 
loops.  If  you  perform  manual  transla¬ 
tion,  many  BASICA  logical  loops  with 
GOTO  s  can  be  rewritten  using  any  of 
the  True  BASIC  loops,  which  en¬ 
hances  readability. 

Decision-making  constructs  in 
QuickBASIC  and  True  BASIC  are  supe¬ 
rior  and  far  more  readable  than  in 
BASICA.  QuickBASIC  and  True  BASIC 
support  multiline  IF .. .  THEN  . . .  EL¬ 
SE. . .  END  IF  constructs  and  even  al¬ 
low  for  ELSEIF  clauses.  The  ON  GOTO 
and  ON  GOSUB  use  labels  with  Quick¬ 
BASIC.  QuickBASIC  does  not  support 
the  CASE  statement,  whereas  True 
BASIC  does.  Translating  BASICA 
IF .  .  .  THEN  . .  .  ELSE  statements  en¬ 
ables  you  to  use  the  clearer  multiline 


Dr.  Dobb  s  Journal,  March  1987 


version  in  the  other  dialects.  Gone 
are  the  frustrating  branchings  in  the 
THEN  or  ELSE  clause  that  breathe  cha¬ 
os  in  your  program.  The  ON  GOTO/ 
GOSUB  are  easily  translated  to  the  su¬ 
perior  SELECT  CASE  statement  in 
True  BASIC. 

How  many  times  have  you  felt  the 
limitations  of  BASICA  in  defining 
functions?  How  many  times  did  you 
have  to  use  a  subroutine  to  simulate 
multiline  functions?  QuickBASIC  and 
True  BASIC  make  these  painful  mem¬ 
ories  a  thing  of  the  past.  Now  func¬ 
tion  definitions  can  extend  over  nu¬ 
merous  lines  and  freely  use  loops  and 
decision-making  constructs.  Where¬ 
as  True  BASIC  supports  recursive 
functions,  QuickBASIC  in  its  current 
version  does  not. 

Regarding  subroutines,  both 
QuickBASIC  and  True  BASIC  support 
the  GOSUB  <label  or  line  number> 
and  CALL  <subname>  forms.  The 
called  subroutines  take  optional  ar¬ 
gument  lists.  Both  QuickBASIC  and 
True  BASIC  provide  functions  that  re¬ 
turn  the  lower  and  upper  bounds  of 
arrays.  These  functions  are  vital  for 
writing  general-purpose  routines 
that  manipulate  arrays  and  matrices. 

Both  QuickBASIC  and  True  BASIC 
support  external  libraries  of  rou¬ 
tines.  At  the  time  of  writing  this  col¬ 
umn,  True  BASIC  Inc.  announced 
True  BASIC,  Version  2.  One  of  its  high¬ 
lights  is  the  introduction  of  modules! 
I  will  discuss  True  BASIC  modules  in 
my  next  column,  once  I  obtain  more 
information  on  the  exact  syntax  and 
features.  I  will  also  discuss  any  as¬ 
pects  of  similarity  between  library 
modules  in  True  BASIC  2.0  and 
Modula-2.  Although  BASICA  does  not 
support  explicit  libraries,  you  may 
want  to  consider  creating  external  li¬ 
braries  that  contain  your  favorite 
and  frequently  used  routines. 

BASICA  programs  that  use  low-level 
features  (DEF  SEG,  VARPTR )  translate 
easily  into  QuickBASIC.  True  BASIC 
does  not  support  such  machine-spe¬ 
cific  statements,  however,  because 
they  make  programs  less  portable  to 
other  machines.  For  the  same  reason, 
the  valuable  SHELL(  )  statement 
found  in  both  BASICA  and  QuickBASIC 
has  no  similar  implementation  in 
True  BASIC.  The  Developer’s  Toolkit, 
offered  by  True  BASIC  Inc.,  does  pro¬ 
vide  several  low-level  access  routines 
for  the  IBM  PC  implementation. 


High-resolution  graphics  is  another 
area  in  which  BASICA  programs  need 
more  effort  to  be  converted  into  True 
BASIC.  I  think  that  True  BASIC’s  built- 
in  graphics  features  are  superior  to 
those  of  BASICA.  For  example,  True 
BASIC  supports  the  PICTURE  type  of 
routines,  special  kinds  of  subroutines 
that  make  animation  of  objects  easy. 

File  I/O  is  very  similar  in  BASICA 
and  QuickBASIC.  True  BASIC  uses  a 
slightly  different  syntax  and  organi¬ 
zation,  which  means  additional  edit¬ 
ing  of  converted  BASICA  programs. 
The  LPRINT  statement  in  BASICA  is  not 
supported  by  True  BASIC.  Instead, 
you  must  open  a  buffer  for  the  print¬ 
er  (for  example,  OPEN  *  <Bu  f—Num> 

:  PRINTER)  and  then  send  all  the  print¬ 
er  output  using  "PRINT  *<Buf 
_ Num> ”  statements,  similar  to  file 
output.  In  translating  BASICA  pro¬ 
grams,  you  must  insert  the  OPEN 
statement  and  replace  every  LPRINT 
with  "PRINT  # <Buf—Num>  :  ”. 

Error  handling  in  BASICA  and 
QuickBASIC  is  also  similar,  both  using 
the  ON  ERROR  GOTO  and  RESUME 
statements.  QuickBASIC  uses  labels  to 
direct  the  program  flow  to  error  han¬ 
dling  sections.  True  BASIC  uses  a  dif¬ 
ferent  and  more  structured  mecha- 
nism — a  WHEN  ERROR 

IN  ..  .  USE  .  .  .  END  WHEN  construct. 
The  code  section  suspected  of  gener¬ 
ating  errors  is  located  in  the  WHEN 
clause  and  the  exception  handling 
code  in  the  USE  clause.  By  enclosing 
the  suspected  code  portion  in  the 
WHEN  clause,  the  extent  of  error 
trapping  is  most  noticeable. 

Interdialect  Translation 

Listing  One,  page  88,  presents  a  BA¬ 
SICA  utility  program.  The  user  types 
in  the  number  and  names  of  data 
files  containing  text.  This  is  followed 
by  several  search  strings,  with  the 
options  of  simply  locating  or  replac¬ 
ing  strings  with  others.  The  entire  set 
of  strings  is  used  in  text  manipulation 
with  each  file.  The  program  prints 
the  text  lines  found  or  altered  and 
writes  back  the  text  files  to  update 
them.  The  replace  mechanism  is 
fully  automatic  and  has  no  query 
option. 

Using  the  BASIC  converter  from 
True  BASIC  Inc.,  I  converted  the  BA¬ 
SICA  program.  Listing  Two,  page  89, 
shows  the  True  BASIC  version  after 
manual  editing  that  was  needed  to 


make  the  program  function.  The 
converter  inserts  several  lines  at  the 
beginning  of  the  original  BASICA  list¬ 
ing.  These  include  the  use  of  the  def- 
lib.tru  library,  which  contains  True 
BASIC  functions  that  clone  certain  BA¬ 
SICA  functions,  listed  in  lines  25  and 
26.  Three  author  functions  are  de¬ 
fined  within  the  converted  program. 
The  EoJ 7  )  function  is  used  by  the  util¬ 
ity.  The  OPTION  BASE  0  is  also  used 
and  does  not  conflict  with  my  pro¬ 
gram.  Notice  the  following  changes 
made  either  by  the  converter  or  by 
hand  coding: 

1.  The  original  DEFINT  declaration  is 
rendered  passive  by  converting  it 
into  a  comment. 

2.  Each  dot  character  used  in  the 
name  of  a  BASICA  variable  is  replaced 
with  two  underscore  characters. 

3.  BASICA  program  lines  containing 
multiple  statements  are  broken 
down  into  single  statements  per  line 
in  True  BASIC. 

4.  I  inserted  line  1945  to  open  a  buff¬ 
er  for  printer  output.  All  BASICA 
LPRINT  statements  were  flagged  by 
the  converter.  I  changed  each  LPRINT 
into  print  * 9 

5.  The  BASICA  SPC(  )  function  is  re¬ 
placed  by  REPEATS!"  ",  <number» 
to  produce  the  same  effect.  Using 
TAB(  )  is  another  alternative. 

6.  The  converter  moved  the  BASICA 
END  statement  from  line  3000  to  the 
very  end  and  replaced  it  with  a  stop. 
In  True  BASIC,  there  must  be  one  and 
only  one  end  statement  at  the  end  of 
the  program.  If  I  manually  replace 
the  stop  with  end,  all  the  subsequent 
subroutines  become  external  (the 
current  end  location  makes  them  in¬ 
ternal).  The  difference  between  in¬ 
ternal  and  external  subroutines  is  in 
the  scope  of  variables.  Internal  rou¬ 
tines  access  all  the  variables  of  the 
main  program,  but  external  routines 
do  not.  Library  files  containing  noth¬ 
ing  but  external  routines  must  begin 
with  the  keyword  EXTERNAL. 

7.  I  edited  the  OPEN  statements  for 
file  I/O  to  add  the  create  old  clause, 
which  indicates  that  the  file  must  al¬ 
ready  exist. 

8.  I  added  the  erase  # i  in  line  9015. 
This  erases  the  contents  of  the  file  be¬ 
fore  I  write  back  to  it.  Unlike  BASICA, 
True  BASIC  does  not  allow  you  to 
overwrite  existing  text,  so  you  must 
erase  a  file  before  updating  its 


Dr.  Dobb 's  Journal,  March  1987 


121 

239 


STRUCTURED  PROGRAMMING 

(continued  from  page  121) 


contents. 

9.  Each  assignment  statement  in 
True  BASIC  begins  with  the  keyword 
let.  It  is  mandatory  in  Version  1,  but 
the  new  Version  2  enables  you  to  is¬ 
sue  a  directive  and  make  the  let 
optional. 

Listing  Three,  page  91,  shows  a 
True  BASIC  version  that  differs  from 
that  in  Listing  Two  in  the  following 
ways: 


1.  No  line  numbers  are  used. 

2.  Some  of  the  tests  in  the  IF .. .  THEN 
constructs  have  been  reversed  to 
make  use  of  the  multiline  THEN  and 
ELSE  clauses  and  to  bypass  the  need 
for  line  numbers. 

3.  Subroutine  calls  are  used  instead 
of  GOSUB.  I  have  deliberately  used  ar¬ 
gument  lists  to  give  a  sense  of  struc¬ 
tured  code.  I  could  have  made  the 
subroutines  parameterless  and  their 
code  access  global  variables. 

4.  The  BASICA  NOT  EOF( )  test  used  in 
detecting  the  end  of  file  is  replaced 
with  the  True  BASIC  MORE  # 1  func¬ 


tion,  which  performs  the  same  task. 

Listing  Four,  page  92,  shows  the 
first  QuickBASIC  version  of  the  BASICA 
utility.  I  wrote  it  to  demonstrate  the 
following  QuickBASIC  features: 

1.  No  line  numbers. 

2.  The  GOSUB  statements  are  fol¬ 
lowed  by  alphanumeric  labels.  The 
corresponding  labels  are  located  at 
the  start  of  each  subroutine.  This 
QuickBASIC  looks  slightly  more  mod¬ 
ular  than  its  parent  BASICA  versions. 

Listing  Five,  page  94,  shows  the  sec¬ 
ond,  more  structured  QuickBASIC 
version.  The  GOSUB  statements  are 
replaced  by  subroutine  CALLS.  The 
argument  lists  of  the  subroutines  are 
identical  to  those  of  True  BASIC  in 
Listing  Three.  Listings  Three  and 
Five  show  strong  similarities  be¬ 
tween  QuickBASIC  and  True  BASIC 
with  respect  to  program  segmenta¬ 
tion.  This  gives  you  a  feeling  that 
both  QuickBASIC  and  True  BASIC  real¬ 
ly  promote  more  structured  code. 
Compared  to  Pascal,  these  BASIC  dia¬ 
lects  retain  simple  data  types  with 
the  declaration  of  variables  limited  to 
arrays.  Compared  to  FORTRAN,  they 
represent  a  true  challenge  because 
they  offer  many  of  FORTRAN  IV  and 
FORTRAN-77’s  features. 

I  have  focused  on  the  one-way 
translation  of  programs  written  in 
BASICA  to  QuickBASIC  and  True  BASIC. 
In  a  future  column,  I  will  look  at  the 
two-way  translation  between  Quick¬ 
BASIC  and  True  BASIC  programs.  I  also 
plan  to  look  at  Better  BASIC,  another 
"new  wave”  BASIC  dialect,  which  I 
have  not  discussed  this  time  because 
of  space  limitations. 

Availability 

All  the  source  code  for  articles  in  this 
issue  (except  C  Chest)  is  available  on  a 
single  disk.  To  order,  send  $14.95  to 
Dr.  Dobb's  Journal,  501  Galveston  Dr., 
Redwood  City,  CA  94063  or  call  (415) 
366-3600  ext.  216.  Please  specify  the 
issue  number  and  format  (MS-DOS, 
Macintosh,  Kapro). 

DDJ 

(Listings  begin  on  page  88.) 

Vote  for  your  favorite  feature/article. 

Circle  Reader  Service  No.  6. 


124 

240 


Dr.  Dobb's  Journal,  March  1987 


COLUMNS 


ARTIFICIAL  INTELLIGENCE 


Object-Oriented  Programming 


The  theme  for  my  next  few  col¬ 
umns  will  be  object-oriented 
programming  in  AI.  This  is  a  rather 
vast  but  very  hot  topic,  and  I'll  ap¬ 
proach  it  from  a  few  different  van¬ 
tage  points.  As  mentioned  in  the  pre¬ 
vious  column,  object-oriented 
languages  represent  a  programming 
paradigm,  which  is  a  much  more  sig¬ 
nificant  development  than  just  an¬ 
other  new  programming  language  or 
programming  technique.  Most  pro¬ 
gramming  languages  to  date  have 
been  for  programming  in  a  single 
paradigm,  that  of  procedural  pro¬ 
gramming.  Because  all  programmers 
already  know  this  paradigm,  I  will 
concentrate  on  the  newer  ones. 

As  I  see  it,  object-oriented  program¬ 
ming  is  not  a  direction  that  is  entirely 
new  and  without  precedent  in  com¬ 
puter  science;  rather,  it  takes  various 
developments  in  programming  lan¬ 
guages  to  their  next  logical  step,  for 
reasons  of  clarity,  modularity,  and 
programming  efficiency.  In  one 
sense,  you  can  think  of  object-orient¬ 
ed  programming  as  the  program¬ 
ming  paradigm  that  takes  structured 
programming  to  its  natural  logical 
conclusion.  In  structured  program¬ 
ming,  variables  can  be  local  to  a  par¬ 
ticular  procedure  and  these  proce¬ 
dures  typically  pass  arguments  such 
as  strings  and  numbers  between 
them.  With  object-oriented  systems 
all  this  is  taken  much  further.  Vari¬ 
ables  are  no  longer  local  just  to  proce¬ 
dures.  The  main  building  blocks  are 
now  objects — protected  areas  of  sys- 


by  Ernest  R.  Tello 


tern  memory — which  can  have  both 
local  variables  and  local  procedures. 
Moreover,  the  building  blocks  do  not 
communicate  with  one  another  just 
by  passing  arguments.  The  proce¬ 
dures  themselves,  usually  called 
methods,  which  are  local  to  objects, 
are  actually  the  messages  that  are 


sent  and  received  by  objects.  In  this 
respect,  objects  resemble  smaller 
computers  within  the  host  comput¬ 
er,  each  with  their  own  data  and 
code  areas. 

Most  object-oriented  systems  have 
at  least  two  different  types  of  objects: 
classes  and  instances.  Classes  may 
have  a  logical  relation  between  one 
another  such  that  one  might  be  the 
subclass  or  superclass  of  another. 
Generally  speaking,  the  superclass  is 
the  more  abstract  class  and  the  sub¬ 
class  is  the  more  specific.  So,  for  ex¬ 
ample,  if  you  created  the  class  Furni¬ 
ture,  then  you  could  create  the  class 
Chair  as  the  subclass  of  Furniture  and 
Desk-Chair  as  a  subclass  of  Chair.  In 
this  example,  Furniture  would  be  the 
superclass  of  Chair,  which  is  in  turn 
the  superclass  of  Desk-Chair. 

Object-oriented  systems  have  at 
least  three  obvious  advantages.  One 
very  nice  one  is  that,  once  you  have 
written  the  code  for  a  class,  you  can 
have  as  many  instances  of  that  class 
present  in  the  system  at  the  same 
time  as  memory  will  allow.  A  class  is 
simply  a  template  on  which  each  in¬ 
stance  is  modeled  and  provided  with 
its  own  area  of  memory  that  cannot 
be  accessed  by  any  other  object  ex¬ 
cept  by  using  the  object’s  own  local 
methods.  So,  for  example,  this  means 
that,  in  an  object-oriented  system, 
you  can  have  as  many  graphics  pens, 
windows,  editors,  interpreters,  and 
so  on  as  you  like  copresent  without 
any  fear  that  they  will  interfere  with 
one  another.  The  second  advantage  is 
that,  through  the  mechanism  of  in¬ 
heritance,  subclasses  automatically 
share  all  the  variables  and  methods 


of  their  superclasses.  This  means  that 
you  can  write  greater  and  greater 
specializations  of  functions  just  by 
adding  the  part  that  is  unique — the 
rest  is  inherited  automatically.  The 
third  immediate  advantage  is  that 
you  can  provide  a  uniform  interface 
over  the  widest  possible  range  of  ob¬ 
ject  types  because  you  can  use  the 
same  name  for  methods  of  different 
objects  that  have  to  be  implemented 
differently,  and  this  action  can  re¬ 
main  invisible  to  the  user.  So,  for  ex¬ 
ample,  you  might  create  different 
classes  for  a  variety  of  different  geo¬ 
metric  polyhedra.  Then,  for  each 
separate  class,  you  would  define  the 
methods  volume  and  surface  area. 
The  actual  formulas  and  their  imple¬ 
mentations  would  vary,  but  the  call¬ 
ing  names  would  all  be  the  same. 
Then  you  could  say  Tetrahedron-1 
volume  or  Cube-3  volume,  and  in 
each  case  methods  would  be  invoked 
that  returned  the  value  of  the  object's 
volume. 

Some  people  say  that  the  key  ad¬ 
vantage  of  object-oriented  program¬ 
ming  is  the  ability  to  reuse  code  for 
many  different  programs.  But  in  it¬ 
self  this  is  not  significantly  different 
from  library  functions.  The  real  dif¬ 
ference  is  an  improved  ability  to  han¬ 
dle  complexity  in  a  transparent  man¬ 
ner.  An  advantage  of  object-oriented 
programming  that  is  not  necessarily 
immediately  obvious — but  which  ex¬ 
perienced  programmers  who  have 
worked  with  these  systems  will  near¬ 
ly  always  testify  to — is  that  object-ori¬ 
ented  languages  give  you  more  lever¬ 
age  in  working  on  very  large 
programs.  This  does  not  come  for 
free,  though,  and  it's  not  guaranteed. 
Factoring  a  large  program  into  the 
right  parts  is  a  large  part  of  the  battle. 
It  is  also  necessary  to  learn  the  right 
techniques  for  managing  the  code 
and  making  life  easy  for  the  mem¬ 
bers  of  a  programming  team.  Object- 
oriented  systems  are  usually  diffuse, 


126 


Dr.  Dobb's  Journal,  March  1987 

241 


ARTIFICIAL  INTELLIGENCE 

(continued  from  page  126) 


with  parts  of  applications  being  dis¬ 
persed  among  a  large  number  of 
classes  and  subclasses.  To  program 
efficiently  with  such  a  system,  it  is 
essential  to  have  the  proper  tools  and 
an  effective  method  for  keeping  the 
application  well  focused  and  well 
organized. 

It  is  important  to  point  out  that  ob¬ 
ject-oriented  programming  cannot 
be  regarded  as  an  easy  thing  to  pick 
up  rapidly,  as  it  is  a  totally  different 
paradigm  from  the  one  to  which 
most  programmers  are  accustomed. 
As  programming  approaches  go,  it  is 
a  knowledge-intensive  one.  In  other 
words,  the  readily  available  modular 
code  is  only  useful  providing  that 
programmers  know  what  they  have 
available  to  them  and  how  it  may  be 
best  used.  Many  programmers  resist 
learning  new  languages,  not  to  speak 
of  new  programming  paradigms,  so 
it  is  important  to  spell  out  again  in 
clear-cut,  pragmatic  terms  just  what 
the  real  advantages  are  to  program¬ 
ming  with  objects.  As  I  see  it,  there 
are  four  main  ones: 

1.  standard  calling  conventions  for  a 
broad  range  of  operations  that  exhib¬ 
it  differences  in  behavior,  as  do  varia¬ 
tions  on  a  theme 

2.  a  means  of  managing  very  large 
programming  projects  by  breaking 
up  large  problems  into  smaller,  inde¬ 
pendently  functioning,  highly  visible 
parts 

3.  a  truly  modular  programming  en¬ 
vironment  in  which  redundancies  in 
coding  are  kept  to  an  absolute 
minimum 

4.  the  ability  to  spawn  multiple  in¬ 
stances  of  a  given  function  or  object 
from  the  same  code  without  the 
codes  for  the  instances  interfering 
with  one  another 

Object-Oriented  LISP: 

An  Overview 

Although  Smalltalk  was  the  first  true 
general-purpose  object-oriented  lan¬ 
guage,  and  implementations  such  as 
Digitalk’s  Smalltalk/V  are  specifically 
aimed  at  AI  applications,  the  main 
uses  of  this  programming  paradigm 
so  far  in  AI  have  been  with  object- 
oriented  LISP.  The  reason  for  this  is 
probably  because  people  in  the  AI 


field  are  most  familiar  with  LISP. 

The  most  interesting  things  that 
have  occurred  with  object-oriented 
LISP  are  some  of  the  clear  innovations 
it  has  made  in  object-oriented  pro¬ 
gramming  generally.  Not  only  did 
LISP  have  little  difficulty  absorbing 
the  object-oriented  paradigm  but  also 
it  introduced  some  important  inno¬ 
vations  to  this  programming  ap¬ 
proach  as  it  did  so.  I  would  like  to  dis¬ 
cuss  three  innovations  in 
particular — mixins,  method  combi¬ 
nation,  and  multimethods.  The  mixin 
feature  is  the  LISP  version  of  multiple 
inheritance,  which  consists  of  the 
ability  to  create  a  new  class  that  in¬ 
herits  from  more  than  just  a  single 
superclass.  In  effect,  it  means  the 
ability  to  build  an  object  hierarchy 
that  can  be  a  network  or  tangled  hi¬ 
erarchy  rather  than  just  a  simple 
tree.  Although  multiple  inheritance 
is  theoretically  present  in  the  latest 
release  of  Smalltalk-80,  it  was  an  af¬ 
terthought  in  this  language  and  has 
nothing  close  to  approaching  the 
readily  usable  and  trouble-free  oper¬ 
ation  of  mixins  in  object-oriented 
LISP. 

Method  combination  is  a  bit  more 
difficult  to  explain  than  mixins  but  is 
no  less  important.  The  first  appear¬ 
ance  of  user-defined  method  combi¬ 
nation  in  LISP  was  with  Symbolics' 
Flavors  system.  This  system  does  not 
just  copy  the  approach  to  method 
combination  used  in  Smalltalk  but  in¬ 
troduces  a  new  approach.  The  Fla¬ 
vors  implementation  lays  stress  on 
the  order  in  which  components  are 
combined  to  produce  a  flavor.  This  is 
particularly  true  with  methods,  the 
procedures  that  are  local  to  flavors. 
The  heart  of  the  Flavors  system  is  the 
way  the  methods  of  various  compo¬ 
nents  are  combined.  The  problem  is 
this:  if  you  define  a  flavor  that  inher¬ 
its  from  several  other  flavors  or  class¬ 
es,  each  of  which  have  their  own  spe¬ 
cialized  versions  of  the  same 
message,  then  how  will  the  method 
for  this  new  flavor  be  constructed? 
The  Flavors  system  offers  a  variety  of 
ways  for  combining  methods  and 
even  provides  for  user-defined  meth¬ 
od  combinations.  It  is  designed  so  that, 
if  you  want  to,  you  can  define  entire¬ 
ly  new  ways  of  combining  methods. 

The  default  for  method  combina¬ 
tion  in  Flavors  is  to  ignore  all  but  the 
latest  implementation  of  the  method, 


meaning  the  one  defined  in  the  most 
specific  of  the  flavors  from  which 
the  new  flavor  will  inherit.  If  you  de¬ 
cide  to  define  an  entirely  new  meth¬ 
od  for  the  new  flavor,  then  all  the 
others  will  naturally  be  overridden. 
The  general  format  for  the  more 
complex  types  of  method  combina¬ 
tion  in  Flavors  is  for  one  flavor  to  be 
selected  to  provide  the  primary 
method  and  for  any  other  flavors  to 
provide  what  are  called  daemon 
methods.  The  primary  method  has 
control  of  handling  the  main  func¬ 
tion  associated  with  the  message, 
whereas  the  daemon  methods  are  re¬ 
sponsible  for  subsidiary  tasks. 

Flavors  has  two  kinds  of  daemon 
methods — before  and  after.  The  ter¬ 
minology  is  derived  from  the  order 
in  which  the  method  functions  are 
called.  The  basic  way  that  combined 
methods  work  is  that  they  first  call  all 
the  before  methods,  then  the  prima¬ 
ry  method,  and  finally  all  the  after 
methods.  Each  of  these  component 
methods  is  passed  the  same  argu¬ 
ments  in  turn  as  were  passed  to  the 
combined  method.  Only  the  values 
returned  by  the  primary  method 
will  be  returned  by  the  combined 
method,  however.  All  values  re¬ 
turned  by  the  daemon  methods  are 
ignored.  If  there  is  more  than  one  be¬ 
fore  method,  then  the  before  meth¬ 
ods  are  called  in  the  same  order  as 
that  in  which  the  flavors  are  com¬ 
bined,  whereas  after  methods  are 
called  in  reverse  order. 

What  is  the  point  of  these  method 
combinations?  They  can  have  a  vari¬ 
ety  of  different  uses,  but  one  of  the 
most  obvious  is  to  provide  an  addi¬ 
tional  type  of  modularity  that  cap¬ 
tures  the  whole  spirit  of  the  object- 
oriented  approach.  With  method 
combination,  if  you  can’t  find  all  you 
want  for  a  flavor  method  in  any  of 
the  flavors  from  which  it  will  be  in¬ 
heriting,  yet  part  of  what  you  want  is 
available,  then  method  combination 
can  often  save  you  from  having  to  re¬ 
write  the  entire  function  from 
scratch.  What  you  can  do  is  select  a 
method  to  inherit  that  can  serve  as 
your  primary  method.  Then  you  just 
write  the  before  and  after  methods 
that  can  be  added  to  this  primary 
method  to  produce  the  desired  re¬ 
sult.  Naturally  this  won’t  be  possible 
in  all  cases.  It  will  work  only  in  situa¬ 
tions  in  which  the  function  desired 


128 

242 


Dr.  Dobb's  Journal,  March  1987 


can  be  combined  from  several  sepa¬ 
rate  functions.  You  will  find  that  this 
applies  to  a  surprisingly  large  num¬ 
ber  of  cases,  however. 

Multimethods,  a  capability  first 
made  available  in  CommonLoops, 
could  well  be  the  most  important 
contribution  object-oriented  LISP  has 
made  so  far  to  object-oriented  pro¬ 
gramming.  Multimethods  are  func¬ 
tions  that  can  be  considered  as  mes¬ 
sages  to  any  number  of  types  of 
objects.  Prior  to  the  development  of 
multimethods,  object-oriented  LISP 
made  a  distinction  between  the  ob¬ 
ject  to  which  a  message  was  sent  and 
the  arguments  to  the  message  proce¬ 
dure.  So,  in  the  expression: 

(send  Rectangle  draw-at  10  40) 

the  class  Rectangle  is  solely  responsi¬ 
ble  for  recognizing  the  message  and 
is  distinguished  from  the  numbers  10 
and  40,  which  are  arguments  to  the 
draw-at  message.  This  is  somewhat 
artificial,  for  in  Smalltalk  numbers 
are  treated  as  instances  of  the  class 
Number  and  arithmetical  operations 
such  as  multiplication  are  considered 
as  messages  to  the  numbers  to  multi¬ 
ply  themselves.  Multimethods  take 
this  even  further  by  considering  the 
class  to  which  a  message  is  sent  as  just 
another  one  of  the  arguments.  In 
principle  any  number  of  arguments 
can  be  passed  to  a  multimethod,  and 
each  of  the  arguments  is  an  instance 
of  its  own  class.  So  a  multimethod  is 
really  a  message  to  an  indefinite 
number  of  objects,  with  the  method 
combination  required  to  complete 
this  message  determined  by  the  actu¬ 
al  arguments  used. 

One  "problem”  with  object-orient¬ 
ed  LISP  is  that  it  is  just  too  popular  for 
its  own  good.  It  has  become  a  favorite 
tool  not  only  for  programming  AI  ap¬ 
plications  but  also  for  systems  pro¬ 
gramming  and  developing  user  inter¬ 
faces.  The  result  of  this  diversity  is  a 
somewhat  conflicting  set  of  require¬ 
ments  for  users  of  different  types. 
Systems  programmers  and  those  de¬ 
veloping  user  interfaces  and  ad¬ 
vanced  graphics  applications  are 
usually  interested  in  high-perform¬ 
ance,  bug-free  code.  AI  researchers 
are  willing  to  trade  some  perform¬ 
ance  for  greater  flexibility  and 
generality. 

What  are  some  of  the  primary  con¬ 


siderations  in  using  object-oriented 
LISP  for  AI  purposes?  As  I  have  al¬ 
ready  suggested,  certainly  one  of  the 
most  important  is  the  dynamic  be¬ 
havior  of  class  systems.  This  refers  to 
the  ability  of  an  object  system  to 
change  dynamically  to  reflect  chang¬ 
ing  circumstances  in  the  world.  In 
many  AI  programs  it  is  of  consider¬ 
able  importance  that  the  system  be 
able  to  update  itself  automatically  to 
a  greater  or  lesser  degree.  It  is  helpful 
to  consider  what  this  implies. 

A  minimum  condition  for  such  an 
automatic  system  is  to  keep  a  run¬ 
ning  tally  of  all  the  system’s  current 
objects  of  various  kinds  in  a  form  that 
is  easily  accessible.  In  LISP  this  gener¬ 
ally  means  maintaining  a  list  of  such 
items  and  being  able  to  update  the  list 
as  necessary.  More  specifically,  it  is 
necessary  to  be  able  to  access  at  any 
given  time  all  the  instances  that  are 
currently  alive  and  know  their  class¬ 
es.  If  the  system  does  not  already  do 
this  in  some  way,  then  it  is  essential 
that  it  at  least  support  the  minimum 
functions  that  would  allow  these  fea¬ 
tures  to  be  implemented. 

Closely  related  to  this  requirement 
is  the  ability  to  write  functions  that 
can  create  new  objects  with  names 
that  are  determined  only  at  run  time. 
Although  this  may  sound  trivial,  in 
LISP  it  is  easy  to  create  new  objects  by 
programming  them  with  names  the 
programmer  specifies  in  the  code, 
but  it  is  not  nearly  as  straightforward 
to  write  functions  that  automatically 
create  objects  when  needed  with 
names  that  have  to  be  specified  at  the 
time.  Compared  with  this,  “uncreat- 
ing”  objects  is  relatively  trivial.  If 
there  is  no  function  corresponding  to 
remob  in  use,  then  it  is  still  always 
possible  to  make  a  new  object  with 
the  same  name  as  the  uncreated  one 
and  set  it  equal  to  nil. 

Such  functions  are  necessary  for 
creating  what  are  known  as  compos¬ 
ite  objects,  for  example.  These  are  ob¬ 
jects  of  a  complex  structure  that  con¬ 
tain  other  objects  as  parts.  So,  for 
example,  a  desk  object  could  be  de¬ 
scribed  as  a  composite  object  com¬ 
posed  of  a  top,  legs,  and  drawers.  In 
such  a  case,  the  components  might 
well  be  instances  of  classes  in  their 
own  right.  To  create  a  composite  ob¬ 


ject  automatically,  therefore,  would 
involve  naming  and  creating  all  those 
instances  that  are  parts  of  the 
composite. 

In  a  sense,  the  parts  of  a  composite 
object  can  form  another  hierarchy 
that  can  exist  alongside  the  abstrac¬ 
tion  hierarchy  of  classes  containing 
the  main  composite  object.  The  com¬ 
posite  object  approach  seems  to  have 
real  limits  as  far  as  creating  very 
large  hierarchies  is  concerned,  how¬ 
ever.  It  is  difficult  to  imagine  creating 
very  large  systems  such  as  spacecraft 
in  any  degree  of  detail.  Are  such  sys¬ 
tems  necessary?  If  you  want  to  be 
able  eventually  to  create  deep  sys¬ 
tems  for  diagnosing  problems  and 
predicting  various  consequences  in 
emergency  situations,  or  even  for 
failure-mode  analysis  for  design  pur¬ 
poses,  then  such  systems  appear  to  be 
indispensable. 

Many  LISP  programmers  are  anx¬ 
ious  for  an  object-oriented  extension 
to  Common  LISP  and  wish  there  were 
already  such  a  standard  for  the  dia¬ 
lect.  At  the  present  time  several  LISP 
vendors  have  developed  their  own 
proposed  extensions,  which  they 
hope  will  be  adopted  as  the  standard 
object-oriented  extension  to  Com¬ 
mon  LISP. 

In  the  Spotlight 

I  asked  some  experts  in  object-orient¬ 
ed  LISP  both  what  they  wanted  to  see 
happen  and  what  they  thought 
would  happen  in  the  quest  to  devel¬ 
op  an  agreed-upon  standard  for  ob¬ 
ject-oriented  programming  in  Com¬ 
mon  LISP.  Currently,  there  is  an 
organized  effort  to  develop  such  a 
standard  with  formal  ANSI  recogni¬ 
tion.  Toward  that  end,  a  committee 
has  been  formed  to  draft  a  proposed 
standard  that  will  then  be  made 
available  to  the  programming  com¬ 
munity  for  its  feedback  so  that  a  true 
“community  standard”  might 
emerge.  The  members  of  this  object- 
oriented  working  group  include  Dan 
Bobrow  and  Gregor  Kiczales  of  Xe¬ 
rox;  David  Moon,  Dan  Weinreb,  and 
Sonja  Keene  of  Symbolics;  Richard 
Gabriel  and  Linda  Demichael  of  Lu¬ 
cid;  Jim  Kempf  of  Hewlett-Packard; 
and  Patrick  Dussud  of  Texas 
Instruments. 

Gabriel  is  the  president  of  Lucid,  a 
vendor  of  Common  LISP  for  a  variety 
of  different  machines,  and  author  of 


Dr.  Dobb's  Journal,  March  1987 


129 

243 


the  book  Performance  and  Evaluation 
of  Lisp  Systems  (MIT  Press,  1985).  I 
spoke  to  him  regarding  the  develop¬ 
ing  standard  for  an  object-oriented 
extension  to  Common  LISP.  Gabriel 
sees  his  own  role  as  a  kind  of  "buffer 
zone”  to  help  mediate  the  potentially 
inflammatory  relations  between  Xe¬ 
rox,  advocate  of  CommonLoops,  and 
Symbolics,  advocate  of  New  Flavors. 
Although  Gabriel  is  not  optimistic 
about  the  solution  of  many  of  the  sub¬ 
tle  problems  in  formulating  an  ade¬ 
quate  standard  rapidly,  he  feels  that 
a  reasonable  standard  is  emerging 
that  has  many  of  the  features  of  New 
Flavors  "on  the  surface”  while  utiliz¬ 
ing  much  of  CommmonLoops 
"underneath.” 

I  asked  Gabriel  about  some  of  the 
difficult  issues  involved  in  formulat¬ 
ing  the  new  standard,  such  as  the  is¬ 
sue  of  dealing  somehow  with  the  fact 
that  readable  code  is  often  not  effi¬ 
cient  and,  conversely,  efficient  code 
is  often  not  readable.  Gabriel  admits 
that  this  is  a  pervasive  problem  that 
will  not  go  away  easily.  One  difficul¬ 
ty  is  that  often  programmers  who 
know  the  details  of  a  given  applica¬ 
tion  can  find  various  ways  to  exploit 
certain  idiosyncrasies  of  the  way  it  is 
coded  to  improve  performance  con¬ 
siderably.  Tricks  of  this  kind  are  obvi¬ 
ously  specific  to  the  particular  appli¬ 
cation  and  are  often  opaque  to 
another  programmer  reading  the 
code.  The  main  reason  for  this,  ac¬ 
cording  to  Gabriel,  is  that  “most  effi¬ 
cient  code  uses  side  effects  to  a  con¬ 
siderable  degree.”  Gabriel  thinks  that 
advanced  knowledge-based  systems 
for  program  optimization  will  ulti¬ 
mately  be  necessary  to  solve  this 
problem.  Another  possibility  is  paral¬ 
lel  LISP  because,  as  Gabriel  points  out, 
"the  programs  with  the  least  side  ef¬ 
fects  run  fastest  on  parallel  ma¬ 
chines.”  Gabriel  is  currently  working 
on  Qlisp,  a  parallel  LISP  language, 
with  John  McCarthy,  the  inventor  of 
LISP. 

Another  problem  specific  to  object- 
oriented  systems  is  that  currently  in¬ 
stances  in  most  systems  are  strictly 
subservient  to  classes  in  that  an  in¬ 
stance  object  is  always  an  instance  of 
one  and  only  one  predefined  class, 
which  is  used  as  a  kind  of  template 
for  creating  the  instance.  This  has  a 
built-in  bias  for  the  abstract  over  the 
concrete,  which  could  put  a  limit  on 


the  type  of  object-oriented  AI  pro¬ 
grams  that  can  be  written.  For  exam¬ 
ple,  it  might  be  highly  desirable  in 
certain  AI  programs  that  an  object  be 
created  that  is  not  initially  an  in¬ 
stance  of  any  particular  class  but  that 
later  on  might  be.  One  way  around 
this  would  be  to  first  make  the  object 
an  instance  of  a  neutral  holder  class, 
such  as  Object  in  Smalltalk  or  t  in 
CommonLoops,  and  have  a  provision 
for  changing  the  object’s  parent  class 
at  a  later  time. 

Another  person  I  asked  for  com¬ 
ments  about  the  object-oriented  ex¬ 
tension  to  Common  LISP  was  Gerry 
Barber,  chief  scientist  at  Gold  Hill 
Computers,  the  main  vendor  for  a 
subset  of  Common  LISP  on  MS-DOS  ma¬ 
chines.  The  main  issues  Barber  and 
his  group  at  Gold  Hill  seem  con¬ 
cerned  about  are  that  the  standard 
make  use  of  those  features  that  are 
well  understood  and  trouble-free.  In 
this  respect,  he  has  some  doubts 
about  the  metaobject  protocol  that 
forms  the  heart  of  CommonLoops. 
Barber  is  apparently  not  as  confident 
as  members  of  the  standards  work¬ 
ing  committee  are  that  the  metaclass 
approach  that  works  so  well  in 
Smalltalk  and  CommonLoops  is  well- 
tested  enough  in  the  LISP  environ¬ 
ment  to  find  a  place  in  the  standard. 
He  agrees  that  it  would  be  desirable 
to  have  a  system  that  has  an  inherent 
flexibility  and  generality,  but  "it  is 
important,”  he  says,  "to  find  the  right 
generality  and  the  right  flexibility.” 
Barber  sums  up  the  outlook  at  Gold 
Hill  as  follows:  "Our  strategy  is  to  rely 
on  things  that  have  been  shown  not 
to  have  problems.”  Regardless  of 
whether  the  new  standard  is  ready 
in  time,  Gold  Hill  plans  to  release  its 
own  object-oriented  extension  to 
Common  LISP — a  system  that  will 
probably  closely  resemble  the  Fla¬ 
vors  system  from  Symbolics — early 
in  1987. 

The  third  person  I  spoke  to  on  the 
same  topic  was  Dan  Bobrow  of  the 
Intelligent  Systems  Laboratory  at  Xe¬ 
rox  PARC,  considered  by  many  to  be 
the  foremost  authority  on  object-ori¬ 
ented  programming  in  AI.  Compared 
with  Gabriel  and  Barber,  Bobrow  is 
most  optimistic  concerning  the 
emerging  standard  for  objects  in 
Common  LISP.  He  doesn't  feel  that 
what  has  been  reached  so  far  is  sim¬ 
ply  a  compromise  between  the  Xerox 


and  Symbolics  proposals.  He  feels 
that  what  is  happening  is  a  genuine 
synthesis  of  the  best  ideas  that  are 
around  right  now  in  object-oriented 
LISP.  To  him,  the  atmosphere  is  not 
one  of  tension  between  competing 
proposals  but  rather  a  true  profes¬ 
sional  collaboration,  much  in  the 
same  spirit  that  produced  the  origi¬ 
nal  Common  LISP  standard. 

I  also  asked  Bobrow  what  some  of 
the  limitations  might  be  with  the 
emerging  standard.  Here  again,  his 
answers  were  positive.  For  example, 
I  raised  some  of  the  issues  I  had  dis¬ 
cussed  with  Gabriel,  such  as  the  need 
in  advanced  AI  applications  to  be  able 
to  have  the  same  object  simulta¬ 
neously  present  in  several  different 
hierarchies.  Here  he  felt  certain  that 
the  new  standard  did  not  rule  out  the 
ability  to  program  such  applications. 

I  also  raised  the  issue  of  efficiency 
and  trouble-free  operation  vs.  adapt¬ 
ability  and  flexibility.  Once  again  he 
was  optimistic  and  indicated  that  it 
would  indeed  be  possible  to  include 
different  compilation  options  for  dif¬ 
ferent  uses  of  objects.  This  would 
mean,  for  example,  that  those  pro¬ 
grammers  using  objects  for  systems 
programming  and  user  interfaces, 
who  are  primarily  interested  in  pro¬ 
ducing  fast,  unmodifiable,  trouble- 
free  code,  could  use  one  compilation 
and  those  who  use  objects  in  AI  pro¬ 
grams  that  need  to  have  greater  flexi¬ 
bility  and  the  ability  to  modify  them¬ 
selves  dynamically  could  use  a 
different  compilation  option.  In  this 
way,  both  types  of  users  could  be 
satisfied. 

In  my  next  column  I’ll  continue  the 
theme  of  object-oriented  AI  by  focus¬ 
ing  on  some  specific  implementa¬ 
tions  of  object-oriented  languages. 

Bibliography 

Bobrow,  D.,  et  al.  "Merging  LISP  and 
Object-Oriented  Programming.” 
OOPSLA  '86  Proceedings. 

Moon,  D.  "Object-Oriented  Program¬ 
ming  with  Flavors.”  OOPSLA  '86 
Proceedings. 


DDJ 

Vote  for  your  favorite  feature/article. 
Circle  Reader  Service  No.  7. 


Dr.  Dobb's  Journal,  March  1987 

244 


133 


FORUM 


DDJ  ON  LINE 


*:  8534  SO/General/DDJ  office 

05-Dec-86  17:31:07 
Sb:  Screen  Weirdness! 

Fm:  Chris  Johnston 

71505,1752 
To:  All 

I  know  that  this  is  going  to  sound 
crazy,  but  I  was  sitting  at  my  desk  at 
work,  talking  to  a  coworker  who 
was  running  Microsoft  Word  on  a 
6-MHz  IBM  PC/AT  with  a  CGA.  I  was 
finishing  my  lunch,  crunching  on 
one  of  those  big  hard  pretzels,  when 
I  noticed  that  every  time  I  crunched 
the  screen  image  moved!  Nothing 
else  did,  just  the  image  on  the  screen, 
which  seemed  to  get  smaller  along 
the  vertical  axis,  as  if  the  tube  lost 
power  momentarily.  This  is  not  a 
real  change  in  the  image  because 
you  have  to  be  crunching  to  see  the 
effect,  but  everybody  else  who  tried 
it  saw  the  same  thing.  Our  assump¬ 
tion  is  that  the  vibrations  set  up 
when  you  crunch  are  some  multiple 
of  the  vertical  frame  rate,  causing  an 
apparent  change  in  the  screen.  On 
some  machines  I  can  see  the  vertical 
retrace  line  when  I  try  this.  It  doesn’t 
seem  to  happen  on  a  TV  set. 

Is  anybody  interested  in  trying 
this  out  and  seeing  if  it  is  a  universal 
phenomenon?  We’ve  been  talking 
about  trying  to  measure  the  'crunch 
frequencies’  with  an  accelerometer 
but  haven’t  tried  it  yet. 

P.S.  It  works  with  hard  cookies, 
too. 

•“There  is  a  reply:  8538 
«:  8538 

Sb:  Screen  Weirdness! 


Fm:  Ray  Duncan  [DDJFORUM] 
76703,4265 

It’s  not  crazy.  I've  been  noticing  the 
same  thing  for  about  a  year  in  my  of¬ 
fice.  The  funny  thing  is,  I  never  see  it 
on  my  own  screen,  but  I  can  easily 
see  it  on  the  screen  of  the  guy  who 
sits  on  the  other  side  of  the  room.  I 
always  assumed  it  was  'real  ' and 
wondered  why  he  never  said  any¬ 
thing  about  it,  and  I  never  really 
could  dream  up  a  reason  for  it. 

“*  There  is  a  reply:  8549 

*:  8549 

Sb:  Screen  Weirdness! 

Fm:  Pete  Becker  76347,3151 
I'd  guess  it’s  physiological.  Maybe 
the  vibration  from  the  crunch 
shakes  the  retina  or  something.  Such 
an  effect  would  be  more  pro¬ 
nounced  on  distant  objects:  because 
they  produce  a  smaller  image,  an 
identical  displacement  is  proportion¬ 
ately  much  larger. 

“*  There  is  a  reply:  8552 

*■.  8552 

Sb:  Screen  Weirdness! 

Fm:  Chris  Johnston 

71505,1752 

That  makes  a  lot  of  sense.  We  have 
an  HP  spectrum  analyzer  and  one  of 
those  accelerometer-hammer  thin- 
gies  that  you  use  for  measuring  the 
vibration  frequencies  of  mechanical 
parts.  You  give  the  item  a  whack  to 
excite  it,  and  then  it  senses  the  vibra¬ 
tions  that  result.  We  have  been 
thinking  about  trying  the  pretzel 
crunch  test  with  the  accelerometer 
pressed  firmly  against  a  cheekbone 
to  see  what  happens.  My  bet  is  the 
resultant  frequencies  are  some  mul¬ 
tiple  of  the  60-Hz  vertical  frame  rate. 

“*  There  is  a  reply:  8554 

*:  8554 

Sb:  Screen  Weirdness! 

Fm:  Pete  Becker  76347,3151 
There  probably  aren’t  any  well-de¬ 
fined  resultant  frequencies — just  a 
bunch  of  garbage,  which,  of  course, 


consists  of  all  frequencies.  I  suppose 
a  true  controlled  experiment  would 
consist  of  putting  the  accelerom¬ 
eter  against  your  cheekbone  and  hit¬ 
ting  your  jaw  with  a  hammer  so  you 
have  reproducible  conditions.  Any 
volunteers? 

***  There  are  replies:  8555, 

8603 

#:  8603 

Sb:  Screen  Weirdness! 

Fm:  Chris  Johnston 

71505,1752 

No,  I  think  I  will  skip  the  'hit  your 
jaw  with  a  hammer  part.’  Somebody 
at  work  did  suggest  putting  strain 
gauges  on  the  pretzel!  I'll  try  to  re¬ 
member  to  fire  up  the  accelerom¬ 
eter/frequency  analysis  system  and 
try  out  the  less  painful  part  of  the 
test.  I’ll  let  you  know! 

#:  8555 

Sb:  Screen  Weirdness! 

Fm:  jhon  Stanley  73765,1026 
The  wiggle  is  not  psychological — it 
really  exists.  It  is  due  to  the  vibration 
of  the  muscles  of  the  eye,  which  try 
to  correct  themselves  and  therefore 
move  the  eye.  It  is  so  common  that 
the  brain  automatically  corrects  for 
it.  It  can  do  a  great  job  correcting 
those  things  that  do  not  move  but  has 
trouble  with  moving  things — like,  for 
example,  a  band  of  light  sweeping 
down  the  face  of  a  CRT.  The  region 
of  the  CRT  lit  at  any  given  time  is 
small,  maybe  %—  1-inch  tall.  If  your 
eye  moves  up  while  the  band  moves 
down,  the  screen  will  appear  taller. 
Likewise,  move  the  eye  down  and 
the  picture  gets  smaller  (somewhat 
like  moving  an  original  on  the  copy 
machine  while  the  machine  scans  it). 
So,  vibrate,  and  the  picture  wiggles. 

I  have  shown  this  to  many  people, 
and  they  are  always  amazed.  They 
don't  believe  it  until  they  try  it 
themselves. 


“*  There  are  replies:  8564, 
8604 

*:  8604 

Sb:  Screen  Weirdness! 

Fm:  Chris  Johnston 


136 


Dr.  Dobb’s  Journal,  March  1987 

245 


71505,1752 

As  I  understand  it,  the  eye  wiggles  a 
little  all  the  time  because  nerves  that 
are  continuously  stimulated  shut 
down  afer  a  little  while.  The  places 
where  the  receptors  see  light  and 
dark  alternate  is  near  an  edge.  At  a 
low  level,  we  have  an  automatic  edge 
discriminator  built  into  the  system. 

*:  8564 

Sb:  Screen  Weirdness! 

Fm:  Pete  Becker  76347,3151 
To:  jhon  Stanley  73765,1026 
I  said  'physiological,'  not  psychologi¬ 
cal!  Your  explanation  sounds  good. 

#:  8637 

Sb:  Screen  Weirdness! 

Fm:  jhon  Stanley  73765,1026 
The  same  thing  happens,  I  have  no¬ 
ticed  recently,  with  LED  clocks.  The 
digits  are  scanned,  as  in  a  CRT,  so  the 
numbers  also  wiggle  when  you 
chew. 


***  There  is  a  reply:  8639 
*:  8639 

Sb:  Screen  Weirdness! 

Fm:  Pete  Becker  76347,3151 
Synchronisity!  I  was  at  my  sister's 
this  afternoon,  and  she  called  me 
into  the  kitchen  and  asked  me  to 
click  my  teeth  together  while  look¬ 
ing  at  the  clock  on  her  microwave 
oven.  Of  course,  I  wouldn’t  do  any¬ 
thing  so  undignified,  but  I  gave  her 
your  explanation. 

***  There  is  a  reply:  8696 

*\  8696 

Sb:  Screen  Weirdness! 

Fm:  jhon  Stanley  73765,1026 
I  told  a  friend  of  mine  at  work  to  go 
brthththththphphphphtttt  at  the  19- 
inch  monitor  we  use  because  it 
would  do  something  neat.  He  gave 
me  this  funny  look  and  a  half-heart¬ 


ed  brrththt.  I  told  him  to  try  again.  He 
did,  and  the  look  on  his  face  told  me 
he  saw  the  effect.  Ain't  science  won¬ 
derful? 

DDJ 

Vote  for  your  favorite  feature/article. 

Circle  Reader  Service  No.  8. 


Dr.  Dobb's  Journal,  March  1987 

246 


137 


PROGRAMMER'S  SERVICES 


THE  STATE  OF  BASIC 


More  BASIC  Modules 
and  Libraries 

One  of  the  shortcomings  of  the  early 
versions  of  microcomputer  BASIC 
was  the  absence  of  formal  libraries  of 
reusable  code.  The  lack  of  support 
for  multiline  functions,  callable  BASIC 
subroutines,  and  local  variables 
weakened  any  attempts  to  “simu¬ 
late”  BASIC  libraries.  The  implemen¬ 
tors  of  "new  wave”  BASICS,  a  phrase 
you  will  hear  often,  have  recognized 
the  need  for  supporting  user-defined 
libraries.  The  solutions  given  by 
these  new  BASICS  vary  in  the  degree 
of  flexibility  and  syntax,  however. 
Let’s  take  a  look. 

QuickBASIC  lets  you  create  libraries 
in  two  steps.  First,  you  write  the 
source  code  for  a  library  and  compile 
it  into  .OBJ  form.  The  next  step  is  to 
“build”  an  .EXE  library  file  by  using 
BUILDLIB.EXE,  which  is  supplied  with 
QuickBASIC.  BUILDLIB.EXE  is  able  to 
take  one  or  more  library  object  files 
and  create  a  bigger  library.  To  avoid 
confusion  between  libraries  in  .OBJ 
and  .EXE  files,  you  can  rename  the 
former  as  sublibraries.  QuickBASIC 
permits  your  BASIC  programs  to  use 
one  library  only.  The  default  library 
is  USERLIB.EXE,  but  you  can  instruct 
BUILDLIB.EXE  to  create  libraries  with 
other  names.  You  must  specify  these 
library  names  when  invoking  Quick¬ 
BASIC  from  the  DOS  command  line. 

The  preceding  discussion  gives  the 
impression  that  QuickBASIC  supports 
one  library  at  a  time.  The  good  news 
is  that  you  can  expand  and  update 
your  .EXE  library  files  by  including 
new  or  modified  sublibraries.  To  do 
this,  you  must  store  the  .OBJ  subli¬ 
brary  files  because  you  may  rebuild  a 
library  periodically. 


Example  1,  below,  shows  a  short 
subroutine  that  calculates  the  square 
root  of  a  number  using  Newton's 
method.  The  library  body  has  no  for¬ 
mal  declaration.  Functions  in  Quick¬ 
BASIC  libraries  are  local  to  the  library, 
so  I  have  used  a  callable  subroutine 
instead.  A  client  QuickBASIC  program 
need  not  make  any  special  declara¬ 
tions  to  use  the  library  subroutines; 
the  burden  falls  on  the  program  au¬ 
thor  to  document  external  subroutine 
calls.  The  scheme  of  calling  library 
subroutines  in  QuickBASIC  offers  a 
good  degree  of  language  extension. 

In  True  BASIC,  library  files  begin 
with  the  keyword  EXTERNAL.  Unlike 
QuickBASIC,  all  the  subroutines  and 
functions  are  accessible  to  the  client 
program.  Local  variables  are  not  auto¬ 
matically  shared  with  other  library 
routines  or  client  programs;  data 
flows  through  argument  lists  and  data 
files.  Unlike  QuickBASIC,  True  BASIC 
enables  your  application  program  to 
use  multiple  libraries.  You  use  the 


syntax  LIBRARY  <library  names  list> 
to  indicate  the  files  containing  the 
sought  libraries.  Functions  imported 
from  libraries  must  be  declared  in  DE¬ 
CLARE  DEF  <function  names  list> 
statements.  This  feature  of  True  BASIC 
permits  you  to  maintain  libraries  in  a 
more  independent  way  than  you  can 
in  QuickBASIC.  In  addition,  libraries 
can  call  other  libraries  in  True  BA¬ 
SIC — a  valuable  feature  for  modular 
software  development. 

Example  2,  below,  shows  the 
square  root  function  in  a  True  BASIC 
library.  The  library/module  loading 
feature  enables  you  to  do  away  with 
explicit  LIBRARY  and  DECLARE  DEF 
statements  related  to  the  loaded  li¬ 
braries.  In  that  respect,  True  BASIC 
also  offers  a  vehicle  for  language 
extension. 

BetterBASIC  supports  modules,  so 
much  so  that  the  implementation  it¬ 
self  is  modular.  A  customizable  con¬ 
figuration  text  file  is  used  to  list  the 
modules  that  are  loaded  into  memo- 


SUB  SQROOT ( X , S )  static 

IF  X  )=  0  THEN 
ACCR  =  1.0E-8 
S  =  X  /  2 

WHILE  ABS  ( S  *  S  —  X)  >  ACCR 
S=(X/S  +  S)/2 
WEND 

ELSE 

S  =  —  1  '  result  for  a  negative  argument 

END  IF 
END  SUB 


Example  1:  QuickBASIC  library  subroutine  to  compute  the  square  root  us¬ 
ing  Newton's  method 

EXTERNAL  !  declaration  needed  to  def  i  ne  a  True  BASIC  library 
DEF  FNSQRT  (  X  ) 

IF  X  >-  0  THEN 
LET  Acer  =  1  .  OE  — 8 
LET  S  =  X  /  2 

DO  WHILE  ABS  (  S  *  S  —  X)  )  ACCR 
LET  S  =  (X/S  +  S)/2 
LOOP 

LET  FNSQRT  =  S 
ELSE 

LET  FNSQRT  =  —  1  !  result  for  a  negative  argument 
END  IF 

END  DEF _ _ _ 

Example  2:  True  BASIC  library  function  to  compute  the  square  root  using 
Newton's  method. 


138 


Dr.  Dobb  s  Journal,  March  1987 

247 


THE  STATE  OF  BASIC 

(continued  from  page  138) 


ry  to  provide  your  BASIC  applications 
with  additional  routines.  Some  of  the 
libraries  are  used  to  make  Better  BA¬ 
SIC  compatible  with  BASICA. 

To  create  a  module  in  BetterBASlC, 
you  create  a  new  workspace  in 
which  you  define  local  and  exported 
routines  as  well  as  module  initializa¬ 
tion.  You  use  a  PUBLIC  declaration  as 
an  export  list;  any  routine  not  listed  is 
strictly  local.  Creating  a  module  in 
BetterBASlC  is  an  interactive  process. 
It  involves  a  MAKE  MODULE  <name> 
command  in  which  BetterBASlC  re¬ 
quests  you  to  verify  your  PUBLIC  dec¬ 
laration  and  MAIN  code  (used  to  ini¬ 
tialize  the  module).  An  affirmative 
answer  puts  your  module  in  memo¬ 
ry  and  makes  its  functions  accessible 
as  an  extension  of  the  language.  In¬ 
formation  is  passed  to  module  rou¬ 
tines  via  argument  lists,  data  files,  or 
the  use  of  pointers. 

Example  3,  below,  shows  a  module 
that  exports  the  BetterBASlC  version  of 
my  square  root  function.  Notice  that 
BetterBASlC  requires  line  numbers  in 
some  portions  of  the  code.  The  decla¬ 
rations  of  variables  are  similar  to 
those  of  Pascal  (more  about  this  in  a 
future  column).  BetterBASlC  uses  a  re¬ 
served  identifier  RESULT  instead  of 
the  function  name  to  return  the  result 
of  a  function.  Also  notice  that  the 
function  arguments  are  not  listed  im¬ 
mediately  after  the  function  name 
but  on  the  line  that  follows  the  func¬ 
tion  name  declaration. 

The  implementation  of  libraries  in 
the  new  wave  BASICS,  among  other 
new  features,  offers  an  enhanced 
level  of  software  engineering.  The 


presence  of  software  libraries  ac¬ 
knowledges  the  following: 

•  the  need  for  reliable  software  build¬ 
ing  blocks 

•  the  shortening  of  development  time 
by  reusing  existing  routines 
•support  for  structured  and  more 
systematic  program  development 

c 

Vote  for  your  favorite  feature/article. 

Circle  Reader  Service  No.  9. 


Vendors 

BetterBASlC 
Summit  Software 
106  Access  Rd. 

Norwood,  MA  02062 
(617)  769-7966 
$199 

Reader  Service  Number  50 

QUICKBASIC 
Microsoft  Corp. 

16011 

N.E.  36th  Way 
Redmond,  WA  98052 
(206)  882-8088 
$99 

Reader  Service  Number  51 

True  BASIC 
True  BASIC  Inc. 

39  S.  Main  St. 

Hanover,  HN  03755 
(603)  643-3882 
(call  for  prices) 

Reader  Service  Number  52 


PUBLIC:  SQROOT 

* 

REAL  FUNCTION:  SQROOT 

REAL  ARG :  X 

REAL :  S .Acer 

10  Acer  =  1  .  OE-8 

20  S  =  X  /  2 

30  WHILE  ABS(S  *  S  -  X)  >  Acer  DO 

40  S  =  (X  /  S  +  S  )  /  2 

50  REPEAT 

60  RESULT  =  S 

END  FUNCTION 

Example  3:  BetterBASlC  library  function  to  compute  the  square  root  us¬ 
ing  Newton  s  method 


140 

248 


Dr.  Dobb's  Journal,  March  1987 


PROGRAMMER'S  SERVICES 


OF  INTEREST 


Languages 

Pecan  Software  Systems’  imple¬ 
mentation  of  UCSD  Pascal  for  the  Ap¬ 
ple  I1GS  is  source-compatible  with  Ap¬ 
ple  Pascal.  Extensions  beyond  Apple 
Pascal  are  offered  in  areas  of  multi¬ 
tasking,  dynamic  memory  manage¬ 
ment,  extended-precision  arithmetic, 
and  separate  compilation.  The  soft¬ 
ware  is  ProDOS  compatible  and  uti¬ 
lizes  the  new  features  of  the  Apple 
IIGS,  including  extended  memory, 
sound,  and  graphics.  The  Power  Sys¬ 
tem  Professional  Pak  is  available  for 
$199.95.  Reader  Service  No.  16. 

Pecan  Software  Systems  Inc. 

1410  39th  St. 

Brooklyn,  NY  11218 
(718)  851-3100 

Hard  Disks/Utilities 

Storage  Dimensions  has  introduced 
the  AT160F,  a  320-megabyte,  high- 
performance,  internal,  dual  hard¬ 
disk  drive  for  IBM  PC/ATs  and  com¬ 
patibles.  The  AT160F  reduces  access 
time,  breaks  the  32-megabyte  DOS 
barrier,  and  surpasses  ROM  BIOS  max¬ 
imum  storage  restrictions.  It  is  fully 
compatible  with  the  IBM  PC/AT  and 
its  standard  Western  Digital  control¬ 
ler  card.  Price  for  the  AT160F  ranges 
from  $5,995  to  $9,995,  depending  on 
storage  capacity,  number  of  drives, 
and  the  inclusion  of  controller.  Read¬ 
er  Service  No.  17. 

Storage  Dimensions 
14127  Capri  Dr. 

Los  Gatos,  CA  95030 
(408)  370-3304 

The  Storage  Products  Division  of  Fu¬ 
jitsu  America  has  announced  a  high- 
performance,  5V4-inch,  optical  disk 
drive  with  a  600-megabyte  formatted 
capacity.  The  M2505A  WORM  (write 


once,  read  many)  drive  utilizes  a  two- 
laser  head  design  for  fast  throughput 
and  interlink  tracking  to  improve 
data  reliability.  The  drive  has  a  posi¬ 
tioning  time  of  100  milliseconds,  a  ro¬ 
tational  speed  of  1,800  RPM,  a  transfer 
rate  of  124K  per  second,  and  utilizes 
the  Enhanced  Storage  Device  Inter¬ 
face.  The  M2505A  with  M1080A  con¬ 
troller  is  priced  at  $3,500,  and  media 
for  the  5V4-inch  drive  is  priced  at 
$100.  Reader  Service  No.  18. 

Fujitsu  America 
3055  Orchard  Dr. 

San  Jose,  CA  95134 
(408)  946-8777 

Design  Software  has  released 
DSBACKUP+  and  DSRECOVER,  hard¬ 
disk  backup  and  protection  utilities 
for  the  IBM  PC,  PC/AT,  PC/XT,  and  com¬ 
patibles.  DSBACKUP+’s  features  in¬ 
clude  five-minute  backup  of  a  10-me- 
gabyte  hard  disk,  verification  of  data 
while  backing  up  or  restoring,  data 
compression  for  up  to  40  percent 
more  data  on  each  disk  or  cartridge, 
multiple  volumes  to  allow  backup 
and  restore  from  more  than  one 
drive  at  a  time,  and  the  ability  to  back 
up  only  those  files  that  have  been 
changed  since  the  last  backup.  DSRE¬ 
COVER 's  features  include  undeletes  in 
one  step,  views  of  all  deleted  files, 
and  the  ability  to  reconstruct  original 
formatting  when  a  hard  disk  is  refor¬ 
matted.  DSBACKUP+  is  priced  at 
$79.95,  and  DSRECOVER  is  priced  at 
$49.95.  Reader  Service  No.  19. 

Design  Software 
1275  W.  Roosevelt  Rd. 

West  Chicago,  IL  60185 

(800)  231-3088 

In  IL,  HI,  AK  (312)  231-4540 

SunDog  Software  Corp.  has  an¬ 
nounced  Squish,  a  40K  resident  file- 
compression  program  that  com¬ 
presses  files  on  both  hard  and  floppy 
disks.  When  the  files  are  used,  expan¬ 
sion  of  data  takes  place  spontaneous¬ 
ly  in  memory  rather  than  on  disk, 
and  no  advance  planning  is  needed 
to  use  the  files.  All  programs  that  use 
standard  DOS  functions  for  reading 
and  writing  can  use  ''squished”  files. 
Squish  is  also  compatible  with  other 
resident  programs.  The  program  is 
available  for  IBM  PCs,  PC/XTs,  PC/ATs, 


and  compatibles  using  DOS  2.0  and 
costs  $79.  Reader  Service  No.  20. 
SunDog  Software  Corp. 

264  Court  St. 

Brooklyn,  NY  11231 
(718)  855-9141 

Fail-Safe  from  CSSL  is  a  fault-tolerant 
system  that  allows  IBM  PCs  and  com¬ 
patibles  to  continue  operating  even 
after  catastrophic  hard-disk  failures. 
It  is  the  first  level  of  a  multitiered  sys¬ 
tem  that  comes  in  three  configura¬ 
tions.  The  other  configurations  are 
DFT  (Disk  Fault  Tolerant),  a  software 
and  half-card  version,  and  DFT  II,  a 
hardware-only  version  built  around 
firmware  and  a  controller  card.  Each 
configuration  contains  solutions  to 
the  most  common  problems  found  in 
personal  computer  system  failures. 
Fail-Safe  requires  DOS  2.0  or  later  and 
24K  RAM.  The  single-unit  PC  version 
is  available  for  $395.  DFT,  which  is 
configured  for  a  network  linking  up 
to  15  PCs,  is  available  for  $595.  Reader 
Service  No.  21. 

CSSL  Inc. 

909  Electric  Ave. 

Seal  Beach,  CA  90740 
(213)  493-2471 

Rabbit  Industries  has  introduced 
the  MagicDrive,  a  quick  and  power¬ 
ful  hard-disk  drive  for  Macintosh 
computers  that  have  at  least  512K 
RAM.  It  is  available  in  20-,  30-,  65-,  and 
235-megabyte  versions  and  includes 
such  features  as  automatic  error  de¬ 
tection  and  correction,  daisy-chain¬ 
ing,  automatic  head  parking,  print 
spooling,  password  security,  and 
backup  utilities.  Prices  range  from 
$699  to  $3,399,  depending  on  the  ver¬ 
sion.  Reader  Service  No.  22. 

Rabbit  Industries 

4505  Spicewood  Springs  Rd.,  Ste.  304 
Austin,  TX  78759 
(512)  343-0781 

Tools 

Flash-Up  Windows  from  The  Soft¬ 
ware  Bottling  Company  is  a  memo¬ 
ry-resident  utility  for  creating,  con¬ 
trolling,  and  managing  menus  and 
help  windows  for  DOS,  BASIC,  Pascal, 
C,  COBOL,  FORTRAN,  dBASE,  and  most 
other  software.  It  lets  programs  con¬ 
trol  windows,  allows  you  to  define 


142 


Dr.  Dobb's  Journal,  March  1987 

249 


OF  INTEREST 

(continued  from  page  142) 

windows,  assigns  windows  to  keys, 
and  acts  as  a  window  macro  en¬ 
hancer  by  letting  you  send  com¬ 
mands  to  running  programs.  Flash- 
Up  Windows  sells  for  $90.  Reader 
Service  No.  23. 

The  Software  Bottling  Company 
6600  Long  Island  Expwy. 

Maspeth,  NY  11378 
(718)  458-3700 

A  general-purpose  set  of  develop¬ 
ment  tools  and  C  function  libraries 
called  Real-Tools  is  available  from 
Pioneering  Controls  Technolo¬ 
gies.  Real-Tools  comprises  a  screen- 
management  system,  windowing  ca¬ 
pabilities,  user-defined  graphics,  and 
assorted  utilities  and  library  func¬ 
tions.  It  is  priced  at  $99  for  binary, 
$299  for  library  source,  and  $399  for 
complete  source.  Reader  Service  No. 
24. 

Pioneering  Controls  Technologies  Inc. 
510  Bering  Dr.,  Ste.  300 
Houston,  TX  77057 
(713)  266-8649 

Csharp  PC  Drivers  Package  is  a  li¬ 
brary  of  C  language  support  routines 
for  data  acquisition  and  control  hard¬ 
ware  on  the  IBM  PC,  PC/AT,  and  com¬ 
patibles  from  Systems  Guild.  It  in¬ 
cludes  support  for  the  Metrabyte 
Dash8  and  Dashl6  analog  I/O  boards, 
the  Data  Translation  DT2801  and 
DT2808  analog  I/O  output  boards,  and 
the  IBM  PC  DMA  controller.  Csharp  PC 
Drivers  Package  can  be  used  with  the 
following  C  compilers:  Microsoft  3.0 
and  4.0,  Lattice  2.15  and  3.10,  and  C86 
from  Computer  Innovations.  A  spe¬ 
cial  version  of  the  product  is  avail¬ 
able  for  use  with  Rational  Systems’ 
Instant-C  incremental  compiler.  A 
source  license  for  the  Csharp  PC  Driv¬ 
ers  Package  costs  $195.  Reader  Ser¬ 
vice  No.  25. 

Systems  Guild 
P.O.  Box  1085 
Kendall  Square  Station 
Cambridge,  MA  02142 
(617)  451-8479 

088  is  an  optimizer  compatible  with 
the  C  Ware  Corp.’s  DeSmet  C88  com¬ 
piler.  The  product,  introduced  by 
Key  Software  Products,  can  run 
stand-alone  or  installed  as  an  auto¬ 
matic  part  of  the  compilation  pro- 


Dr.  Dobb's  Journal,  March  1987 


cess.  In  minimal  8088  mode,  088  typi¬ 
cally  eliminates  4—13  percent  of  the 
instructions  and  simplifies  7—12  per¬ 
cent  of  those  that  remain.  IBM  PC/AT 
or  compatible  users  can  use  80188 
mode  to  eliminate  another  5—9  per¬ 
cent  of  the  instructions.  Programs 
that  make  heavy  use  of  an  8087  or 
80287  floating-point  chip  can  use  8087 
mode  to  achieve  significant  perform¬ 
ance  improvements.  088  is  available 
for  $49.  Reader  Service  No.  26. 

Key  Software  Products 
440  Ninth  Ave. 

Menlo  Park,  CA  94025 
(415)  364-9847 

Six  new  programming  toolkits  for 
use  with  Kyan  Pascal  are  available 
from  Kyan  Software.  The  toolkits 
save  programmers  time  and  help 
them  add  state-of-the-art  graphics 
and  other  features  to  their  Kyan  Pas¬ 
cal  programs.  The  toolkits  run  on  Ap¬ 
ple  IIs  with  Kyan  Pascal  and  are 
priced  from  $29.95  to  $149.95.  Reader 


Service  No.  27. 

Kyan  Software 
1850  Union  St.,  #183 
San  Francisco,  CA  94123 
(415)  626-2080 

Greenleaf  Software  has  released 
Data  Windows,  a  windows  and  data 
entry  library  for  C  language  pro¬ 
grammers.  DataWindows'  features 
include  overlaid  windows  with 
screen  management,  transaction-ori¬ 
ented  data  entry,  and  more  than  135 
functions.  Portions  of  the  program’s 
object  code  can  be  used  in  other  pro¬ 
grams  without  royalty  obligations. 
DataWindows  sells  for  $225,  and  the 
source  code  is  available  for  an  addi¬ 
tional  $225.  Reader  Service  No.  28. 
Greenleaf  Software 
1411  LeMay  Dr.,  Ste.  101 
Carrollton,  TX  75007 
(214)  631-0811 

Cytek  has  released  three  new  pack¬ 
ages  to  enhance  its  Multi-C  multitask¬ 


ing  library.  Multi-Comm  is  a  full-fea¬ 
tured  communications  library  that 
supports  high-speed,  interrupt-driv¬ 
en  data  transfers,  multiple  device 
types  in  asynchronous  or  synchro¬ 
nous  mode,  and  background  commu¬ 
nications  by  Multi-C  tasks.  Multi-Win- 
dows  is  a  window  development 
package  for  creating  pop-up  win¬ 
dows.  Multi-Forms  works  with  Multi- 
Windows  to  produce  data  entry  and 
display  screens.  Source  code  is  sup¬ 
plied  for  all  hardware-dependent 
functions,  allowing  them  to  be  used 
with  any  compiler-supported  com¬ 
puter,  including  MS-DOS  and  ROM- 
based  systems.  Multi-Comm  and 
Multi-Forms  are  priced  at  $149  each, 
and  Multi-Windows  sells  for  $295. 
Reader  Service  No.  29. 

Cytek  Inc. 

805  Turnpike  St.,  Unit  202 
North  Andover,  MA  01845 
(617)  687-8086 

The  C  386  Compiler  and  RLL  386  Relo- 


Dr.  Dobb's  Journal,  March  1987 


149 

251 


OF  INTEREST 

(continued  from  page  149) 

cation,  Linkage  and  Library  Tools 
package  from  Intel  Corp.  are  de- 
signed  to  help  speed  development  of 
both  embedded  and  on-target  80386 
application  software.  Both  packages 
support  all  the  80386's  features,  capa¬ 
bilities,  and  operating  modes.  The 
compiler  produces  object  code  that  is 
compatible  with  Intel’s  other  80386 
languages.  The  RLL  386  tools  package 
allows  programmers  to  design  pro¬ 
tected  multiuser  and  multitasking 
operating  systems.  The  C  386  Compil¬ 


er  and  RLL  386  package  sell  for  $900 
and  $600,  respectively.  Reader  Ser¬ 
vice  No.  30. 

Intel  Corp. 

Literature  Dept.  W338 
3065  Bowers  Ave. 

Santa  Clara,  CA  95051 
(408)  987-8080 

Graphics 

Microfield  Graphics  has  introduced 
T8,  a  single-board  graphics  system 
for  the  IBM  PC/AT,  RT/PC,  and  desktop 


computers  based  on  the  Intel  80386. 
Based  on  a  dual-processor  architec¬ 
ture  with  64-bit  internal  memory  in¬ 
terface,  the  T8  is  designed  to  meet  the 
graphics  requirements  of  high-end 
CAD/CAM,  CAE,  and  mapping  applica¬ 
tions.  Prices  vary  according  to  config¬ 
uration.  Reader  Service  No.  31. 
Microfield  Graphics  Inc. 

8285  S.W.  Nimbus  Ave.,  Ste.  161 
Beaverton,  OR  97005 
(503)  626-9393 

MSI  Logic  has  introduced  a  half-size, 
enhanced  graphics  adapter  called 
SMART  (Single  Monitor  Adapter  Tech¬ 
nology)  EGA.  The  adapter  is  compati¬ 
ble  with  any  IBM  PC  software  and  op¬ 
erates  in  all  the  popular  display 
modes  on  any  standard  EGA  color 
monitor.  It  costs  $549.  Reader  Service 
No.  32. 

NSI  Logic  Inc. 

Cedar  Hill  Business  Park 
257-B  Cedar  Hill  Rd. 

Marlboro,  MA  01752 
(617)  460-0717 

Editor 

UniPress  Software  has  released  a 
Unix-oriented  text  editor  called  vi- 
PLUS  that  has  some  features  not 
found  in  standard  vi,  such  as  multiple 
windows,  an  interactive  interface  to 
Unix,  and  extensibility  through  mac¬ 
ros.  It  is  available  for  many  computer 
systems  running  Unix,  Xenix,  Ultrix, 
and  other  Unix  derivatives.  The  PC 
version  sells  for  $645.  Reader  Service 
No  33. 

UniPress  has  also  released  C-macs,  a 
program  editor  for  C  programmers 
that  is  built  on  top  of  the  company's 
Emacs  editor.  C-macs  checks  and  bal¬ 
ances  parentheses  and  braces  and 
permits  programmers  to  define  an  in¬ 
denting  style.  The  editor  can  run 
make  while  an  edit  session  is  under¬ 
way  and  maintains  "tags”  of  all  sys¬ 
tem  components.  The  PC  version  of  C- 
macs  costs  $645.  Reader  Service  No  34. 
UniPress  Software  Inc. 

2025  Lincoln  Hwy. 

Edison,  NJ  08817 
(201)  985-8000 


DDJ 


150 

252 


Dr.  Dobb’s  Journal,  March  1987 


FORUM 


SWAINE'S  FLAMES 


A  computer  system  that  encapsu¬ 
lates  the  knowledge  of  experts 
(including  heuristic  decision  process¬ 
es)  for  retrieval  by  the  inexpert  but 
naturally  intelligent  is  called  an  ex¬ 
pert  system.  There  are  many  expert 
systems  in  existence  throughout  the 
world  today.  When  a  British  health 
association  recently  recommended 
that  each  health  district  in  Great  Brit¬ 
ain  avail  itself  of  a  (presumably  hu¬ 
man)  expert  on  AIDS,  it  discovered 
that  there  were  fewer  AIDS  experts 
than  health  districts  in  Britain.  Enter 
the  AIDS  expert  system. 

At  Warwick  University  in  Great 
Britain,  a  team  of  computer  scientists 
under  the  direction  of  Dr.  Boger  Brit¬ 
tain  is  scanning  more  than  100  arti¬ 
cles  a  month  on  AIDS,  building  a 
knowledge  base  that  will  help  doc¬ 
tors  counsel  AIDS  patients  and  serve 
as  a  research  and  diagnostic  tool.  The 
project  builds  on  a  prototype  system 
for  rabies  patients  and  is  expected  to 
cost  some  £20,000  and  result  in  a 
piece  of  software  that  can  run  on  a 
mainframe  or  microcomputer. 

Despite  some  recent  press  grum¬ 
blings  about  expert  systems  technolo¬ 
gy  not  being  the  miracle  the  same 
press  had  made  it  out  to  be,  expert 
systems  are  useful  tools  in  just  such 
situations  as  AIDS  diagnosis — in  fact, 
many  of  the  fundamental  expert  sys¬ 
tems  techniques  were  developed  in  a 
medical  diagnostic  framework  called 
MYCIN.  (We  have  received  and  expect 
to  publish  next  month  a  MYCIN-Iike 
expert  system.)  The  Warwick  project 
sounds  practical  and  may  actually 
make  a  contribution  to  putting  the 
brakes  on  the  AIDS  epidemic.  That's 
great,  and  because  I  try  to  maintain  a 
positive  mental  attitude  in  this  col¬ 
umn,  I  won't  suggest  that  the  three 
major  American  television  networks 
are  making  a  counterbalancing  con¬ 
tribution  to  the  spread  of  the  disease 
with  their  refusal  to  carry  condom 
advertising. 


I  learned  about  the  AIDS  expert  sys¬ 
tem  in  a  news  item  in  the  Christmas 
1986  issue  of  a  British  weekly  called 
New  Scientist.  This  periodical  is 
worth  the  time  of  any  scientifically 
curious  and  naturally  intelligent  per¬ 
son  on  either  side  of  the  Atlantic. 
Erstwhile  DDJ  resident  intern  Dave 
Cortesi  and  I  have  shared  a  deep 
fondness  for  New  Scientist  for  years 
(something  like  our  fondness  for  Jon 
Bentley's  Programming  Pearls  col¬ 
umn  in  IEEE  Software  and  our  respect 
for  at  least  the  intentions  of  the  best 
science  fiction,  this  last  being  what 
Dave  is  currently  writing).  New  Scien¬ 
tist’s  snipes  at  the  British  government 
are  often  (to  me)  funny  and  its  humor 
is  largely  (to  me)  incomprehensible, 
but  everything  else  is  gold.  There  is 
more  to  think  about  in  six  weeks  of 
New  Scientist  than  in  twelve  months 
of  Scientific  American. 

A  videophone  system  for  the  deaf 
that  handles  the  bandwidth  problem 
by  abstracting  essential  expressive 
and  gestural  cues  into  an  animated 
cartoon  of  the  caller  is  something  I’ve 
followed  off  and  on  for  years;  the  lat¬ 
est  word  on  this  University  of  Essex 
project  appeared  in  the  October  23  is¬ 
sue  of  New  Scientist.  The  November 
27  issue  talked  about  progress  toward 
standardization  on  a  Unix  applica¬ 
tion  interface,  which  could  be  the 
biggest  boost  for  Unix  since  Bell  Labs 
gave  it  to  the  universities.  The  De¬ 
cember  18  issue  had  brief  items  on 
Texas  Instruments’  trenching  tech¬ 
nology  for  3-D  4-megabit  memory 
chips  and  on  European  research  into 
sixth  generation  computers  (see  also 


"Sixth  Generation  Computers”  by 
Richard  Grigonis  in  the  May  1984 
DDJ).  It's  a  good  magazine. 


January  brought  one  of  the  most 
enjoyable  of  the  computer  trade 
shows:  Macworld  Expo.  The  atmo¬ 
sphere  this  year  was  particularly 
charged,  and  there  were  products 
and  announcements  significant 
enough  to  support  the  electricity. 

Steve  Jasik  showed  off  his  debug¬ 
ger  MacNosy  Part  Two:  The  Debug¬ 
ger.  His  slogan:  Beyond  Discipline 
and  Into  Bondage.  The  Interface 
Builder  from  Expertelligence  looked 
interesting:  it  lets  you  put  a  Mac  inter¬ 
face  on  LISP  programs.  The  Develop¬ 
ers'  Toolkit  panel,  with  moderator 
Scott  Knaster  (manager  of  the  Devel¬ 
oper  Technical  Support  Group  at  Ap¬ 
ple),  MicroPhone  author  Dennis 
Brothers,  Jim  Friedlander  of  Apple, 
and  David  Intersimoni  of  Borland, 
talked  encouragingly  about  MacApp 
and  APDA,  the  Apple  Programmer's 
and  Developer’s  Association. 


One  East  Coast  writer  out  for  the 
Expo  was  sometime  Rolling  Stone 
writer  Steven  Levy,  who  some  think 
was  the  model  for  the  John  Travolta 
character  in  the  less-than-perfect 
movie  Perfect.  Levy  definitely  was 
one  of  the  founders  of  the  Lunch 
Bunch,  a  group  of  technology  writers 
who  ate  hamburgers  together  on  two 
coasts.  The  Lunch  Bunch  has  served 
up  at  least  one  book  and  has  now 
spun  off  a  dinner  group  gourmandiz- 
ing  in  Silicon  Valley  under  the  label 
Nerd’s  Night  Out.  January's  menu 
called  for  a  discussion  of  Apple's  new 
machines,  but  the  most  knowledge¬ 
able  sources  decided  that  they 
couldn't  talk  about  that  and  canceled. 
How  Apple  has  changed. 


Michael  Swaine 
editor-in-chief 


152 


Dr.  Dobb's  Journal,  March  1987 

253 


■ 


BUILDING 
BETTER 
BRAINS: 

Artificial 
Neural  Networks 

Four  Mac  PRO 
Reviewed 

Quick  Expert 
Systems 

Educating 
Programmers 


Languages: 

C  Version  of  Nroff 


1 2*T  l  i  il  yZ*T#  I  l*>]  it!  ill: 


Object  Oriented  LISPs 
Np»w  RASIC  Subroutines 


APRIL  1987 


CONTENTS 


VOLUME  12,  ISSUE  4 


ARTICLES 


Artificial  nerves  ^ 


Expert  systems 

in  BASIC  ► 


Macintosh  ^ 
PROLOGS 


NroffinC  O 


Forth  tools , 
people  tools  ^ 


LISP  extensions  ^ 


Macintosh  2 


Educating 

programmers 

Programming  ► 
PROLOG  logic 


AI:  An  Artificial  Neural  Network  Experiment 

by  Robert  Jay  Brown 

The  greatest  challenge  in  artificial  intelligence  is  the 
learning  problem:  how  to  make  programs  that  can  learn 
from  experience.  One  paradigm  that  shows  promise  is  the 
neural  network  approach,  which  is  both  a  natural 
approach  to  the  learning  problem  and  a  radical  departure 
in  thinking  about  programming. 

AI:  MYCIN-Like  Expert  Systems 
by  Richard  W.  Grigonis 

How  to  build  a  backward-chaining  expert  system  without 
an  "AI  language.” 


16 


42 


REVIEWS 


AI:  Four  PROLOGS  lor  the  Macintosh 

by  Dan  L.  Pierson 

Dan  compares  four  PROLOG  packages  for  the  Mac  and 
describes  various  dialects  and  features  of  the  language. 


30 


COLUMNS 


C  CHEST 
by  Allen  Holub 

Allen  describes  some  of  the  features  of  his  nroff-like  text 
editor,  nr.  In  the  Flotsam  and  Jetsam  section,  he  presents  a 
solution  to  the  problem  of  managing  globals  in  large  C 
programs. 

STRUCTURED  PROGRAMMING 

by  Michael  Ham 

Michael  talks  about  people  skills  for  programmers  and 
discusses  a  solution  to  the  common  problem  of  collectings 
number  from  the  keyboard. 

ARTIFICIAL  INTELLIGENCE 
by  Ernest  R.  Tello 

Ernie  takes  a  look  at  the  features  of  three  object-oriented 
extensions  to  common  LISP  and  proposes  a  language 
standard  based  on  these  features. 


130 


140 


146 


FORUM 


EDITORIAL  6 

by  Michael  Swaine 

RUNNING  LIGHT  0 

by  Nick  Turner 
ARCHIVES  8 

LETTERS  10 

by  you 

VIEWPOINT  14 

by  Allen  Holub 

DDJ  ON  LINE  152 

SWAINE’S  FLAMES  160 

by  Michael  Swaine 


PROGRAMMER'S 

SERVICES 


THE  STATE  OF  BASIC:  156 
A  look  at  efficient  subroutines 
in  the  new  BASICS 
OF  INTEREST:  158 

New  products  out  there 
ADVERTISER  INDEX:  159 
Where  to  find  those  ads 


llr.DobbNJeuriMlof 

Software  Tbols 


BUILDING 

BETTER 

BRAINS 

Attack* 

Nnutoi  Nohvotl 


About  the  Cover 

Neural  network  programming 
takes  the  human  nervous  system 
as  its  model  for  program  structure. 
Better  brains?  No,  but  brainlike 
programs,  yes. 

This  Issue 

Our  third  annual  Artificial  Intelli¬ 
gence  issue  reflects  the  state  of  AI 
today.  LISP  is  now  common;  PRO¬ 
LOG,  which  virtually  didn’t  exist  on 
micros  in  1985,  is  everywhere;  ex¬ 
pert  systems  are  just  tools;  and  the 
cutting  edge  of  AI  work  is  the 
learning  problem. 

Next  Issue 

For  much  of  its  short  lifespan 
(about  30  years),  computer  music 
has  been  primarily  an  ivory- 
tower  pursuit — if  you  wanted  to 
do  serious  work  in  the  field  of 
computer-generated  music  you 
looked  to  the  university  re¬ 
search  labs.  Today,  music  algo¬ 
rithms  also  originate  in  the 
homes  and  workshops  of  people 
like  you — people  who  program 
music  applications  on  comput¬ 
ers  such  as  the  Amiga,  Atari  ST 
and  Macintosh. 

Our  May  feature  article  in¬ 
cludes  a  brief  history  of  comput¬ 
ers  in  music  and  focuses  on  some 
recent  developments  in  MIDI 
programming,  sampling,  tran¬ 
sient-oriented  synthesis  meth¬ 
ods,  and  programs  that  compose 
and  collaborate  on  original  mu¬ 
sic.  We’ll  also  have  an  article 
about  how  to  design  a  software- 
based  music  recorder  using 
MIDI. 

5  bandwidth  topic 

^  entry  point 


Dr.  Dobb's  Journal,  April  1987 

256 


3 


FORUM 


EDITORIAL 


Premise:  Apple  did 
several  things 
right  with  the  Mac  2. 

First,  note  that  the 
new  generation  of  32- 
bit  personal  comput¬ 
ers  will  beckon  to  the 
computer  -  literate 
masses,  not  just  to  pro¬ 
grammers  and  power 
users.  As  Bob  Enyart 
wrote  in  the  January 
27  issue  of  PC  Week,  "the  less  sophisti¬ 
cated  a  user  is,  the  more  dependent 
on  very  high-powered  I/O  and  intel¬ 
ligent  user  interfaces  he  will  be.  ”  Per¬ 
formance  will  sell  well. 

Second,  note  that  performance  in 
the  next  generation  machines  is  rap¬ 
idly  going  to  become  bus-bandwidth- 
bound,  particularly  if  those  less-so¬ 
phisticated  users  are  demanding 
such  memory-hungry  features  as 
high-powered  I/O.  (Color  graphics 
and  voice  synthesis  were  among  the 
features  that  Enyart  alluded  to.) 

Don’t  take  my  word  for  it:  "In  the 
next  few  years,  the  bus  bandwidth  to 
the  main  memory  is  going  to  be  the 
issue  which  determines  perform¬ 
ance.” — Hal  Hardenbergh,  The  Pro¬ 
grammer's  Journal,  September/Octo¬ 
ber  1986. 

And  in  terms  of  the  68020  specifi¬ 
cally,  "memory  access  time  is  one  of 
the! major  obstacles  to  increasing  the 
performance  of  the  68020  beyond 
what  it  already  provides.  Although  it 
is  possible  to  increase  the  clock  fre¬ 
quency,  the  performance  benefit, 
unless  accompanied  by  decreased 
system  access  time,  will  be  of  dimin¬ 
ished  value.”— Doug  MacGregor  and 
Jon  Rubinstein,  IEEE  Micro,  Decem¬ 
ber  1985. 

The  disk  drive  data-transfer  rate 
will  also  be  critical.  It's  already  possi¬ 
ble  to  see  the  crippling  effect  of  slow 
drives  on  fast  processors.  The  Com¬ 
paq  Deskpro  386,  which  is  just  a  fast 
micro  after  all,  works  as  well  as  it 
does  partly  because  it  doesn’t  have  a 
slow  disk  drive. 


Commercial  disk 
drive  technology  has 
been  a  five-megabit- 
per-second  world,  and 
five  megs  will  bore  a 
386  or  68K  unmerciful¬ 
ly.  Two  faster  drive  in¬ 
terfaces  currently  get¬ 
ting  attention  are  the 
SCSI  (Small  Computer 
Systems  Interface)  and 
the  ESDI  (Enhanced 
Small  Disk  Interface).  SCSI  will  do  up 
to  12  megabits;  ESDI  will  do  up  to  24 
megabits.  SCSI  is  smart  and  cheap  (a 
very-low  cost  do-it-yourself  SCSI  drive 
for  the  Mac  was  presented  in  the  Sep¬ 
tember  1985  issue  of  DDJ )  and  is  be¬ 
coming  something  of  a  standard. 

In  the  light  of  this,  consider  Apple’s 
choices  of  Nubus  and  an  SCSI  disk  in¬ 
terface:  an  open-architecture,  rela¬ 
tively  low-cost,  relatively  high-band- 
width  bus  and  a  relatively  low-cost, 
relatively  high-bandwidth  disk  inter¬ 
face.  Power,  dare  I  say,  to  the  people. 

Of  course,  they  left  the  CPU  bur¬ 
dened  with  driving  the  screen.  But  I 
only  said  they  did  some  things  right. 


' yyl^JLjzsiJ!  cS"^r<* 

Michael  Swaine 
editor-in-chief 


Dr.  Dol#s  journal  of 

Software  Tools 


Editorial 

Editor-in-Chief  Michael  Swaine 
Managing  Editor  Vince  Leone 
Assistant  Editors  Sara  Noah  Ruddy 
Levi  Thomas 

Technical  Editor  Allen  Holub 
Consulting  Editor  Nick  Turner 
Contributing  Editors  Ray  Duncan 
Michael  Ham 
Bela  Lubkin 
Namir  Shammas 
Ernest  R.  Tello 

Copy  Editor  Rhoda  Simmons 
Production 

Production  Manager  Bob  Wynne 

Art  Director  Michael  Hollister 
Assoc.  Art  Director  Joe  Sikoryak 
Typesetter  Jean  Aring 
Technical  Illustrator  Frank  Pollifrone 
Cover  Illustrators  Frank  Pollifrone 
Mark  Schroeder 
Circulation 

Circulation  Director  Maureen  Kaminski 
Newsstand  Sales  Mgr.  Stephanie  Barber 
Book  Marketing  Mgr.  Jane  Shaminghouse 
Circulation  Coordinator  Kathleen  Shay 
Administration 
Finance  Director  Kate  Wheat 
Business  Manager  Betty  Trickett 
Accounts  Payable  Supv.  Mayda  Lopez-Qpintana 
Accts.  Beceivable  Supv.  Laura  Di  Lazzaro 
Account  Managers 
Lisa  Boudreau  (415)  366-3600 
Gary  George  (404)  897-1923 
Michael  Wiener  (415)  366-3600 
Cynthia  Zuck  (718)  499-9333 
Promotions/Srvcs.Mgr.  Anna  Kittleson 
Advertising  Coordinator  Charles  Shively 

M&T  Publishing  Inc. 

Chairman  of  the  Board  Otmar  Weber 
Director  C.F.  von  Quadt 
President  and  Publisher  Laird  Foshay 
Associate  Publisher  Michael  Swaine 

Dr.  Dobb's  Journal  of  Software  Tools  (USPS  307690) 
is  published  monthly  by  M&.T Publishing  Inc.,  501  Gal¬ 
veston  Dr.,  Redwood  City,  CA  94063;  (415)  366-3600. 
Second-class  postage  paid  at  Redwood  City  and  at  ad¬ 
ditional  entry  points. 

Article  Submissions:  Send  manuscripts  and  disk 
(with  article  and  listings)  to  the  Editor. 

DDJ  on  CompuServe:  Type  GO  DDJ 
Address  Correction  Requested:  Postmaster:  Send 
Form  3579  to  Dr.  Dobb's  Journal,  P.O.  Box  27809,  San 
Diego,  CA  92128.  ISSN  0888-3076 

Customer  Service:  For  subscription  problems  call: 
outside  CA  (800)  321-3333;  in  CA  (619)  485-9623  or  566- 
6947.  For  book/software  order  problems  call  (415)  366- 
3600. 

Subscriptions:  $29.97  per  1  year;  $56.97  for  2  years. 
Canada  and  Mexico  add  $27  per  year  airmail  or  $10 
per  year  surface.  All  other  countries  add  $27  per  year 
airmail.  Foreign  subscriptions  must  be  prepaid  in  U.S. 
funds  drawn  on  a  U.S.  bank.  For  foreign  subscriptions, 
TELEX:  752-351. 

Foreign  Newsstand  Distributor:  Worldwide  Media 
Service  Inc.,  386  Park  Ave.  South,  New  York,  NY  10016; 
(212)  686-1520  TELEX:  620430  (WUI). 

Entire  contents  copyright  ©  1987  by  M&T^ 
Publishing,  Inc.,  unless  otherwise  noted  on 
specific  articles.  All  rights  reserved. 

People's  Computer  Company 

Dr.  Dobb's  Journal  of  Software  Tools  is  published  by  M&.T 
Publishing  Inc.  under  license  from  People's  Computer  Company, 
2682  Bishop  Dr.,  Suite  107,  San  Ramon,  CA  94583,  a  nonprofit 
corporation. 


6 


Dr.  Dobb's  Journal,  April  1987 

257 


FORUM 


RUNNING  LIGHT 


Like  many  of  you, 

I’m  a  habitual 
hacker.  But,  also  like 
most  of  you,  I  get 
burned  out  from  time 
to  time.  When  I  joined 
DDJ  last  February,  I 
was  more  than  ready 
to  stop  programming 
for  a  while.  I  had  just 
finished  working  on  a 
large  Macintosh  pro¬ 
ject  for  a  company  that  promptly 
went  out  of  business.  This  kind  of 
thing  happens  to  freelance  program¬ 
mers  once  in  a  while,  I  fear.  You  put  a 
lot  of  sweat,  time,  and  craftsmanship 
into  a  project  that  ends  up  never  see¬ 
ing  the  light  of  day.  And,  often  as  not, 
you  end  up  never  seeing  the  pay- 
check  for  all  that  wasted  effort  ei¬ 
ther.  These  are  the  events  that  lead  to 
programmer  burnout,  and  I  had  it 
bad.  When  the  opportunity  to  edit 
DDJ  came  along,  I  decided  to  take  a 
break  from  programming  and  ex¬ 
plore  the  world  of  magazine  editing 
. . .  frankly,  I  needed  a  change  of 
pace;  at  DDJ  I  could  talk  and  write 
about  programming  and  get  paid  for 
it!  An  exciting  possibility  to  be  sure. 

Now,  after  almost  a  year  of  talking 
about  programming  and  reading 
about  other  people's  programming 
projects,  I’m  itchy  to  get  back  into  the 
thick  of  things  again.  So,  starting  next 
month,  I'll  be  stepping  aside  as  editor 
and  taking  on  the  role  of  consulting 
editor.  This  way  I  get  more  time  to 
program  and  still  keep  my  hand  in 
around  here.  (Sometimes  you  can 
have  your  cake  and  eat  it  too!)  You 
will  also  hear  from  me  from  time  to 
time  in  the  Right  to  Assemble 
column. 

In  the  meantime,  this  is  our  annual 
AI  issue,  with  a  feature  article  on  a 
program  that  can  be  taught  to  recog¬ 
nize  ASCII  characters  in  graphic  form. 
SILOAM  represents  the  beginnings  of 
the  new  field  I  mentioned  recently — 
neural  networks.  There  was  a  brief 


flurry  of  interest  in 
such  networks  several 
years  back,  with  the 
invention  of  various 
simple  neural  circuits 
that  could  be  simulat¬ 
ed  relatively  easily. 
But  it  amounted  to  lit¬ 
tle,  partly  because  of 
limited  hardware  and 
partly  because  the 
models  were  not  so¬ 
phisticated  enough.  Now  the  field  is 
heating  up  again,  especially  as  we 
gain  the  power  to  simulate  some¬ 
what  larger  nets.  As  is  typical  of  new 
fields,  theories  abound,  and  there's  a 
lot  of  work  to  be  done.  We’d  like  to 
follow  the  developments  closely;  if 
you’re  working  in  the  field,  why  not 
give  us  a  call  or  send  a  letter? 

In  upcoming  issues  we'll  be  resur¬ 
recting  the  Professional  Programmer 
department  in  our  Programmer’s 
Services  section.  The  department 
will  deal  with  various  topics  perti¬ 
nent  to  those  of  you  who  are  making 
your  living  as  programmers — topics 
such  as  software  copyright  laws, 
products  for  producing  firmware, 
and  how  to  get  your  software  pub¬ 
lished.  We’ll  also  have  interviews- 
with  some  well-known  program¬ 
mers  and  hear  about  their  projects 
and  their  preferred  tools  for  soft¬ 
ware  development.  If  you  have  sug¬ 
gestions  about  what  you'd  like  to  see 
in  the  Pro  Pro  department,  send  a 
note  to  our  assistant  editor  Levi 
Thomas.  You  can  also  reach  Levi 
on  CompuServe — ID  number 
76703,4060. 

Who  knows,  we  may  even  write 
something  about  how  to  avoid  pro¬ 
grammer  burnout — if  we  can  find 
someone  who  knows  the  trick  to 
it.  .  .  . 

It’s  been  a  heck  of  a  lot  of  fun,  folks. 
Stay  tuned! 

Nick  Turner 
editor 


ARCHIVES 


The  Expanding  Mag 

"Once  again,  we've  published  our  largest 
issue  ever!  Our  expansion  is  exciting;  but  at 
the  same  time,  we  are  conscious  that  bigger 
is  not  always  better.  In  this<case,  however, 
our  increase  in  size  will  allow  us  options  we 
haven’t  had  in  the  past.  With  more  editorial 
pages,  we  have  the  opportunity  to  present 
more  variety  each  month,  even  when  we 
run  larger  pieces,  as  we  have  in  this  issue 
and  in  the  last.” — editorial,  Reynold  Wig¬ 
gins,  DDJ,  August  1983. 

Predictions  in  AI 

"The  biggest  revolution  will  come  in 
terms  of  software.  ‘Actor-based’  languages 
such  as  Smalltalk  will  allow  programs  to 
alter  themselves  to  the  user’s  wishes.  If  var¬ 
ious  present-day  languages  were  embed¬ 
ded  in  such  an  actor  language,  ten  strange 
new  customized  languages  for  specific 
purposes  could  be  generated  on  demand, 
merely  by  the  user  having  a  'conversation’ 
with  the  larger  host  Al/actor  program." — 
"5th  Generation  Computers Richard  Gri- 
gonis,  DDJ,  December  198Z. 

"So  what  have  the  Japanese  selected  as 
the  lingua  franca  of  their  AI  research?  Not 
lisp,  but  PROLOG,  an  obscure  language  de¬ 
veloped  by  the  French  and  ‘polished’  by 
the  British.  PROLOG  is  the  fundamental  er¬ 
ror  in  the  otherwise  sound  Japanese  Fifth 
Generation  Project." — "And  Still  More  Fifth 
Generation  Computers, "  Richard  Grigonis, 
DDJ,  August  1983. 

Ten  Years  Ago  in  DDJ 

"We  (the  Professional  Users  Group)  are 
not  particularly  interested  in  organization¬ 
al  clap-trap  but  rather  in  dissemination  of 
both  hard  data  and  innovative  fantasy,  re¬ 
garding  technical  features,  of  small  com¬ 
puter  systems,  both  hardware  and  soft¬ 
ware.  As  an  example  of  innovative  fantasy, 
let  me  tell  you  about  Dick  Maus’s  project:  to 
regard  the  game  of  'life’  as  a  subset  of  what 
Dick  calls  'Superlife.'  This  algorithm  will 
treat  growth  and  decline  phenomena  in 
cell  arrays.  He  sees  applications  for  his  sim¬ 
ulation  studies  in  such  diverse  fields  as  auto 
traffic  flow,  interaction  of  arrays  of  busi¬ 
nesses  in  a  given  market,  wave  phenome¬ 
na  analysis,  and  nerve  impulse  transmis¬ 
sion  in  neural  tissue,  just  for  starters.” — 
William  J.  Schenker,  M.D.,  letter  to  DDJ,  April 
1977. 

■  .  .  and  a  more-than-usually  confused 
potpourri  of  interesting  tidbits.” — last  en¬ 
try  in  table  of  contents,  DDJ,  April  1977. 


Dr  Dobbs  Journal* 

COMPUTER 

V>>Hsdienics  {Jrthodontia 

Ruamag  Ligf*  WiAoml  Oonbytt 


8 

258 


Dr.  Dobb's  Journal,  April  1987 


FORUM 


LETTERS 


Software  Gap 

Dear  DDJ, 

I'm  responding  to  the  two  “name 
witheld"  letters  in  the  January  1986 
issue. 

To  the  person  whose  managers 
have  business  backgrounds  and  don't 
understand  software:  I'd  say  turn¬ 
about  is  fair  play.  You  know  software 
and  have  strongly  held  standards — 
witness  your  letter  and  your  DDJ  sub¬ 
scription.  You  were  interviewed  for 
your  current  job  and  were  deemed 
acceptable.  In  an  interview,  you  have 
rights,  too.  Interview  the  person  in¬ 
terviewing  you.  No  need  to  be  overly 
aggressive  about  it — there's  always  a 
line  to  walk — but  see  if  you  have 
common  ground.  Without 
that  common  ground,  you're 
going  to  be  unhappy.  A  good 
manager  will  recognize  this, 
and  you  don’t  want  to  work 
for  a  bad  one. 

To  the  person  concerned 
about  sloppy  code:  You  men¬ 
tioned  “it  is  sometimes  diffi¬ 
cult  to  determine  when  to 
stop  coding  and  start  stub¬ 
bing.”  True  enough,  unless 
you  have  a  plan.  Let  me  sug¬ 
gest  one.  I  like  to  add  code  to  a 
running  program.  It's  more 
enjoyable  because  the  feed¬ 
back  is  right  there — it  shows 
on  the  screen,  the  device  runs, 
whatever.  Debugging  is  sim¬ 
pler — you  can  see.  Feedback 
is  the  crucial  element  in 
debugging. 

So  my  suggestion:  Write  the 
main  routine  and  enough  I/O 
to  get  some  feedback.  Stub  ev¬ 
erything  else.  Get  that  run¬ 
ning.  So  it’s  too  easy/simple — 


who  cares?  You  have  a  place  to  stand, 
and  the  CPU  is  your  lever.  Now  add 
code  that  makes  the  program  do 
something  simple.  Then  something 
else.  Keep  that  feedback  there.  Other¬ 
wise  the  job  can  become  work. 

I  think  it's  neat  that  I  can  go  some¬ 
where  five  days  a  week  to  have  fun 
and  actually  get  paid  for  it. 

Joe  Osgood 

14930  Hartland  St. 

Van  Nuys,  CA  91405 

More  Searching  for  a  Sine 

Dear  DDJ, 

Richard  A.  Campbell’s  use  of  a  Tay¬ 
lor's  series  approximation  for  the 
sine  [“In  Search  of  a  Sine,”  December 
1986]  is  the  wrong  approach  to  the 
problem.  The  question  that  should  be 
asked  is  “What  is  the  best  finite  poly¬ 
nomial  approximation  to  a  given 
function?”  The  answers  have  long 
been  known  and  are  to  be  found  in 
the  area  of  Chebyshev  polynomials; 
Mainframe  programmers  have  done 
a  considerable  amount  of  research 
on  this  topic.  Approximations  for  Dig¬ 
ital  Computers  by  Cecil  H.  Hastings  et 
al.  (Princeton  University  Press,  1955) 
lists  the  sine  function  as  well  as  a 
wide  variety  of  common  and  esoteric 


functions  as  part  of  a  research  study 
by  the  RAND  Corp. 

If  you  consider  a  poynomial  ap¬ 
proximation  of  three  terms  for  the 
sine,  the  values  of  Cl  =  1.5706268,  C2 
=  -0.6432292,  and  C3  =  0.0727102 
are  derived  from  approximating  the¬ 
ory,  whereas  Campbell's  Taylor's  se¬ 
ries  coefficients  are  Cl  =  1.570795,  C2 
=  -0.645921,  and  C3  =  0.07948765. 
For  a  polynomial  approximation  of 
four  terms,  the  coefficients  are  the 
same  for  the  approximating  theory 
and  the  finite  Taylor’s  series 
expansion. 

Hopefully  this  will  help  Campbell 
and  others  in  their  investigations  in 
this  area.  Too  many  implementations 
of  computer  languages  ignore  the 
vast  amount  of  work  that  has  gone 
before.  The  implementation  of  the 
sine  and  other  well-known  functions 
points  this  out. 

I  should  also  point  out  that  Radio 
Shack  offers  a  BASIC  program  that 
yields  15-digit  accuracy  on  a  4K  ma¬ 
chine  for  a  wide  variety  of  functions 
(Level  II  Double-Precision  Subroutine 
Program,  catalog  number  26-1704). 

Douglas  Ingalls 

Ithaca  College 

Ithaca,  New  York  14850 


Dear  DDJ, 

In  December  1986  you  ran  an 
article  in  which  Richard 
Campbell  recounted  his  diffi¬ 
culty  in  locating  a  suitable  ap¬ 
proximation  for  the  sine 
function. 

Maybe  I  can  make  his  day.  I 
faced  a  similar  problem  a  few 
years  ago,  and  in  my  searches 
I  came  across  a  book  that  con¬ 
sists  almost  entirely  of  ap¬ 
proximations — polynomial 
approximations,  recursive  ap¬ 
proximations  (!),  infinite  se¬ 
ries,  and  infinite  products — 
not  to  mention  all  the  useful 
identities,  integrals,  and  de¬ 
rivatives  thereof.  Would  you 
believe  a  trig  table  with  values 
for  the  sine  and  cosine  func¬ 
tions  to  23  significant  places? 
(Great  for  checking  up  on  that 
library  approximation  you're 
using!)  And  more. 

I’m  talking  about  the  Hand- 


F\  ROBOT  vision  ZA 

AT  DIVISION  1/ 


" Okay — I  understand  that  it  can  see  the  difference 
between  a  cat  and  a  dog — but  does  it  know  if  their 
intentions  are  hostile ?" 


10 


Dr.  Dobb's  Journal,  April  1987 

259 


LETTERS 

(continued  from  page  10) 


book  of  Mathematical  Functions,  edit¬ 
ed  by  Milton  Abromowitz  and  Irene 
A.  Stegun,  New  York:  Dover,  1965, 
originally  published  by  the  National 
Bureau  of  Standards  of  all  people.  Af¬ 
ter  looking  at  dozens  and  dozens  of 
mathematics  and  computer  hand¬ 
books,  searching  in  vain  for  an  ap¬ 
proximation  of  the  gamma  function, 
I  finally  came  across  this  gem  and 
had  a  rapturous  religious  experience 
on  the  spot.  Now  my  copy  resides 
next  to  Knuth,  K  &  R,  and  other  sa¬ 
cred  writings  of  computer  lore.  I 
don't  know  how  many  times  it  has 
saved  my  derriere. 

Your  local  college  library  almost 
certainly  has  a  copy — go  take  a  look, 
and  see  if  it  isn't  everything  I  say  it  is. 

William  Zeitler 

9  Pajaro  Way 

Salinas,  CA  93901 

Feedback 

Dear  DDJ, 

You  asked  for  input  in  the  January 
issue,  so  here  it  is.  First  the  complaint. 
At  the  bottom  of  page  16,  you  state: 
"you  can  move  up  from  the  68008, 
which  is  roughly  comparable  to  a 
fast  Z80  in  power.”  That  is  a  gross 
misrepresentation — the  68008  is 
about  four  times  faster  than  the  6809, 
and  the  6809  runs  rings  around  the 
Z80  (see  the  benchmarks  published  in 
Interface  Age,  April  1983).  I  agree  that 
there  have  been  several  poor  imple¬ 
mentations  of  the  68008  (Hazelwood) 
that  ran  very  slowly;  however,  a 
good  hardware  design  (Peripheral 
Technology)  will  really  fly. 

In  some  instances  the  68008  is  fast¬ 
er  than  the  68000.  A  hard-disk  driver 
where  the  sector  buffer  must  be  read 
and  placed  in  memory  is  one  such  in¬ 
stance  (see  Example  1,  right). 

In  the  Right  to  Assemble  column, 
the  numbers  project  is  super,  and  I 
am  sure  it  will  fill  a  large  void  in  com¬ 
puter  science  literature.  I  wrote  a  56- 
digit  calculator  for  the  6809  in  assem¬ 
bly  language,  and  I  could  not  find 
much  help  in  any  magazines  or 
books.  I  had  the  same  decision  to 
make  on  the  transcendental  func¬ 
tions.  The  precision  on  the  polynomi¬ 
al  approximation  was  at  the  most  10 
digits  and  I  needed  56,  so  I  had  to  use 
the  series  expansion.  The  arctan  se¬ 


ries  would  not  converge  around  1.5, 
so  I  devised  a  simple  solution — I  ap¬ 
plied  the  half-angle  formula  three 
times  and  then  multiplied  the  results 
by  8.  This  was  so  successful  that  I 
needed  only  24  elements  in  the  series 
for  56  digits  of  accuracy. 

If  I  may  brag  a  little,  my  CALC56 
program  calculates  Pi  (4*ATN(1))  in  14 
seconds,  accurate  to  55  places.  I  need¬ 
ed  four  expansion  series  (sine,  expo¬ 
nent  of  e,  natural  log,  and  arctan)  in 
my  calculator;  all  other  functions 
could  be  derived  from  them.  The 
source  code  is  in  6809  assembly  lan¬ 
guage  and  could  easily  be  transport¬ 
ed  to  the  68000.  If  you  want  to  see  this 
code,  I  would  be  willing  to  share.  It 
would  also  be  very  interesting  to  see 


if  you  have  better  algorithms  than  I 
have.  If  you  need  any  constants  cal¬ 
culated  to  high  accuracy,  just  let  me 
know. 

Dan  Farnsworth 

3646  Lantana  Rd. 

Lantana,  FL  33462-2299 

DDJ 


In  S8000  code  you  would  do  this  (A1  points  to  the 

destination . ) : 

MOVE . W  #255, D1 

counter 

8 

LEA  DCA ,  AO 

DC  addr 

12 

LOOP  MOVE . B  (A0),(A1)+ 

move  byte  12 

DBF  D 1 , LOOP 

loop  14 

6656 

RTS 

16 

6692  § 

In  the  68008  you  make  the 

hardware  so  that  the 

disk  con- 

troller  addresses  occupy  4 

bytes  consecutively: 

M0VEQ1.L  #63, D1 

counter 

8 

LEA  DCA, AO 

DC  addr 

24 

LOOP  MOVE . L  (A0),(A1)+ 

move  long 

word  40 

DBF  D 1 ,  LOOP 

loop  26 

4158 

RTS 

32 

4222  § 

If  memory  is  available  you 

can  do  this: 

LEA  DCA,  AO 

DC  addr 

24 

MOVE . L  (AO),(A1)+ 

move  long 

40 

MOVE . L  (AO),(A1)+ 

! 

move  long 

40 

1 

MOVE . L  (AO),(A1)+ 

64  2  byte 

instructions 

40 

2616  ■ 

Example  1:  Sometimes  the  68008  is  faster  than  the  68000. 


12 

260 


Dr.  Dobb's  Journal,  April  1987 


FORUM 


VIEWPOINT 


Education  and 
Programming 

There  is  a  popular  misconception 
(popular  among  educators  and  lay¬ 
men  alike)  that  programming  and 
mathematics  are  somehow  related. 
As  a  consequence  an  undue  amount 
of  math  is  forced  down  the  gullets  of 
unwilling  undergraduates  in  every 
computer  science  program  in  the 
country.  I  feel  that  not  only  is  this 
needless  torture  but  it  also  can  be 
counterproductive. 

To  my  mind,  a  good,  traditional, 
liberal-arts  education — one  that  in¬ 
cludes  a  little  math  but  that  also  in¬ 
cludes  things  such  as  English  compo¬ 
sition,  history,  and  Latin — is  better 
preparation  for  a  programming  ca¬ 
reer  than  a  math-  or  science-oriented 
education.  Many  of  the  worst  pro¬ 
grams  I've  seen  have  been  written  by 
mathematicians  and  physicists  and 
some  of  the  best  have  been  written 
by  people  with  degrees  in  English 
and  Russian  Literature.  How  is  this 
possible  if  programming  is  as 
grounded  in  math  as  some  would 


by  Allen  Holub 


have  us  believe?  More  to  the  point, 
many  potentially  excellent  program¬ 
mers  are  forced  out  of  computer  sci¬ 
ence  programs  every  year  because 
they  don’t  have  an  aptitude  for  math¬ 
ematics.  The  other  side  of  the  prob¬ 
lem  is  that  many  programmers  who 
do  graduate  are  not  fully  in  touch 
with  the  world  around  them  because 
of  the  unwarranted  dominance  of 
technical  training  to  the  detriment  of 
nontechnical  but  equally  valuable 
kinds  of  knowledge. 

Before  continuing,  I  need  to  clarify 


my  terms.  There’s  a  difference  be¬ 
tween  programming  and  computer 
science,  between  the  writing  of  pro¬ 
grams  and  the  study  of  them.  Com¬ 
puter  science  has  a  great  deal  to  do 
with  mathematics;  programming 
does  not.  I’m  also  differentiating  be¬ 
tween  engineering  and  program¬ 
ming.  You  obviously  need  to  under¬ 
stand  mathematics  to  write  a 
program  that  does  mathematical 
things. 

On  the  other  hand,  the  creation  of  a 
computer  program,  regardless  of  the 
function  of  that  program,  is  a  differ¬ 
ent  process  from  designing  the  math- 
related  algorithms  that  the  program 
implements.  In  any  event,  most  pro¬ 
grams  don't  implement  mathematical 
algorithms,  and  if  they  do,  the  algo¬ 
rithms  are  often  designed  by  someone 
other  than  the  programmer.  How 
much  calculus  is  there  in  a  word  pro¬ 
cessor  or  a  database  application? 

The  skills  you  need  to  solve  a  math 
problem  are  virtually  useless  when  it 
comes  to  writing  a  program.  At  the 
undergraduate  level,  at  least,  you 
solve  a  math  problem  by  repetitively 
applying  a  set  of  memorized  rules  to 
an  equation  until  one  or  more  of 
those  rules  does  something  useful.  If 
you  tried  to  write  a  program  this 
way,  you'd  never  get  beyond  the 
main ( )  module.  On  the  other  hand, 
the  process  of  writing  an  essay  is  al¬ 
most  identical  to  that  of  writing  a 
program.  You  start  both  with  an  out¬ 
line  of  some  sort  (what  is  a  structure 
chart  or  a  Warnier-Orr  diagram  if  not 
an  outline?),  you  have  to  organize 
both  in  the  same  way  (a  topic  para¬ 
graph  is  a  main (  )  subroutine,  the 
routines  that  main ( )  calls  are  sections 
in  the  essay,  and  so  forth),  and  the 
stepwise  refinement  of  a  program 
mirrors  the  development  of  an  idea 
within  an  essay. 

The  study  of  a  language,  especially 
a  classical  language  such  as  Latin,  is 
also  useful  to  programmers.  It's  not 
the  Latin  itself  that's  important — 
though  a  knowledge  of  Latin  certain¬ 
ly  does  help  improve  your  English — 
but  rather  it's  the  tools  that  you  need 
to  learn  Latin  that  are  important.  You 
are  really  teaching  yourself  how  to 
understand  a  large  and  complex  sys¬ 
tem,  and  once  you've  understood 


how  to  learn  the  language,  you  can 
apply  these  same  techniques  to  any 
complex  system,  such  as  a  computer 
program. 

These  problems  can  be  solved,  to 
some  extent,  by  a  restructuring  of  the 
educational  process.  I'm  arguing 
here,  not  for  the  restructuring  of 
computer  science  programs  as  such, 
but  rather  for  the  creation  of  a  new 
academic  discipline  entirely — one 
that's  concerned  primarily  with  pro¬ 
gramming.  A  little  math  is  obviously 
required — basic  algebra,  a  little  Bool¬ 
ean  and  linear  algebra,  some  set  the¬ 
ory,  and  (a  very  little)  calculus  are  all 
that's  needed,  however.  These  topics 
could  be  covered  quite  adequately  in 
a  well-designed  one-semester  course, 
and  as  I  said  earlier,  a  little  math  is 
just  as  much  a  part  of  a  liberal-arts 
education  as  is  English.  If  students 
were  required  to  take  a  full  year  of 
English  composition  rather  than  cal¬ 
culus,  not  only  would  we  have  bet¬ 
ter-written  programs,  but  we'd  have 
readable  documentation  for  them  as 
well.  The  study  of  computer  science 
as  such  could  then  be  moved  to  the 
graduate  level,  as  is  fitting  for  an  es¬ 
sentially  academic  discipline.  A  good 
parallel  is  the  way  that  you'd  earn  a 
medical  degree — a  bachelor's  degree 
in  a  related  but  more  general  field  is 
required  before  you  can  enter  a  grad¬ 
uate-level  M.D.  program.  Similarly,  a 
degree  in  programming  would  be  a 
prerequisite  for  a  degree  in  comput¬ 
er  science. 

In  the  long  run  this  sort  of  restruc¬ 
turing  would  give  us  both  better  pro¬ 
grammers  and  better  computer 
scientists. 

DDJ 

Vote  for  your  favorite  feature/article. 

Circle  Reader  Service  No.  1 . 


14 


Dr.  Dobb's  Journal,  April  1987 

261 


ARTICLES 


An  Artificial 
Neural  Network 
Experiment 


by  Robert  Jay  Brown 


This  article  describes 
an  experimental 
computer  program 
that  could  serve  as  one  com¬ 
ponent  of  a  computer  vision 
system.  The  program  simu¬ 
lates  an  artificial  neural  net¬ 
work  and  acts  as  a  learning 
machine.  It's  implemented 
as  a  committee  network  of  threshold  logic  units  that  func¬ 
tions  as  a  trainable  pattern  recognizer  for  visual  images 
composed  of  a  rectangular  array  of  pixels.  Each  pixel  con¬ 
tains  a  single  number  representing  the  gray-scale  value  of 
that  region  of  the  field  of  view. 

A  robot  vision  system  consists  of  many  components: 

•  image  capture,  in  which  the  image  is  converted  from 
light  or  other  radiation  to  an  analog  electrical  signal 
•  image  digitization  and  signal  processing,  in  which  the 
image  is  divided  into  a  set  of  picture  elements,  or  pixels, 
each  of  which  is  assigned  a  value  representing  the  bright¬ 
ness  and/or  color  of  that  part  of  the  image 
•  region,  edge,  and  boundary  detection,  in  which  the  dis¬ 
tinct  visual  elements  of  the  image  are  separated 
•  image  scaling  and  alignment,  in  which  the  digitized  im¬ 
age  is  rotated,  translated,  and  otherwise  transformed  to 
place  it  into  some  standard  position  and  size 
•  image  recognition,  in  which  the  preprocessed  image  is 
submitted  to  a  pattern  recognition  algorithm  to  allow  the 
machine  to  recognize  what  the  image  represents. 

This  article  is  about  image  recognition. 


Robert  Jay  Brown,  5225 N.  W.  Ct.,  Margate,  FL  33063.  Robert 
is  a  graduate  student  at  Florida  Atlantic  University.  He  is 
currently  a  consultant  involved  in  designing  electronic  sur¬ 
veillance  intercept  and  cryptography  systems. 


Neurophysiology 

SILOAM  (simple  image  learn¬ 
ing  on  adaptive  machinery) 
operates  by  modeling  one 
possible  organization  of  the 
actual  neural  structure  of 
the  brain  on  a  computer. 
Consider  a  single  nerve  cell 
(see  Figure  1,  page  18).  Dur¬ 
ing  World  War  II,  a  physiologist  and  a  mathematician 
worked  together  to  try  to  create  a  model  of  the  brain 
based  on  anatomical  and  physiological  experimental 
findings.  The  team  of  McCulloch  and  Pitts  started  by  mod¬ 
eling  the  single  nerve  cell,  or  neuron.  Their  model  is 
known  today  as  the  McCulloch-Pitts  neuron  and  is  depict¬ 
ed  in  Figure  2,  page  18. 

Microscopic  studies  reveal  that  the  nerve  cell  is  com¬ 
posed  of  a  cell  body,  or  cyton;  many  input  fibers,  or  den¬ 
drites;  and  a  single  output  fiber,  or  axon,  which  branches 
to  send  signals  to  the  dendrites  of  other  nerve  cells.  Physio¬ 
logical  studies  in  which  scientists  selectively  stimulate  sets 
of  dendrites  while  observing  the  axon  yield  the  following 
result:  some  dendrites  tend  to  cause  the  neuron  to  produce 
an  output  (the  neuron  “fires”),  and  other  dendrites  tend  to 
prevent  the  neuron  from  firing.  The  dendrites  that  en¬ 
courage  firing  are  called  "excitory”  inputs,  and  those  that 
discourage  firing  are  called  "inhibitory”  inputs. 

The  cell  either  fires  or  does  not  fire.  There  is  no  specific 
tie  between  which  inhibitory  input  cancels  an  excitory 
input.  Each  input  has  a  "weighting  factor”  associated  with 
it:  excitory  inputs  have  a  positive  weight,  and  inhibitory 
inputs  have  a  negative  weight.  The  neuron  combines  all 
these  inputs  (probably  within  the  cyton)  to  produce  the 
output.  Thus  it  appears  that  the  neuron  adds  up  all  the 
weights  associated  with  inputs  that  are  stimulated,  and  if 
the  result  exceeds  a  certain  threshold,  then  the  neuron 
fires. 


This  program 
acts  as 
a 

learning  machine . 


16 

262 


Dr.  Dobb's  Journal,  April  1987 


Threshold  Logic 

This  model  of  a  neuron  (Figure  3,  page  18)  is  called  a 
threshold  logic  unit  (TLU).  It  is  not  difficult  to  construct  a 
TLU  out  of  readily  available  hardware  components:  a 
Schmidt  trigger  connected  in  series  after  an  operational 
amplifier  wired  to  operate  as  an  analog  summer  will  suf¬ 
fice  (see  Figure  4,  page  19).  The  weights  are  the  gains  de¬ 
termined  by  the  proportionality  factors  of  the  scaling  re¬ 
sistors  at  the  plus  or  minus  inputs  to  the  op-amp. 
Adjustable  weights  can  be  realized  by  using  potentiome¬ 
ters  with  the  wiper  connected  to  the  stimulus  input,  with 
one  end  of  the  resistance  element  connected  to  the  plus 
input  of  the  op-amp  and  the  other  end  connected  to  the 
minus  input.  I  chose  to  simulate  the  operation  of  a  TLU 
with  software.  It's  cheaper,  and  as  you  shall  see,  it  allows 
the  program  to  adjust  its  own  weighting  factors,  which  is 
necessary  if  the  TLU  is  to  be  trained  automatically  rather 
than  "tweaked''  into  alignment  by  a  skilled  technician. 

The  TLU  is  the  physical  embodiment  of  a  linear  equa¬ 
tion  formed  by  setting  an  inner  product,  or  metric  (mea¬ 
sure  of  distance),  to  0.  The  solution  set  for  the  equation 
thus  formed  defines  a  cleaving  plane  that  acts  as  the 
boundary  between  two  half-spaces  in  the  weight  space  of 
all  possible  weight  sets  that  could  make  up  the  weights 
for  the  input  of  a  TLU.  The  cleaving  planes,  being  the 
solution  to  a  homogeneous  linear  equation,  must  pass 
through  the  origin.  When  the  equality  is  changed  to  an 
inequality,  the  sign  of  the  inequality  indicates  on  which 
side  of  the  cleaving  plane  the  weight  point  lies. 

SILOAM  contains  many  TLUs.  Each  has  as  many  inputs 
as  there  are  pixels  in  the  graphical  array  that  is  being 
examined  by  the  network.  In  addition,  each  TLU  has  an 
additional  input  that  is  always  excited.  This  extra  input 
provides  a  "reference  point  ”,  or  "bias”  (analogous  to  the 
DC  component  of  a  Fourier  series),  to  set  the  threshold 
that  determines  the  firing  point  of  the  TLU.  This  input  is 
necessary  to  homogenize  the  linear  equation  formed  by 


the  inner  product  used  to  compute  the  metric.  Without  it, 
an  input  of  all  Os  would  be  degenerate  and  the  training 
algorithm  would  fail  to  converge. 

Pattern-Space  Geometry 

The  task  of  computing  the  output  of  a  TLU  is  performed 
by  the  vector  operation  of  taking  the  “dot  product”  of  the 
stimulus  vector  and  the  weight  vector  associated  with  the 
TLU.  The  sign  of  the  result  determines  the  output:  a  posi¬ 
tive  sign  means  the  nerve  has  fired;  a  negative  sign  means 
it  hasn't.  Each  weight  vector  can  be  interpreted  as  a  point 
in  hyperspace,  or  weight  space.  If  you  look  at  the  dot 
product  formed  between  the  input  pattern  vector  and 
the  TLU  weight  vector,  you  see  that,  if  the  elements  of  the 
pattern  vector  are  viewed  as  constants  and  the  weight 
vector  elements  are  viewed  as  variables,  then  if  the  dot 
product  is  set  equal  to  0,  this  dot  product  forms  a  linear 
homogeneous  equation.  The  situation  in  3-space  is: 

ax  +  by+cz  =  0 

A,  b,  and  c  are  the  components  of  the  weight  vector,  and  x, 
y,  and  z  are  the  components  of  the  pattern  to  be  recog¬ 
nized.  Remember  that  z  is  set  to  a  constant  value  of  1.  The 
solution  set  defines  a  plane  that  passes  through  the  origin 
(see  Figure  5,  page  19).  The  plane  forms  a  pattern  surface 
that  cleaves  weight  space  into  two  half-spaces:  this  places 
one  set  of  possible  TLU  weights  on  one  side  of  the  pattern 
plane  and  one  set  on  the  other  side.  Any  given  weight 
point  will  be  on  either  the  negative  or  the  positive  side  of 
the  pattern  plane.  (The  two's  complement  arithmetic  of 
the  computer  makes  it  convenient  to  consider  a  point  lying 
on  the  plane  itself  to  be  on  the  positive  side  of  the  plane.) 

The  absolute  value  of  the  dot  product  is  proportional  to 
the  perpendicular  distance  from  the  weight  point  to  the 
pattern  plane.  Thus  a  given  pattern  hyperplane  divides 
weight  hyperspaCe  into  two  half-hyperspaces.  The  dot 


Dr.  Dobb's  Journal,  April  1987 


17 

263 


NEURAL  NETWORK 

(continued  from  page  17) 


product  of  the  pattern  vector  with  the  weight  point  re¬ 
turns  the  distance  from  the  weight  point  to  the  pattern 
plane.  The  weight  points  are  defined  by  the  weights  for 
each  of  the  inputs  to  the  TLU;  the  coordinates  of  the 
weight  point  are  just  the  values  of  the  weights  for  the  TLU. 

By  this  convention,  you  can  visualize  each  TLU  as  being 
represented  by  a  point  in  this  weight  space.  The  dot  prod¬ 
uct  of  the  weight  point  with  the  augmented  pattern  vec¬ 
tor  is  the  perpendicular  distance  from  the  weight  point  to 
the  pattern  plane.  If  this  quantity  is  positive,  then  the  TLU 
"recognizes”  the  pattern;  if  it  is  negative,  then  the  TLU 
does  not  recognize  the  pattern. 

The  TLU  is  a  pattern  dichotomizer:  its  corresponding 
weight  point  in  weight  space  is  on  the  positive  side  of 
some  pattern  hyperplanes  and  on  the  negative  side  of 
others.  Thus  it  divides  all  pattern  planes  in  weight  space 
into  two  classes,  or  sets:  those  it  is  on  the  positive  side  of, 
which  it  recognizes,  and  those  it  is  on  the  negative  side  of, 
which  it  does  not  recognize. 

Artificial  Neural  Networks 

Now,  how  about  more  difficult  cases,  in  which  a  simple 


linear  function  cannot  separate  the  categories?  I  will  first 
explore  a  yes/no  decision  from  a  single  network  and  will 
use  one  possible  network  of  TLUs,  called  the  committee 
network  (see  Figure  6,  page  19).  The  committee  network 
is  a  type  of  layered  machine.  A  committee  of  TLUs  is  com¬ 
posed  of  an  odd  number  of  TLUs,  each  presented  with  the 
same  input  pattern  vector.  Each  TLU  in  the  committee 
decides  whether  the  pattern  is  one  that  it  recognizes,  and 
it  casts  a  vote  accordingly.  A  chairman  TLU  then  counts 
the  votes.  The  chairman's  inputs  are  the  outputs  of  the 
committee  members.  Its  weights  are  fixed  at  1  for  all  its 
inputs,  so  it  simply  functions  as  a  vote  counter  for  the  rest 
of  the  committee.  By  training  multiple  networks  inde¬ 
pendently,  you  can  increase  the  number  of  recognizable 
classes  to  any  power  of  2,  as  shown  in  Figure  7,  page  20. 

I  have  developed  a  democratic  machine:  how  can  such 
a  thing  work?  I’ll  now  show  how  a  democratic  committee 
with  a  majority-rule  voting  system  can  make  a  substantial 
improvement  to  my  pattern  recognizer.  The  divisions 
formed  by  each  of  the  TLUs  in  the  committee  can  occur  at 
different  places.  Through  proper  training  (which  I'll  de¬ 
scribe  later),  the  voting  TLUs  in  the  committee  can  be 
made  to  form  point  clusters  in  weight  space  such  that  a 
majority  of  TLU  weight  points  will  always  be  on  the  prop¬ 
er  side  of  every  presented  pattern  hyperplane.  Thus  you 


Figure  1:  A  nerve  cell,  or  neuron 


Figure  Z:  McCulloch-Pitts  neuron  Figure  3:  The  threshold  logic  unit 


18 

264 


Dr.  Dobb's  Journal,  April  1987 


Inputs 


Figure  4:  Simplified  hardware  implementation  of  a  threshold  logic  unit  with  adjustable  weights 


Dot  product: 
0=<a,b,c><x,y,z> 

Expanded: 

0=ax+by+cz 


Figure  Ss  The  pattern  plane  as  a  linear  homogenous  equation  derived  from  a  dot  product 


Patem  input 


Decision 

output 


Figure  6:  A  committee  network  of  threshold  logic  units 


Dr.  Dobb’s  Journal,  April  1987 


19 

265 


NEURAL  NETWORK 

(continued  from  page  18) 

can  select  the  weight  points  such  that  a  majority  of  TLUs 
will  vote  yes,  and  this  defines  the  set  of  patterns  that  the 
committee  as  a  whole  will  recognize. 

To  train  such  a  committee,  you  present  it  with  a  pattern 
vector  and  observe  the  result.  If  the  committee  returns 
the  correct  answer,  you  present  another  pattern;  if  not, 
you  must  correct  it.  You  do  this  by  adjusting  the  weights 
to  produce  a  more  favorable  vote  (somewhat  equivalent 
to  lobbying  in  legislative  processes,  where  the  TLU  plays 
the  part  of  the  politician).  This  does  not  ensure  that  the 
correct  decision  will  be  obtained  the  next  time  the  com¬ 
mittee  sees  this  pattern,  but  the  vote  will  be  closer.  Be¬ 
cause  you  insist  oh  an  odd  number  of  TLUs  in  a  committee 
(exclusive  of  the  chairman),  you  can  never  have  a  tie. 
What  you  do  is  convert  one  TLU  at  a  time  to  a  more  "en¬ 
lightened”  view.  By  repeating  this  process  enough,  the 
committee  will  return  a  favorable  decision.  When  this 
occurs,  you  say  that  the  network  has  been  trained  to  rec¬ 
ognize  the  pattern. 

The  Training  Algorithm 

How  do  you  go  about  finding  the  correct  TLU  to  adjust, 
and  how  do  you  perform  the  adjustment?  You  pick  the 
TLU  that  voted  incorrectly  and  that  was  the  least  sure  of 
its  vote.  This  means  that  you  pick  the  TLU  that  had  the 
wrong  sign  for  its  dot  product  and  that  had  a  dot-product 


magnitude  that  was  the  least  of  all  TLUs  with  the  wrong 
sign.  This  corresponds  to  selecting  the  weight  point  clos¬ 
est  to  the  pattern  hyperplane  and  on  the  wrong  side  of  it. 
Now  you  know  which  TLU  to  work  on,  but  how  do  you 
adjust  the  weights  to  produce  the  desired  effect?  You 
move  the  weight  point  for  the  selected  TLU  along  the  per¬ 
pendicular  from  the  weight  point  to  the  pattern  hyper¬ 
plane  toward  the  pattern  hyperplane  and  through  to  the 
other  side  of  the  pattern  hyperplane,  thereby  changing 
the  TLU’s  classification  of  the  pattern. 

You  actually  move  the  weight  point  by  an  amount  deter¬ 
mined  by  a  constant — the  correction  fraction — times  the 
distance  from  the  weight  point  to  the  pattern  hyperplane. 
This  constant  must  lie  between  1  and  2  for  the  training 
algorithm  to  converge.  If  it  is  greater  than  1,  then  the 
weight  point  will  move  to  the  other  side  of  the  pattern 
hyperplane;  if  it  is  less  than  1,  then  the  weight  point  will 
move  toward  the  pattern  hyperplane  but  not  through  it. 
In  this  case,  the  training  algorithm  will  not  converge  and 
training  will  never  be  accomplished  because  the  weight 
point  will  always  be  on  the  wrong  side  of  the  pattern 
plane  even  though  it  gets  constantly  closer  to  it.  This  tech¬ 
nique  is  called  fractional  correction.  If  the  distance  moved 
is  the  least  integer  such  that  the  pattern  plane  will  be 
crossed,  this  choice  results  in  a  training  strategy  known  as 
absolute  correction.  The  simplest  technique  is  constant 
correction,  in  which  a  constant  distance  is  always  moved. 
Such  strategies  allow  for  the  use  of  integer  arithmetic  re¬ 
sulting  in  faster  execution  and  simpler  hardware. 


ao 

266 


Dr.  Dobb's  Journal,  April  1987 


NEURAL  NETWORK 

(continued  from  page  20) 

Fixed-increment  correction  with  binary  images  using 
8-bit  signed  integer  weights  lends  itself  to  cheap  parallel¬ 
ism.  A  separate  Intel  80C51  microcomputer  on  a  chip 
could  be  used  for  each  TLU,  taking  weight  points  out  of  an 
array  in  on-board  ROM.  Using  six  committees  of  seven 
voting  TLUs  each,  plus  a  single  80C51  to  count  all  the  votes 
and  act  as  central  control,  results  in  6X7+1=43  80C51 
processor  chips.  A  system  such  as  this  should  be  able  to 
perform  real-time  optical  text  scanning  of  printed  litera¬ 
ture.  Using  surface-mount  or  hybrid  packaging  methods, 
the  device  might  be  as  portable  as  a  Walkman.  A  speech 
synthesizer  could  serve  as  an  output  device,  receiving  AS¬ 
CII  text  from  the  pattern  recognizer.  VoM! — a  reading  de¬ 
vice  for  the  visually  handicapped.  To  this  end,  I  have  ex¬ 
amined  the  performance  of  an  8-bit  integer  network  with 
fixed-increment  correction  with  encouraging  results. 
The  TMS-320C25  digital  signal  processor  from  Texas  In¬ 
struments  may  prove  less  expensive  than  the  80C51  ar¬ 
ray.  Although  the  320  is  more  expensive,  it  is  much  faster 
than  the  80C51,  and  the  overall  system  cost  may  be  less. 

The  Experiment 

There  are  actually  three  versions  of  SILOAM — a  floating¬ 
point  version,  a  16-bit  integer  version,  and  an  8-bit  integer 
version  (see  Listing  One,  page  56).  The  symbol  ELTYPE  is 
defined  on  the  compiler  invocation  line  and  determines 
the  type  definition  for  an  element  of  a  weight  point  vec¬ 
tor.  You  select  the  pattern  presentation  order  for  training 
with  the  —  o  option  and  select  initial  conditions  for  the 
weight  points  with  the  — r  option.  The  correction  method 
is  selected  with  -a,  -i,  and  -f  options,  which  specify 
absolute,  fixed  increment,  or  fractional  correction,  re¬ 
spectively.  You  select  the  level  of  detail  for  logging  with 
the  —/option:  —10  displays  only  final  results;  — /3  displays 
the  most  detail. 

I  ran  the  program  on  a  small  pattern  file,  representing  a 


9  10 


Table  1:  Binary  images  of  numerals 


binary  image  of  each  of  the  numerals  0—9  (Table  1,  below). 
It  has  also  successfully  learned  the  entire  uppercase  alpha¬ 
bet.  The  alphabet  pattern  file  was  generated  by  rasterizing 
characters  from  the  Hershey  character  database  of  the  Na¬ 
tional  Bureau  of  Standards.  The  entire  ASCII  character  set 
was  generated  in  this  fashion  as  a  high-resolution  dot-ma¬ 
trix  raster  and  was  taught  to  SILOAM.  The  output  produced 
by  a  run  of  the  program  with  the  high-resolution  sample 
character  set  is  shown  in  Example  1,  below. 

It  is  interesting  that  a  network  of  only  one  TLU  per 
committee  could  learn  all  of  the  binary  images  of  the 
character  sets.  When  this  is  the  case,  the  pattern  set  is  said 
to  be  "linearly  separable."  A  pattern  file  of  random  ana¬ 
log  pixel  values  was  generated,  comprising  100  images.  In 
this  case,  a  single  TLU  could  not  learn  the  pattern  set — 
three  TLUs  were  required,  and  different  training  meth¬ 
ods  produced  radically  different  results.  Fixed  increment 
performed  quite  poorly,  absolute  correction  did  some¬ 
what  better,  but  fractional  correction  did  the  best.  Values 
of  the  correction  fraction  closer  to  2  seemed  to  perform 
better  and  resulted  in  faster  convergence.  In  fact,  conver¬ 
gence  was  even  achieved  with  values  greater  than  2,  al¬ 
though  when  they  got  up  to  about  2.5,  convergence 
failed. 

Limitations 

SILOAM  can  actually  get  confused  and  forget  things  it  has 
already  learned  in  the  process  of  trying  to  learn  new 
things.  In  Learning  Machines,  Nilsson  shows,  however, 
that  the  training  procedure  will  converge  to  the  desired 
result,  given  that  you  choose  a  suitable  distance  to  move 
the  weight  point  and  that  you  do  not  exceed  the  capacity 
of  the  machine  (related  to  the  number  of  TLUs  per  com¬ 
mittee  and  the  number  of  patterns  to  be  recognized).  This 
result  is  known  as  the  Fundamental  Training  Theorem, 
or  the  Perceptron  Convergence  Theorem. 

Despite  the  impressive  performance  of  this  simple  net¬ 
work,  it  has  some  serious  theoretical  shortcomings — for 
example,  it  cannot  learn  a  simple  exclusive-OR  function. 


Invoked  By:  ascii  — il  — tl 
element  type  is  int 

initialising 

mean  of  the  radii:  31.749012 
standard  deviation:  0.001284 

training  completed  in  5771  seconds. 

number  of  committees:  7 
number  of  tlus  total:  7 
number  of  elements:  7063 
number  of  connections:  6302 

number  of  passes  thru  file:  78 
number  of  patterns  in  file:  87 
number  of  mis-recognitions :  1190 
number  of  tlu  adjustments:  1190 
maximum  element  magnitude:  49.000000 

mean  of  the  radii:  83.3047640 
standard  deviation:  0.582814 


Example  1:  Output  of  SILOAM  with  high-resolution 
character  set 


24 


Dr.  Dobb's  Journal,  April  1987 

267 


NEURAL  NETWORK 

(continued  from  page  24) 

This  deficiency  can  be  shown  with  a  geometric  proof. 
Because  there  are  two  variables  input  to  an  XOR,  a  2  X  1 
input  pattern  is  needed.  This  results  in  three  dimensions 
for  the  pattern  space.  Because  there  are  four  combina¬ 
tions  of  the  inputs,  there  are  four  pattern  planes.  All  four 
planes  pass  through  the  origin.  If  you  imagine  a  sphere 
about  the  origin,  these  four  planes  intersect  the  sphere  in 
four  great  circles  and  these  four  circles  each  intersect 
each  other.  If  you  place  the  south  pole  of  the  sphere  on 
the  origin  of  a  2-D  graph  and  project  the  surface  of  the 
sphere  onto  the  plane  (as  in  a  polar  projection  of  the 
globe),  four  intersecting  circles  result — one  centered  in 
each  quadrant  of  the  plane.  Their  common  intersection  is 
around  the  origin  of  the  graph.  If  you  count  the  number 
of  distinct  regions  into  which  these  circles  divide  the 
plane,  you  get  14,  but  there  are  16  possible  combinations 
of  accepting  and  rejecting  4  distinct  patterns  composed  of 
2  independent  variables.  The  exclusive-OR  and  the  exclu- 
sive-NOR,  or  equivalence  relation,  are  these  missing  re¬ 
gions.  You  can  use  a  similar  argument  in  higher  dimen¬ 
sions  to  show  that  all  such  networks  are  likewise 
deficient  in  not  being  able  to  learn  all  possible  pattern  sets 
with  which  they  could  be  confronted. 

In  Parallel  Distributed  Processing,  Rumelhart  et  al. 
show,  however,  that  other  network  topologies,  neural 
activation  functions,  and  training  algorithms — especially 
the  gradient-descent  training  method — are  capable  of 
producing  all  possible  Boolean  switching  functions.  I 
highly  recommend  their  book  for  anyone  interested  in 
pursuing  the  study  of  artificial  neural  networks  in  depth. 

Topics  for  Further  Investigation 

One  idea  that  needs  to  be  explored  is  recognizing  patterns 
over  time — that  is,  sequences  of  patterns.  One  idea  might 
be  to  incorporate  some  sort  of  feedback  into  the  network 
to  provide  a  memory  capability.  The  output  bit  vector 
could  be  concatenated  with  the  pattern  input  vector  to 
provide  an  input  to  the  network  that  was  a  function  not 
only  of  the  current  input  but  also  of  the  previous  output. 

Additional  exploration  could  be  done  with  nonbinary 
pattern  elements.  Remember  that  the  recognition  pro¬ 
cess  does  not  require  the  elements  of  the  pattern  vector  to 
be  Is  and  Os;  any  real  values  will  still  satisfy  the  vector 
geometrical  constraints  and  should  be  recognizable  with 
essentially  the  same  algorithm. 

Geoffrey  Hinton  discussed  some  of  his  work  with  artifi¬ 
cial  neural  networks  at  the  AAAI-86  conference  in  Phila¬ 
delphia  last  August  (see  bibliography).  He  showed  how  a 
real-valued,  differentiable  activation  function  could  be 
trained  by  the  method  of  gradient  backflow  to  form  its 
own  independently  developed  internal  abstractions.  His 
experiment  involved  learning  two  similar  family  trees. 
The  network  formed,  on  its  own  and  without  being 
taught  explicitly  by  the  examples,  the  abstract  concepts 
of  generation  and  various  family  relations.  It  was  also 
able  to  generalize  its  experience  to  determine  the  rela¬ 
tions  between  individuals  it  had  not  been  previously  in¬ 
troduced  to.  It  got  better  than  three  out  of  four  new  prob¬ 
lems  correct.  Hinton  showed  that  a  statistical  correlating 


recognizer  would  fail  this  test  and  that  true  generaliza¬ 
tion  of  induced  abstractions  was  being  performed. 

\ europhysiology  Revisited 

What  can  you  learn  about  the  natural  mind  based  on  a 
model  neural  network?  Psychiatrists  treat  mental  illness 
with  drugs,  such  as  phenathiazines  and  lithium,  and  with 
electroshock  therapy.  Electroshock  therapy  is  assumed  to 
destroy  connections,  thereby  altering  weights  by  setting 
them  to  zero.  Phenathiazines  affect  the  release  of  neuro- 
transmitters  such  as  serotonin,  norepinephrine,  and  dopa¬ 
mine.  A  change  in  these  would  have  the  same  effect  as 
altering  the  weights  at  the  inputs  to  the  neuron.  Lithium  is 
metabolized  by  the  body’s  electrochemistry  in  the  same 
manner  as  is  sodium,  but  it  behaves  differently  in  nerve 
conduction  and  firing  potential.  Thus  lithium  acts  as  an 
inert  placeholder  for  the  neuroactive  sodium  in  the  sodi¬ 
um-potassium  complex.  The  presence  of  lithium  would 
affect  the  threshold  of  the  neuron,  but  this  is  just  another 
weight  in  my  model.  Thus  these  psychoactive  drugs  are 
trying  to  counteract  a  medical  problem  that  is  manifested 
in  a  perturbation  of  the  weights  of  the  neuron.  If  the  drugs 
are  underprescribed,  the  desired  effect  will  not  be 
achieved,  and  if  they  are  overprescribed,  the  weights  may 
be  totally  scrambled,  resulting  in  a  worsening  of 
symptoms. 

Hardware  Implementations 

Recently,  threshold  logic  has  been  receiving  a  lot  of  atten¬ 
tion  from  the  press.  The  front  page  of  the  Electronic  Engi¬ 
neering  Times  carried  an  article  on  February  3, 1986,  enti¬ 
tled  “Neural  Research  Yields  Computer  That  Can  Learn.” 
This  article  described  research  into  a  speech  learning 
program  based  on  Hopfield  networks  that  apparently 
makes  use  of  time  feedback  techniques  similar  to  those 
outlined  here. 

The  next  week  an  article  appeared  in  the  same  publica¬ 
tion  that  touted  threshold  logic  as  the  key  to  optical  com¬ 
puters.  This  article  gave  some  details  on  how  electro-opti¬ 
cal  technology  can  implement  threshold  logic  gates  with 
an  enormous  number  of  inputs. 

The  week  after  that,  an  article  appeared  giving  details 
of  a  gallium  arsenide  TLU  that  uses  the  analog  addition  of 
the  brightness  of  light  waves  to  perform  the  summing 
operation,  so  that  the  device  is  essentially  a  threshold  de¬ 
tector  with  a  photoresistor  for  an  input.  This  TLU  imple¬ 
mentation  does  not  increase  in  complexity  no  matter 
how  many  inputs  it  recieves:  they  are  just  lights  shining 
on  its  single  photoresistor.  The  output  of  this  device  is  a 
single  solid-state  laser.  This  last  article  also  described  a 
holographic  optical  interconnect  scheme  that  is  very  in¬ 
teresting.  The  output  lasers  reflect  off  a  hologram  placed 
over  the  chip.  The  hologram  acts  as  a  phased  array  reflec¬ 
tor  that  directs  each  output  laser  's  light  only  to  those  pho¬ 
toresistors  that  are  supposed  to  be  connected  to  that  out¬ 
put.  In  this  way,  the  logic  signals  travel  at  the  speed  of 
light  without  wasting  any  chip  real  estate  on  signal  inter¬ 
connect  lines.  The  logic  can  be  placed  closer  together,  and 
a  3-D  medium  is  available  for  interconnect  wiring  instead 
of  the  2-D  masks  of  current-day  chips.  Gallium  arsenide 
chips  are  now  available  that  operate  at  speeds  of  20  GHz. 
With  more  closely  spaced  circuits  and  more  flexible  de- 


26 

268 


Dr.  Dobb's  Journal,  April  1987 


sign  rules  for  interconnects,  a  tremendous  increase  in 
speed  should  be  gained  from  these  new  devices. 

This  kind  of  hardware  is  just  what  is  needed  to  take 
these  learning  systems  from  the  several  seconds  per  itera¬ 
tion  speed  zone  into  the  picoseconds  per  iteration  arena. 
You  can  certainly  expect  to  see  more  threshold  logic 
learning  systems  in  the  future  if  this  hardware  imple¬ 
mentation  effort  succeeds. 

Availability 

All  the  source  code  for  articles  in  this  issue  (except  for  C 
Chest)  is  available  on  a  single  disk.  To  order,  send  $14.95  to 
Dr.  Dobbs  Journal,  501  Galveston  Dr.,  Redwood  City,  CA 
94063  or  call  (415)  366-3600  ext.  216.  Please  specify  the  issue 
number  and  format  (MS-DOS,  Macintosh,  Kaypro). 

Bibliography 

Brown,  Chappell.  "Neural  Research  Yields  Computer 
That  Can  Learn.”  Electronic  Engineering  Times  (February 
3, 1986). 

Brown,  Chappell.  "Threshold  Logic  Gates:  Key  to  Making 
Optical  Computers  Reality?”  Electronic  Engineering  Times 
(February  10, 1986). 

Brown,  Chappell.  "Researchers  See  Lightwaves  at  End  of 
VLSI  Tunnel.”  Electronic  Engineering  Times  (February  17, 
1986). 

Buzan,  Tony,  and  Dixon,  Terence.  The  Evolving  Brain. 
New  York:  Holt,  Rinehart  &  Winston,  1977. 

Hinton,  Geoffrey  E.  "Learning  Distributed  Representa¬ 
tions  of  Concepts.”  Proceedings  of  the  Eighth  Annual  Con¬ 
ference  of  the  Cognitive  Science  Society.  Amherst,  Mass.: 
Lawrence  Erlbaum  Assoc.,  August  1986. 

Hopfield,  J.  J.  "Neural  Networks  and  Physical  Systems 
with  Emergent  Computational  Abilities.”  Proceedings  of 
the  National  Academy  of  Sciences  79  Washington,  D.C., 
1982. 

Masland,  Richard  H.  "The  Functional  Architecture  of  the 
Retina.”  Scientific  American,  vol.  255,  no.  6  (1986). 
McCulloch,  W.,  and  Pitts,  W.  "A  Logical  Calculus  of  the 
Ideas  Immanent  in  Nervous  Activity.”  Bulletin  of  Mathe¬ 
matical  Biophysics  5  (1943). 

Miller,  Richard  K.  Optical  Computers:  The  Ne?ct  Frontier  in 
Computing.  Madison,  Ga.:  SEAI  Technical  Publications, 
1986. 

Nilsson,  Nils  J.  Learning  Machines:  Foundations  of  Train- 
able  Pattern-Classifying  Systems.  New  York:  McGraw-Hill, 
1965. 

Rumelhart,  David  E.;  McCleland,  James  L.;  and  the  PDP 
Research  Group.  Parallel  Distributed  Processing.  Cam¬ 
bridge,  Mass.:  MIT  Press,  1986. 

Schwartz,  Tom  J.  "IBM  Research  Yields  Artificial  Neural 
Net  Workstation.”  Electronic  Engineering  Times  (Septem¬ 
ber  1,  1986). 

Silbar,  Margaret  L.  "In  Quest  of  a  Human-like  Robot.”  Ana¬ 
log  88  (November  1971). 

DDJ 

(Listing  begins  on  page  56.) 

Vote  for  your  favorite  feature/article. 

Circle  Reader  Service  IMo.  2. 


Dr.  Dobb's  Journal,  April  1987 


27 

269 


REVIEWS 


Four  PROLOGS  for  the 

Macintosh 

by  Dan  L.  Pierson 


The  growth  in  popularity  of 
PROLOG  has  been  vividly  re¬ 
flected  in  the  products  avail¬ 
able  for  the  Macintosh.  Four  PROLOGS 
have  been  released  since  January 
1986;  others  have  been  announced  or 
are  rumored.  This  review  covers  the 
four  PROLOGS  that  were  available  for 
the  Mac  as  of  November  1986:  PRO- 
LOG/m,  Version  1.10a;  AAIS  Prolog  M- 
1.10;  MacPROLOG,  Version  1.0a;  and 
ExperProlog-II,  Version  2.32. 

The  Products 
PROLOG/m 

PROLOG/m  from  Chalcedony  Soft¬ 
ware  is  a  basic  Edinburgh  PROLOG  in¬ 
terpreter.  It  and  the  IRM  PC  version, 
PROLOG/i,  are  the  successors  to  the 
several-year-old  PROLOG/V.  The 
package  includes  a  184-page  paper¬ 
back  Language  Reference  and  Tutori¬ 
al;  a  30-page  paperback  manual;  and 
a  bootable,  unprotected  disk  that  con¬ 
tains  the  PROLOG/m  program,  the 
PROLOG  library,  a  file  of  release  notes, 
and  a  folder  of  example  programs. 
Chalcedony  also  sells  three  packages 
of  extension  programs:  the  TOOLBOX, 
the  TOYBOX,  and  NFL-Xpert.  This  re¬ 
view  includes  the  TOOLBOX,  a  collec¬ 
tion  of  58  utility  programs.  Some  of 
the  TOOLBOX  programs,  such  as  setof 
and  bagof,  are  standard  parts  of  other 
PROLOGS.  Other  TOOLBOX  programs, 
such  as  avLtree  and  h_list,  are  use¬ 
ful  tools  that  are  not  part  of  any  other 
PROLOG.  The  TOOLBOX  consists  of  a 
72-page  paperback  book  that  con¬ 
tains  the  complete  source  of  all  the 
programs,  explanations,  and  exam- 


Dan  L.  Pierson,  10  Fort  Meadow  Dr., 
Hudson,  MA  01749.  Dan  was  one  of  the 
developers  of  the  first  release  of  VAX 
Common  LISP.  He  is  currently  devel¬ 
oping  software  for  Unix. 


It  seems  that 
PROLOG'S 
implementors 
never  considered 
syntactic  similarity  to 
be  a  goal. 

pies  of  use  and  a  disk  containing  the 
program  sources. 

AAIS  Prolog 

AAIS  Prolog  is  the  newest  of  the  prod¬ 
ucts.  The  first  shipments  for  the  Mac 
were  in  September  1986  (there  was 
an  earlier  Unix  version),  and  the  first 
update  was  shipped  in  late  Novem¬ 
ber.  AAIS  is  a  large,  elaborate  Edin¬ 
burgh-based  system.  The  AAIS  Prolog 
package  includes  a  134-page,  letter- 
size  manual;  the  primer  Prolog  Pro¬ 
gramming  for  Artificial  Intelligence 
by  Ivan  Bratko;  a  bootable,  unprotect¬ 
ed  disk  containing  the  PROLOG  pro¬ 
gram  and  library;  and  a  disk  with  a 
few  example  programs. 

The  November  update  to  AAIS  Pro¬ 
log  was  supposed  to  include  support 
for  Macintosh  graphics.  Much  to  my 
surprise  this  support  took  the  form  of 
general,  user-extendable  support  for 
both  the  entire  Macintosh  ROM  and 
any  additional  programs  in  code  re¬ 
sources.  This  means  that  anyone  with 
access  to  an  assembler  or  compiler 
that  can  produce  code  resources  can 
extend  AAIS  Prolog  simply  by  adding 
the  resource  to  a  copy  of  the  PROLOG 
file  using  Apple’s  ResEdit  and  writing 
a  small  PROLOG  program  to  load  the 
resource  and  define  the  new  predi¬ 
cates.  Of  course,  this  is  not  novice  pro¬ 
gramming— the  price  for  AAIS  Pro¬ 
log’s  flexibility  is  that  graphics  and 
window  programming  are  much 


more  complicated  than  in  a  more  lim¬ 
ited,  higher-level  environment. 

MacPROLOG 

MacPROLOG  is  micro-PROLOG,  the 
first  dialect  of  PROLOG  for  microcom¬ 
puters.  Micro-PROLOG  and  its  big  sis¬ 
ter  sigma-PROLOG  are  currently  avail¬ 
able  for  MS-DOS,  CP/M-86,  CP/M-80,  the 
Apple  II,  the  Commodore  64,  most 
Unix  machines,  VAX/VMS,  and  Data 
General  machines  running  AOS.  Mac¬ 
PROLOG  is  a  complete,  professional 
development  system  with  many  spe¬ 
cial  features.  The  MacPROLOG  pack¬ 
age  includes  a  PC-style  ring  binder 
containing  a  61-page  users’  guide  and 
a  large  reference  manual;  the  primer 
micro-PROLOG  by  Clark  and  McCabe; 
an  unbootable,  unprotected  disk  con¬ 
taining  the  PROLOG  program,  the  Ge¬ 
neric  (all  three  syntaxes)  program¬ 
ming  environment,  a  file  of  release 
notes  in  two  formats  (MacWrite  and 
text),  and  a  folder  of  example  pro¬ 
grams;  a  disk  containing  the  Standard 
and  Edinburgh  programming  envi¬ 
ronments;  and  a  disk  containing  the 
Standard  and  Edinburgh  run-time 
systems. 

Two  of  the  most  impressive  fea¬ 
tures  of  MacPROLOG  are  its  support 
for  three  syntaxes  with  automatic 
conversion  between  them  and  its  ex¬ 
tensive  support  for  the  Mac  inter¬ 
face.  Predicates  exist  to  perform  sim¬ 
ply  and  easily  almost  any  task 
involving  windows,  menus,  and  dia¬ 
logs.  For  example: 

(SCROLL_MENU 

["please  select  some"  fruit] 

[apples  pears  peaches] 

[apples]  _s) 

displays  a  moded  dialog  box  contain¬ 
ing  the  prompt  "please  select  some 


30 

270 


Dr.  Dobb's  Journal,  April  1987 


fruit,”  a  scrolling  list  of  the  fruit  with 
apples  highlighted,  and  an  OK  but¬ 
ton.  When  the  user  clicks  OK,  the 
variable  s  is  unified  with  the  select¬ 
ed  fruit.  Using  the  above  facilities,  a 
run-time  system,  and  user-defined 
error  handlers,  an  application  can 
hide  the  underlying  PROLOG  com¬ 
pletely.  Given  all  this  power,  it’s  sur¬ 
prising  that  MacPROLOG  provides  ab¬ 
solutely  no  support  for  Macintosh 
graphics. 

ExperProlog-II 

ExperProlog-II  is  a  polished,  profes¬ 
sional  production  of  Alain  Colmer- 
auer's  next-generation  PROLOG.  Ex- 
perTelligence  advertises  compatible 
versions  for  VAX/VMS  and  the  IBM  PC; 
the  reference  manual  also  mentions 
Hewlet-Packard  (HP-150,  HP-1000,  HP- 
9000)  and  expands  the  IBM  PC  to  in¬ 
clude  all  MS-DOS  machines.  The  pack¬ 
age  includes  an  attractive  vinyl  ring 
binder  containing  a  171-page  refer¬ 
ence  manual  and  132-page  Macintosh 
users’  manual;  the  primer  PROLOG  by 
Francis  Giannesini  et  al.;  a  bootable, 
unprotected  disk  containing  the  PRO¬ 
LOG  program,  an  initial  saved  system, 
and  a  demo  program;  and  a  disk  con¬ 
taining  example  programs  and  the 
Lisa  Pascal  programs,  object  files,  and 
link  commands  for  a  sample  exten¬ 
sion  to  Prolog-II. 

ExperProlog-II  provides  extensive, 
fairly  high-level  support  for  Macin¬ 
tosh  graphics  and  mouse  input  but 
very  little  menu  support  (extend  one 
menu)  and  no  support  for  dialogs  or 
applications  that  hide  the  underlying 
PROLOG.  User  extensions  to  PROLOG 
could  probably  do  most  of  this,  but 
extensions  currently  require  the  Lisa 
Workshop — an  uncommon,  expen¬ 
sive,  and  obsolete  development  envi¬ 
ronment  for  the  Mac. 

User  Interfaces 

All  four  PROLOGS  provide  a  basic  Mac¬ 
intosh  environment  with  a  menu 
bar,  desk  accessory  support,  and  an 
initial  window.  Table  1,  right,  sum¬ 
marizes  the  user  interface  features  of 
the  four  products.  The  main  interac¬ 
tion  window  is  called  a  query  win¬ 
dow  because  you  interact  with  a  PRO¬ 
LOG  system  by  asking  it  questions. 
The  first  three  entries  in  the  table  re¬ 
fer  to  features  of  this  query  window. 
The  next  section  refers  to  the  pro¬ 
gram  editor,  if  any.  Only  the  AAIS  edi¬ 


tor  is  suitable  for  editing  non-PROLOG 
text — the  other  editors  are  too  closely 
tied  to  the  PROLOG  evaluation  mecha¬ 
nism.  MacPROLOG  includes  a  Find 
Definition  menu  option  that  makes 
the  correct  window  visible  and  posi¬ 
tions  the  cursor  at  the  start  of  the  re¬ 
quested  definition.  Most  of  the  PRO¬ 
LOGS  report  errors  by  a  combination 
of  an  alert  dialog,  positioning  the  cur¬ 
sor,  and  highlighting  the  erroneous 
text.  The  method  for  interrupting  a 
long  running  program  varies  from 
none  in  Prolog-II  to  using  the  Mac's 
Programmer's  Switch  to  invoke  a  full 
debugger  breakpoint  in  AAIS  Prolog. 

Debugger 

All  the  PROLOGS  provide  some  sort  of 
debugger.  The  traditional  PROLOG  de¬ 
bugger  is  called  a  box  debugger  be¬ 
cause  it  is  based  on  viewing  a  PROLOG 
procedure  as  a  box  with  two  en¬ 
tries — call  and  redo — and  two  exits — 
fail  and  exit.  With  a  full  box  debug¬ 
ger,  tracing  and  breakpoints  (called 
spy  points)  are  individually  con¬ 
trolled  for  each  entry  and  exit  in  the 
box  for  each  procedure.  Individual 
control  of  the  entries  and  exits  is 
done  by  leashing  or  unleashing  the 
interpreter  for  each  type  of  entry  or 


exit.  PROLOG/m  provides  full  leash¬ 
ing  and  unleashing  but  few  other 
commands.  Prolog-II  provides  only 
an  uncontrolled  trace.  The  other  two 
PROLOGS  provide  box  debuggers 
without  full  leashing;  the  AAIS  Prolog 
debugger  provides  a  particularly 
rich  selection  of  other  options. 

The  last  row  in  Table  1  shows  how 
much  free  memory  remained  on  a 
512K  Macintosh  after  starting  the  ba¬ 
sic,  fully  loaded  system.  I  couldn’t  de¬ 
termine  this  number  for  PROLOG/m. 
The  other  three  PROLOGS  provide  op¬ 
tions  to  remove  features  and  save 
some  memory.  I  didn't  have  time  to 
reconfigure  each  PROLOG  to  see  how 
much  memory  can  be  saved  if  all  the 
optional  features  are  stripped  out. 

The  Dialects  of  PROLOG 

PROLOG  is  more  than  merely  another 
language  that  has  never  suffered  the 
pangs  of  standardization;  it  is  a  lan¬ 
guage  whose  primary  implementors 
seem  never  to  have  considered  syn¬ 
tactic  similarity,  let  alone  compatabi- 
lity,  a  goal.  The  four  PROLOGS  in  this 
review  support  five  distinct  syntaxes, 
only  one  of  which  is  merely  a  signifi¬ 
cant  extension  of  another. 

The  smallest  elements  of  a  PROLOG 


PROLOG/m 

AAIS  Prolog 

MacPROLOG 

Prolog-II 

Queries 

Scrolling 

no 

yes 

dialogs1 

yes 

Editable 

yes 

yes 

yes 

yes2 

First/all 

choice3 

pause4 

choice5 

all 

Editor 

no6 

yes 

yes 

yes 

Multifile 

— 

yes 

yes 

no 

Can  query 

— 

yes 

yes 

no 

Search 

— 

yes 

yes 

yes 

Replace 

— 

yes 

yes 

no 

Balance  parens 

— 

no 

option 

auto 

Find  definition 

— 

no 

yes 

no 

Errors 

Report 

message 

dialogs 

dialogs 

dialogs 

Cursor 

no 

yes 

yes 

yes 

Highlight 

no 

yes7 

yes 

yes 

Interrupt 

command- 

Programmer's 

Switch 

Programmer’s 

Switch 

no8 

Debugger 

box 

box3 

box10 

trace 

Free  memory  (bytes) 

? 

84,524 

81,850 

78,960 

1 .  Multiple  query  dialogs  can  be  open  at  a  time;  there  is  a  default  scrolling  output  window. 

2.  Special  Edit  Goals  window. 

3.  Global  menu/command  choice. 

4.  Pause  after  each  solution;  type for  next  solution. 

5.  Elaborate  choices  in  each  query  dialog. 

6.  Can  use  an  editor  desk  accessory. 

7.  Highlight  removed  when  dialog  is  exited;  dialog  sometimes  covers  highlight. 

8.  Programmer's  Switch  interrupt  is  recoverable  but  leaves  the  screen  messed  up  and  rudely  aborts  the  execution 
state. 

9.  No  leash/unleash  options  but  tots  of  stack  display  and  execution  modification  commands. 

10.  Only  interpreted  predicates  can  be  traced;  all  predicates  can  have  spy  points  set  on  them. 


Table  1:  User  interface 


Dr.  Dobb's  Journal,  April  1987 


31 

271 


MAC  PROLOGS 

(continued  from  page  31 ) 


program  are  clauses.  A  clause  consists 
of  a  functor  and  zero  or  more  argu¬ 
ments.  The  number  of  arguments  of  a 
clause  is  the  arity  of  the  clause.  A 
clause  of  zero  arity  is  an  atom.  Clauses 
are  the  elements  of  rules.  A  rule  con¬ 
sists  of  a  head  (the  left-hand  side)  and  a 
body  (the  right-hand  side).  The  head 
of  a  rule  consists  of  a  single  clause;  the 
body  consists  of  zero  or  more  clauses. 
The  head  of  a  rule  is  true  if  and  only  if 
all  the  clauses  in  the  body  are  true.  A 
rule  with  no  body  is  always  true  and 
is  known  as  a  fact.  A  set  of  rules 
whose  heads  all  have  the  same  func¬ 
tor  and  arity  form  a  procedure.  A  pro¬ 
cedure  defines  the  meaning  of  a  par¬ 
ticular  functor  and  arity,  called  a 
predicate.  Because  the'  same  functor 
can  have  different  meanings  with  dif¬ 
ferent  arities,  predicates  are  typically 
referred  to  by  both  functor  and  arity, 
as  in  plus/3  for  plus  with  three 
arguments. 

Edinburgh  Syntax 

The  most  common  PROLOG  syntax  is 
called  Edinburgh  syntax  after  the  Pro- 
log-10  and  C-Prolog  systems  devel¬ 
oped  at  the  University  of  Edinburgh. 
This  syntax  is  used  in  the  well-known 
PROLOG  primer  Programming  in  Pro¬ 
log,  the  excellent  new  intermediate 
hook  The  Art  of  Prolog,  and  many 
other  hooks  and  papers  on  the  lan¬ 
guage.  Edinburgh  is  the  only  syntax 
that  supports  a  wide  selection  of  infix 
operators.  All  of  the  products  except 
Prolog-11  support  the  Edinburgh  syn¬ 
tax,  either  natively  or  as  an  option. 

A  typical  Edinburgh  procedure 
looks  like  this: 

reversetl  1,1  ]). 
reversed  X 1  Xsl.Zs) :  — 
reverse!  Xs,Ys), 
append(Ys,|X],Zs). 

This  is  the  famous  "naive  reverse" 
procedure  for  reversing  a  list.  It 
reads:  the  null  list  is  its  own  reverse; 
to  reverse  a  nonnull  list,  concatenate 
the  reverse  of  the  tail  of  the  list  and 
the  head  of  the  list. 

In  Edinburgh  syntax,  clauses  are 
expressed  as  functor! argl, ...]),  and 
rules  are  clause  : —  clause 
I, clause .  .  .  /.,  where  the  commas  be¬ 
tween  clauses  mean  and.  Lists  arc  ex¬ 


pressed  as  comma-separated  terms 
within  square  brackets;  the  head  and 
tail  of  a  list  are  indicated  by  a  vertical 
bar.  Thus  [X  ,'Xs]  means  the  list  with 
head  X  and  tail  Xs.  Symbols  start  with 
a  lowercase  letter,  whereas  variables 
start  with  an  uppercase  letter.  Arith¬ 
metic  expressions  are  expressed  as XI 
is  X  +  1.  Note  that  X  and  XI  are  differ¬ 
ent  variables;  once  a  PROLOG  variable 
is  bound  to  a  value,  it  cannot  be 
changed.  Operators  such  as  is  and  + 
can  be  freely  defined  with  arbitrary 
precedence,  associativity,  and  mean¬ 
ing.  There  are  two  types  of  com¬ 
ments:  end-of-line  comments  starting 
with  '%’and  comment  text  bracketed 
by  nonnesting  ’/*'and  ’“/'as  in  C.  Mac- 
PROLOG's  Edinburgh  mode  does  not 
permit  the  end-of-line  comment 
syntax. 

A  A  IS  Prolog's  native  syntax  is  an  ex¬ 
tension  of  Edinburgh  syntax  that  sup¬ 
ports  AAIS’  new  features  and  data 
types.  AAIS  Prolog  is  about  as  similar 
to  Edinburgh  PROLOG  as  Common 
LISP  is  to  its  ancestor  MacLisp. 


The  History 

Interest  in  using  formal  logic  as  a  pro¬ 
gramming  language  dates  back  to  re¬ 
search  in  automatic  theorem  provers 
in  the  early  1950s.  Robinson's  1965 
paper1  provided  the  necessary 
groundwork  for  a  practical  logic  pro¬ 
gramming  language.  Hewitt's  PLAN¬ 
NER,2  although  later  recognized  as  a 
failure,  was  the  first  logic-based  pro¬ 
gramming  system.  Cooperative  re¬ 
search  by  Alain  Colmerauer  and  Rob¬ 
ert  Kowalski  resulted  in  the  creation 
of  the  first  PROLOG  interpreter  in  the 
early  1970s. 

PROLOG  research  and  development 
continued  in  Europe  during  the  1970s. 
The  two  principle  research  groups 
were  Colmerauer's  group  at  the  Uni¬ 
versity  of  Marseille-Aix  and  the  Uni¬ 
versity  of  Edinburgh  group,  which  in¬ 
cluded  Robert  Kowalski  and  David 
Warren.  Warren  wits  responsible  for 
the  next  major  breakthough  in  PRO¬ 
LOG.  His  Prolog-10  compiler,  the  first 
high-performance  PROLOG  system, 
did  much  to  dispel  the  belief  that  logic 
programming  languages  had  to  be 
horribly  inefficient. 

The  Japanese  gave  PROLOG  its  next 
big  boost  when  they  decided  to  use 


The  naive  reverse  procedure  is  ex¬ 
actly  the  same  in  AAIS  and  Edinburgh 
PROLOGS.  Among  the  syntactic 
changes  in  AAIS  Prolog  are  that  c'is  a 
character  instead  of  a  small  integer 
(but  can  automatically  be  treated  as  an 
integer  when  needed),  cat  is  a  string 
instead  of  a  list  of  small  integers,  and 
foo:append  is  the  symbol  append  in 
the  package  foo.  Also,  packages  are  a 
typeof  module;  programsinonepack- 
age  will  not  see  symbols  of  another 
package  unless  the  program’s  pack¬ 
age  inherits  from  the  other  package. 

Standard  Syntax 

MacPROLOG's  native  (Standard)  syn¬ 
tax  is  LlSP-like;  everything  is  fully  pa¬ 
renthesized  and  stored  as  lists.  The  na¬ 
ive  reverse  procedure  is  expressed  in 
Standard  syntax  as: 

((reverse  ()())) 

((reverse  (_Xi_Xs)  _Zs) 

(reverse  __Xs  _Ys) 

(APPEND  _Ys  ( _ X) _ Zs)) 


of  PROLOG 


PROLOG  instead  of  LISP  as  the  basis  of 
their  Fifth  Generation  Project. 

Though  still  controversial  (see  Carl 
Hewitt's  attack  in  the  premier  issue 
of  AI  Expert  magazine3),  PROLOG  is 
gaining  force  as  the  second  major 
language  for  AI  applications.  Serious 
implementations  of  all  the  major  dia¬ 
lects  are  available  for  most  comput¬ 
ers  and  operating  systems  from  mi¬ 
cros  to  mainframes.  New  language 
features  are  appearing  to  address 
many  of  the  perceived  shortcomings 
of  basic  PROLOG.  Whether  you  be¬ 
lieve  in  the  virtues  of  PROLOG  or  not, 
the  language  is  maturing  as  a  real, 
long-term  option  in  the  program¬ 
ming  world. 

PROLOG  is  an  important  language  to 
understand  because  it  requires  a  new 
way  of  thinking  about  programming 
problems.  PROLOG  is  a  declarative, 
side-effect-free  language  with  a  pre¬ 
defined  processing  loop.  The  absence 
of  side  effects  in  “pure’’  PROLOG 
makes  the  language  a  serious  candi¬ 
date  for  parallel  processing.  Indeed, 
several  concurrent  languages  based 
on  PROLOG,  such  as  Concurrent  Pro¬ 
log,  ^  have  already  appeared. 


32 

272 


Dr.  Dobb  s  Journal,  April  1987 


MAC  PROLOGS 

(continued  from  page  32) 


Clauses  are  expressed  as  (functor 
arg  [arg  . . .  ])  and  rules  are  (clause 
[clause  ...]).  Lists  are  a  parenthesized 
sequence  of  terms,  and  a  rule  is  a  list. 
As  in  Edinburgh  syntax,  a  vertical  bar 
separates  the  head  and  tail  of  a  list. 
Symbols  start  with  any  alphabetic 
character.  Variables  are  distinguished 
by  a  leading  underscore.  Arithmetic 
expressions  are  expressed  as  ( +  _y  1 
_ylj;  infix  operators  are  not  available. 


Multi-element  structures  with  com¬ 
pact  storage  similar  to  Edinburgh 
clauses  are  available  as  tuples  ex¬ 
pressed  as  <a_b  1>;  special  functions 
are  provided  to  compose  and  access 
tuples.  Comments  are  bracketed  with 
nonnesting  '/*'  and  This  syntax  is 
alleged  to  make  it  much  easier  to  write 
programs  that  manipulate  other 
programs. 

Simple  Syntax 

MacPROLOG  also  supports  Simple,  an 
English-like  syntax  for  beginning 


PROLOG  students.  It  is  used  in  several 
primers, includingmicro-PROLOG.The 
naive  reverse  procedure  in  Simple 
syntax  is: 

( _ X)  reverse  ( _ X) 

(_X  _Y!_Xs)  reverse  __Zs 
if( _ Y ! _ Xs)  reverse  _Ys 

and  append  LYs  ( _ X)  __Zs) 

Clauses  are  expressed  as  functor 
(arg  [arg  ...])  or  arg  functor  or  argl 
functor  arg2.  Rules  are  built  of  clauses, 
conditions,  and  conjunctions.  In  the 
preceding  example,  and  is  a  conjunc¬ 
tion  and  if  is  a  condition.  Lists,  sym¬ 
bols,  variables,  and  comments  are  the 
same  as  in  Standard  syntax.  Simple 
arithmetic  expressions  are  as  in  Stand¬ 
ard  syntax,  but  complex  expressions 
such  as: 

(SUM  (X  +  Y  /  5)  Z  XI) 
are  possible. 

A  Vcu  Model 

The  naive  reverse  procedure  in  Pro- 
log-II  is: 

reversetnil,  nil)  —  >  ; 
reverse(X.X-tail,  Z)  —  > 

reverse(X-tail,  Y) 

append! Y,  X.nil,  Z); 

Clauses  are  still  expressed  as  func¬ 
tor  (arg[, arg .  . .]),  but  rules  are  now 
clause  —>  clause[  clause];.  The  — > ; 
is  required  even  for  a  fact.  Lists  are  ex¬ 
pressed  as  a  series  of  terms  separated 
by  '. '  and  ending  in  nil.  List  head  and 
tail  are  expressed  as  X.X-tail.  Symbols 
start  with  at  least  two  alphabetic  char¬ 
acters.  The  rest  of  the  symbol  can  con¬ 
tain  dashes  but  not  underscores.  Sever¬ 
al  extended  characters  in  the  Mac’s 
character  set  are  treated  as  alphabetic, 
so  symbols  such  as  /Egis  are  legal. 
Variables  start  with  one  alphabetic 
character,  optionally  followed  by  any 
number  of  digits,  optionally  followed 
by  any  number  of  apostrophes,  op¬ 
tionally  followed  by  a  dash  and  any 
symbol.  This  allows  such  names  as  y, 
yf,  X",  and  X-the-first  but  disallows 
such  mnemonic  names  as  Premise 
and  Denial.  Arithmetic  expressions 
are  expressed  as  val(add(X,l),Xl). 
Comments  are  strings  surrounded  by 
",  Comments  cannot  span  lines  and 
cannot  appear  within  the  rules  defin¬ 
ing  a  single  function  (for  example,  an 


36 


Dr.  Dobb's  Journal,  April  1987 

273 


end-of-line  comment  after  the  first 
rule  of  naive  reverse  would  be  illegal). 
Program  lines  indented  with  tab  char¬ 
acters  cause  syntax  errors,  such  as  "A 
SIMPLE  TERM  IS  EXPECTED,"  with  the 
cursor  positioned  near  the  tab  but 
nothing  highlighted. 

Prolog-II  terminology  can  confuse 
programmers  who  are  used  to  earlier 
versions  of  PROLOG.  Where  other  PRO¬ 
LOGS  refer  to  "rules”  and  "unifica¬ 
tion,”  Prolog-II  talks  about  "trees”  and 
"deletion.”  There  is  a  reason  for  the 
change  in  terminology.  Prolog-II  is 
based  on  a  new,  expanded,  theoreti¬ 
cal  model,  which  adds  the  concept  of 
inequalities  between  trees  to  the  earli¬ 
er  PROLOGS’  equalities.  This  feature, 
encapsulated  in  the  built-in  rule  dif, 
adds  a  great  deal  of  power  to  Prolog-II. 

Prolog-II's  other  new  features  in¬ 
clude  infinite  trees  and  error  han¬ 
dling.  Infinite  trees,  which  consist  of 
otherwise  finite  trees  with  loops, 
provide  direct  PROLOG  support  for 
the  directed  graph  structures  found 
in  such  applications  as  finite-state  au¬ 
tomata  and  grammars.  Error  han¬ 
dling  in  Prolog-II  is  based  on  the 
primitive  block,  which  provides  a 
simple  form  of  signal  handling.  All 
Prolog-II’s  standard  run-time  errors 
signal  with  block-ey.it  so  that  a  pro¬ 
gram  can  handle  the  error.  Previous 
PROLOGS  either  exit  fatally  on  error 
or  treat  errors  as  failure;  neither  ap¬ 
proach  is  really  adequate  for  build¬ 
ing  production  programs. 

Prolog-II  is  missing  some  "inessen¬ 
tial”  features  of  earlier  PROLOGS.  Inte¬ 
gers  cannot  be  negative,  though  real 
numbers  can  be.  Several  layers  of  syn¬ 
tactic  sugar  have  been  removed.  The 
Prolog-II  list  syntax  resembles  the  ba¬ 
sic  dot  syntax  that  earlier  PROLOGS 
and  LISP  hide  with  a  more  readable 
list  notation.  The  omission  of  opera¬ 
tors  makes  many  programs  that  ma¬ 
nipulate  symbolic  structures  less 
readable.  The  variable  naming  rules 
and  the  restriction  on  placement  of 
comments  interact  to  require  a  coding 
style  in  which  initial  block  comments 
and  mnemonic  rule  names  are  the 
main  aids  to  creating  readable  pro¬ 
grams.  The  sample  programs  in  PRO¬ 
LOG  and  the  Prolog-II  manual  use 
sparse  one-line  header  comments  and 
one-  or  two-character  variable  names. 

Prolog-II  is  a  different  language 
from  the  other  PROLOGS.  The  syntax 
differs  from  Edinburgh  and  Standard 


in  almost  every  respect:  the  terminol¬ 
ogy  is  different,  and  many  of  the  lan¬ 
guage  features  are  different,  new,  or 
missing.  Converting  an  Edinburgh 
PROLOG  program  to  Prolog-II  is  on  the 


Table  2:  Data  types 


Table  3:  Input/output  extensions 


same  order  of  difficulty  as  convert¬ 
ing  a  C  program  to  Modula-2.  Prolog- 
II  is  a  new  language  with  powerful 
new  features.  It  is  theoretically  more 
powerful  than  older  PROLOGS, 


PROLOG/m 

A  A  IS  Prolog 

MacPROLOG 

Prolog-II 

Multistream 

no 

yes 

yes 

no 

Random 

no 

yes 

yes 

no 

Backtrackable  input 

no 

yes 

no 

no 

Pretty  printing 

no 

yes 

no 

no 

Tree  drawing 

no 

no 

no 

yes 

PROLOG/m 

AAIS 

Prolog 

MacPROLOG 
Standard  Edinburgh 

Prolog-II 

Cut 

I 

! 

/ 

I 

/ 

Backtracking 

yes 

yes 

yes 

yes 

yes 

Negation  by  failure 

not 

not 

NOT 

not 

_ i 

Disjunction 

;  :  ;  ■ 

; 

OR 

— 

If 

-> 

-> 

IF 

-> 

default 

Else 

;  orl 

* 

2 

2 

Iteration  over 

repeat 

repeat 

— 

repeat 

— 

Solutions 

forall3 

_ 

FORALL 

_ 

_ 

Lists 

— 

foreach 

foreachlist 

MAP 

map 

— 

Numbers 

— 

for 

_ 

_ 

_ 

Collection 

bagof3 

bagof 

BAGOF 

bagof 

— 

setof3 

setof 

SETOF 

ISALL 

setof 

findall 

list-of 

Delayed  evaluation 

— 

— 

— 

— 

freeze 

— 

— 

TOHOLLOW 

TOGROUND 

dif 

Inequality 

'  - 

— 

; — 

— 

dif 

Signals 

— 

— 

4 

4 

block 

block-exit 

Dynamic  invocation 

call 

call 

apply 

call 

— 

Metavariables 

1 .  Example  in  the  primer— 

2.  Part  of  the  //predicate. 

3.  In  the  TOOLBOX. 

X 

not  normally  defined. 

X 

_X5 

X 

4.  Program-defined  error  handlers  for  each  error  number. 

5.  The  MacPROLOG  metavariable  facility  is  exceptionally  powerful. 

Table  4:  Control  features 


PROLOG/m 

AAIS  Prolog 

MacPROLOG 

Prolog-II 

Atoms 

yes 

yes 

yes 

yes 

Clauses 

yes 

yes 

yes 

yes 

Rules 

yes 

yes 

yes 

yes 

Lists 

yes 

yes 

yes 

yes 

Characters 

no 

yes 

no 

yes 

Integers 

yes 

yes 

yes 

yes1 

#  of  bits 

16 

32 

282 

24 

Floats 

yes 

yes 

yes 

yes 

#  of  bits 

64 

64 

64 

643 

Strings 

no 

yes 

no 

yes 

Arrays 

no 

no 

no 

yes 

I/O  streams 

no 

yes 

no 

no 

Buffers 

no 

no 

no 

yes 

Infinite  trees 

no 

no 

no 

yes 

1 .  Nonnegative  integers  only. 

2.  Documentation  states  range  is  -99.999,999  to  +99,999,999. 

3.  Inferred  from  external  procedure  conventions. 


Dr.  Dobb's  Journal,  April  1987 

274 


37 


MAC  PROLOGS 

(continued  from  page  37) 

though  I  feel  it  is  less  readable. 

Features 

Tables  2—6,  pages  37  and  38,  summa¬ 
rize  the  language  features  of  the  dif¬ 
ferent  PROLOGS.  A  few  entries  need 
more  explanation. 

Backtracking 

Backtrackable  input  (Table  3)  in¬ 
volves  one  of  the  traditional  prob¬ 
lems  in  PROLOG  design — how  to  rec- 

oncile  the  side-effect-free,  back¬ 
tracking  world  of  logic  programming 
with  the  side-effect-  and  state-based 
external  world.  The  PROLOG  lan¬ 
guage  compromises  by  making  I/O 
operations  nonbacktrackable.  When 
foo(S )  fails  in: 

try(S) :  —  read(S),  foo(S). 

PROLOG  will  not  backtrack  and  retry 
read(S).  Although  this  strategy  is  usu¬ 
ally  fine,  it  does  make  some  loops 
harder  to  write  (what  if  you  wanted 
to  keep  reading  clauses  until  you 

PROLOG/m 

AAIS  Prolog  MacPROLOG 

Prolog-ll 

Database 

modification  yes 

yes 

yes1 

yes 

Rule  access  yes 

yes 

yes 

yes 

Keyed  database  no 

yes2 

no 

no 

Assignment  no 

set_global  remember 

assign 

Read  variable  no 

get_global  recall,  default 

val 

Deletion  no 

no 

forget 

no 

Properties  no 

no 

yes 

no 

1 .  To  a  separate  interpreted  database. 

2.  Imperfect  emulation  of  the  Prolog-10  feature  because  of  the  different  (hashed)  internal  structure  of  the  AAIS 

database. 

Table  S:  Side-effect  features 

PROLOG/m 

AAIS  Prolog  MacPROLOG 

Prolog-ll 

Modules  no 

packages  no1 

worlds 

Grammar  no 

yes 

no 

no 

Full  char  set  no 

no 

yes2 

partial3 

Mac  interface  no 

yes4 

yes 

a  little 

Graphics  no 

yes4 

no 

yes 

Other  toolbox  no 

yes4 

no 

no 

Other  language  no 

yes6 

no 

yes6 

Click  icon  yes 

yes 

yes 

yes 

Saves  system  no 

no 

yes 

yes 

Run-time  system  no 

no 

yes7 

no 

1 .  The  primer  mentions  a  module  system  in  other  versions  of  micro-Prolog. 

2.  No  choice  of  font;  extended  chars  are  not  alphanumerics. 

3.  Special  font  (enhanced  Monaco)  provided. 

4.  Direct,  complete  support  for  the  Mac  Toolbox. 

5.  Like  Toolbox  support,  but  for  arbitrary  code  resources. 

6.  Requires  Lisa  Workshop. 

7.  Special  license  required  for  redistribution. 

Table  G:  Other  features 

PROLOG/m 

AAIS 

MacPROLOG 

Prolog-ll 

Prolog 

interp  comp 

opt 

Reverse  30  1 1 .8 

0.7 

2:13.1'  2.1 

1.4  1 .8 

LIPS  42 

709 

—  236 

354  276 

Reverse  failed  50 

60 

30  90 

240  50 

Map  coloring  4;  12.0 

14.3 

4:05.6  8.0 

6.2  40.0 

With  dif 

7.7 

Graph  connect  4.2 

0.7 

4.8  1.4 

1.3 

Sieve  100  26.2 

3.3 

2:48.4'  10.7 

10.7  5.7 

*  Ran  out  of  memory  after  listed  time. 

Table  7:  Benchmark  results  (minutes:  seconds) 

Dr.  Dobb's  Journal ,  April  1987 

275 


found  one  that  satisfied  foo(S)?).  AAIS 
Prolog  provides  an  additional  set  of 
input  predicates  that  retry  when 
backtracked  to. 

Control  Structures 

Table  4  has  a  lot  of  information 
crammed  into  it.  Each  row  represents 
a  control  structure.  The  entries  in 
each  row  are  the  name(s)  of  the  con¬ 
trol  structure  in  that  PROLOG,  — 
means  the  control  structure  isn't  sup¬ 
ported  in  that  product,  and  an  empty 
space  means  that  this  row  is  a  contin¬ 
uation  of  the  preceding  one  because 
some  product  has  more  than  one  dis- 


O  Bratko,  Ivan.  Prolog  Program¬ 
ming  for  Artificial  Intelligence. 
Reading,  Mass.:  Addison-Wesley, 
1986.  Paperback,  423  pages. 

This  is  a  new  book  that  will  be  in¬ 
cluded  as  part  of  AAIS  Prolog  by  the 
time  you  read  this  review.  My  copy 
hasn’t  arrived  yet,  so  I  can’t  say  any¬ 
thing  more  about  it. 

Clark,  K.  L.,  and  McCabe,  F.  G.  mi- 
cro-PROLOG.  Englewood  Cliffs,  N.J.: 
Prentice-Hall,  1984.  Paperback,  401 
pages. 

Although  there  are  several  micro- 
PROLOG  primers,  this  is  the  official 
one.  It  is  also  the  only  one  I  know  of 
that  goes  beyond  elementary  use  of 
the  Simple  syntax  to  teach  the  possi¬ 
bilities  of  the  whole  language.  The 
last  section  of  the  book,  Applications 
of  micro-PROLOG,  consists  of  individ¬ 
ual  articles  by  different  authors,  in¬ 
cluding  a  critical-path-analysis  pro¬ 
gram;  two  chapters  on  search, 
pruning,  and  game  playing;  and,  you 
guessed  it,  the  obligatory  expert  sys¬ 
tem  example. 

Clocksin,  W,  F.,  and  Mellish,  C.  S.  Pro¬ 
gramming  in  Prolog.  1st  ed.  Berlin: 
Springer-Verlag,  1981.  Paperback, 
279  pages. 

This  is  the  classic  PROLOG  text.  The 
first  PROLOG  textbook,  it  is  one  of  the 
main  reasons  for  the  “Edinburgh 
standard.''  This  standard  is  really  the 
core  PROLOG  that  Clocksin  and  Mel¬ 
lish  derived  from  the  existing  Edin¬ 
burgh  PROLOG  systems  for  this  text. 
Although  of  great  historic  impor¬ 
tance,  later  and  better  PROLOG  texts 
are  now  available. 


tinct  predicate  for  that  control  struc¬ 
ture.  Negation  by  failure  is  the  weak 
analogue  of  true  negation  used  in  PRO¬ 
LOG.  Dynamic  invocation  is  the  ability 
to  execute  a  clause  bound  to  a  vari¬ 
able.  Metavariables  are  an  extension 
of  dynamic  invocation — most  PRO¬ 
LOGS  treat  a  variable  found  where  a 
clause  would  be  expected  as  implicit 
dynamic  invocation.  MacPROLOG  also 
allows  metavariables  in  other  posi¬ 
tions,  such  as  a  metavariable  as  all  or 
part  of  the  body  of  a  clause. 

Side  Effects 

PROLOG  is  mainly  a  side-effect-free 


Books 

Giannesini,  Francis;  Kanoui,  Henry; 
Pasero,  Robert;  and  van  Caneghem, 
Michal.  PROLOG.  Reading,  Mass.:  Ad¬ 
dison-Wesley,  1986.  Paperback,  260 
pages. 

This  is  the  Prolog-II  primer.  It  is 
clearly  written  and  quite  readable. 
The  last  chapter  includes  a  good  deal 
of  material  on  parsing  natural  lan¬ 
guages  and  compiling  grammars  as 
well  as  the  obligatory  expert  system 
example.  If  you’re  interested  in  Pro- 
log-II,  this  is  the  best  introduction 
you’ll  likely  to  get.  If  you  buy  Exper- 
Prolog-II,  you  get  it  as  part  of  the 
package. 

Sterling,  Leon,  and  Shapiro,  Ehud. 
The  Art  of  Prolog.  Cambridge,  Mass.: 
MIT  Press,  1986.  Hardback,  427  pages. 

This  is  the  book  I  am  using  to  learn 
PROLOG.  It  is  an  excellent  first  text  for 
experienced  programmers  who  don't 
want  to  bother  with  a  primer.  It  is  also 
an  excellent  classroom  text.  The  book 
is  divided  into  four  roughly  equal 
parts:  Logic  Programming,  The  Prolog 
Language,  Advanced  Prolog  Program¬ 
ming  Techniques,  and  Applications. 

The  first  part,  Logic  Programming, 
is  one  of  the  book’s  greatest  strengths 
and  weaknesses  at  the  same  time.  Its 
goal  is  to  teach  the  ideas  and  tech¬ 
niques  of  logic  programming  before 
getting  in  to  the  quirky  details  and 
hackery  of  a  specific  language.  The 
problem  is  that  it  is  full  of  logic  pro¬ 
grams  that  look  exactly  like  the  PRO¬ 
LOG  programs  in  the  book.  Many  of 
the  logic  programs  such  as: 

plus(0,X,X)  <  — 

naturaLnumber(X). 


language;  good  PROLOG  programs 
make  minimal  use  of  side  effects.  But 
many  programs  can’t  be  written 
without  some  side  effects  (see  Table 
5).  The  first  two  items  refer  to  the  ear¬ 
liest  form  of  PROLOG  side  effects.  PRO¬ 
LOG  rules  and  facts  are  stored  togeth¬ 
er  in  a  memory  database.  The  earliest 
PROLOGS  provided  special  predicates 
to  access  and  modify  this  database  in 
order  to  bootstrap  up  the  PROLOG  en¬ 
vironment.  Of  course,  programmers 
began  to  use  these  predicates  for 
other  reasons  such  as  including  self- 
modifing  code.  Although  excessive 
use  of  the  database  predicates  is  slow, 


plus(s(X),Y,s(Z))  <  — 

plus(X,Y,Z). 
naturaL_n  umber(O) . 
natural_number(s(X))  <  — 

naturaLn  umber(X). 

cannot  execute  in  PROLOG.  Novices 
studying  alone  are  likely  to  enter 
these  programs  and  become  very 
confused  when  they  don’t  work. 

Advanced  Prolog  Programming 
Techniques  develops  examples  such 
as  Eliza,  a  parser  for  grammar  rules, 
an  alpha-beta  pruning  search,  a  PRO¬ 
LOG  tracer,  and  the  obligatory  simple 
expert  system  shell. 

The  Applications  section  of  the 
book  consists  of  the  chapters  “Game 
Playing  Programs”  (Mastermind, 
Nim,  and  Kalah),  "A  Credit  Evalua¬ 
tion  Expert  System,”  “An  Equation 
Solver,”  and  "A  Compiler”  (to  an  ab¬ 
stract  assembly  language). 

The  second  main  flaw  in  this  book 
is  the  number  of  typos,  especially  in 
the  programs.  I  suspect  that  this  may 
be  partly  a  result  of  the  new  process 
used  to  produce  this  book.  The  au¬ 
thors  and  assistants  produced  the 
book  using  TEX  and  sent  the  camera- 
ready  copy  to  MIT  Press,  which  pub¬ 
lished  it  without  further  ado.  I'm 
sure  this  new  process  is  faster  and 
cheaper,  and  it  certainly  produced 
an  attractive  book,  but  I  wonder  if 
the  increased  speed  has  dropped  too 
many  proofreading  steps. 

All  in  all,  I  highly  recommend  this 
book  to  anyone  who  is  familiar 
enough  with  programming  in  gener¬ 
al  to  avoid  the  pitfalls  and  to  PROLOG 
beginners  looking  for  a  more  ad¬ 
vanced  book  to  follow  their  primer.C 


Dr.  Dobb's  Journal,  April  1987 

276 


39 


MAC  PROLOGS 

(continued  from  page  39) 


hard  to  understand,  and  generally 
bad  form,  some  use  of  them  has  true 
advantages  in  both  speed  and  power. 
Such  powerful  predicates  as  setof 
would  not  be  possible  without  data¬ 
base  modification. 

Other  Features 

Grammar  (Table  6)  refers  to  definitive 
clause  grammar  rules.  This  special 
syntax  for  defining  a  useful  subset  of 
natural-language  grammars  is  de¬ 
scribed  in  Programming  in  Prolog 
and  The  Art  of  Prolog. 

None  of  the  PROLOGS  can  create 
true  stand-alone  applications.  Click 
icon  means  that  a  program  icon  can 
be  double-clicked  to  automatically 
start  the  PROLOG  with  that  program 
loaded.  All  the  PROLOGS  permit  the 
loaded  program  to  take  control  as  it 
starts  for  a  poor-man's  application. 

Vendors 

PROLOG/m 

Chalcedony  Software  Inc. 

5580  La  Jolla  Blvd.,  Ste.  126 
La  Jolla,  CA  92307 
(619)  483-8517 
$99.95,  PROLOG/m 
$29.95,  TOOLBOX 
$29.95,  TOYBOX 
$49.95,  NFL-X-pert 
Reader  Service  Number  27 

AAIS  Prolog 

Advanced  A. I.  Systems  Inc. 

P.O.  Box  39-0360 
Mountain  View,  CA  94039 
(415)  961-1121 
$150 

Reader  Service  Number  28 
MaCPROLOG 

Programming  Logic  Systems  Inc. 

31  Crescent  Dr. 

Milford,  CT  06460 

(203)  877-7988 

$295,  MacPROLOG 

$100,  optimizing  compiler 

Reader  Service  Number  29 

ExperProlog-II 

ExperTelligence  Inc. 

559  San  Ysidro  Rd. 

Santa  Barbara,  CA  93108 

(805)  969-7871 

$495 

Reader  Service  Number  30 


ProIog-II  allows  the  entire  state  of  the 
PROLOG  to  be  saved  in  binary  form  so 
that  a  development  or  application 
can  be  restarted  quickly.  MacPROLOG 
provides  two  minimal  run-time  sys¬ 
tems,  but  the  run-time  system,  Mac¬ 
PROLOG  itself,  and  your  compiled 
code  must  all  be  on  the  disk  in  order 
to  run  the  program. 

The  Benchmarks 

The  most  common  metric  of  PROLOG 
performance  is  LIPS,  logical  infer¬ 
ences  per  second.  This  is  no  more 
useful  than  MIPS,  but  I've  done  it  any¬ 
way.  The  second  row  of  Table  7,  page 
38,  is  a  LIPS  figure  calculated  by  divid¬ 
ing  the  time  for  naive  reverse  of  30 
elements — the  most  common  PRO¬ 
LOG  benchmark.  The  third  row  of  Ta¬ 
ble  7  is  a  measure  of  memory  utiliza¬ 
tion.  It  is  the  length  of  a  list  that 
caused  a  memory  or  stack-full  error 
message  from  naive  reverse.  List 
sizes  of  30,  50,  60,  90, 120, 180,  and  240 
were  used  in  the  test. 

Map  coloring  is  a  simple  generate- 
and-test  recursive  loop.  This  sort  of 
thing  is  fairly  common  in  PROLOG  ap¬ 
plications.  Graph  connect  uses  a  lot  of 
database  hacking  with  asserta  and  re¬ 
tract.  Sieve  is,  of  course,  the  tradition¬ 
al  Byte  sieve  benchmark.  I  included  it 
to  give  some  idea  of  the  arithmetic 
performance  of  the  PROLOGS. 

AAIS  Prolog  comes  up  with  about 
twice  the  LIPS  of  the  MacPROLOG  opti¬ 
mizing  compiler  here,  but  the  ratings 
are  reversed  with  a  more  complicat¬ 
ed  benchmark.  Both  the  MacPROLOG 
compilers  handily  won  the  map  col¬ 
oring  test.  The  compilers,  especially 
the  optimizing  compiler,  had  much 
better  memory  utilization  than  any 
of  the  interpreters.  The  number  of 
resolutions  in  a  naive  reverse  of 
length  N  is  the  triangular  number  of 
N+l — that  is  (N +l)(ceiling((N +1)/ 
2)).  Reverse  of  30  takes  496  resolu¬ 
tions,  60  takes  1,891,  90  takes  4,186, 
and  240  takes  29,161  resolutions.  The 
advantage  of  dif  is  also  clear,  improv¬ 
ing  Prolog-II's  performance  on  map 
coloring  by  a  factor  of  5. 

Conclusions 

PROLOG  takes  a  lot  of  memory.  All 
these  products  are  really  cramped  on 
a  512K  Macintosh. 

These  are  four  very  different  prod¬ 
ucts.  For  learning  PROLOG  and  play¬ 
ing  with  it  at  home,  I’d  recommend 


AAIS  Prolog.  It’s  fast;  has  many  fea¬ 
tures,  including  unlimited  potential 
use  of  the  Mac  environment;  and  the 
price  is  reasonable.  For  serious  re¬ 
search,  development,  or  prototyping 
in  which  graphics  are  not  an  issue,  I'd 
definitely  consider  MacPROLOG  as 
well.  Its  memory-efficient  compilers 
and  easy  use  of  the  Mac  user  interface 
make  it  well  worth  considering.  Pro- 
log-II  has  a  powerful  and  interesting 
base  language,  but  the  price  is  high 
and  both  the  user  interface  and  lan¬ 
guage  are  less  convenient  than  the 
other  PROLOGS.  Buy  it  if  you  like  the 
Prolog-II  language  model  or  need  its 
features;  otherwise,  wait  until  its  ven¬ 
dor  lowers  the  price.  The  only  PRO¬ 
LOG  I  cannot  recommend  at  all  is  PRO¬ 
LOG/m.  It's  not  bad  by  itself,  but  it’s 
just  not  competitive.  For  only  a  little 
more  money  you  can  get  AAIS  Prolog. 

If  you  are  determined  to  ship  a  Mac¬ 
intosh  product  in  PROLOG,  you  need 
two  features:  the  ability  to  supply 
your  own  user  interface  and  some 
sort  of  run-time  system.  Only  Mac¬ 
PROLOG  comes  close  to  meeting  these 
requirements.  It  meets  the  first  re¬ 
quirement  easily,  but  I'm  not  con¬ 
vinced  by  a  "run-time  system”  that  re¬ 
quires  that  both  the  language 
interpreter  and  the  run-time  system 
file  go  along  with  the  application  on 
every  disk. 

Notes 

1.  J.  A.  Robinson,  "A  Machine-Orient¬ 
ed  Logic  Based  on  the  Resolution 
Principle,”  Journal  of  the  ACM  12  (Jan¬ 
uary  1965):  23  —  41. 

2.  C.  E.  Hewitt,  "PLANNER:  A  Language 
for  Proving  Theorems  in  Robots,” 
Proceedings  of  IJCAI-69  (Washington, 
D.C.:  International  Joint  Conference 
on  Artificial  Intelligence,  May  1969): 
10. 

3.  C.  E.  Hewitt,  "Concurrency  in  Intel¬ 
ligent  Systems,”  Al  Expert  (San  Fran¬ 
cisco:  CL  Publications,  1986):  44—50. 

4.  E.  Shapiro,  Concurrent  Prolog  (Cam¬ 
bridge,  Mass.:  MIT  Press,  1987). 

5.  K.  Kahn,  E.  D.  Tribble,  M.  Miller, 
and  D.  Bobrow,  "Objects  in  Concur¬ 
rent  Logic  Programming  Lan¬ 
guages,”  OOPSLA'86  Conference  Pro¬ 
ceedings:  242  —  257. 

DDJ 

Vote  for  your  favorite  feature/article. 

Circle  Reader  Service  No.  3. 


Dr.  Dobb's  Journal,  April  1987 


41 

277 


ARTICLES 


MYCEV-Like  Expert 
Systems 


by  Richard  W.  Grigonis 


For  Amiga  owners  interested 
in  artificial  intelligence  but 
lacking  Amiga  LISP,  Metacom- 
co's  Cambridge  LISP  68000,  XLISP,  or 
Micro  Forge  PROLOG,  here  is  an  inter¬ 
esting  approach  to  setting  up  a  quick 
expert  system  using  the  Amiga  BASIC 
package  that  you  already  own.  By 
now  you’ve  probably  noticed  that 
this  is  the  best  Microsoft  BASIC  inter¬ 
preter  ever  written  (so  good,  in  fact, 
that  it  almost  isn't  BASIC),  and  further 
improvements  are  on  the  way. 

Rule-Based  Systems 

As  all  readers  of  DDJ's  special  AI  issues 
are  aware,  most  expert  systems  are 
essentially  a  collection  of  production 
rules  and  are  therefore  known  as 
rule-based  systems  (RBSs).  The  rules 
have  a  left-hand  side  (the  antecedent, 
or  propositions  combined  in  logical 
AND/OR  form  that  comprise  a  situa¬ 
tion)  and  a  right-hand  side  (the  conse¬ 
quent,  goal,  or  action  to  be  taken). 

A  simple  rule  is  as  follows: 

IF  Left-hand  side 
THEN  Right-hand  side 
IF  X  has  hair 
THEN  X  is  a  mammal. 

Notice  how  the  rules  of  an  RBS  are 
simple  modules  of  logic,  taking  the 
form  of  IF... THEN  statements.  The 
THEN  sections  of  some  rules  match 
the  IF  sections  of  others,  forming  an 
inference  network  that  can  be  drawn 
as  an  AND/OR  tree.  The  rules  them¬ 
selves  are  quite  useless  unless  the  pro¬ 
gram  also  possesses  an  inference  en¬ 
gine  that  searches  through  the  rules 
in  one  of  two  ways.  If  the  situation  on 
the  left  side  is  examined  first  and  the 


Richard  W.  Grigonis,  49  Haring  St., 
Bergenfield,  NJ  07621.  Richard  is  a 
freelance  programming  consultant. 


An  approach  to 
setting  up  an  expert 
system  with  BASIC 


right  side  is  taken  as  the  action,  then 
the  system  can  be  said  to  be  a  bottom- 
up,  forward-chaining  one.  If,  howev¬ 
er,  the  right  side  is  examined  first  and 
taken  as  a  goal  or  hypothesis  to  be 
proved  by  demonstrating  the  truth  of 
the  propositions  on  the  left  side,  then 
such  a  system  is  a  top-down  or  back¬ 
ward-chaining  one. 

IWYCI1V for  the  Masses 

Perhaps  the  most  famous  backward 
chaining,  depth-first  search  expert 
system  is  MYCIN,  which  was  also  one 
of  the  first  major  expert  systems  and 
is  still  used  as  a  sort  of  benchmark  in 
comparing  expert  systems.  MYCIN 
has  450  rules  that  are  used  to  diag¬ 
nose  and  suggest  antibiotic  treatment 
for  100  blood  and  meningitis  infec¬ 
tions.  The  program  listing  accompa¬ 
nying  this  article  (Listing  One,  page 
74)  is  a  simple  MYCIN-like  program 
written  as  a  demonstration  of  such  a 
system  in  Amiga  BASIC.  It  took  just  a 
few  hours  to  port  the  original  version 
to  the  Amiga  from  another  one  of  Jay 
Miner's  hardware  creations,  my  old 
Atari  800.  In  fact,  a  few  line  numbers 
from  the  original  still  exist  in  the 
Amiga  BASIC  code,  serving  as  labels. 

Instead  of  diagnosing  diseases  on 
the  basis  of  symptoms,  the  program 
in  the  listing  identifies  animals  on  the 
basis  of  physical  attributes  and  ob¬ 
served  behaviors.  It  is  a  toy  system 
designed  to  demonstrate  the  MYCIN 
reasoning  mechanism  in  Amiga  BA¬ 
SIC.  You  can  find  the  knowledge  base 


of  rules  it  uses  on  pages  242—243  of 
Winston  and  Horn’s  textbook  LISP.1 

The  problem  with  writing  expert 
systems  in  BASIC  as  opposed  to  PRO¬ 
LOG  is  that  the  inference  engine  nec¬ 
essary  to  parse  the  knowledge  base  of 
rules  and  to  drive  a  path  through 
them  must  be  written  by  the 
programmer. 

You  can  store  production  rules  as 
DATA  statements  as  follows: 

DATA  Rule  1, IF, has  hair, 

THEN, is  mammal 
DATA  Rule  2, IF, gives  milk, 

THEN, is  mammal 

Unfortunately,  as  most  BASICS  are 
not  recursive,  programming  a  gener¬ 
al-purpose  top-down  parser  to  act 
upon  these  DATA  statements  requires 
considerable  effort.  You  must  estab¬ 
lish  push-down  stacks,  queues,  stack 
conventions,  and  so  on  to  store  local 
variables  for  the  reentrant  subrou¬ 
tines  of  the  inference  mechanism.  An¬ 
other  problem  with  a  general-pur¬ 
pose  parser  of  this  type  is  that  the 
processing  overhead  needed  to 
search  through  the  knowledge  base 
of  production  rules  and  keep  track  of 
what  is  going  on  results  in  a  slow  pro¬ 
gram,  although  it  is  versatile.  Code  for 
a  BASIC  expert  system  using  DATA 
statements  such  as  the  preceding 
ones,  along  with  a  stack,  backward- 
chaining,  and  the  Winston  and  Horn 
rules  (but  without  the  MYCIN  certainty 
factor  inference  routines)  can  be 
found  in  the  September  1981  issue  of 
Byte.2 

Elegant  BASIC? 

The  (supposedly)  quick-and-dirty  ap¬ 
proach  I  have  elected  to  use  for  devel¬ 
oping  an  Amiga  BASIC  expert  system 
requires  a  bit  more  code  for  each 
rule,  but  it  forces  the  BASIC  interpret- 


42 

278 


Dr.  Dobb’s  Journal,  April  1987 


er  (via  its  own  stack)  to  handle  the 
bookkeeping  required  for  searching 
through  the  rules  in  a  top-down, 
backward-chaining,  depth-first  man¬ 
ner.  A  single  stack  is  employed,  but 
only  to  keep  track  of  what  routines 
have  been  activated  so  that  the  sys¬ 
tem  can  explain  its  reasoning  to  users 
when  they  key  in  “why"  or  "why?" 
in  response  to  a  question  from  the 
system.  Otherwise,  the  expert  system 
is  herein  presented  in  a  form  similar 
to — though  not  exactly  the  same  as — 
a  syntax-directed  recognizer,  such  as 
those  used  in  computer  language  in¬ 
terpreters  and  compilers. 

A  BASIC  interpreter,  for  example, 
can  be  constructed  from  these  pro¬ 
duction  rules: 

<statement>  ::  = 

LET  <identifier>  = 

<expression> 

<statement>  ::  = 

NEXT  <identifier> 

<statement>  ::  = 

INPUT  <identifier  list> 

< statement >  ::  = 

GOTO  Cline  number> 
<statement>  ::  = 

FOR  <identifier>  = 

<  expression  > 

TO  <expression> 

<statement>  ::  = 

GOSUB  Cline  number> 

Systems  programmers  writing  a  BA¬ 
SIC  interpreter  based  upon  these 
rules  must  now  write  a  syntax-direct¬ 
ed  recognizer  in  C  or  assembly  lan¬ 
guage  that  can  parse  all  the  kinds  of 
BASIC  language  statements  shown 
above.  They  do  this  by  writing  sepa¬ 
rate  recognizer  procedures  for  each 
nonterminal  in  the  language,  with 
one  procedure  calling  others  as  dic¬ 
tated  by  the  production  rules.  For  the 
preceding  rules  pertaining  to  BASIC 
statements,  John  Zarrella3  suggests 
one  possible  syntax  recognition  pro¬ 
cedure  (see  Example  1,  right). 

Because  an  expert  system  is  also 
based  upon  production  rules,  pro¬ 
grammers  can  examine  the  compo¬ 
nents  of  each  rule  and  write  recog¬ 
nizer  procedures  or  subroutines  in  a 
high-level  language  in  the  same  man¬ 
ner  as  with  the  BASIC  interpreter  dis¬ 
cussed  previously. 

In  constructing  such  a  program, 
you  must  write  a  subroutine  for  each 
nonterminal  and  terminal  in  the  lan¬ 


guage  or,  as  in  this  case,  each  hypo¬ 
thetical  fact,  assertion,  or  combined 
assertion  to  be  proven.  The  "words” 
analyzed  by  this  expert  system  syn¬ 
tax-directed  recognizer  are  the  nu¬ 
meric  values  supplied  by  users  in  re¬ 
sponse  to  questions  asked  by  the 
program.  In  other  words,  each  hy¬ 
pothesis  (albatross,  penguin,  ostrich, 
zebra,  and  so  on)  is  a  subroutine 
"proved”  by  calling  other  subhy¬ 
potheses  (bird,  mammal,  and  so  on), 
also  in  the  form  of  subroutines,  that 
in  turn  call  still  other  hypothesis/ 
subroutines  (has  feathers,  lays  eggs, 
and  so  on).  If  you  could  call  such  pro¬ 
cedures  recursively  (which  is  not  re¬ 
quired  in  this  kind  of  system),  then 
you  would  have  a  recursive  descent 
parser — not  exactly  what  you  would 
call  quick-and-dirty  code. 

Such  syntax-directed  recognizers 
are  not  general  purpose — meaning 
that  you  cannot  simply  plug  in  a  new 
set  of  rules  describing  expertise  in 
some  other  domain  of  knowledge — 
but  they  are  faster  than  general-pur¬ 
pose  inference  engines  and  they  are 
more  in  keeping  with  the  procedural 
and  modular  knowledge  representa¬ 
tion  philosophy  suggested  by  pro¬ 
duction  rule  systems.  Indeed,  you 
can  now  insert  additional,  special 
subroutines  or  functions  of  arbitrary 
complexity  here  and  there  in  the 
program  as  needed.  Such  systems  al¬ 
low  the  BASIC  language  to  do  what  it 
does  best — define  the  heuristic  flow 
of  control  (procedures)  of  the  expert 


system.  It  does  this  so  well  that  you 
can  no  longer  refer  to  the  production 
rules  explicitly  because  they  are  now 
implied  in  the  pattern  of  nested  sub¬ 
routines  residing  in  the  code.  You 
must  therefore  now  speak  not  just  of 
rules  and  combined  assertions  but  of 
"facts”  and  "combined  facts”  in  nu¬ 
meric  arrays  worked  upon  by  the 
subroutines,  all  of  which  can  be 
imagined  as  residing  on  the  nodes 
and  arcs  of  an  imaginary  inference 
net  or  AND/OR  tree. 

Another  strength  of  this  kind  of 
system  is  that  error  messages  and  ex¬ 
planations  of  the  top-down  reason¬ 
ing  process  are  easier  to  program  as 
the  system  is  designed  specifically  for 
the  particular  knowledge  base  used. 

Like  MYCIN,  the  system  presented 
here  can  accept  information  volun¬ 
teered  by  users  at  any  level  of  the  rea¬ 
soning  process.  If  users  are  not  abso¬ 
lutely  sure  of  the  animal  or 
classification  they  are  thinking  of, 
the  program  ignores  their  input  and 
digs  deeper  into  the  AND/OR  tree.  Pro¬ 
vision  has  been  made  in  the  program 
for  programmers  to  change  this 
threshold  easily  (see  the  IF  statement 
two  lines  above  the  Test.for.a.posi- 
tive.number:  subroutine). 

AND  Clauses 

The  system  can  handle  negative  in¬ 
ferences  and  degrees  of  certainty  in  a 
user's  answers  through  a  mathemati¬ 
cal  process  essentially  the  same  as 
that  used  by  MYCIN. 


procedure  STATEMENT; 
local  LEXEME; 
LEXEME  =  GETLEXEME ; 
select  LEXEME  of 


"LET” : 


"NEXT" : 
"INPUT" 
"GOTO" : 
"FOR" : 


"GOSUB" 

end; 


begin 

call  IDENTIFIER; 

IF  GETLEXEME  /= 
call  EXPRESSION; 
end; 

call  IDENTIFIER 
call  IDENTIFIERLIST; 
call  LINENUMBER; 
begin 

call  IDENTIFIER; 
if  GETLEXEME  /= 
call  EXPRESSION; 
if  GETLEXEME  /= 
call  EXPRESSION; 
end  ; 

Call  LINENUMBER; 


="  then  call  ERROR; 


="  then  call  ERROR; 


TO"  then  call  ERROR; 


Example  1:  John  Zarella's  syntax  recognizer 


Dr.  Dobb's  Journal,  April  1987 


43 

279 


EXPERT  SYSTEMS 

(continued  from  page  43) 


As  an  example,  let's  take  a  look  at 
one  of  the  rules  in  the  system's  AND/ 
OR  tree: 

IF  animal  is  an  UNGULATE 
AND  animal  HAS  BLACK  STRIPES 
THEN  animal  is  a  ZEBRA 
(attenuation  factor =0.8) 

Let’s  say  that  the  user's  certainty  on 
the  UNGULATE  branch  of  the  ANDe d 
relation  is  0.7  (1.0  being  absolute  cer¬ 
tainty)  and  the  certainty  on  the  HAS 
BLACK  STRIPES  branch  is  0.8. 

In  normal  probability  theory,  you 
would  multiply  the  individual  frac¬ 
tional  probabilities,  yielding  0.56.  MY- 
CIN,  however,  does  not  use  standard 
probability  theory.  Conventional 
probability  theory  was  rejected  be¬ 
cause  it  was  felt  that  the  AND  clauses 
in  classification  systems  violate  the 
two  foundations  of  standard  proba¬ 
bility  theory  (particularly  Bayes’ 
rule):  statistical  independence  and 
prior  probabilities  (or  priors). 

Standard  statistical  probability  as¬ 
sumes  that  the  components  in  the 
ANDe d  relations  are  independent  of 
each  other  and  that  an  examination  of 
a  sufficiently  large  number  of  exam¬ 
ples  of  a  rule  allows  you  to  construct  a 
statistical,  frequency  model  of  the 
rule  so  that  you  can  give  an  a  priori 
probability  of  a  hypothesis  being  true 
in  the  absence  of  any  evidence.  The 
developers  of  MYCIN  rejected  and/or 
modified  these  ideas  because  the 
symptoms  of  a  disease  (or  the  physical 
attributes  of  an  animal,  for  that  mat¬ 
ter)  are  not  independent  but  usually 
occur  in  groups.  Also,  it  is  difficult  to 
obtain  data  on  and  analyze  thousands 
of  cases  to  determine  frequencies. 

Instead,  MYCIN's  developers  creat¬ 
ed  their  own  technique  for  dealing 
with  uncertainty,  based  on  confirma¬ 
tion  theory  (logical  probability)  and 
the  use  of  certainty  factors  and  atten¬ 
uation  factors.  You  must  therefore 
distinguish  certainty  (the  degree  of 
confidence  you  have  in  a  fact  or  rule) 
from  ordinary  probability. 

A  certainty  factor  (CF)  is  a  number 
between  —  1  and  1  given  to  a  fact  or 
relation  to  indicate  the  confidence  a 
user  has  in  providing  data  concern¬ 
ing  a  fact  or  relation  to  the  expert  sys¬ 
tem.  In  this  sense,  certainty  factors 


are  really  confidence  factors,  not 
probability  coefficients.  By  the  end  of 
a  user’s  session  with  an  expert  sys¬ 
tem,  the  program  itself  has  combined 
and  computed  new  certainty  factors. 
In  MYCIN,  if  the  computed  truth  of  a 
fact  exceeds  0.8,  then  the  fact  is 
judged  to  be  proven  and  the  certainty 
factor  is  now  1.0.  Also,  if  the  certainty 
factor  falls  into  the  range  —0.2  to  0.2, 
then  the  certainty  factor  is  set  to  0 
(unknown)  and  a  certainty  factor  in 
the  range  —0.8  to  —1.0  is  converted 
to  —1.0  (definitely  false). 


The  program 
combines 
and 

computes 
new  certainty 
factors. 


An  attenuation  factor  is  a  number 
between  0  and  1  that  is  multiplied  by 
a  certainty  factor,  yielding  a  new  cer¬ 
tainty  factor.  It  is  an  indicator  of  a 
rule's  inherent  reliability,  or  at  least 
the  confidence  a  human  expert  had 
in  the  efficacy  of  the  rule  when  the 
system  was  being  developed.  Just  as  a 
certainty  factor  starts  out  as  really  a 
confidence  factor  on  the  part  of  the 
user,  an  attenuation  factor  is  likewise 
a  confidence  factor  on  the  part  of  the 
human  expert  from  whom  the  rules 
were  derived. 

In  the  AND  clause  shown  earlier, 
the  probability  of  one  conditional 
AND  another  is  taken  as  a  minimum 
of  their  certainties,  so  the  program 
finds  the  lowest  certainty  factor  on 
the  branches  of  the  AND  clause  (the 
certainty  factors  of  UNGULATE  or  HAS 
BLACK  STRIPES)  and  multiplies  it  by 
the  attenuation  factor  of  0.8.  If  the 
certainty  passed  up  the  tree  by  the 
Prove.ungulate:  subroutine  is  0.7  and 
the  certainty  factor  given  by  the 
Prove.black.stripes:  subroutine  is  0.8, 
the  Prove.zebra:  subroutine  will  se¬ 
lect  the  lower  figure  of  0.7  and  multi¬ 
ply  it  by  the  zebra  AND  clause  attenu¬ 
ation  of  0.8,  yielding  a  new  output 
certainty  factor  of  0.56.  The  figure, 
coincidentally,  matches  that  given  by 


standard  probability.  In  any  event, 
this  result  is  passed  up  to  the  user  as 
the  final  certainty  factor  for  the  ani¬ 
mal  being  a  zebra. 

OR  Clauses 

But  what  would  have  been  the  case  if 
the  rule  had  been  an  OR  clause  in¬ 
stead  of  an  AND  clause?  That  is,  what 
if  the  rule  had  looked  like  this: 

IF  animal  is  an  UNGULATE 
OR  animal  HAS  BLACK  STRIPES 
THEN  animal  is  a  ZEBRA 

The  designers  of  MYCIN  thought 
that  the  certainty  factors  on  the 
branches  of  an  OR  node  should  rein¬ 
force  one  another.  Remember  that 
the  certainty  factor  on  the  UNGULATE 
branch  is  currently  0.7  and  the  cer¬ 
tainty  factor  on  the  HAS  BLACK 
STRIPES  branch  is  0.8.  The  CF  on  the 
UNGULATE  branch  takes  you  70  per¬ 
cent  toward  proving  that  the’  animal 
is  a  zebra  (0.7).  That  still  leaves  30  per¬ 
cent  (0.3)  to  go  in  order  to  achieve  ab¬ 
solute  certainty.  As  it  happens,  the  CF 
on  the  HAS  BLACK  STRIPES  branch 
(0.8)  "carries”  the  the  total  certainty 
an  additional  80  percent  over  the  re¬ 
maining  distance  of  0.3.  Because  80 
percent  of  0.3  is  0.24,  the  certainty 
factor  of  the  animal  being  a  zebra  is 
now  0.7  +  0.24,  or  0.94.  Had  a  third 
branch  existed  with  a  certainty  fac¬ 
tor  of,  say,  0.5,  then  the  total  certainty 
would  have  been  carried  another  50 
percent  over  the  remaining  distance 
of  0.06,  yielding  a  new  total  certainty 
factor  of  0.97. 

One  way  of  expressing  this  proce¬ 
dure  mathematically  is  as  follows: 

New  CF  =  CF1  +  CF2(1  -  CF1) 

A  more  confusing  (though  equiva¬ 
lent)  expression  is  this  one: 

New  CF  =  CF1  +  CF2  -  (CF1  X  CF2) 

In  order  to  handle  negative  numbers, 
however,  you  would  have  to  convert 
the  above  equation  into  the 
following: 

New  CF  =  CF1  +  CF2  +  (CF1  X  CF2) 

Also,  things  get  awkward  when 
you  are  using  three  certainty  factors: 

New  CF  =  CF1  +  CF2(1  -  CF1) 


44 

280 


Dr.  Dobb's  Journal,  April  1987 


EXPERT  SYSTEMS 

(continued  from  page  44) 


+  CF3(1  —  [CF1 
+  CF2(1  — CF1)]) 

Fortunately,  this  equation  can  be  sim¬ 
plified  to  this  one: 

New  CF  =  1  -  (1  -  CF1)  (1  -  CF2) 

(1  -  CF3) 

My  system  uses  this  equation  if  at 
least  one  certainty  factor  on  a  branch 
is  positive.  If  all  the  certainty  factors 
are  negative  or  0,  the  system  uses  this 


equation: 

New  CF  =  -1  +  (1  +  CF1)  (1  +  CF2) 

(1  +  CF3) 

At  this  point  you  should  distinguish 
the  use  of  negative  numbers  in  stat¬ 
ing  the  confidence  in  a  hypothesis 
from  situations  where  a  lack  of  evi¬ 
dence  is  necessary  to  prove  a  hypoth¬ 
esis,  in  which  case  you  must  add  a 
rule  to  the  knowledge  base  testing  for 
the  lack  of  certain  information.  Here 
is  one  such  rule  from  MYCIN  itself: 

IF  identity  of  organism 


is  not  known 

AND  gram  stain  of  organism 
is  not  known 

AND  morphology  of  organism 
is  not  known 
AND  site  of  culture  is  csf 
AND  infection  is  meningitis 
AND  age  of  patient 

is  less  than  or  equal  to  17 
THEN  (.3)  category  of  organism 
is  enterobacteriaceae. 

Some  researchers  have  pointed  out 
some  deficiencies  with  the  MYCIN  ap¬ 
proach  to  reasoning  with  certainty 
factors.4  I  find  that  all  forms  of  the  OR 
clause  equations  increase  the  certain¬ 
ty  factor  values  too  much,  so  I  place 
attenuations  on  all  possible  branches 
to  bring  the  results  closer  to  what  you 
would  expect  from  standard  proba¬ 
bility  theory.  Researchers  now  use 
formulas  closer  to  standard  probabili¬ 
ty  theory  in  the  new  systems,  but  I 
will  not  examine  them  here. 

Expanding  the  System 

The  system  is  currently  designed  to 
identify  seven  animals,  and  the  HY¬ 
POTHESIS  numeric  array  has  been  di¬ 
mensioned  to  accept  up  to  20  ani¬ 
mals.  I'll  now  demonstrate  how  you 
add  a  new  animal  to  the  system. 

Let's  say  you  want  the  program  to 
be  able  to  recognize  an  emu,  which  is 
a  large  flightless  bird  like  an  ostrich 
but  with  dark  feathers  and  large  red 
eyes.  The  rule  you  wish  to  express  is 
as  follows: 

IF  the  animal  is  a  BIRD 
AND  the  animal  CANNOT  FLY 
AND  the  animal  HAS  DARK  FEATHERS 
AND  the  animal  HAS  BIG  RED  EYES 
THEN  the  animal  is  an  EMU 

This  AND  clause  looks  as  if  its  going  to 
be  pretty  reliable,  so  let's  give  it  an 
attenuation  of  1. 

Remember  that,  as  the  program  is 
backward  chaining,  it  looks  at  this 
rule  in  reverse,  taking  the  emu  iden¬ 
tity  as  a  hypothesis  to  be  proved  by 
calling  other  subroutines  that  in  turn 
attempt  to  prove  that  the  animal  is  a 
bird,  cannot  fly,  and  so  on. 

Let’s  add  the  actual  emu  subrou¬ 
tine  first.  First,  you  scroll  down  to  the 
bottom  of  the  DATA  statements, 
where  you  see  the  following  lines: 

DATA  35, has  pointed  teeth 


46 


Dr.  Dobb's  Journal,  April  1987 

281 


pointed,  teeth = 35 
DATA  —  1,  END  OF  DATA 

Now  add  the  new  fact  and  user  re¬ 
quest  string  for  emu  (fact  #36)  above 
the  last  line.  The  result  should  look 
like  this: 

DATA  35, has  pointed  teeth 
pointed  .teeth = 35 
DATA  36,is  an  emu 
emu =36 

DATA  - 1,  END  OF  DATA 

As  emu  is  one  of  the  top-level  hypoth¬ 
eses  (animals  to  be  identified),  there 
are  two  special  areas  in  the  program 
to  change.  Below  the  line,  REM  TOP- 
LEVEL  HYPOTHESES  (ROOTS)  OF  AND/ 
OR  TREE:,  find  these  lines: 

HYPOTHESISE) = cheetah 
number  .of.hypotheses=7 

and  change  them  to  look  like  these: 

HYPOTHESISE) = cheetah 
HYPOTHESIS® = emu 
number. of.hypotheses = 8 

The  other  area  of  the  program  to 
change  as  a  result  of  emu  being  a  top- 
level  hypothesis  is  the  executive  call¬ 
ing  routine  a  little  farther  down. 
Make  an  addition  just  above  line 
10165  that  reads  as  follows: 

GOSUB  Prove.emu 
IF  halt.on.success=2 
AND  OUTPUT.CF(emu)  =  1  THEN  10165 

You  then  add  the  subroutine 
shown  in  Example  2,  below,  to  prove 


the  animal  is  an  emu  (with  an  attenu¬ 
ation  of  1)  to  the  very  bottom  of  the 
program. 

If  you  had  been  trying  to  represent 
an  OR  clause  instead  of  an  AND 
clause: 

IF  the  animal  is  a  BIRD 
OR  the  animal  CANNOT  FLY 
OR  the  animal  HAS  DARK  FEATHERS 
OR  the  animal  HAS  BIG  RED  EYES 
THEN  the  animal  is  an  EMU 

you  would  have  had  to  assign  attenu¬ 
ations  to  each  branch  because  the 
components  of  such  a  disjunctive  re¬ 
lation  can  each  be  considered  as  a 
separate  minirule  contributing  a  cer¬ 
tainty  factor  to  prove  that  the  animal 
is  an  emu.  This  is  known  as  a  multi¬ 
ply  argued  certainty.  A  subroutine 
describing  this  (again  with  arbitrarily 
chosen  attenuations)  would  look  like 
that  in  Example  3,  page  48. 

But  what  if  the  rule  had  taken  the 
form  of  a  compound  relationship  of 
ANDs  and  Oils? 

IF  (animal  is  a  BIRD 

AND  animal  CANNOT  FLY 

AND  animal  HAS  DARK  FEATHERS) 

OR  animal  HAS  BIG  RED  EYES 
THEN  animal  is  an  EMU 

This  way  of  proving  the  animal  is  an 
emu  is  actually  just  an  OR  clause  with 
two  components,  one  of  which  can 
be  further  reduced  to  some  ANDs.  My 
solution  is  to  write  a  separate  routine 
for  the  AND  clause  (with  an  addition¬ 
al  attenuation  to  keep  the  OR  clause 
certainty  under  control).  Thus,  you 
now  have  the  two  subroutines 


Prove . emu: 

current .  f  act  =  einu 

GOSUB  Test . fact . for . human. input 

IF  leave=yes  THEN  RETURN 

GOSUB  Prove. bird 

GOSUB  Prove . cannot .fl y 

GOSUB  Prove. dark. feathers 

GOSUB  Prove . big . red . eyes 

number . of . and. clause . components  =  4 
AND.  COMPONENT  (  1 )  =bird 
AND .  COMPONENT  (  2  )=  cannot  .fl  y 
AND . COMPONENT ( 3 ) =dark . feathers 
AND . COMPONENT { 4  )  =big . red . eyes 
at . factor . for .AND. clause=  1 
GOSUB  Compute. AND. clause. cf 
GOSUB  Deduce 
RETURN 


Example  2:  Emu-proving  subroutine 


Dr.  Dobb's  Journal,  April  1987 

282 


EXPERT  SYSTEMS 

(continued  from  page  47) 


shown  in  Example  4,  below. 

Supposedly,  the  Amiga  BASIC  inter¬ 
preter  does  not  accept  subroutine  la¬ 
bels  longer  than  40  characters,  but 
the  43-character  label  Prove.bird.- 
and.  cannot. fly. and.  dark. feathers: 
works  correctly.  The  Microsoft  peo¬ 
ple  probably  mean  that  the  first  40 


characters  of  the  label  are  significant. 

The  second  subroutine  will  also  re¬ 
quire  access  to  both  the  user  and  the 
numeric  arrays  to  compute  the  cer¬ 
tainty  factors,  so  you  must  log  it  in 
your  DATA  statements: 

DATA  37, is  a  bird  and  cannot  fly 
and  dark  feathers 
bird. and. cannot. fly 

,and.dark.feathers=37 


Prove . emu: 

current .  f  act  =  emu 

GOSUB  Test. fact. for .human. input 
IF  leave  =  yes  THEN  RETURN 
GOSUB  Prove. bird 
GOSUB  Prove . cannot .fl y 
GOSUB  Prove. dark. feathers 
GOSUB  Prove. big. red. eyes 

number .of. or. clause. components  —  4 
OR.  COMPONENT  (  1  )  =bird 

AT . FACTOR . FOR . OR . COMPONENT ( 1 )  =  .  8 
OR .  COMPONENT  ( 2  )  =  cannot  .fly 

AT . FACTOR . FOR . OR . COMPONENT (  2  )  =  .  8  5 
OR . COMPONENT ( 3  )  =  dark .feathers 

AT .  FACTOR  .  FOR  .  OR  .  COMPONENT  (  3  )  =  .  9 
FOR . COMPONENT ( 4 ) =big . red . eyes 

AT . FACTOR . FOR . OR . COMPONENT ( 4 )  =  1 
GOSUB  Compute. or .clause. cf 
GOSUB  Deduce 
RETURN 


Example  3:  Alternate  emu-proving  subroutine 


Prove . emu: 

current .  f  act  =  emu 

GOSUB  Test. fact. for. human. input 
IF  leave=yes  THEN  RETURN 

GOSUB  Prove . bird . and . cannot .f 1 y. and. dark . feathers 
GOSUB  Prove . big. red . eyes 

number . of . or . clause . components— 2 

OR.  COMPONENT  (  1  )  =  bird. and .  cannot  -fly.  and .  dark .  feather: 

AT . FACTOR . FOR . OR . COMPONENT ( 1 )  =  .  8 
OR .  COMPONENT (  2 ) =  big . red . eyes 

AT . FACTOR . FOR . OR . COMPONENT ( 2 )  =  .  8  5 
GOSUB  Compute . or . clause . cf 
GOSUB  Deduce 
RETURN 

Prove . bird . and . cannot .fly. and . dark .feathers : 

current . f act = bird . and . cannot .fly. and . dark .feathers 

GOSUB  Test. fact. for. human. input 

IF  leave  =  yes  THEN  RETURN 

GOSUB  Prove. bird 

GOSUB  Prove . cannot -fl y 

GOSUB  Prove. dark. feathers 

number .of. and. clause. components  =  3 
AND  .  COMPONENT  (  1  )  =bird 
AND . COMPONENT ( 2  )  =  cannot .fly 
AND . COMPONENT ( 3  )  =dark . feathers 
at.factor.for.and.clause=1 
GOSUB  Compute. and. clause. cf 
GOSUB  Deduce 
RETURN 


Example  4:  Emu-proving  routines  for  compound  case 


48 


Dr.  Dobb's  Journal,  April  1987 

283 


So  much  for  the  emu  subroutine. 

As  for  the  other  subroutines  that 
Prove.emu:  calls,  you  already  have 
those  that  attempt  to  prove  that  the 
animal  is  a  bird  and  cannot  fly,  but 
you  need  two  subroutines  to  prove 
that  the  animal  has  dark  feathers  and 
big  red  eyes.  These  are  both  termi¬ 
nals  (they  don't  have  to  call  other 
parsing  subroutines — they  just  ask 
the  user  a  question),  so  they  are  easier 
to  write.  First,  you  add  this  new  set  of 
DATA  statements  near  the  top: 

DATA  38, has  dark  feathers 
dark.feathers= 38 

DATA  39, has  big  red  eyes 
big.red.eyes=39 

Having  done  this,  you  scroll  to  the 
bottom  of  the  program  and  add  two 
subroutines: 

Prove.dark.feathers: 
current. fact = dark.feathers 
GOSUB  Test.  fact,  for.human.  input 
IF  leave =yes  THEN  RETURN 
GOSUB  Deduce 
RETURN 

Prove. big.red. eyes: 
current. fact = big.red.eyes 
GOSUB  Test.fact. for.human. input 
IF  leave =yes  THEN  RETURN 
GOSUB  Deduce 
RETURN 

That's  it. 

By  writing  some  templates  of  vari¬ 
ous  sizes  for  the  subroutines  handling 
ANDs  and  ORs,  you  can  copy  them  as 
needed  and  quickly  assemble  a  MY- 
CIN-Iike  expert  system  running  under 
the  Amiga  BASIC  (Microsoft  BASIC)  in¬ 
terpreter.  The  system  currently  can 
handle  rules  with  clauses  having  12 
ANDs,  12  ORs,  or  a  combination  of 
both.  You  can  change  this  easily  by 
redimensioning  the  AND.COMPO- 
NENT,  OR. COMPONENT ,  and  AT.FAC- 
TOR.FOR.OR.COMPONENT  arrays. 

You  might  experiment  by  inserting 
additional  subroutines  that  perform 
other  mathematical  tests  outside  the 
MYCIN  environment  (go  ahead,  it’s 
OK — most  mathematicians  think  MY¬ 
CIN 's  mathematical  reasoning  is  pret¬ 
ty  ad  hoc  anyway,  even  though  it  has 
been  known  to  outperform  experts, 
which  isn't  saying  much  for  ex¬ 
perts!).  You  might  also  consider  add¬ 


ing  subroutines  allowing  the  pro¬ 
gram  to  explain  how  a  certainty 
factor  was  reached  after  the  conclu¬ 
sion  of  each  fact  by  listing  the  certain¬ 
ty  factors  passed  up  from  those 
below  it  in  the  AND/OR  hierarchy. 

Long  names  have  been  used  to 
identify  the  subroutines  and  vari¬ 
ables  only  in  an  effort  to  make  the 
program's  flow  of  control  under¬ 
standable.  By  shortening  these  con¬ 
siderably,  you  should  be  able  to  fit 
hundreds  of  rules  in  an  Amiga  with 
512K  of  memory.  Alternatively, 
Amiga  BASIC  allows  you  to  set  up  de¬ 
ductive  routines  as  separate  pro¬ 


grams  that  can  be  called  as  overlays 
when  needed  by  the  top-level  rou¬ 
tines,  then  deleted.  The  only  precon¬ 
dition  is  that  a  called  program  has  to 
be  saved  as  an  ASCII  file  (SAVE  "file- 
spec", A)  or  else  a  "bad  file  mode”  er¬ 
ror  message  appears. 

You  can  call  a  secondary  program 
with: 

CHAIN  MERGE  "filespec”, 

[expression], ALL, DELETE  range 

The  ALL  assures  that  all  variables  are 
shared  between  the  main  calling  pro¬ 
gram  and  the  called  program,  a  fea- 


Dr.  Dobb's  Journal,  April  1987 

284 


49 


EXPERT  SYSTEMS 

(continued  from  page  49) 


ture  that  this  kind  of  system  requires. 
Using  this  technique  in  conjunction 
with  a  hard  disk  enables  you  to  con¬ 
struct  an  expert  system  with  thou¬ 
sands  of  facts  and  rules. 

Such  disk-intensive  software 
would  probably  be  quite  slow,  so  the 
best  thing  to  do  is  to  keep  the  whole 
program  in  memory  by  shortening 
the  variable  and  label  names.  Remov¬ 
al  of  the  Delay:  subroutine  will  also 
speed  things  up,  showing  how  inter¬ 
preted  BASIC  can  hold  its  own  against 
interpreted  LISP,  at  least  under  these 
circumstances.  Still,  I’m  waiting  for 
the  Amiga  BASIC  compiler  to  appear. 

Availability 

All  the  source  code  for  articles  in  this 
issue  (except  for  C  Chest)  is  available 
on  a  single  disk.  To  order,  send  $14.95 
to  Dr.  Dobb's  Journal,  501  Galveston 
Dr.,  Redwood  City,  94063  or  call  (415) 
366-3600  ext.  216.  Please  specify  the 
issue  number  and  format  (MS-DOS, 
Macintosh,  Kaypro). 

Notes 

1.  P.  H.  Winston  and  B.  K.  P.  Horn,  LISP 
(Reading,  Mass.:  Addison-Wesley, 
1981). 

2.  Richard  O.  Duda  and  John  G.  Gasch- 
nig,  "Knowledge-Based  Expert  Sys¬ 
tems  Come  of  Age,"  Byte,  vol.  6  no.  9 
(September  1981):  238—281. 

3.  John  Zarrella,  Language  Transla¬ 
tors  (Suisun  City,  Calif.:  Microcom¬ 
puter  Applications,  1982). 

4.  J.  B.  Adams,  "A  Probability  Model 
of  Medical  Reasoning  and  the  MYCIN 
Model,”  Mathematical  Biosciences  32 
(1976):  177-186. 


DDJ 

(Listing  begins  on  page  74.) 

Vote  for  your  favorite  feature/article. 
Circle  Reader  Service  No.  4. 


NEURAL  NETWORK 


Listing  One  (Text  begins  on  page  16.) 

♦define  PGM_ID  “SILOAM  CI-C86  Ver.  of  11/22/86  for  PC-DOS  2.X+” 

/*  An  Adaptive  Template  Matching  Image  Categorizer 

*  (An  Experimental  Computer  Vision  Program) 

* 

*  This  program  implements  a  trainable  pattern  classifier  as 

*  a  committee  network  of  threshold  logic  units.  It  learns  to 

*  recognize  patterns  by  being  trained  from  a  set  of  prototype 

*  patterns  presented  in  a  training  file.  The  training  file  is 

*  organized  as  a  set  of  visual  images  represented  as  an  orthogonal 

*  array  of  picture  elements,  or  pixels.  Each  pixel  is  a  number 

*  representing  the  gray-scale  value  of  that  point  in  the  image. 

*  Associated  with  each  pattern  is  a  number,  or  tag,  that 

*  represents  the  category  to  which  that  pattern  belongs. 

* 

*  R.  J.  Brown 

*  Elijah  Laboratories  International 

*  5225  N.W.  27th  Court 

*  Margate,  FL  33063 

*  (305)  979-1567 

* 

*  Ownership:  I  hereby  place  this  program  in  the  public  domain. 


*  System: 

Red  River  ATlas 

10  MHz  80286  IBM-PC/AT  clone 

*  Compiler: 

* 

C86  Version  2.30H;  Computer  Innovations,  Inc. 

*f 

iinclude  "stdio. 

h" 

/*  needed  for  stream  input /output 

*/ 

* define  FALSE 

0 

/*  boolean  constant  for  'false' 

*/ 

# define  TRUE 

! FALSE 

/♦  boolean  constant  for  'true' 

*/ 

♦define  NULL  ( (int  *) 0) 

/*  the  pointer  to  nowhere 

*/ 

♦define  void 

/*  function  that  returns  no  value 

*/ 

♦define  forall (index, limit) \ 

for ( (index) »0; (index) < (limit) ; (index) ++)  /*  looping  word 

*/ 

♦  define  kase  (id. 

stmt)  \ 

case (id) 

:  (  \ 

stmt ;  \ 

break;  \ 

/*  shorthand  form  for  case  statement 

*/ 

♦  define  u (x)  ((unsigned)  (x) ) 

/*  shorthand  for  ' (unsigned) '  cast 

*/ 

typedef  unsigned  char  byte; 

/*  an  8-bit  byte  of  storage 

*/ 

typedef  unsigned  int  word; 

/♦a  16-bit  word  of  storage 

*/ 

typedef  word 

boolean; 

/*  a  decision  variable. 

*  'true'  or  'false  value  only  */ 

typedef  ELTYPE 

element; 

/*  an  element  is  a  real  number 

*/ 

typedef  DOTY PE 

DOT; 

/*  type  of  a  dot  product  may  be  bigger! 

*/ 

typedef  element 

*vector; 

/*  a  vector  is  a  set  of  elements 

*/ 

typedef  vector 

tlu; 

/*  a  tlu  is  a  vector 

*/ 

typedef  struct 

{ 

/*  the  collection  of 

*/ 

tlu 

♦wtpt; 

/*  a  set  of  tlu  weight  points, 

*/ 

DOT 

♦dot; 

/*  and  dot  product  save  cells 

*/ 

} 

committee; 

/*  is  a  committee 

*/ 

typedef  char 

♦pointer; 

/♦  a  general  pointer  to  whatever... 

*/ 

* 

*  Global  Variable  Definitions 

* 


FILE  *pat, 

*fopen()  ; 

byte  patname[64], 
*  Index  ()  ; 


/*  the  input  training  pattern  file 
/*  the  file  opener 
/*  ascii  filename  of  input  file 
/*  string  search  library  function 


*/ 

*/ 

*/ 

*/ 


52 


Dr.  Dobb's  Journal,  April  1987 

285 


_ _ 


int  ncom, 

patwide, 

pathite, 

pats_so_far, 

pats__missed, 

missed, 

tlus_t rained, 

npass, 

log_level, 

dim, 

ntlu, 

corr_incr, 

*vote; 

boolean  goofed, 

start_over, 

absolute, 

*decsn, 

*class; 


DOT 


pa t mag; 


element  fraction, 
maxel~0; 
radius; 


vector  pattern; 
committee  *net; 


/*  number  of  committees  in  the  network  */ 
/*  pattern  width  in  pixels  */ 
/*  pattern  height  in  pixels  */ 
/*  how  many  patterns  in  file  so  far  */ 
/*  how  many  patterns  were  mis-recognized  so  far  */ 
/*  #  of  patterns  missed  on  this  pass  */ 
/*  how  many  tlu's  have  been  adjusted  so  far  */ 
/*  number  of  current  pass  thru  pattern  file  */ 
/*  level  of  detail  for  run-time  logging  */ 
/*  number  of  elements  in  a  vector  (dimension)  */ 
/*  number  of  tlus  per  committee  */ 
/*  fixed  increment  correction  constant  */ 
/*  pointer  to  vote  count  array  */ 

/*  mis-recognition  indicator  for  training  loop  */ 
/*  select  start  over  on  error  training  strategy  */ 
/*  flag  for  absolute  correction  training  method  */ 
/*  pointer  to  network's  decision  array  */ 
/*  pointer  to  class  (category)  array  */ 


/*  pattern  magnitude  (used  for  training)  */ 

/*  correction  fraction  for  training  */ 

/*  maximum  element  in  a  weight  point  */ 

/*  average  radius  (distance  from  origin) 

*  of  tlu  weight  point  at  initialization  */ 

/*  pointer  to  current  input  pattern  */ 

/*  pointer  to  network  as  an  array  of  committees  */ 


Library  Routines 

* 


extern  float  atof();  /* 
extern  double  sqrt () ;  /* 
extern  pointer  calloc();/* 
extern  long  time();  /* 


ascii  to  float  library  conversion  routine  */ 
square  root  library  function  */ 
memory  allocation  library  function  */ 
benchmark  timing  routine  */ 


* 

*  BANNER 

* 

void  banner))  (  /*  display 
print f (“\nts“, PGM  ID) ; 


printf ("\nWritten  by: 
printf (“\nThis  program 


k************ *************  ********************* 

Display  Program  I.  D. 

************************************** *********/ 

program  identification  information  */ 

/*  Program  Identification  is  #define'd 
*  at  top  of  source  file  * / 

R.  J.  Brown,  Elijah  Laboratories  Intn'l"); 
is  in  the  Public  Domain. \n") ; 


* 

HELP  Display  Screen 

* 

void  help()  (  /*  some  user  friendly  help  for  the  uninitiated  •!  */ 


printf  ( 
printf ( 
printf  ( 
printf  ( 
printf  (' 
printf  (' 
printf  (' 
printf  (' 
printf  (' 
printf  (' 
printf  (' 
printf  (' 
printf  (' 


Simple  Image  Learning  On  Adaptive  Machinery\n") ; 

An  Adaptive  Template  Matching  Image  CategorizerXn") ; 

\n") ; 

R.  J.  Brown,  Elijah  Laboratories  InternationalXn") ; 

5150  W.  Copans  Rd.  Suite  1135,  Margate  FL  33063\n"); 

\nM)  ; 

usage:  siloam  <options>  filename[.ext)\n\nM) ; 

where:  filename  —  is  the  input  pattern  file.\n\n"); 
options:  -r ##.#  —  gives  initialization  radius. \nM); 

“t*#  —  gives  number  of  TLUs  per  committee. \nM ) ; 

-o  —  start  over  on  error, \n\n") ; 
choose  one:  -if#  —  fixed  increment  correction,  it  =  incr.Xn"); 
'  -a  —  absolute  correction. \n" ) ; 


( continued  on  nejct  page ) 


Dr.  Dobb's  Journal,  April  1987 

286 


57 


NEURAL  NETWORK 


Listing  One  (Listing  continued,  text  begins  on  page  16. ) 

print f ("  -f##.#  —  fractional  correction,  ##.#  is  lambda . \n") ; 

printf  ("  -1#  —  logging  level:  0=least;  3=most . \n") ; 

exit  (0) ; 

} 


* 

*  SIGN  —  The  Sign  Of  An  Element  +/-  1 


int  sign(x)  /*  return  the  sign  of  a  number  as  plus  or  minus  one  */ 

element  x;  /*  argument  is  an  element  */ 

return (  x< (element) 0  ?  -1  /*  if  number  is  negative,  return  -1  */ 

:  1  );/*  else  return  +1  */ 

} 


/********* 


ISIGN  —  The  Sign  Of  An  Integer  +/-  1 


******************* 


************************* 


int  isign(x)  /*  return  the  sign  of  a  number  as  plus  or  minus  one 
int  x;  /*  argument  is  an  integer 

return  (  x<0  ?  -1  /*  if  number  is  negative,  return  -1 

:  1  );  /*  else  return  +1 


} 


*/ 

*/ 

*/ 

*/ 


element  eabs (x) 
element  x; 

{ 

return (  x<0  ? 

1 


/*  the  absolute  value  of  an  element 
/*  argument  is  an  element 

■x  /*  if  number  is  negative,  make  it  positive 

x  ) ;  /*  else  return  it  like  it  is 


*/ 

V 

*/ 

*/ 


*********** 


int 

int 

{ 


i 

/** 


labs (x)  /*  the  absolute  value  of  an  integer 

x;  /*  argument  is  an  integer 


*/ 


*/ 


return!  x<0  ?  -x 

:  x  ); 


/*  if  number  is  negative,  make  it  positive 
/*  else  return  it  like  it  is 


*/ 

*/ 


ALPHA  —  Step  Function 


int  alpha (x) 
int  x; 

{ 

return!  x>0 

1 


/*  step  function  return  zero  or  one 
/*  argument  is  an  integer  (in  this  program,,.) 


?  1 
:  0  )  ; 


/*  if  argument  strictly  positive, 
/*  else  return  zero 


return  one  */ 

*/ 


******** 


****************************************************** 
MOVE  —  String  Move  Function 


************ 


(continued  on  page  60) 


58 


Dr.  Dobb's  Journal,  April  1987 

287 


NEURAL  NETWORK 


Listing  One  (Listing  continued,  text  begins  on  page  16.) 

char  *move  (src, dst )  /*  move  a  string  returning  ptr  to  end  of  result  */ 

char  *src, *dst;  /*  pointers  to  source  &  destination  strings  */ 

{ 

while  (0 !  =  (  (*dst++)  =  (*src++) )) ;  /*  copy  bytes  until  end  of  source  */ 
return ( — dst);  /*  return  ptr  to  end  of  destination  */ 

} 


/«»********************************************(,********************«**** 

* 

*  RADIUS  STATISTICS  —  Summary  Info 


void  radius_stat 1st ics  ()  {  /*  show  how  weight  points  are  distributed  */ 


element  r,  /*  current  radius  accumulator  */ 

*pe;  /*  pointer  to  current  element  */ 

float  mu=0,  /*  mean  of  radii  */ 

sigma=0;  /*  standard  deviation  of  radii  */ 

committee  *pc=net;  /*  pointer  to  current  committee  */ 


vector  *pt; 


/*  pointer  to  current  tlu 


int 


c, 

t, 

e, 

n~ncom*nt lu; 


/*  committee  loop  counter 
/*  tlu  loop  counter 
/*  element  loop  counter 
/*  number  of  tlu's  altogether 


*/ 

*/ 

*/ 

*/ 


forall (c, ncom)  { 

/*  for  all 

committees. . . 

*/ 

pt  — pc++— >wtpt ; 

/*  point  to  first  tlu 

*/ 

forall (t, ntlu)  { 

/*  for 

all  tlu's... 

*/ 

pe=*pt++; 

/* 

point  to  first  element 

*/ 

r=0 . ; 

/* 

initialize  radius  tally 

*/ 

forall (e, dim)  { 

/* 

for  all  elements... 

*/ 

r+“  (*pe)  *  (*pe)  ; 

/*  accumulate  radius  sqr'd 

•/ 

p e++; 

} 

/*  point  to  next  element 

*/ 

mu+=sqrt { (float) r) ; 

/* 

accumulate  sum  of  radii 

*/ 

sigma+**  (float)  r; 

} 

/* 

accumulate  variance  variable 

*/ 

} 

mu/= (float) n; 

/*  divide  to  get  overall  average  radius 

*/ 

s  i  gma =mu  *mu  *  n ; 

/*  compute 

variance 

*/ 

sigma  =*sqrt  (sigma)  /mu; 

/*  compute 

standard  deviation 

*/ 

print f  ("\nmean  of  the  radii: 

%f‘’,mu); 

/*  print  statistical 

*/ 

print f ("\nstandard  deviation:  %f", sigma);  /*  summary  of  weight 

*/ 

print f ("\n") ; 

/*  point  distribution 

*/ 

/■AAA******* 


READ  HEADER  —  Read  File  Header 


************************************* 


*  **  */ 


void  read_header ()  /*  read  training  file  header  information  */ 

{ 


rewind  (pat) ; 
pats_so_far=0; 


/*  rewind  pattern  file  */ 

/*  reset  pattern  sequence  counter  */ 


fscanf  (pat,  /* 

"hdr  %d  %d  %d  \nM,  /* 


header  comes  from  pattern  file  */ 

header  must  start  with  'hdr' 
then  read  header  information 
composed  of  three  numbers  */ 


/*  put  this  information  into  the  following  global  variables  */ 


in com, 
ipatwide, 
ipathite)  ; 


/*  number  of  committees  in  network  */ 
/*  pattern  width  in  pixels  */ 
/*  pattern  height  in  pixels  */ 


(continued  on  page  65) 


60 

288 


Dr.  Dobb's  Journal,  April  1987 


_ NEURAL  NETWORK 

Listing  One  (Listing  continued,  tejit  begins  on  page  16.) 

/***★******************************************************************** 

* 

*  RANDOM  —  Random  Number  Generator 

★A**********************************************************************/ 

element  random!)  (  /*  generate  a  uniformly  distributed 

*  random  number  from  the  open  Interval  (0...1)  */ 

return  (rand  0/16384. ) ;  /*  return  scaled  random  Integer  */ 

) 

/«***************************«******************************************* 

* 

*  INIT  VAL  —  Initial  Element  Value 

* 

************************************************************************/ 

element  init  val  (radius)  /*  generate  init'l  value  for  element  a  tlu  */ 
element  radius;  /*  the  avarage  radius  of  a  weight  point  */ 

{ 

return  (  /*  return  the  */ 

(radius‘sqrt (3 .) ) / (sqrt ( (float)  dim) )  /*  average  weight  value  */ 

*  (2. ‘random 0-1)  /*  scaled  randomly  by  a  */ 

).  /*  uniform  distribution  */ 

} 


INITIALIZE  —  Allocate  Storage,  Etc. 


void  initialize ()  { 

committee  *pc; 
tlu  *pt; 

element  *pe, 
x; 


/*  allocate  &  initialize  network  array  storage 

/*  pointer  to  current  committee  of  network 
/*  pointer  to  current  tlu  of  committee 
/*  pointer  to  current  element  of  tlu 
/*  current  initialization  weight  value 


int  c,  /*  committee  index  in  network 

t ,  /*  tlu  index  in  committee 

e;  /*  element  index  in  tlu 

printf  ("\ninitializing") ;  /*  say  what's  taking  so  long  ! 


dim=patwide*pathite+l; 


/*  number  of  elements  in  a  tlu 


pattern**  (vector)  calloc  (u  (dim) , 

u (sizeof (element) ) ) ; 


/*  allocate  the  pattern 
/*  vector 


class**  (boolean  *)  calloc (u (ncom) ,  /*  allocate  the  class  array 

u (sizeof  (boolean) )) ;  /*  which  will  contain  the 

*  desired  decision  bits  from  the  committees,  as  read  from  the 

*  training  file,  the  actual  verdict  of  the  network  will  be 

*  compared  with  this  to  see  if  training  is  required.  */ 

vote=(int  *) calloc (u (ncom) ,  /*  allocate  the  votes  array 

u (sizeof  (int) )) ;  /*  which  will  contain  the 

*  count  of  votes  for  each 

*  committee.  */ 


decsn**  (boolean  *)  calloc  (u (ncom) ,  /*  allocate  the  decision 

u (sizeof (boolean) )) ;  /*  array  which  will  contain 

*  the  bits  of  the  answer, 

*  one  bit  per  committee.  */ 


pc=net= (committee  *) calloc (u (ncom) 


/*  allocate  the  network 


u (si zeof (committee) )) ;  /*  as  an  array  of  committees  */ 


forall  (c, ncom)  { 


/*  for  all  committees  in  the  network. 


pc->wtpt-pt-(tlu  *) calloc (u(ntlu),  /*  allocate  a  committee  */ 

u(sizeof (tlu) ) ) ;  /*  as  an  array  of  tlu's  */ 

pc++->dot« (DOTYPE  *) calloc (u(ntlu),  /*  together  with  dot  */ 

u (sizeof (DOT)));  /*  product  save  cells  */ 

forall (t,ntlu)  (  /*  for  all  tlu's  In  the  committee...  */ 

(continued  on  ne?ct  page) 


Dr.  Dobb's  Journal,  April  1987 


_ NEURAL  NETWORK 

Listing  One  ( Listing  continued,  text  begins  on  page  16.) 


pe=*pt++= (element  *) calloc (u (dim) ,  /*  allocate  a  tlu  */ 

u (sizeof (element) )) ;  /*  as  an  array 

*  of  elements  */ 

forall  (e, dim)  {  /*  for  each  weight.*.  */ 

if (radius*»0)  *pe++= (e! =0) ;  /*  grow  connections?  */ 

else  {  /*  or  adjust  weights?  */ 

x=eabs  (*pe++*=init_val  (  /*  adjust,  get  initial  */ 

(element) radius) ) ;/*  weight  value  */ 


if(x>maxel)  maxel=x;  /*  update  max  magnitude  */ 

} 

) 

/*  initialize  each  element  to  a  random  value  such  that  the  average 

*  radius,  or  distance  from  the  origin,  of  each  weight  point  is  'radius'. 

*  this  will  produce  a  distribution  of  weight  points  clustered  near  the 

*  surface  of  a  hyper-sphere  as  the  starting  condition.  If  the  radius  is 

*  zero,  then  all  weights  will  be  set  to  zero  except  for  the  threshold 

*  setting  weight.  This  is  analogous  to  forcing  the  program  to  grow  new 

*  interneural  connections  on  an  as-needed  basis,  supposedly  just  like 

*  the  real  brain  does!  */ 

) 

) 

printf ("\n") ;  /*  perform  new-line  when  initialize  is  done  */ 

) 


* 

*  DOTPROD  —  Form  A  Dot  Product 


DOT  dotprod(x,y) 

/* 

form  the  scalar  product  of  two  vectors 

*/ 

vector  x,y; 

{ 

/* 

both  arguments  are  vectors 

*/ 

DOT  z=0; 

/* 

result  accumulator,  initialized  to  zero 

*/ 

int  i; 

/* 

element  index,  used  as  loop  counter 

*/ 

forall (i, dim) 

/* 

for  all  elements  in  each  vector... 

*/ 

z+»(*x++) * (*y++) ; 

/*  compute  the  dot  product 

*/ 

return (z) ; 

/* 

return  it  to  the  caller 

*/ 

/**********♦************************************************************* 

* 

*  READ  CLASS  —  Read  The  Class  Tag 

* 

************************************************************** **********^ 

boolean  read_class()  {  /*  read  the  class  tag  number  for  the  image  */ 

int  /*  loop  counter  for  index  in  class  array  */ 

tmP»*  /*  temp  cell  to  hold  decimal  category  */ 


boolean  *pcl=class; 

/*  pointer 

to  class  (category)  array 

*/ 

if  (fscanf  (pat, "%d", &tmp) 

!=1)  /* 

read  the  pattern  category 

*/ 

return (FALSE)  ; 

/* 

return  FALSE  for  end  of  file 

*/ 

forall (i, ncom)  { 

/* 

for  each  committee  in  network 

*/ 

*pcl++=tmp&l; 

/* 

extract  desired  committee  output 

*/ 

tmp»«l; 

} 

/* 

advance  to  next  committee 

*/ 

*pcl=l; 

/*  augment 

with  a  1  to  prevent  singularity 

*/ 

pats  so  far++; 

/*  update  pattern  sequence  counter 

*/ 

return (TRUE) ; 

/*  return  TRUE  if  class  read  successfully 

*/ 

*  READ  PATTERN  —  Read  Next  Pattern 

* 


boolean  read_pattern ()  {  /*  read  next  pattern  from  training  file  */ 

int  i/j;  /*  loop  counters  for  row  &  column  of  image  */ 
element  *pe=pattern;  /*  pointer  to  element  of  pattern  vector  */ 
float  tmp;  /*  temp  cell  for  input  conversion  */ 


66 

290 


Dr.  Dobb's  Journal,  April  1987 


forall (i,  patwide)  /*  for  each  row  in  the  image,  */ 

forall ( j,pathite)  /*  for  each  pixel  in  that  row,  */ 

if(  fscanf (pat, "%f ", &tmp)  /*  input  value  of  pixel  */ 

! =1  )  return (FALSE) ;  /*  return  FALSE  if  end-of-file  */ 

else  *pe++= (element) tmp;  /*  convert  to  type  element  */ 


return  (  read_class()  );  /*  read  in  its  class  as  an  array 

*  of  correct  decisions  for  each  committee  in  the  network.  If  the 

*  entire  pattern  is  read,  together  with  its  class,  return  TRUE.  */ 


* 

*  COUNT  VOTES  —  Count  The  Votes 


int  count_votes  (pc)  /* 

count  the  votes  for 

each  tlu  in  a  committee 

*/ 

committee  *pc;  /* 

/ 

second  parameter  is 

a  pointer  to  committee 

*/ 

1 

DOT  *pd=pc->dot; 

/* 

dot  product 

save  cell  pointer 

*/ 

tlu  *pt»pc->wtpt; 

/* 

tlu  pointer 

*/ 

int  ti. 

/* 

tlu  index  (loop  counter) 

*/ 

count *0; 

/* 

the  count  of  votes  for  the  committee 

*/ 

forall  (ti, ntlu) 

/*  forall  tlus  in  committee 

*/ 

count +-sign ( 

/*  count  votes  as  +  or  - 

*/ 

*pd++= 

/*  &  save  dot  product  as 

*/ 

dotprod (*pt++, pattern) 

/*  weight  point  dotted  with 

*/ 

) ; 

/*  pattern  vector 

*/ 

return  (count) ;  /*  return  tally  */ 


/************************************************************************ 

* 

*  RECOGNIZE  —  Recognize  A  Pattern 

* 

************************************************************************/ 

void  recognize  ()  {  /*  recognize  a  pattern  by  taking  the  decision 

*  of  each  committee  to  be  a  bit  in  the  category 

*  number  for  the  pattern 

V 

int  i,  /*  loop  counter  */ 

♦pv-vote;  /*  pointer  to  vote  count  array  */ 

boolean  *pdec»decsn; /*  pointer  to  decision  array. 

*  this  holds  the  decision  bits  for  each 

*  of  the  committees  in  the  network. 

*/ 

committee  *pc=net;  /*  pointer  to  current  committee  in  network  */ 

forall (i, ncom)  /*  for  all  committees  in  the  network...  */ 

*pdec++“alpha (*pv++=count_votes (pc++) ) ;  /*  how  many  votes  ?  */ 

} 

/************************************************************************ 

*  $  G  E  T  WEAK  TLU  —  Sway  Which  One  ? 

* 

************************************************************************/ 


int  get_weak_tlu  (ci)  /*  choose  tlu  most  vulnerable  to  be  swayed  */ 

int  ci;  /*  argument  is  committee  index  */ 

int  weak=0,  /*  index  of  weakest  tlu  so  far  */ 

sv=isign(vote[ci]) ,  /*  sign  of  committee's  vote  */ 

ti;  /*  tlu  index  */ 

DOT  *pd- (&net [ci ] ) ->dot ,  /*  pointer  to  dot  product  array  */ 

conviction-INFINITY,  /*  lowest  conviction  so  far  */ 

d;  /*  saved  dot  product  value  */ 

forall  (ti, ntlu)  (  /*  for  all  of  the  tlu's  in  this  committee...  */ 

d=pd[ti];  /*  get  the  saved  dot  product  value  */ 

if  (sign  (d)  *==*sv)  {  /*  if  tlu  voted  incorrectly  */ 


(continued  on  ne^t  page) 


Dr.  Dobb's  Journal,  April  1987 


67 

291 


NEURAL  NETWORK 


Listing  One  (Listing  continued,  tejct  begins  on  page  16.) 

If (eabs  (d) cconviction)  (  /*  and  if  this  tlu  has  the 

*  least  conviction  of  any  that  have  been  examined  so  far,  */ 

weak=ti;  /*  then  remember  it  as  the  best  one  so 

*  far  to  adjust  to  sway  the  vote  of  this  committee.  */ 

convict ion=eabs  (d) ;  /*  update  lowest  conviction  */ 

) 

) 

) 

return  (weak) ;  /*  return  subscript  of  weakest  tlu  in  committee  */ 

) 


* 

*  ADJUSTMENT  —  Correction  Coefficient 

element  ad justment  (ci, ti)  /*  compute  correction  coefficient  */ 

ci,  /*  committee  index  */ 

ti;  /*  tlu  index  */ 

[ 

DOT  d- (&net [ci] )->dot [ti] ; 


if (corr_incr) 

return (corr_incr*sign (d) >  ; 


/*  saved  dot  product  */ 

/*  fixed  increment  correction  */ 


if  (absolute)  /*  absolute  correction  */ 

return ( (int) (d/patmag) +sign(d) ) ; 

if  (fraction)  /*  fractional  correction  */ 

return  (d*fract ion/pa tmag) ; 

return (abort ("No  correction  method  specified.")); 


* 

ADJUST  —  Change  TLU's  Weights 

void  ad just (ci, t i)  /*  adjust  the  weights  of  a  single  tlu  */ 


int  ci,  /*  committee  index 

ti;  /*  tlu  index 

( 

vector  pw-(snet [ci] ) ->wtpt [ti] , 
pp-pattern; 

element  lambda»ad justment (ci, ti) , 
wt,  awt ; 

int  i; 

tlus_trained++; 

forall (i, dim)  { 

wt= (*pw++) -=lambda* (*pp++) ; 
awt=eabs (wt) ; 
if  (maxel<awt)  { 
maxel=awt ; 
if (log_level)  { 

printf  ("\nmaxel«%f ", 

(f loat)maxel) ; 

) 

) 


V 

V 


/*  pointer  to  a  weight  */ 

/*  pointer  to  a  pixel  */ 

/*  the  correction  coefficient  */ 
/*  temps  for  max  weight  point  */ 


/*  element  index  &  loop  counter  */ 


/*  count  adjustment  of  tlu  */ 

/*  for  each  coefficient  */ 
/*  adjust  weights  */ 
/*  save  magnitude  */ 
/*  new  maximum  ???  */ 
/*  yes,  update  max  elem  */ 
/*  if  any  logging,  */ 
/*  then  display  the  */ 
/*  new  maximum  value  */ 


} 


if (log_level>-3) 
printf ("\n 


) 


com=%d  tlu=%d  lambda=%g", 
ci,ti, (float) lambda) ; 


/*********** 


*  S  W  A  Y  TLUS  —  Sway  TLUs  To  Change  Vote 

* 


68 

292 


Dr.  Dobb's  Journal,  April  1987 


void  sway_tlus  (ci)  /*  sway  enough  tlu's  to  change  the  vote 
int  ci;  /*  parameter  is  committee  index 

int  i,  /*  loop  counter 

lost_by*=iabs  (vote[ci]/2)+l,  /*  how  many  votes  we  lost  by 
weak_tlu;  /*  weakest  wrong  tlu  in  committee 

DOT  *pd= (&net [ci J ) ->dot;  /*  pointer  to  dot  product  array 

forall  (i, lost_by)  (  /*  do  this  enough  times  to  sway  the  vote... 

weak_tlu=get_weak_tlu(ci) ;  /*  find  most  vulnerable  tlu 

adjust (ci,weak_tlu) ;  /*  adjust  its  weights  to  change 

*  its  mind  about  the  pattern  */ 
pd[weak_tlu]«- sign (pd[weak_tlu] ) ;  /*  flip  sign  of  dot  product 
*  so  this  tlu  won't  be  considered  again  in  this  loop  */ 

> 

) 

/***************»***************************************************** 

* 

*  SHOW  BITS  —  Display  Bits  On  CRT 

* 

********************************************************************* 


*/ 

*/ 

*/ 

*/ 

*/ 

‘/ 

*/ 

*/ 


void  show_bits(ps,  pb) 
char  *ps; 
boolean  *pb; 

( 

Int  1, 
k-1, 
v-0; 


/*  display  a  bit  vector  on  the  screen 
/*  the  label  for  the  bit  vector 
/*  the  pointer  to  the  bit  vector 

/*  loop  counter 
/*  power  of  two 
/*  value  accumulator 


1 


forall  (1,  ncom)  1 

if (*pb++)  v+«k; 
k«=l; 

) 

prlntfC  %s  %d“,ps,v); 


/*  for  all  committees 
/*  convert  binary  to  decimal 
/*  advance  to  next  bit 

/*  display  label  and  value 


***/ 

*/ 

*/ 

*/ 

*/ 

*/ 

*/ 

*/ 

*/ 

*/ 

*/ 


/************************************************************************ 

* 

*  TRAIN  —  Train  The  Network 

* 

♦A**********************************************************************/ 

train ()  (  /*  train  the  network  to  recognize  the  pattern  */ 

int  ci;  /*  conmittee  index  */ 

goofed-FALSE;  /*  give  benefit  of  doubt  —  assume  didn't  goof  */ 

patmag^otprod  (pattern,  pattern) ;  /*  find  pattern  magnitude  */ 

forall (ci, ncom)  /*  for  all  the  committees  in  the  network...  */ 


/*  if  the  conmittee  goofed  up,  */ 

/ 
/ 


if (decsn(ci) !=class[ci) )  ( 

goofed-TRUE;  /*  then  say  so, 

pats  missed++;  /*  count  misrecognized  pattern 

sway~tlus(ci);  /*  and  change  enough  tlu's 

*  so  it  won't  goof  up  on  this  pattern  next  time  !  */ 

) 


if (goofed)  { 

/♦ 

did  we  goof? 

*/ 

missed++; 

/* 

yes,  count  the  boo  boo! 

*/ 

if (log_level>“2)  { 

/* 

if  detail  requested. 

V 

printf ("Xn") ; 

/* 

start  a  new  line 

*/ 

show_bits ("siloam  ",decsn); 

/* 

show  machine's  decision 

*/ 

show_bits ("really  ", class); 

/* 

display  what  really  is 

*/ 

1 


) 


/************************************************************************ 

* 

*  TOTCONS  —  Total  Number  Of  Connects 

* 

************************************************************************/ 
int  totcons()  {  /*  count  total  #  of  connections  */ 

(continued  on  next  page) 


Dr.  Dobb's  Journal,  April  1987 


69 

293 


_ NEURAL  NETWORK 

Listing  One  (Listing  continued,  te}ct  begins  on  page  16.) 


committee  *n«net; 

/* 

neural  network  pointer 

*/ 

tlu  *c. 

/* 

committee  pointer 

*/ 

t; 

/* 

tlu  pointer 

*/ 

int  i,j,k. 

/* 

loop  indices 

*/ 

no-0; 

/* 

totalizer  accumulator 

*/ 

forall (i,  ncom)  {  c=n++->wtpt; 

/* 

for  each  committee... 

forall ( j,ntlu)  {  t-*c++; 

/* 

for  each  tlu  in  the  committee*/ 

forall  (k, dim-1) 

/* 

for  each  element  in  the  tlu 

*/ 

if  (*t++ ! -0)  no++; 

} 

/* 

count  it  if  it  is  connected 

*/ 

} 

return (no) ; 

/* 

return  the  count 

*/ 

* 

*  SILOAM  Outside  Control  Structure 

* 

void  siloamO  {  /*  outside  control  structure  for  pattern  recognizer  */ 


long  start, stop; 
int  cons, new, old»0; 

/*  timer  value  cells  for  benchmarking 
;/*  connection  counters 

*/ 

*/ 

read_header () ; 

/*  read  header  information  in  the  training  file 

V 

initialize ()  ; 

/*  allocate  the  committees  of  TLUs  and 
*  initialize  the  weight  points  randomly  */ 

radius_statistics () ; 

'/*  print  starting  radius  statistics 

*/ 

npass-0; 

start -time (NULL)  ; 

/*  initialize  pass  counter 
/*  remember  start  time 

*/ 

V 

do  { 

/*  start  over  in  training  file, 

*  we  made  a  mistake...  */ 

missed=0; 

/*  reset  misrecognition  counter 

*/ 

read_header  ()  ; 

/*  rewind  training  file 

*  and  skip  over  header  information...  */ 


while (read_pattern () )  {  /*  keep  reading  patterns  until  we've 
*  done  the  entire  training  file  and  recognized  them  all  */ 

recognize ();  /*  attempt  to  recognize  the  pattern  */ 

train ();  /*  adjust  any  weights  necessary  to  get 

*  the  correct  recognition  if  we  goofed  */ 

if (goofed&&start_over)  break;  /*  select  training  strategy  */ 


)  /* 

end 

of  while  loop  to  read  next  pattern 

*/ 

npass++; 

/*  increment  pass  counter 

*/ 

if (log_level>-l)  { 

/*  give  pass  summary  report 

*/ 

cons-totcons () ; 

/*  count  the  connections 

*/ 

new=cons-old; 

/*  compute  how  many  new  ones 

*/ 

old-cons; 

/*  remember  for  next  time 

*/ 

printf (H\npass  #  %d 
npass, 

\ 

missed  %d  cons-%d  new-%dM, 

missed,  cons,  new) ; 

}  while (missed) ;  /* 

end 

of  do  loop  to  train  network 

*/ 

stop-time (NULL) ;  /* 

get 

stop  time 

*/ 

/*****************  print  end  of  run  summary  *************************/ 
print f ("\n") ; 

printf ("\ntraining  completed  in  %ld  seconds. \nM, stop-start) ; 


print f ("\nnumber  of  committees:  %dM/ncom  ) 
printf ("\nnumber  of  tlus  total:  %d"/ ncom*ntlu  ) 
printf ("\nnumber  of  elements:  %dM,ncom*ntlu*dim  ) 
printf ("\nnumber  of  connections:  %d",totcons ()  ) 
printf (“\n") ; 


printf ("\nnumber  of  passes  thru  file:  %dM,npass); 
printf ("\nnumber  of  patterns  in  file:  %d", pats_so_far  ); 
printf (M\nnumber  of  mis- recognitions:  %dM, pats_missed  ); 
printf ("\nnumber  of  tlu  adjustments:  %dM, tlus_trained) ; 
printf  ("\nmaximum  element  magnitude:  %f,  (float  )maxel); 


70 

294 


Dr.  Dobb's  Journal,  April  1987 


_ _ I _ _ _ 

printf  ("\n") ; 

rad  i  us_st  at  i  sties () ;  /*  print  ending  radius  statistics  */ 


} 


*  MAIN  Program  Starts  Here 

* 


main  (paramet, params)  /**»*.«****  main  program  entry  point  ************/ 


int  paramet;  /*  number  of  parameters  on  command  line  */ 

char  *params[];  /*  array  of  pointers  to  strings  for  each  param  */ 


( 

int  i; 


/*  array  index  variable 


banner  ()  ; 


/*  print  program  name,  version,  k  release  date  */ 


printf (“\nInvoked  By:“);  /*  show  how  the  program  */ 
for  (l-l;i<-paramct,-i++)  printf  ("  %s",params  [i  ] )  ;  /*  was  started  up!  */ 
printf ("\nelement  type  is  %s“, eltype) ;  /*  show  arithmetic  used  */ 
printf  (“\n”) ; 


parse  the  command  line  **********************/ 


if  (paramct--l)  helpO;  /*  if  no  params,  then  give  help  and  quit  !  */ 
patname[0]=0;  /*  else  set  pattern  filename  to  null  string  */ 


for  (i-l;i<paramct;l++)  (  /*  for  each  parameter...  */ 

if  params[i)  [0))  /*  is  it  an  option  ?  */ 

switch (toupper (params [i] [1] ) )  t  /*  yes,  which  one  ?  * / 

kase ( *0' , start_over=TRUE)  /*  strategy  */ 

kase('L‘, log_level=atoi (Sparams[i] [2] ) )  /*  log  detail  */ 

kase(*T‘,ntlU“atoi (£params[i) [2] ) )  /*  #  of  TLUs  */ 

kase( • R‘ , radius=atof (sparams(i) [2] ) )  /*  lnit  radius  */ 

kase ( 1 1’ , corr_incr-atoi (sparams[l) [2] ) )  /*  fixed  incr  */ 

kase ( ' A* , absolute=TRUE)  /*  absolute  */ 

kase( ' F* , fraction=atof (sparams [i ] [2] ) )  /‘fractional  */ 

) 


/*******************  parse  filename 


else  if (index (&params[i] [0] ,  ) 

move (& params [i] [0] , pat name) ; 


/*  is  ' . '  in  it? 

/*  yes,  pattern  file 


*/ 

*/ 


} 


else  move  ("  .PATM, 

move  (& params  [i]  [01 ,  pat  name) )  ; 


/*  no,  default  extension  is  */ 
/*  '.pat'  for  pattern  file  */ 


/****************  check  for  command  line  errors  *********************/ 

if  (pat name [0 ]—0)  /*  check  for  missing  pattern  file  name  */ 

abort ( 

“pattern  filename  not  specif led! “) ; 

if (ntlu==0)  /*  check  for  missing  number  of  TLUs  */ 

abort ( 

"number  of  TLUs  per  committee  not  specified!"); 


/*************************  open  pattern  file 


if (! (pat~fopen(patname, "r") ) )  /*  if  open  fails,  abort  */ 

abort ( 

"can't  open  pattern  file!"); 


/*********  perform 


the  training  and  recognition  algorithm  ***********/ 


/*  srand(l);  V 


siloam  ()  ; 


} 


/*  make  random  number  generator  repeatable  — 

*  . . .this  may  be  removed,  if  desired,  after  the 

*  debug  phase  is  complete!  */ 

/*  call  the  outside  control  structure  for  the 

*  trainable  pattern  recognizer.  */ 

End  Listing 


Dr.  Dobb's  Journal,  April  1987 


71 

295 


EXPERT  SYSTEMS 


Listing  One  (Te?ct  begins  on  page  42.) 

REM  Mycin-like  expert  system  for  Amiga  BASIC 
REM  by  Richard  Grigonis 

DIM  AND . COMPONENT ( 12 ) , AT . FACTOR . FOR . OR . COMPONENT ( 12 ) 

DIM  OR. COMPONENT (12) ,TRAIL(30) , HUMAN . INPUT$(4) 

DIM  MESSAGES (60) , WHICH. EQ$ (8) , BLANKS (19) 

DIM  HYPOTHESIS (20)  '  CHANGE  THIS  NUMBER  IF  MORE  THAN  20  ANIMALS 
no-0:yes-l 

DATA  l,is  an  albatross 

albatross-1 

DATA  2 ,  i  s  a  penguin 

penguin-2 

DATA  3, is  an  ostrich 

ostrich-3 

DATA  4, is  a  zebra 

zebra-4 

DATA  5, is  a  giraffe 

giraffe-5 

DATA  6, is  a  tiger 

tiger-6 

DATA  7, is  a  cheetah 
cheetah-7 
DATA  8, flies  well 
flies .well-8 
DATA  9, swims 
swims- 9 

DATA  10, is  black  and  white 

black . and . white-1 0 

DATA  11, cannot  fly 

cannot . fly- 11 

DATA  12, has  a  long  neck 

long. neck-12 

DATA  13, has  black  stripes 

black. stripes-13 

DATA  14, has  long  legs 

long. legs-14 

DATA  15, has  dark  spots 

dark. spots- 15 

DATA  16, has  a  tawny  color 

tawny .color-16 

DATA  17, is  a  bird 

bird-17 

DATA  18, is  an  ungulate 
ungulate-18 

DATA  19, is  a  carnivore 

carnivore-19 

DATA  20, i 8  a  mammal 

mamma 1-20 

DATA  21, has  hair 

has. hair-21 

DATA  22, gives  milk 

gives. milk-22 

DATA  23, eats  meat 

eats. meat-23 

DATA  24, has  pointed  teeth  and  claws  and  forward  pointing  eyes 

teeth . claws . eyes-24 

DATA  25, is  a  mammal  and  has  hoofs 

mammal . and .hoofs-25 

DATA  26, is  a  mammal  and  chews  cud 

mammal . and . chews .cud-26 

DATA  27, has  feathers 

feathers-27 

DATA  28, flies  and  lays  eggs 

flies . and. lays. eggs-28 

DATA  2 9, lays  eggs 

lays.eggs-29 

DATA  30, flies 

flies-30 

DATA  31,chew8  cud 
chews. cud-31 
DATA  32, has  hoofs 
hoofs-32 

DATA  33, has  forward  pointing  eyes 
front .eyes-33 
DATA  34,  has  claws 
claws-34 

DATA  35, has  pointed  teeth 
pointed. teeth-35 
DATA  — 1, END  OF  DATA 

REM  TOP-LEVEL  HYPOTHESES  (ROOTS)  OF  AND/OR  TREE: 

HYPOTHESIS (l)-albatross 
HYPOTHESIS (2 ) -penguin 
HYPOTHESIS (3) -ostrich 
HYPOTHESIS (4) -zebra 
HYPOTHESIS (5) -giraffe 
HYPOTHESIS (6) -tiger 
HYPOTHESIS ( 7 ) -cheetah 
number .of . hypotheses-7 

REM  DETERMINE  TOTAL  NUMBER  OF  FACTS: 
number .of. facts-0 
WHILE  fact  <>  -1 

READ  fact, MESSAGES 
number. of . facts-number .of . facts+1 
WEND 

number . of . facts-number . of . facts-1 

DIM  BEEN .EXAMINED . BEFORE (number. of .facts) , OUTPUT. CF 

(number .of . facts) 

Start : 

FOR  A-0  TO  UBOUND (OUTPUT. CF) 

OUTPUT . CF (A) -0 : BEEN . EXAMINED . BEFORE (A) -0 
NEXT  A 

PRINT  "I'm  a  backward-chaining  expert  system." 

PRINT  "Please  think  of  one  of  the";number .of .hypotheses 
PRINT  "animals  listed  below.  I  will  ask  you" 

PRINT  "questions  about  the  animal  and  compute" 


PRINT  "the  certainty  of  it  being  one" 

PRINT  "of  the  following"; number. of .hypotheses; "animals :":PRINT 
FOR  fact-1  TO  number .of .hypotheses 
which. fact-HYPOTHES IS (fact) 

GOSUB  Find. message :PRINT  "ANIMAL  "; MESSAGES 
NEXT  fact 
PRINT 

10030  PRINT  "DO  YOU  WANT:  " 

PRINT  "AN  EXHAUSTIVE  SEARCH  (1)  OR," 

PRINT  "STOP-ON-SUCCESS  (2)?  ":PRINT 
PRINT  "Press  the  NUMBER  of  YOUR  SELECTION" 

PRINT  "and  then  press  the  RETURN  KEY." 
halt . on . success-0 : INPUT  halt .on . success 
IF  0>halt .on. success  OR  halt .on. success>2 
THEN  PRINT  "TRY  AGAIN!" :GOTO  10030 

10050  REM  PROVE  HYPOTHESES 
GOSUB  Prove. albatross 

IF  halt. on. success-2  AND  OUTPUT. CF (albatross )-l  THEN  10165 
GOSUB  Prove. penguin 

IF  halt. on. success-2  AND  OUTPUT. CF (penguin)-l  THEN  10165 
GOSUB  Prove. ostrich 

IF  halt. on. success-2  AND  OUTPUT. CF (ostrich) -1  THEN  10165 
GOSUB  Prove. zebra 

IF  halt. on. success-2  AND  OUTPUT. CF (zebra) -1  THEN  10165 
GOSUB  Prove. giraffe 

IF  halt. on. success-2  AND  OUTPUT. CF (giraffe)-l  THEN  10165 
GOSUB  Prove. tiger 

IF  halt. on. success-2  AND  OUTPUT. CF (tiger) -1  THEN  10165 
GOSUB  Prove. cheetah 

IF  halt. on. success-2  AND  OUTPUT. CF (cheetah )-l  THEN  10165 

10165  REM  DISPLAY  RESULTS 
CLS 

PRINT  "HERE  ARE  THE  COMPUTED  CERTAINTY  FACTORS:" 

PRINT  "(Correct  animal  has  highest  positive  CF#)":PRINT 
FOR  fact-1  TO  number .of .hypotheses 
which. fact-HYPOTHES IS (fact) 

GOSUB  Find. message 
BLANK$— SPACES (19) 

MESSAGE$-MESSAGE$+MID$ (BLANKS, 1, LEN (BLANKS) -LEN (MESSAGES) ) 
PRINT  "ANIMAL  ".-MESSAGES;"  CF-" ;  OUTPUT .  CF  (which .  fact ) 

NEXT  fact 

PRINT: PRINT  "TO  GO  AGAIN,  press  the  RETURN  button." 

INPUT  HUMAN. INPUTS :PRINT 
GOTO  Start 

REM  SUBROUTINES  TO  COMPUTE  CF'S  (IN  ALPHABETICAL  ORDER) 

Compute. and. clause .cf: 

GOSUB  Find. lowest .cf. branch 
GOSUB  Multiply. lowest . cf .by . at . factor 
GOSUB  Trim. to. zero 
OUTPUT. CF( TRAIL ( depth) ) -new. cf 
RETURN 

Compute . or . clause . cf : 

GOSUB  Multiply . component . cfs .by . at . factors 
GOSUB  Test . for . a .positive .number 
GOSUB  Run. or .equation 
GOSUB  Trim. to. zero 
OUTPUT. CF (TRAIL (depth) )-new.cf 
RETURN 

Dec. stack: 

depth-depth-1 : RETURN 
Deduce : 

which. fact-TRAIL (depth) : GOSUB  Find. message 
PRINT :PRINT  "The  fact  that  the  animal  " 

PRINT  MESSAGES;"  (FACT  #  "; fact . number;  ") " 

PRINT  "Now  has  a  Certainty  Factor  of:  ".-OUTPUT.  CF(  TRAIL  (depth ) ) 
PRINT  .-GOSUB  Dec.  stack:  GOSUB  Delay 
RETURN 

Delay : 

FOR  D— 1  TO  10000 :NEXT  D : RETURN 
Explain. why: 

which. fact-TRAIL (1) :GOSUB  Find .message : CLS 
PRINT  "I  AM  INVESTIGATING  THE  HYPOTHESIS" 

PRINT  "THAT  THE  ANIMAL..." 

PRINT  MESSAGES;"  (FACT  #  "; fact . number; ")" : PRINT 
IF  depth-1  THEN 

PRINT  "...BY  FIRST  ASKING  YOU.":PRINT 
PRINT  "If  you  are  not  sure  (-8  <  CF  <  8)" 

PRINT  "then  I  will  investigate  this  hypothesis  further." 

ELSE 

FOR  A-2  TO  depth 
which.  fact-TRAIL  (A) 

GOSUB  Find. message: PRINT  "...BY  PROVING  THAT  THE  ANIMAL..." 
PRINT  MESSAGES;"  (FACT  #  "; fact .number; ")" :PRINT 
NEXT  A 

PRINT  "...BY  ASKING  YOU." 

END  IF 

PRINT: INPUT  "PRESS  RETURN  KEY  TO  CONTINUE", HUMAN . INPUTS 
RETURN 

Find. lowest .cf. branch: 

lowest . number-OUTPUT . CF (AND . COMPONENT ( 1 ) ) 

FOR  branch-1  TO  number .of .and. clause. components 
number . to . test-OUTPUT . CF (AND . COMPONENT (branch ) ) 

IF  lowest .number>number. to. test 
THEN  lowest .number-number. to. test 
NEXT  branch 
RETURN 

Find .message : 

RESTORE 

FOR  C-l  TO  which. fact 


(continued  on  next  page) 


74 

296 


Dr.  Dobb's  Journal,  April  1987 


EXPERT  SYSTEMS 


Listing  One 

(Listing  continued,  text  begins  on  page  42.) 

READ  fact .number,  MESSAGES 
NEXT  C 
RETURN 

Inc . stack: 

depth-depth+1 : TRAIL (depth) -current . fact 
RETURN 

Multiply . component. cfs .by . at . factors : 

FOR  branch-1  TO  number . of .or .clause .components 
new . cf-OUTPUT . CF (OR . COMPONENT ( branch) ) 

" AT . FACTOR . FOR . OR . COMPONENT (branch ) 

GOSUB  Trim. to. zero 

OUTPUT . CF (OR. COMPONENT (branch) ) -new . cf 
NEXT  branch 
RETURN 

Multiply. lowest .cf .by . at . factor : 

new .cf-lowest . number “at . factor. for . and .clause 
RETURN 

Negative . or . equation : 
new.cf-1 

FOR  branch-1  TO  number .of .or .clause. components 

new.cf-new.cf* (l+OUTPUT.CF(OR.COMPONENT(branch) ) ) 

NEXT  branch 
new . cf — 1+new . cf 
RETURN 

Positive . or . equation : 
new . cf-1 

FOR  branch-1  TO  number .of .or .clause. components 

new.cf-new.cf* (1-OUTPUT .CF (OR. COMPONENT (branch) ) ) 

NEXT  branch 
new.cf-l-new.cf 
RETURN 

Run. or .equation: 

IF  WHICH. EQ$-"POSITIVE"  THEN 
GOSUB  Positive. or. equation 
ELSE 

GOSUB  Negative. or. equation 
END  IF 
RETURN 

Test . fact . for .human. input : 
leave-no : GOSUB  Inc. stack 

IF  BEEN. EXAMINED. BEFORE (current. fact)— yes  THEN  leave-yes : GOSUB 
Dec . stack : RETURN 

BEEN .EXAMINED .BEFORE (current . fact) -yes 
4156  which. fact-current . fact 
GOSUB  Find. message 

4160  CLS: PRINT" (FACT  #  M ; fact . number ;")": PRINT 
PRINT  "ON  A  SCALE  OF  -10  TO  10  WHERE," 

PRINT  "  10-absolutely  certain  it's  true" 

PRINT  "  8-almost  certain" 

PRINT  "  6-probably" 

PRINT  "  3-slight  evidence" 

PRINT  "  0-unknown" 

PRINT  "  -6-probably  not" 

PRINT  "  -8-almost  certainly  not" 

PRINT  "-10-definitely  not" 

PRINT :PRINT"TO  WHAT  DEGREE  DO  YOU  BELIEVE  THAT" 

PRINT  "The  animal  ";MESSAGE$; "?": PRINT 
PRINT  "TYPE  NUMBER  AND  PRESS  RETURN  KEY," 

PRINT  "OR  TYPE  'why?'  AND  PRESS  RETURN  KEY" 

INPUT  HUMAN. INPUT$ 

HUMAN . I NP  UT  $-UCASE  $ (HUMAN . INPUTS ) 

IF  HUMAN. INPUT$-"WHY"  OR  HUMAN . INPUT$-"WHY?" 

THEN  GOSUB  Explain . why : GOTO  4156 
I-VAL ( HUMAN . INPUTS ) 

IF  -10>I  OR  I>10  THEN  GOTO  4160 
1-1/ 10 : OUTPUT . CF (current . fact) -I 
IF  - . 8>I  OR  I> . 8  THEN  leave-yes : GOSUB  Deduce 
RETURN 

Test. for. a. positive. number: 

WHICH. EQ$-"NEGATIVE" 

FOR  branch- 1  TO  number . of .or .clause . components 
number . to . test-OUTPUT . CF (OR . COMPONENT (branch) ) 

IF  number .to. test>0  THEN 

WHICH . EQS— "POSITIVE" : branch-number . of . or . clause . components 
NEXT  branch 
RETURN 

Trim. to. zero: 

IF  -.2<-new.cf  AND  new.cf<-.2  THEN 
new. cf-0 

ELSEIF  new . cf>- . 8  THEN 
new.cf-1 

ELSEIF  new.cf< — .8  THEN 
new.cf— 1 
END  IF 
RETURN 

REM  *  ** DEDUCTIVE  ROUTINES  FOLLOW*** 

Prove . albatross : 

current. fact-albatross : GOSUB  Test. fact. for .human. input 
IF  leave-yes  THEN  RETURN 
GOSUB  Prove .bird: GOSUB  Prove . flies . well 
number . of . and . clause . component s-2 

AND .COMPONENT ( 1 ) -bird : AND . COMPONENT (2) -flies . well 
at . factor . for . and . clausa— 1 
GOSUB  Compute .and .clause . cf 
GOSUB  Deduce 
RETURN 

P  rove . penguin : 


Listing  One 


current . fact-penguin : GOSUB  Tes t . f act . for . human . input 
IF  leave-yes  THEN  RETURN 
GOSUB  Prove. bird: GOSUB  Prove . cannot . fly 
GOSUB  Prove. black. and. white: GOSUB  Prove. swims 
numbe  r . of . and . clause . component  s-4 

AND . COMPONENT ( 1 ) -bi rd : AND . COMPONENT ( 2 ) -cannot .fly 
AND . COMPONENT ( 3 ) -black . and . white : AND . COMPONENT (4 ) -swims 
at . factor . for .and .clause- . 8 
GOSUB  Compute. and. clause. cf 
GOSUB  Deduce 
RETURN 

Prove. ostrich: 

current. fact-ostrich: GOSUB  Test. fact. for .human. input 
IF  leave-yes  THEN  RETURN 
GOSUB  Prove .bird : GOSUB  Prove . cannot . fly 
GOSUB  Prove. black. and. white: GOSUB  Prove. long. neck 
number . of . and .clause . component s-4 

AND . COMPONENT ( 1 ) -bi rd : AND . COMPONENT ( 2 ) -cannot . f ly 
AND . COMPONENT ( 3 ) -black . and . white : AND . COMPONENT (4 ) -long . neck 
at . factor . for . and . clause- . 85 
GOSUB  Compute .and. clause. cf 
GOSUB  Deduce 
RETURN 

Prove . zebra : 

current . fact-zebra : GOSUB  Test . fact . for . human . input 
IF  leave-yes  THEN  RETURN 

GOSUB  Prove .ungulate : GOSUB  Prove .black. stripes 
number . of .and .clause .component s-2 

AND . COMPONENT ( 1 ) -ungulate : AND . COMPONENT (2 ) -black . stripes 
at . factor . for . and . clause- . 8 
GOSUB  Compute. and. clause. cf 
GOSUB  Deduce 
RETURN 

Prove .giraffe: 

current . fact-giraffe : GOSUB  Test . fact . for .human . input 
IF  leave-yes  THEN  RETURN 

GOSUB  Prove. ungulate: GOSUB  Prove. long. neck 
GOSUB  Prove. long. legs: GOSUB  Prove. dark. spots 
number . of . and . clause . component s-4 

AND . COMPONENT ( 1 ) -ungulate : AND . COMPONENT (2) -long . neck 
AND . COMPONENT (3) -long . legs : AND . COMPONENT (4 ) -dark . spots 
at . factor . for . and . clause- . 85 
GOSUB  Compute. and. clause. cf 
GOSUB  Deduce 
RETURN 

Prove .tiger: 

current . fact-tiger : GOSUB  Test . fact . for . human . input 
IF  leave-yes  THEN  RETURN 

GOSUB  Prove. mammal: GOSUB  Prove .carnivore 
GOSUB  Prove. black. stripes :GOSUB  Prove. tawny . color 
number .of .and .clause .component s-4 

AND . COMPONENT ( 1 ) -mammal : AND . COMPONENT (2) -carnivore 
AND . COMPONENT ( 3 ) -black . stripes : AND . COMPONENT (4 ) -tawny . color 
at . factor . for . and . clause- . 95 
GOSUB  Compute. and. clause. cf 
GOSUB  Deduce 
RETURN 

Prove . cheetah : 

current . fact-cheetah : GOSUB  Test . fact . for . human . input 
IF  leave-yes  THEN  RETURN 

GOSUB  Prove. mammal: GOSUB  Prove .carnivore 
GOSUB  Prove. tawny. color : GOSUB  Prove. dark. spots 
number . of . and . clause . component  s-4 

AND . COMPONENT ( 1 ) -mammal : AND . COMPONENT ( 2 ) -carnivore 
AND . COMPONENT ( 3 ) -tawny . color : AND . COMPONENT ( 4 ) -dark . spots 
at . factor . for . and. clause- . 95 
GOSUB  Compute. and. clause. cf 
GOSUB  Deduce 
RETURN 

Prove . flies . well : 

current . fact-flies .well : GOSUB  Test . fact . for .human . input 
IF  leave-yes  THEN  RETURN 
GOSUB  Deduce 
RETURN 
Prove. swims: 

current . fact- swims : GOSUB  Test . fact . for . human . input 
IF  leave-yes  THEN  RETURN 
GOSUB  Deduce 
RETURN 

Prove .black. and .white: 

current . fact-black . and . white : GOSUB  Test . fact . for . human . incut 
IF  leave-yes  THEN  RETURN 
GOSUB  Deduce 
RETURN 

Prove. cannot . fly: 

current . fact -cannot . fly : GOSUB  Test . fact . for .human . input 
IF  leave-yes  THEN  RETURN 
GOSUB  Deduce 
RETURN 

P  rove . long . neck : 

current. fact-long. neck: GOSUB  Test . fact . for .human. input 
IF  leave-yes  THEN  RETURN 
GOSUB  Deduce 
RETURN 

Prove .black. stripes: 

current . fact— black . stripes : GOSUB  Test . fact . for . human . input 
IF  leave-yes  THEN  RETURN 

gosub  Deduce  (continued  on  page  82) 


76 


Dr.  Dobb's  Journal,  April  1987 

297 


EXPERT  SYSTEMS 


Listing  One 

(Listing  continued,  text  begins  on  page  42.) 


Prove . long . legs : 

cur  rent . fact-long . legs : GOSUB  Test . fact . for . human . input 
IF  leave-yes  THEN  RETURN 
GOSUB  Deduce 
RETURN 

Prove . dark . spots : 

current . fact-dark . spots :GOSUB  Test . fact . for .human . input 
IF  leave-yes  THEN  RETURN 
GOSUB  Deduce 
RETURN 

Prove . tawny . color : 

current . fact-tawny .color : GOSUB  Test . fact . for . human . input 
IF  leave-yes  THEN  RETURN 
GOSUB  Deduce 
RETURN 

Prove. bird: 

current . fact-bird : GOSUB  Test . fact . for . human . input 
IF  leave-yes  THEN  RETURN 

GOSUB  Prove. feathers: GOSUB  Prove . flies . and .lays . eggs 
number . of .or . clause . component s-2 
OR . COMPONENT ( 1 ) -feathers 

AT . FACTOR . FOR . OR . COMPONENT ( 1 ) -1 
OR . COMPONENT (2 ) - f 1 ies . and . lay s . eggs 
AT .FACTOR . FOR . OR . COMPONENT (2 ) - . 8 
GOSUB  Compute . or .clause .cf 
GOSUB  Deduce 
RETURN 

Prove. ungulate: 

current . fact -ungulate : GOSUB  Test . fact . for . human . input 
IF  leave-yes  THEN  RETURN 
GOSUB  Prove. mammal .and. hoofs 
GOSUB  Prove. mammal .and. chews. cud 
number -of . or . clause . component3-2 
OR . COMPONENT ( 1 ) -mammal . and . hoofs : 

AT . FACTOR . FOR . OR . COMPONENT ( 1 )-. 85 
OR . COMPONENT (2 ) —mammal . and . chews . cud 
AT . F ACTOR . FOR . OR . COMPONENT ( 2 ) - . 8 
GOSUB  Compute . or .clause .cf 
GOSUB  Deduce 
RETURN 

Prove . carnivore : 

current. fact-carnivore: GOSUB  Test . fact . for .human. input 
IF  leave-yes  THEN  RETURN 

GOSUB  Prove. eats. meat: GOSUB  Prove . teeth . claws . eyes 
number . of . or . clause. components— 2 
OR . COMPONENT ( 1 ) -eat  s . meat 

AT . FACTOR . FOR .  OR . COMPONENT ( 1 ) - . 85 
OR . COMPONENT (2 ) -teeth . claws . eyes 
AT . FACTOR . FOR . OR . COMPONENT ( 2 ) -1 
GOSUB  Compute . or .clause .cf 
GOSUB  Deduce 
RETURN 

Prove . mamma 1 : 

current . f act  -mammal: GOSUB  Test . fact . for . human . input 
IF  leave-yes  THEN  RETURN 

GOSUB  Prove. has. hair: GOSUB  Prove. gives .milk 
number .of .or. clause. components-2 

OR . COMPONENT ( 1 ) -ha 8 . hai r : AT . FACTOR . FOR .OR . COMPONENT ( 1 )  - .  8 5 
OR.  COMPONENT  (2)  -gives  .milk  : AT.  FACTOR. FOR. OR .  COMPONENT  (2) -.8 
GOSUB  Compute. or .clause. cf 
GOSUB  Deduce 
RETURN 

Prove. has. hair : 

current . fact -has . hair : GOSUB  Test . fact . for . human . input 
IF  leave-yes  THEN  RETURN 
GOSUB  Deduce 
RETURN 

Prove. gives,  milk: 

current . fact -gives .milk :GOSU3  Test . fact . for .human .input 
IF  leave-yes  THEN  RETURN 
GOSUB  Deduce 
RETURN 

Prove . eats .meat : 

current . fact-eats . meat : GOSUB  Test . fact . for .human. input 
IF  leave-yes  THEN  RETURN 
GOSUB  Deduce 
RETURN 

Prove . teeth. claws . eyes : 

current . fact-teeth . claws . eyes : GOSUB  Test . fact . for . human . input 
IF  leave-yes  THEN  RETURN 

GOSUB  Prove. pointed. teeth: GOSUB  Prove. claws 
GOSUB  Prove . front . eyes 

number . of .and . clause . component s-3 

AND . COMPONENT ( 1 ) -pointed. teeth : AND .COMPONENT (2) -claws 
AND . COMPONENT ( 3 ) - front .eyes 
at . factor . for . and . clause- . 85 
GOSUB  Compute . and .clause . cf 
GOSUB  Deduce 
RETURN 

Prove. mammal .and. hoofs: 

current . fact -mammal .and .hoofs : GOSUB  Test . fact . for .human. input 
IF  leave-yes  THEN  RETURN 
GOSUB  Prove. mammal: GOSUB  Prove. hoofs 
number . of .and .clause .components-2 

AND . COMPONENT ( I ) -mammal : AND . COMPONENT ( 2 ) -hoofs 


at . factor . for . and . clause- . 8 
GOSUB  Compute .and .clause. cf 
GOSUB  Deduce 
RETURN 

Prove .mammal . and. chews . cud: 

current . fact-mammal .and. chews. cud:GOSUB  Test . fact . for. human. input 
IF  leave-yes  THEN  RETURN 
GOSUB  Prove. mammal: GOSUB  Prove .chews . cud 
number . of . and .clause . components-2 

AND . COMPONENT ( 1 ) -mammal : AND . COMPONENT ( 2 ) -chews . cud 
at. factor. for. and. clause-. 8 
GOSUB  Compute. and. clause. cf 
GOSUB  Deduce 
RETURN 

Prove. feathers: 

current . fact-feathers : GOSUB  Test . fact . for . human . input 
IF  leave-yes  THEN  RETURN 
GOSUB  Deduce 
RETURN 

Prove . flies . and . lays . eggs : 

current . fact-flies . and . lays . eggs : GOSUB  Test . fact . for . human . inDUt 
IF  leave-yes  THEN  RETURN 
GOSUB  Prove. flies:GOSUB  Prove. lays. eggs 
number . of . and .clause .components-2 

AND . COMPONENT ( 1 ) - f lies : AND . COMPONENT ( 2 ) -lays . eggs 
at . factor . f or . and. clause— 1 
GOSUB  Compute. and. clause. cf 
GOSUB  Deduce 
RETURN 

Prove . lays .eggs : 

current . fact-lays . eggs : GOSUB  Test . fact . for . human . input 
IF  leave-yes  THEN  RETURN 
GOSUB  Deduce 
RETURN 

Prove. flies: 

current . fact-flies : GOSUB  Test . fact . for . human . input 
IF  leave-yes  THEN  RETURN 
GOSUB  Deduce 
RETURN 

P  rove .chews. cud : 

current . fact-chews . cud : GOSUB  Test . fact . for . human . input 
IF  leave-yes  THEN  RETURN 
GOSUB  Deduce 
RETURN 

Prove. hoofs: 

current . fact-hoofs : GOSUB  Test . fact . for . human . input 
IF  leave-yes  THEN  RETURN 
GOSUB  Deduce 
RETURN 

Prove . front . eyes : 

current . fact-front .eyes : GOSUB  Test . fact . for .human .input 
IF  leave-yes  THEN  RETURN 
GOSUB  Deduce 
RETURN 

Prove. claws: 

current . fact-claws : GOSUB  Test . fact . for . human . input 
IF  leave-yes  THEN  RETURN 
GOSUB  Deduce 
RETURN 

P  rove . pointed . t  eeth : 

current . fact-pointed . teeth : GOSUB  Test . fact . for . human . input 
IF  leave-yes  THEN  RETURN 
GOSUB  Deduce 
RETURN 


End  Listing 


82 

298 


Dr.  Dobb's  Journal,  April  1987 


C  CHEST 


Listing  Nineteen  (Text  begins  on  page  130.) 


♦include  <stdio.h> 
♦include  <hash.h> 
♦Include  "nr.h" 


NRMAC.C:  macro,  diversion,  &  trap  support  for  nr 

Copyright  (c)  1987,  Allen  I.  Holub. 

This  module  holds  routines  for  mainpulating  and 
accessing  macros  and  strings.  In  addition  the  line 
trap  mechanism  is  implemented  here. 


Macros  are  kept  in  a  hash  table.  If  they  are  smaller 
than  MAXMBUF  characters,  they  are  stored  in  memory;  else 
they're  stored  on  the  disk.  When  a  macro  gets  too  big  it 
is  put  into  a  file  called  XXYY.mac  where  XX  is  the  first 
character  in  the  name  represented  as  two  hex  digits,  YY 
is  the  second  character. 

VERT  is  the  height  of  the  current  diversion,  accessable  as 
\n(.v  It  is  used  three  places  other  than  this  module: 

1)  It's  incremented  after  every  line  is  output  in  nrout.c 

2)  It's  used  to  spring  diversion  traps  in  nrout.c 

3)  It's  used  when  setting  a  diversion  trap  in  nrprocs.c 

it's  value  is  equal  to  \n(nl  if  no  diversion  is  active. 
HEIGHT  is  the  height  of  the  most  recently  completed 
diversion  accessable  as  \n(dn.  It's  set  equal  to  VERT 
when  a  diversion  is  closed.  It's  not  used  anywhere  by  nr. 


MACRO-RELATED  DEFINES: 


typedef  struct  _macro_ 


char  mode; 
FILE  •  f  d; 
char  *buf; 
char  *ptr; 


int  vert; 
int  width; 


/*  Open  mode:  'r'  'w'  'a'  0-not  open 

/*  Fd  of  macro  on  disk  or  0  if  in  memory 
/•  Pointer  to  buffer  or  0  if  macro  on  disk 
/•If  buf  valid,  pointer  to  the 
/*  last  valid  character,  (current  char  if 
/*  writing) . 

/*  Macro  size  in  lines  and  macro  width  in 
/*  characters.  These  fields  are  only 
/*  defined  if  the  macro  is  a  diversion. 


/•global*/ 

int 

mcreate 

(char  *,char  *); 

/•global*/ 

int 

mappend 

(char  *,char  *); 

/•global*/ 

void 

mac_clean 

(void) ; 

/•global*/ 

int 

screate 

(char  *,char  *); 

/•global*/ 

int 

sappend 

(char  *,char  *); 

/•global*/ 

int 

printm 

(void) ; 

/*  DIVERSION-RELATED  */ 

/•global*/ 

int 

dcreate 

(char  *); 

/•global*/ 

int 

dappend 

(char  *); 

/•global*/ 

int 

endiv 

(void) ; 

/*  TRAP-RELATED  * 

/ 

/•global*/ 

int 

set  linetrap  (char  *,int  ); 

/•global*/ 

int 

movetrap 

(char  *,int  ,lnt 

/•global*/ 

int 

pr_traps 

(void) ; 

/•global*/ 

int 

do_divtrap 

(void) ; 

/•global*/ 

/•global*/ 

int 

int 

do_linetrap 

distance 

(int  ) ; 

(void) ; 

/*  USED  LOCALLY  * 

/ 

/•local  •/ 

void 

delm 

(char*,  MACRO* 

/•local  */ 

char 

*  fname 

(char* 

/•local  */ 

void 

swrite 

(MACRO*, char* 

/•local  */ 

void 

prnt 

(char*,  MACRO* 

/•local  */ 

int 

pushdiv 

(MACRO* 

/•local  */• 

MACRO 

i  *popdiv 

(void 

/•local  */ 

int 

findtrap 

(char* 

static  char 
char  "name; 


•fname{  name  ) 


/•  Create  a  unique  file  name  for  a  macro  temporary  file. 

•  The  name  is  volitile  (it  won't  be  preserved  between 

•  fname ( )  calls.  If  a  TMP  environment  exists,  its 

•  prefixed  to  the  name. 

*/ 

static  char  buf[  80  ]; 
char  *env; 

if(  ! (env  -  getenv ("TMP") )  ) 
env  - 

sprintf (buf,  "%s%02x%02x.mac",  env,  name[0]  £  Oxff, 

name[l]  £  Oxff); 


typedef  UCHAR  LTRAP [4] ; 


static  LTRAP 
static  HASH  TAB 


Linetrapl  MAX LTRAP +1  ); 
•Macros  -  NULL; 


DIVERSION-RELATED  DEFINES: 


typedef  struct 
{ 

FILE  *of ile; 
int  isdiv; 
int  divtrap; 
char  dtrap_name[2] 

int  width; 
int  vert; 


/•  Output  file  of  previous  level  */ 
/•  1  if  file  is  a  diversion  */ 
/*  Diversion  trap  */ 
/*  Invoke  when  divtrap  reached  */ 

/*  width  of  current  diversion.  */ 
/*  Vertical  place  in  current  div  */ 


♦define  MAXDIV  8  /*  Maximum  diversion  nesting  level  */ 


DIV  Dstack [MAXDIV] 

int  Dsp  -  MAXDIV; 


/•  Diversion  environment  stack  •/ 

/*  Diversion  stack  pointer  (index)  */ 


*  Function  prototypes  for  external  routines  not  declared 

•  in  a  .h  file 
*/ 

extern  void  err  (  char",  ...  t*  nrout.c  */ 

extern  int  getline  (  char*, int, int (•)( )  );  /*  nrinp.c  / 

extern  void  process  (FILE*, char *, int, char **) ;  /*  nrinp.c  / 

extern  char  *skipspace(  char*,  int  ); 

extern  char  *skipto  (  int,  char*,  int  ); 

extern  char  “getenv  (  char*  )  • 

/ - - - V 

/*  Function  prototypes  for  routines  in  this  module  *! 


/•  MACRO-RELATED 

(primitives)  : 

*/ 

/•global*/ 

int 

mgetc 

(MACRO  *  ); 

/•global*/ 

void 

mwrite 

(MACRO  *,  char  *); 

/•global*/ 

void 

mputc 

(int,  MACRO  *); 

/•global*/ 

void 

muni ink 

(char  *  ); 

/ ‘global*/ 

MACRO 

•mopen 

(char*,  char*  ) ; 

/•global*/ 

void 

mclose 

(MACRO*  )  ; 

/•  MACRO-RELATED 

(high  level)  : 

*/ 

/•global*/ 

char 

•expandstr 

(char  *,char  *,int) 

/•global* / 

int 

expand_macro 

(char  •); 

MACRO  *mopen(  ra_name,  how  ) 
char  *m_name; 

char  *how; 

/•  Open  the  macro  "ra_name"  in  the  specified  mode.  Mode 

*  may  be  "r"  "w"  or  "a".  If  the  macro  doesn't  exist  it 

*  is  created.  An  open  for  write  will  delete  the  contents 

*  of  the  macro  if  they  exist.  Only  the  first  two 

*  characters  of  the  name  mean  anything.  Return  0  on 

*  error  or  a  pointer  to  the  macro  on  suGcess. 

•/ 


register  int 
register  MACRO 
char 


existing  ; 
•pnode  ; 
•name  ; 


if(  *m_narae  --  '\0') 

return (  (MACRO  *)0  ); 


if(  [Macros  ) 


/*  Create  macro  table 


Macros  -  maketabl  127  ):  /*  If  it  doesn't  exist. 


name  -  fname (  m  name  ) ; 


/•  Convert  macro  name  to  */ 
/*  associated  file  name.  •/ 


pnode  -  (MACRO  •)  findsym(Macros,  m_name) ; 
existing  -  pnode  I-  NULL; 

if(  [pnode  ) 

pnode  -  (MACRO*)  addsym(Macros,  m_name,  si zeof (MACRO) ) ; 

) 

else  if(  pnode->mode  ) 

err ("May  not  access  .%2s  macro  recursively\n",  m_name); 
return (  (MACRO  *)0  ); 

) 

switch (  pnode->raode  -  *how  ) 

{ 

case  'a': 

if(  existing  ££  pnode->fd  ) 

pnode->fd  -  fopen(  name,  "ab"  ); 


case  *w ' : 

if(  existing  ) 

/*  If  the  macro  already  exists,  truncate  it* 

*  buffer,  or  buffer  file,  to  zero 

*  length. 

*/ 

if(  pnode->fd  ) 

pnode->fd  -  fopen(  name,  “w"  ); 


(continued  on  page  86) 


Dr.  Dobb's  Journal,  April  1987 


C  CHEST 


Listing  Nineteen 

(Listing  continued,  te?ct  begins  on  page  130.) 


else  if(  pnode->buf  ) 


free (  pnode->buf  ) ; 
pnode->buf  -  0  ; 

pnode->ptr  -  "M  ; 


) 

break; 


if{  ! existing  ) 

{ 

delsyra (  Macros,  (BUCKET  *)  pnode  ); 
return (  (MACRO  *)0  ); 


Position  the  buffer  (or  file)  pointer  to 
beginning  of  buffer  (or  file) . 


if(  pnode- >buf  ) 

pnode->ptr  -  pnode->buf  ; 

else  if(  pnode->fd  ) 

( 

if(  ! (pnode->fd  -  fopen(  name, 

err ("Can't  open  macro  file;  <%s>\n",  name); 
return (  (MACRO  *)0  ); 

} 

) 

break; 


")  )) 


default: 


err ("Internal  errror: 
return (  (MACRO  *)0  ); 

return (  pnode  ) ; 

) 

bad  mopen  mode\n"); 

void 

mclose(  mptr  ) 

MACRO 

< 

•mptr; 

if (  mptr  ) 

{ 

if(  mptr->fd  ) 

t 

fclose (  mptr->fd 

mptr->mode  -  0; 

) 

);  /•  Close  the  file 

int 

mgetc (  mptr  ) 

MACRO 

•mptr; 

Read  from  a  macro  opened  with  a  previous  mopen  call, 
mgetc  is  called  by  getline  which  is  called  by 
process().  It's  also  called  by  expandstr()  which  may 
be  called  while  expanding  a  macro.  Bunches  of 
recursion,  use  a  big  stack.  Return  EOF  at  end  of  macro. 


register  int 


rval; 


} 

/• - 

static 

MACRO 

char 

( 


) 

/* - 

void 

MACRO 

char 

{ 

/• 


if(  mptr->buf  ) 

rval  -  *raptr->ptr  ?  (int) ( * (mptr->ptr) ++)  :  EOF  ; 
else  if  (mptr->fd  ) 

rval  -  (  mptr->fd  )  ?  getc(  mptr->fd  )  :  EOF  ; 

else 

rval  -  EOF  ; 
return  rval; 


void  swrite(  mptr,  buf  ) 

•rapt  r ; 

•buf; 

/*  Write  into  a  string,  buf  is  a  pointer  to  the 

string  itself  and  mptr  is  a  pointer  to  the  macro. 

while (  *buf  ) 

mputc ( *buf++,  mptr)  ; 

mputc {  0,  mptr  ); 


rawrite (  mptr,  term  ) 

•mptr; 

•terra; 

Write  into  a  macro.  Input  is  taken  from  the 
current  input  file  and  put  into  the  macro  until 
a  line  starting  with  "term"  is  encountered.  Note 


*  that  the  input  is  not  modified  (  escape  sequences 

•  are  not  expanded,  etc.).  Terminate  on  end  of  file 

*  or  on  encountering  terra  at  the  beginning  of  a 

•  line.  Mwrite  is  also  used  by  the  .ig  command  to 
•^ignore  input  text.  To  do  this,  set  mptr  to  0. 

char  buf l  MAXSTR  ],  *bp  ; 

int  not_eof  ; 

while (  not_eof  -  getline (buf, 1, I sraacro  ?  mgetc  :  fgetc) ) 


if(  !  ‘terra  ) 


( 


} 


if(  buf (0]  —  • 
break; 


/•  Terminate  on 
'  ((  bu  f ( 1 ]  — 


at  bol 
*  ) 


) 


else  if(  ‘term  —  '  \n*  ) 

{ 

if(  !*buf  )  /•  Terminate  on  ah  empty  line  •/ 

break; 

> 

else  if (  buf [01  —  S4  buf (X )  —  term[0)  u  buf [2]— terrain 
break; 


if(  Iraptr  )  /•  For  the  .ig  command.  Ignore  */ 

continue;  /*  input.  * / 

for (  bp  -  buf  ;  *bp  ;  mputc (  *bp++,  mptr  )  ) 

mputc(  ' \n',  mptr  );  /•  Getline  doesn't  buffer  LF  */ 


if(  not_eof  ) 


( 


) 

else 

< 


if  (  mptr  ) 

mputc (  0,  mptr  ); 


err ("EOF  encountered  while  writing  to  %s  macro\n", 
exit(1).  syraname (mptr ) ) ; 


) 

/* - 

void  mputc (  c,  mptr  ) 

MACRO  "rapt  r ; 

( 

char  ‘name; 

if(  raptr->fd  ) 

( 

/*  Macro  is  on  the  disk: 

*  Don't  write  a  terminating  null  to  a  file.  This 

*  simplifies  our  life  when  we  append  to  a  macro 

*  in  a  file. 

*/ 


) 


if<  c  ) 

putc (c,  mptr->fd  ); 


else  if(  mptr->buf  ) 

( 

/*  Macro  is  in  memory:  Write  the  character  into 
*  the  buffer.  Don't  increment  when  we  write  a  null 
*^to  make  appending  easier. 

if(  mptr->ptr  -  mptr->buf  <  MAXMBUF  ) 
if(  c  ) 

• (raptr->ptr)++  -  c  ; 

else 

•mptr->ptr  -  0  ; 


> 

else 

( 


/*  Macro  has  grown  too  large.  Create  a  disk 
*  file  and  write  it  out  to  there.  Free  the 
•^  memory  previously  used  by  the  macro. 

name  -  fname (  syraname (mptr)  ); 

if(  ! (mptr->fd  -  fopen(name  ,  "w") )  ) 

err (“Can't  open  temporary  mac*o  file  <%s>\n", 

name ) ; 


) 

else 

( 


fwrite (  raptr->buf,  MAXMBUF,  1,  mptr->fd  ); 
free(  raptr->buf  ); 
mptr->buf  -  raptr->ptr  -  0  ; 

if  (c) 

putc(c,  raptr->fd) ; 


) 

else 

{ 


/*  New  macro,  allocate  a  buffer  then  write  */ 

if(  raptr->buf  -  (char  *)  raalloc(  MAXMBUF  )  ) 

( 


mptr->ptr  -  mptr->buf  ; 


(continued  on  page  89) 


86 

300 


Dr.  Dobb's  Journal,  April  1987 


C  CHEST 


Listing  Nineteen  ( Listing  continued ,  te?ct  begins  on  page  130. ) 


) 

else 


if(  c  ) 

* (mptr->ptr) ++  -  c 

else 

*mptr->ptr  -  0; 


> 

/* - 

void 

char 

{ 


err (“Insufficient  memory  for  raacro\n“); 


muni ink (m_name ) 

*m_narae; 

/*  Remove  macro  “m_name"  if  it  exists.  Return  0  if 

*  the  register  didn't  exist;  return  1  if  it  was 

*  removed  successfully . 

•/ 

register  MACRO  *node; 

if(  ! (node  -  (MACRO  *)  findsyra(  Macros,  ra_narae  ))  ) 
err ("Macro  <%2.2s>  doesn't  exist \n“,  m_name  ); 

else 

( 

if(  node->mode  ) 

err (“May  not  remove  active  macro,  abort ing\n") ; 
exit  (  1  ) ; 

) 

if(  node->buf  ) 

free (  node->buf  ); 

else  if(  node->fd  ) 

unlink (  fname(  m  name  )  ); 


delsym(  Macros,  (BUCKET  *)  node  ); 


) 

/*— 


char 
char 
( 


*expandstr(  name,  target,  maxstr  ) 

•name,  *  target; 

/*  Expand  str  into  the  target  string.  Return  the  updated 

*  target  pointer,  which  won't  be  modified  if  the  string 

*  doesn't  exist.  In  this  last  case  print  an  error  message. 

*  Expand  at  most  maxstr  characters.  Note  that  there's 

*  a  indirect  recursion  if  the  expanded  sting  contains 

*  an  escape  sequence.  expandstrO  is  called  from 

*  escape () . 

*/ 


register  MACRO 
register  int 
char 


if(  mptr 

( 


•mptr; 
c  ; 

•p  ; 

(MACRO  *)  mopen(name. 


c  -  mgetc(raptr)  ; 

while (  c  !-  EOF  fit  maxstr  >  0  ) 

{ 

if(  c  !-  Esc  ) 

( 

*target+*  -  c  ; 
c  -  mgetc(raptr)  ; 

— maxstr; 

) 

else 

( 

p  -  target; 

c  —  escape (p,  (target,  0,  ragetc,  mptr,  maxstr); 
maxstr  —  (target  -  p) ; 

) 

) 

inclose  (  mptr  )  ; 


return  target; 

) 

/ - - - - 

expand  macro (  str  ) 
char  *str; 

1  /•  Expand  the  macro.  The  first  word  in  str  is  the  name. 

*  Note  that  expand  is  called  by  process ()  which  calls 

*  expand.  There  can  be  some  nasty  recursion 

*  going  on  if  macro  expansion  is  nested  too  far.  On  the 

*  other  hand  the  code  needed  to  expand  nested  macros 

*  is  much  cleaner.  MAX NEST  will  help  this  a  little. 

*  Return  0  if  the  macro  doesn't  exist  of  it  the  nest 

*  level  is  too  high,  1  otherwise. 

*  This  routine  is  called  recursively  in  the  case  of 

*  nested  macro  expansions.  Be  careful  with  static 

*  variables. 


register  int 
register  MACRO 


i,  onargs; 
•mptr; 


(continued  on  next  page) 


Dr.  Dobb's  Journal,  April  1987 


89 

301 


C  CHEST 


Listing  Nineteen 

(Listing  continued,  text  begins  on  page  130.) 

char  ‘raacv [MAXARGS ] ,  ‘name  ; 

static  nestlev  -  0  ; 

if<  nestlev  <  MAXNEST  ) 

++nestlev; 

else 


register  MACRO  ‘mptr; 

if(  mptr  -  mopen {  name,  "a"  )  ) 
( 

mvrite (  mptr,  term  ) ; 
ncloset  mptr  ); 


err (  “Macro  nesting  too  deep,  ignoring  <%s>\n",  str) ; 
return  0; 


name  -  str;  /•  extract  name  */ 

str  -  skipto(  •  • ,  str.  Esc  );  /*  Skip  past  name  */ 

if(  ‘str  ) 

*str++  -  '\0';  /*  terminate  name  £  */ 

str  -  skips pace ( str.  Esc);  /‘  skip  whitespace  ‘/ 

if(  !(  mptr  -  (MACRO  *)  mopentnarae,  "r")  )  ) 
return  0; 

/*  Create  the  vector  array  pointing  into  the  argument 

*  array.  Null  terminate  each  argument  string.  Quoted 

*  arguments  are  recognized. 

*/ 

for (  i  -  0  ;  “Str  ((  i  <  MAXARGS  ;  i++  ) 

( 

if(  ‘str  —  ,M'  ) 

{ 

macv[i]  r  ++str  ; 

str  -  skipto(  str.  Esc  ); 


macv[i]  -  str  ; 

str  -  8kipto(  '  ',  str.  Esc  ); 


screate(name,  str) 
char  ‘name,  »str; 


Create  a  string.  If  it  exist,  delete  it 


register  MACRO  ‘mptr; 

if (  mptr  -  mopen {  name,  “v"  )  ) 

( 

swrite(  mptr,  str  ); 
mclose(  mptr  ); 


s append (name,  str) 
char  ‘name,  ‘str; 


/*  Append  to  an  existing  string.  If  it  doesn't 
*  exist,  create  it. 

*/ 

register  MACRO  ‘mptr; 

if(  mptr  -  mopen (  name,  "a"  )  ) 

I 

swrite (  mptr,  str  ); 
mclose(  mptr  ); 


•str++  -  0; 

str  -  skipspace (str.  Esc); 


onargs  —  NARGS;  /•  set  #  args  at  current  lev  */ 

NARGS  -  i; 

while (  i  <  MAXARGS  )  /*  put  null  in  unused  args  */ 

macv(i++)  -  ""  ; 

process  (  (FILE  •)  mptr,  symname (mptr) ,  1,  roacv  ); 

NARGS  -  onargs;  /*  clean  up  * / 

— nestlev; 
mclose(  mptr  ); 
return  1; 


dump_mac(  macro,  file  ) 
char  ‘macro,  ‘file; 

{ 

/*  Dump  the  indicated  macro  out  to  the  indicated  file. 


MACRO  ‘raptr; 

FILE  *  fp; 

register  int  c; 

if(  ! (fp  -  f open (file, "w") )  ) 

err ("Can't  open  <%s>  for  output \nM,  file  ); 

else  if(  mptr  -  (MACRO  *)  mopen (macro,  "r")  ) 

while (  (c  -  ragetc(mptr))  !-  EOF  ) 
putc(c,  fp) ; 

mclose(  mptr  ); 
f close (  fp  ); 


static  void  pmt(  m_name,  p  ) 
char  *m_name  ; 

MACRO  *p; 

file  ^stream  -  NULL; 

int  len; 

char  str [80]; 

if(  p->buf  ) 

len  -  strlen (  p->buf  ); 

else  if(  p->fd  ) 

( 

if(  ! (stream-  fopen(  fname (mname) ,  "rb"  ))) 

err ("Can't  open  %s\n",  str  ); 
return; 


^  len  -  filelength(  fileno (stream)  )  ; 

print f("+ - <%s>  («d  chars  in  ts) - +\n", 

nname,  len,  stream  ?  "file"  :  "memory"  ); 

tifdef  DEBUG 

printf  ("  |  mode-0x%x-<%0,  buf-0x%x,  ptr-0x%x,  fd-0x%x\n", 
p->mode,  p->mode,  p->buf,  p->ptr,  p->fd  ); 

printf  ("+ - \n-  )  ; 


if(  ! stream  ) 

fputs(  p->buf,  stdout  ); 

else 

( 

while (  fgets(str,  80,  stream)  ) 
fputs (  str,  stdout  ) ; 

fdose(stream) ; 


mcreate(  name,  terra  ) 
char  “name,  *term; 


put char ('\n* ) ; 


/*  Create  a  macro.  If  it  already  exists,  delete  it 
*  first. 

*/ 

register  MACRO  *mptr; 

if(  mptr  -  mopen (  name,  "w"  )  ) 

( 

rewrite  (  nptr,  terra  ) ; 
mclose (  mptr  ) ; 

) 


mappend (  name,  term  ) 
char  ‘name,  ‘terra; 

{ 

/*  Append  to  an  existing  macro.  Create  it  if  it 
*  doesn't  exist. 

*/ 


printm ( ) 

/• 

Print  out  all  the  macros  */ 

register  int  lev; 

if(  ! 

[Macros  ) 

else 

*•  There  are  no  macros  “**\n"); 

) 

) 

ptab(  Macros,  prnt  ); 

printf (  "\nThe  end  macro  is  <%s>\n", 

•Encta  ?  Endm  :  "NONEXISTANT"  ); 

static 

char 

void 

delm(  m_ 

name,  p  ) 

MACRO 

*p; 

(continued  on  next  page ) 

90 

302 


Dr.  Dobb's  Journal,  April  1987 


C  CHEST 


Listing  Nineteen 

(Listing  continued,  tejct  begins  on  page  130.) 

/*  Delete  disk  file  associated  with  macro  */ 

if(  p->fd  ) 

unlink (  f name (  ra_narae  )  ) ; 


register  MACRO  "mptr; 

if(  mptr  -  mopen(  name,  "w"  )  ) 

{ 

if (  Ipushdiv (mptr)  ) 
inclose (  mptr  ); 

VERT  -  1;  /*  Vertical  place  in  current  div  */ 

Divwidth  -  0;  /*  Width  of  current  diversion  */ 


mac_clean ( ) 

/*  Delete  all  macros  that  are  on  the  disk  */ 

if(  Macros  ) 

ptab (  Macros,  delm  ); 


*  Stuff  to  handle  to  diversions.  It  is  too  complicated  to 

*  use  recursion  here,  mostly  because  you  can  be  processing 

*  a  macro  while  you  are  diverting  output  (and  have  several 

*  levels  of  nesting  to  boot.  Modifying  process ()  to  handle 

*  changes  in  both  input  and  output  prooved  to  be  too 

*  difficult.  The  easy  thing  to  do  is  maintain  a  special 

*  diversion  stack  on  which  we  keep  the  various  globals  we 

*  want  to  save  when  we  change  diversions.  PushdivO  and 

*  popdivO  (below)  do  the  stack  maintenance. 


static  int  pushdiv(  ndptr  ) 

MACRO  * ndptr; 

( 

register  DIV  *div; 

if<  Dap  <-  0  ) 

{ 

err ("Diversion  nesting  too  deep\nM); 
return  0; 

) 

div  -  tDstack [ — Dsp]; 

#ifdef  DEBUG 

print f ("Opening  diversion,  saving  at  Dstack (%d) \n", 

Dsp) ; 

tendif 


div->ofile 

div->isdiv 

div->divtrap 


-  Ofile  ; 

-  Isdiv  ; 

-  Divtrap; 


div->dtrap_naraelO]  -  Dtrapnarae [01 ; 
div->dtrap_name[l]  -  Dtrap_name [ 1] ; 


div->vert 

div->width 


-  VERT; 

-  Divwidth; 


Divwidth  -  0; 

Ofile  -  (FILE  *) ndptr; 
Isdiv  -  1  ; 

Divtrap  -  -1  ; 

Dtrap_narae(0]  -  0  ; 


/*  Width  of  current  diversion  */ 
/*  output  macro  pointer  */ 
/*  Ofile  points  at  a  macro  */ 
/*  No  diversion  trap  set  *7 
/*  Diversion  trap  has  no  name  */ 


dappend (  name  ) 
char  “name; 


/*  Open  an  existing  diversion  for  appending.  The 

*  current  diversion  height  and  width  number  registers 

*  will  be  changed  to  reflect  this  diversion. 

*/ 

register  MACRO  "mptr; 

if(  mptr  -  mopen(  name,  "a"  )  ) 

( 

if(  Ipushdiv (mptr)  ) 

mclose  (  mptr  ) ; 

VERT  -  mptr->vert  +1  ; 

Divwidth  -  mptr->width 


/*  Close  the  most  recently  opened  diversion 

*  We  must  decrement  VERT  because  it's  incremented  after 

*  the  final  \n  of  the  diversion  is  processed. 

*  If  no  diversion  is  active,  nothing  is  done. 


register  MACRO  *mptr; 

int  height,  width  ; 

height  -  VERT  -  1  ; 
width  -  Divwidth  ; 

if(  mptr  -  popdivO  ) 

{ 

mputc(  0,  mptr  ); 

mptr->vert  -  HEIGHT  -  height 
mptr->width  -  WIDTH  -  width 

mclose (  mptr  ) ; 


Stuff  to  handle  to  traps: 


static  MACRO  *  popdivO 

/*  Restore  the  enviornraent  active  before  the  most  recent 

*  push  div  call.  Return  a  pointer  to  the  diversion  macro 

*  or  0  if  no  enviornment  to  restore. 

*/ 

register  MACRO  *rval; 

register  DIV  *div; 

if (  Dsp  >-  MAXDIV  )  /•  No  diversion  is  active  */ 

return  0; 

#ifdef  DEBUG 

printf ("Closing  diversion,  poping  Dstack l%d] \n",  Dsp); 
fendif 

div  -  4Dstack[Dsp++] ; 

/*  Put  back  the  old  enviornment.  Note  that  VERT  is 
-  initialized  to  1  and  incremented  after  every  line 

*  output  to  the  diversion.  This  way  \n(.d  will  be  1 

*  on  line  1  of  the  diversion.  HEIGHT,  however,  is  the 

*  height  of  the  most  recent  diversion  (which  will  be 

*  one  less  than  VERT. 

*/ 

rval  -  (MACRO  *) Ofile; 

Ofile  -  div->ofile 

Isdiv  -  div->isdiv 

Divtrap  -  div->divtrap  ; 

Dtrap_name [ 0 ]  -  div->dtrap_name [  0  ] ; 

Dtrap_name l 1 ]  -  div->dtrap_name [1 J ; 

VERT  -  div->vert; 

Divwidth  -  div->width; 


return (  rval  ); 


static  int  findtrap(  name  ) 

char  *name; 

/*  Look  for  the  trap  associated  with  the  macro 
*  "name."  Return  an  index  if  found,  -1  if  not. 
*/ 

register  int  i; 

for  (  i  -  MAXLTRAP  +  1  ;  —i  >-  0  ;  ) 

if(  ! strcmp (name,  Linetrapli])  ) 
return  i; 

return  -1; 


set_linetrap(  name,  lnum  ) 
char  ‘name; 

int  lnum; 

1  /*  set  a  line  trap  that  will  execute  the  macro  called 

*  "name"  when  output  line  number  "lnum"  is  passed.  If 

*  the  name  *  is  missing,  clear  the  trap  at  the 

*  indicated  location. 


register  UCHAR  *lp; 

if (  lnum  <  0  ) 

lnum  +-  PGLEN  ; 

lp  -  (UCHAR  *) (  Linetrap  +  lnum  ); 

if(  lnum  <0  ||  lnum  >  MAXLTRAP  ) 

err ("line  trap  must  be  in  the  range  0  -  %d\n", 

MAXLTRAP ) ; 

) 

else  if(  ! "name  ) 


dcreate(  name  ) 
char  *narae; 


Create  a  diversion  from  scratch. 


(continued  on  next  page) 


92 


Dr.  Dobb's  Journal,  April  1987 

303 


C  CHEST 


Listing  Nineteen 

(Listing  continued,  text  begins  on  page  130.) 

( 

•lp++  -  *name++  ; 

*lp++  -  ‘name  ; 

*lp  -  0  ; 

) 

) 

/' - •/ 

movetrap(  name,  where,  isoffset  ) 
char  ‘name; 
t 

/*  Deal  with  the  .ch  command.  If  "where"  is  0 

*  delete  the  trap  else  move  it  to  the  indicated 

*  position.  If  "isoffset"  then  add  "where"  to 

*  the  current  position.  Note  that  it  is  not  an 

*  error  to  clear  a  non-existant  trap. 

*  The  first  set_linetrap  call  deletes  the 

*  existing  trap,  the  second  one  reinstalls  it 

*  at  the  new  location. 

*/ 


> 


register  int  i; 

if(  (i  -  findtrap (name) )  >-  0  ) 

{ 

set_linetrap (  "",  i  ); 
if (  where  ) 

set_linetrap(  name,  isoffset?  i+where  :  where); 

} 


/' - '/ 

pr_traps  {) 

( 

/*  Print  all  active  traps:  */ 
register  int  i,  none  -  1; 


printf("Line  traps:\n"); 

for (  i  -  0;  i  <-  MAXLTRAP  ;  i++  ) 

{ 

if (  Linetrap[i] [0]  ) 

if(  none  ) 

i 

pr int f ("execute:  on  line:\n"); 
none  -  0; 

) 

print  f  ("  %2.2s  %4d\n",  LinetrapU],  i); 

) 

if(  none  ) 

printf(  "There  are  no  line  traps  set.\n"); 
if (  Divtrap  !-  -1  ) 

printf ("Diversion  trap  <%s>  set  at  line  %d\n", 

Dtrap  name,  Divtrap) ; 

if(  Itrap  !-  -1  ) 

printf ("Input  line  trap  <%s>  set  at  line  %d\n", 
Itrap_name,  Itrap); 


/' - */ 

do_divtrap() 

( 

/•  Spring  the  diversion  trap  */ 

if(  ! expand_macro(  Dtrap_name  )  ) 

err ("Can't  spring  .%2.2s  from  diversion  trap\n", 

Dtrap_name  ) ; 


- - - 

do_linetrap(  lnum  ) 

( 

/*  Spring  a  line  trap  on  line  "lnum",  if  one  exists  */ 


register  UCHAR  *trap  ; 

if  <  0  <-  lnum  &£  lnum  <-  MAXLTRAP  ) 

( 

trap  -  (UCHAR  *) (  Linetrap  +  lnum  ); 

if(  ‘trap  i4  !expand_macro (trap)  ) 

err("Can't  spring  trap  for  line  %d  (%2. 2s) .\n", 
(char  "llnum,  trap  ); 


- - - 


distance () 

{ 

register  char  “trap  ; 

register  int  line  ; 

I*  Compute  distance  from  current  line  to  next  trap 


line  -  OLINE  ;  /*  —  Current  output  line  */ 

icontinued  on  page  98) 


94 

304 


Dr.  Dobb's  Journal,  April  1987 


Listing  Nineteen 

(Listing  continued,  te^t  begins  on  page  130.) 

trap  -  (UCHAR  *) (  Linetrap  +  OLINE  ); 

while (  ++line<-PGLEN  44  line<-MAXLTRAP  44  » “trap  ) 

trap  +-  sizeof (LTRAP) ; 


C  CHEST 


return  line  -  OLINE  ; 


Listing  Twenty 


End  Listing  \ineteen 


NRMAP.C:  Routine  to  map  strings  of  type  char  to  strings 
of  type  CTYPE .  The  only  externally  accessable 
subroutine  is: 

void  map(  dest,  src  ) 

UCHAR  *src; 

CTYPE  *de3t; 

Copyright  (c)  1987  Allen  I.  Holub. 


i f (  Num_under  ) 
if{  Num_bold  ) 
if(  Num_os  ) 
i f (  Cont_ul  ) 


— Num_under, 
— Num_bold  , 
— Num_os 
— Cont  ul 


} 


•dest  -  0; 


End  Listing  Twenty 


Listing  Twenty-one 

/•  NRMSC.C  Stuff  that  didn't  fit  anywhere  else 

•  (C)  1987,  Allen  I.  Holub. 

•/ 


♦include  <stdio.h> 
♦include  <ctype.h> 
♦include  "nr.h" 
♦include  "nrmap.h" 


♦include  <stdio.h> 
♦include  <ctype.h> 
♦include  "nr.h" 


void  map(  dest,  src  ) 

UCHAR  "src; 

CTYPE  "dest; 

l 

/*  Map  the  input  character  array  over  to  a  CTYPE  array 

•  in  order  to  get  some  room  for  attribute  bits  (ie. 

*  bold,  italics,  overstruc.k,  etc.).  Set  the  attributes 

*  as  we  process. 

*  Only  printing  characters  can  have  attributes.  Motion 

•  is  transmitted  to  text U  as  two  bytes,  the  first 

•  indicates  the  direction  and  the  second  is  a  count. 

*  map  puts  the  direction  in  the  low  byte  and  the  count 

•  in  the  high  byte.  If  dest  --  0  then  the  various 

*  character  attributes  are  set  but  no  other  processing 

•  is  done. 

•/ 

register  unsigned  i,  c  ; 

while (  i  -  "src  ) 

{ 

++src  ; 

if (  i — VMOVE  ||  i  “  HMOVE  ) 

1 

i  -  *src.+  +  |  MODE_BIT; 
i  |-  (i  —  VMOVE)  ?  VM_BIT  :  HM_BIT  ; 
if(  i  4  0x80  ) 

i  |-  OxOfOO;  /*  Sign  extend  */ 

} 

else  if  (  i— CH_FONT  ) 

( 

Bold  -  Italics  -  Over  -  0;  /*  attributes  off  */ 


i  -  *src++  |  <FONT_BIT  |  MODE_BIT) ; 

) 

else  if(  i— CH_ATTRIB  ) 

1 

if(  -src  ) 

{ 


switch (  «src++  ) 
{ 


case 

BOLD: 

Bold 

-  1; 

break 

case 

ITALICS: 

Italics  -  1; 

break 

case 

OVER: 

Over 

-  1; 

break 

) 

) 

CLRWIDTH  (  i  )  ; 
continue; 

) 

else  /*  Set  the  appropriate  attribute  bits  */ 

[ 

c  -  i; 


switch  (i) 

case  LITCHAR: 

c-i-*src++ 

case  SOFT  HYPHEN: 

c-i-*src.++ 

case  ZWIDTH: 

i-*src++ 

case  UP  SPACE: 

c-i-'  • 

) 


break; 

HYPHENATE (i ) ;  break; 
c  -  -1;  break; 
SETNOPAD (i) ;  break; 


if  ( (Num_under  44  isalnum(i))  I  I  Cont_ul  I  I  Italics) 
SET_UL(  i  ); 

if(  ! WHITE (i)  )  /•  Don't  boldface  or  */ 

{  /*  overstrike  spaces  */ 

if(  Num_os  ||  Over  ) 

SET_OS (  i  ) ; 


) 


if(  Num_bold  ||  Bold  ) 
SET_BD (  i  )  ; 


if(  0  <-  C  44  C  <-  MAX_CHARS_IN_FONT  ) 
SETWIDTH (  i  ) ; 


extern  char 


typedef  struct 
{ 


•skiptoO  ; 


unsignec 

adjusting 

:  1; 

unsigned 

bold 

:  1  ; 

unsigned 

fill 

:  1  ; 

unsigned 

italics 

:  1 ; 

unsigned 

over 

:  1 ; 

int 

adjmode; 

int 

cmd ; 

int 

cont  ul; 

int 

cur font; 

int 

esc; 

int 

indent ; 

int 

itrap; 

char 

itrap  name [2] ; 

int 

1 space; 

int 

nobreak; 

/* 

int 

nm  blanks; 

/• 

int 

nm  on; 

/* 

int 

nm  mult; 

/* 

char 

*nm_str; 

/* 

int 

num  bold; 

/* 

int 

num_center 

/* 

int 

num  under; 

/* 

int 

num_os; 

/• 

int 

offset; 

/* 

char 

*rmarg_str 

/* 

char 

• lmarg_str 

/* 

int 

tab.- 

/* 

int 

leader; 

/• 

int 

linlen; 

/* 

int 

tempin; 

/• 

int 

title_len; 

/• 

TSTOP 

tabs; 

/* 

int 

•fillbuf; 

/* 

adjustment  enabled 
boldface  active 
filling  enabled 
italics  active 
overstrike  active 


/*  adjustment  mode  (.ad  M) 

/*  current  command  character 
/*  lines  to  underline  (.cu) 
/*  Current  font  (.f) 

/*  current  escape  character 
/*  current  indent  (.in) 

/ *  current  input  line  trap 
/*  name  of  the  above 
/*  line  spacing  (.Is) 
current  nobreak  character 
line  numbering  stuff  (.nm) 


♦  of  lines  to  boldface  (. 

♦  of  lines  to  center  ( 

♦  of  lines  to  underline  ( 

♦  of  lines  to  overstrike  ( 

current  page  offset  (.po) 
margin  character 

tab  expansion  character 
leader  expansion  character 
line  length  (.11) 
temporary  indent 
length  of  3-part  title 
previous  tab  stops 
fill  buffer  contents 


•/ 

*/ 

•/ 

*/ 

•/ 

*/ 

V 

•/ 

-/ 

*/ 

«/ 

•/ 

-/ 

*/ 

•/ 

*/ 

*/ 

*/ 

*/ 

bo)  */ 
ce)  */ 
ul)  •/ 
ul)  *1 
*/ 

•/ 

*/ 

*/ 

*/ 

*/ 

•/ 

•/ 

»/ 

•/ 


ENVIORNMENT; 
♦define  ESTACKSIZE 


static  ENVIORNMENT 
static  int 


Env_stack l ESTACKSIZE]  ; 
Esp  -  ESTACKSIZE  ; 


ENVIORNMENTS  ( . ev  command  processing) 


push_env () 

( 

ENVIORNMENT 
extern  int 


•env; 
•saveqO ; 


if(  Esp  <-  0  ) 

1 

err ("Environment  stack  full\n"); 
return; 

) 


env  -  4Env_stack[ — Esp) ; 


env->adjmode 

env->ad justing 

env->bold 

env->cmd 

env->cont_ul 

env->esc 

env->f ill 

env-> cur font 

env->fillbuf 

env->indent 

env-> italics 

env->itrap 

env->itrap_name( 0] 

env->itrap_name( 1] 

env->l space 


-  Adjmode; 

-  Adjusting; 

-  Bold; 

-  Cmd_chr; 

-  Cont_ul; 

-  Esc; 

-  FILL; 

-  CURFONT; 

-  saveqO; 

-  INDENT; 

-  Italics; 

-  Itrap; 

-  Itrap_name (0) ; 

-  Itrap_name (1 ] ; 

-  LSPACE; 

( continued  on  page  100) 


98 


Dr.  Dobb’s  Journal,  April  1987 

305 


C  CHEST 


Listing  Twenty-one 

(Listing  continued,  te?ct  begins  on  page  130. ) 


env->linlen 
env->nm_on 
env->nm_blanks 
env->nm_mult 
env->nm_str 
env->num_bold 
env->num_center 
env->num_under 
env->num_os 
env->nobreak 
env->of fset 
env->over 
env->rmarg_str 
env->lmarg_str 
env->tab 
env->leader 
env->tempin 
env->title_len 
raemcpy (  env->tabs< 
Num_under  -0 


-  LINLEN; 

-  Nm_on; 

-  Nm_blanks; 

-  Nm_mult; 

-  Nm_s  t  r ; 

-  Num_bold; 

-  Num_center; 

-  Num_under; 

-  Num_os; 

-  Nobreak; 

-  OFFSET  ; 

-  Over; 

-  Rmarg_str; 

-  Lmarg_str; 

-  Tab; 

-  Leader; 

-  Tempi n; 

-  Title_len; 
Tabstop,  NUMTABS  ) ; 


Num_bold 

-0 

Num_center 

-0 

Num  os 

-0 

Bold 

-0 

Over 

-0 

Italics 

-0 

Cont  ul 

-0 

Tempin 

-0 

pop_env ( ) 

{ 

ENVIORNMENT 


if(  Esp  >-  ESTACKSIZE  ) 


) 


err ("Enviornment  stack  empty\n") ; 
return; 


D(printf ("Restoring  from  Env_stack [%d] \n",  Esp)); 
env  -  4Env_stack [Esp++] ; 


Ad jmode 

Adjusting 

Bold 

Cmd_chr 

Cont_ul 

Esc 

FILL 

restorq 

INDENT 

Italics 

Itrap 

Itrap_name [ 0 ) 

Itrap_name[l j 

LINLEN 

LSPACE 

Nm_blanks 

Nm_on 

Nm_mult 

Nm_str 

Num_bold 

Num_center 

Num_under 

Num_os 

Nobreak 

OFFSET 

Over 

Rmarg_str 
Lmarg  str 
Tab  “ 

Leader 
Tempin 
Title_len 
memcpy (  Tabstop, 


-  env- >ad jmode; 

-  env->adjusting; 

-  env->bold; 

-  env->cmd; 

-  env->cont_ul; 

-  env->esc; 

-  env->fill; 

(  env->f iiibuf  ); 

-  env->indent; 

-  env->italics; 

-  env->itrap; 

-  env->itrap_name [0 ] ; 

-  env->itrap_name [1 j ; 

-  env->linlen; 

-  env->lspace; 

-  env->nm_blanks; 

“  env->nm_on; 

-  env->nm_mult; 

-  env->nm_str; 

-  env->num_bold; 

“  env->num_center; 

“  env->num_under; 

“  env->num_os  ; 

-  env->nobreak; 

-  env->of f set; 

-  env->over; 

-  env->rmarg_str; 

-  env->lmarg_str; 

-  env->tab; 

-  env->leader; 

-  env->tempin; 

-  env->title_len; 
env->tabs,  NUMTABS  ); 


if(  CURFONT  1-  env->c.urfont  ) 

chgfont (  Fonts (CURFONT) .name  ); 

CURFONT  —  env->curfont; 


Tabs,  Leaders,  and  Fields 


tabset (  s  ) 
char  *s; 


*  13  a  ?trl"9  o£  c‘'on“M  or  apace  delimited  elements 
or  the  form:  I+]Nt 

13  the  P?3itlon  of  the  tab,  t  la  the  type 
(L/C/R) .  The  optional  +  means  add  N  to  the  previous 
tab  stop  value. 


int  prevtab  -  0,  tab  -  0  ; 


for(  ;  »s  ;  prevtab  -  tab  ) 


if  ( 

I 


S  +  +  ; 

tab  -  prevtab  +  uatoi (  *s  ); 


tab  -  uatoi (  is  ) ; 


if<  tab  <1||  tab  >-  NUMTABS  ) 

{ 

err ("Tab  stop  must  be  in  the  range  l-%d\n", 

NUMTABS- 1) ; 

return; 

> 

if(  »S—  *R'  ||  *8— 'C'  ||  •»—  'L*  ) 

Tabstop(tab)  -  «s++  ; 


else  if  (  *s—  ’  '  |  | 
Tabstopl tab] 


while (  *s  — 
s++; 


L': 

II  *s  —  ',  •) 


II  !  *s  ) 


tabclr  () 

{ 

memset (  Tabstop,  0,  NUMTABS  ); 


Clear  all  tabs 


tabprint  () 

( 

register  int 


/*  Print  tabs 


i; 


for (  i  -  OFFSET;  — i  >-  0  ;  outc('  •)  ) 

for (  i  -  1;  i  <-  LINLEN  ;  i++  ) 

outc (  Tabstop(i)  ?  Tabstop (ij 

outc('\r' ) ; 
outc(*\n' ) ; 


FONT  support  routines: 


findfont(  fname  ) 
int  fname; 

{ 

/*  Search  for  a  font  in  the  font  table  (  Fonts[]  ). 

•  Return  an  index  into  the  table  or  -1  if  the 

•  entry  isn’t  there. 

•/ 


register  int  i; 

register  FONT  »fp; 

if(  fname  —  *R’  ) 
return  0; 

if(  isdigit(  fname  )  ) 

( 


/*  Roman  font  is  at  Font(0]  */ 


) 

else 

( 


if (  (i  -  fname  -  '0')  >  NUMFONTS-1  ) 

err(  "%d:  Illegal  font  number\n",  i  ); 

else  if(  1  Fonts(i) .name  ) 

err(  "Font  number  %d  is  undefined\n",  i  ); 

else 

return  i; 


fp  -  4 Font s [NUMFONTS-1 ] ; 

for (  ;  fp  >  Fonts  ;  — fp  ) 
if(  fp->name  —  fname  ) 
return (  fp  -  Fonts  ); 


chfont (  nfont  name  ) 

( 

/*  Called  from  nrout.c  to  actually  change  fonts. 

*  nfont  and  prevfont  are  indexes  into  the  Fonts [) 

*  table  for  the  new  and  previous  fonts.  The  CURFONT 

*  number  register  holds  the  index  for  the  current  font. 

*  Fonts (0)  is  the  Roman  font. 

*/ 

int  nfont; 

static  int  prevfont  -  0;  /•  Roman  */ 

FONT  *p; 

if(  nfont  name  —  PREVIOUS  ) 

( 

nfont  -  prevfont; 
prevfont  -  0; 

) 

else  if(  (nfont  -  findfont (nfont  name))  <  0  ) 

{ 

err(  "Font  <%c>  unavailable,  using  (R)oman\n", 

nfont_name) ; 

nfont  -  0; 

1  (continued  on  page  102) 


100 

306 


Dr.  Dobb's  Journal,  April  1987 


Listing  Twenty-one 

(Listing  continued,  text  begins  on  page  130.) 

if(  CURFONT  ! -  nfont  ) 

{ 

if<  “Fonts [CURFONT] .eraac  ) 

expand_raacro (  Fonts [CURFONT] . emac  ); 

p  -  i Fonts [nfont]  ; 

Right_str  -  p->right  ; 

Left_str  -  p->left 
Hs_amt  -  p->resolution  ; 

if(  “ (p->smac)  ) 

expand_raacro (  p->smac  ); 

) 

prevfont  -  CURFONT; 


C  CHEST 


rad just (  line,  col  ) 

CTYPE  “line  ; 
int  col 
{ 

/“  Do  right  adjustment.  That  is  spread  the  words 

*  as  evenly  as  possible  on  the  line  so  that  the 

“  rightmost  character  of  the  rightmost  line  is  at 
“  the  indicated  column  (col) .  This  is  accomplished 
“  by  replacing  all  paddable  space  characters  with 

•  motion  characters. 

*/ 

static  int  left  -  0; 

int  num_units,  num_chars,  gaps,  need  ; 

int  space_between_characters; 

int  extra_space;  ~ 

gaps  -  ngaps (  line,  4num_units,  4num_chars  ); 

if (  gaps  ~  0  ) 
return; 


CURFONT  -  nfont 

) 


End  Listing  Twenty-one 


need  -  (col  *  SPACE_SIZE)  -  nura_units  ; 
space_between_characters  -  need  /  gaps  ; 
extra_space  -  need  %  gaps  ; 


Listing  Twenty-two 


/ - 

*  NROUT.C:  Output  routines  for  nr. 

*  Copyright  (c)  1987  Allen  I.  Holub.  All  rights  reserved. 
•/ 

•include  <stdio.h> 

•Include  <ctype.h> 

•include  <stdarg.h> 

•include  "nr.h" 

•  include  "nrmap.h'' 

•include  "nrtlen.h" 


/ - 

•  isword(c)  evaluates  to  true  if  c  can  be  in  a  word.  Words 

•  are  composed  of  all  characters  except  padable 

“  space  characters. 

•/ 


D {  out int s(  "radjustO,  input  line:",  line  ) 

D(  printf ("Total  units  -  %d,  ",  num_units  ) 

D(  printf ("total  chars  -  %d\n",  num_chars  ) 

D(  printf ("Padding  to  %d  columns"  ,  col  ) 

D(  printf ("-  %d  units\n",  col  *  SPACESIZE  ) 

D(  printf ("%d  gaps  spread  over  %d  units\n",  gaps, need) 
D(  print f("%d  units/gap,  ",  space_between_characters) 

D (  printf ("%d  extra  units\n\n",  extra_space) 

fort;  gaps  >  0  ;  line++  ) 

( 

if (  WHITE (“line)  ) 

{ 

— gaps; 

“line  -  MOTION(  space_between_characters  ); 

if(  left  ||  (gaps  <-  extra_space)  ) 

( 

if(  — ext ra_s pace  >-  0  ) 


♦define  isword(c)  (  !  (WHITE(c)  l&  PADDABLE (c) )  ) 

/ - “/ 

err(  fmt  ) 
char  “fmt; 

( 

/“  Print  out  an  error  message.  If  a  macro  is  being 

*  processed,  the  macro  name  is  give,  otherwise  the 

*  current  input  file  name  is  given.  This  routine 

*  works  like  printf ()  in  all  other  respects. 

*/ 

va_list  args; 
va_start(  args,  fmt  ); 

fprintf(  stderr,  "\007ERROR (%s%s,  line  %d) :  ", 

Ismacro  ?  "macro  "  :  ""  , 

“I filename  ?  Ifilename  :  "stdin",  INLINES); 

vfprintf(  stderr,  fmt,  args  ); 

) 


ngaps (  line,  total_units,  total_chars  ) 

CTYPE  “line; 

int  “total_units; 

int  “total_c.hars; 

( 

/*  Returns  the  number  of  paddable  space  characters  on 
“  the  line.  Modifies  “total  units  to  hold  the  number 
“  of  horizontal  units  occupied  by  non-space  characters. 
*  “total_chars  is  modified  to  hold  the  total  number  of 
“  characters  on  the  line. 

“/ 

register  int  utotal  -  0; 
register  int  ngaps  -  0; 

CTYPE  “p; 

FONT  “ f tab; 

for(p  -  line;  “p  ;  ++p  ) 

( 

if(  WHITE ( *p)  ) 

( 

++ngaps; 

) 

else  if  ISCHAR (  *p  ) 

( 

utotal  +-  CWIDTH (  “p  ) ; 

D(  print f(  "ngaps ():  %c  ",  “p  )  ); 

D(  print f(  "-  %d  units\n",  CWIDTH (*p)  )  ); 

) 

) 

“total_chars  -  p  -  line  ; 

•total  units  -  utotal; 


left  -  ! left; 

) 


/“ - 

justify (  line,  mode  ) 

CTYPE  “line  ; 
int  mode  ; 

- ./ 

/“Do  simple  line  adjustment  on  str  (ie.  do  left. 

“  right  or  center  mode  adjusting).  If  the 

*  adjustment  mode  is  BOTH  then  the  routine  radjustO 

*  is  called  and  we  don't  do  anything  here. 

*/ 

int  num_units,  num_chars,  need,  gaps  ; 

if(  !“line  ||  mode  —  LEFT  )  /“  Left  adjustment 

V 

return;  /“  is  no  adjustment 

“/ 

if (  mode  —  BOTH  ) 

return  rad just (  line,  TLEN  ); 

gaps  -  ngaps (  line,  4num  units,  fcnum  chars  ); 
need  -  U_TLEN  -  (num_units  +  (gaps  “  SPACE_SIZE)) 

if (  need  <-  0  ) 

return; 

memcpy(  line  +  1,  line,  (num  chars  +  1)  “  sizeof (CTYPE) ) ; 

•line  -  MOTION (  mode— CENTER  ?  need/2  :  need  ); 

) 

/* - 

- •/ 

title (  instr  ) 
char  “instr  ; 

( 

/“  Do  a  three  part  title:  /strl/str2/str3/ 

*  The  character  held  in  Page_ch  is  expanded  to  the 

*  current  page  number.  In  the  absence  of  an  argument 

*  do  nothing. 

*  Titles  are  done  using  the  normal  output  function. 

“  The  delimiters  are  replaced  with  normal  spaces  and 

*  normal  spaces  are  replaced  with  unpaddable  spaces. 
“  Then  outbufO  is  called  with  adjustment  mode 

“  turned  on  to  print  the  line.  OutbufO  will  spread 
“  the  three  parts  of  the  title  evenly  on  the  line. 

“/ 

UCHAR  “s; 

int  delim,  ndelim  ; 

static  CTYPE  dest [  MAXSTR  ]; 

int  i,  fmt; 

UCHAR  pagenura[60] ; 

if(  ! (delim  -  “instr++)  ) 

return  ; 


return  ngaps; 

) 


ndelim  -  3; 

for(  s  -  instr;  *s  ;  s++  ) 


(continued  on  page  104) 


102 


Dr.  Dobb's  Journal,  April  1987 

307 


C  CHEST 


Listing  Twenty-two 

(Listing  continued,  te?ct  begins  on  page  130.) 

if(  * s  —  •  •  ) 

( 

*s  -  UP  SPACE; 

} 

else  if(  »s  —  Page_ch  ) 
deletes (  1,  s  ); 


} 


i  -  nrtoi (  ifmt  )■ 

ltoascii  (  pagenura,  fmt,  1  )• 

j  8  +“  inserts (  pagenum,  s,  MAXSTR  )  -  1; 

else  if (  -s  ~  delira  ) 

{ 

if(  — ndelira  >  0  ) 

else 

( 

•s  -  ' \0 1  ; 
break; 

) 

) 

) 


map  (  dest,  instr  ); 
radjust  (  dest,  Title_len  ); 

pad  {  OFFSET  ); 
outs  (  dest  ); 
outchar  (  TO_CTYPE ( • \n' )  ); 


/*  map  to  CTYPE  array  »/ 
/*  Spread  the  line  */ 

/*  Output  page  offset,  */ 
/*  title  string,  */ 
/*  and  a  newline  */ 


/* 


*/ 


outbuf(  line,  addhyphen  ) 
register  CTYPE  -line; 

{ 


/  Output  a  line  of  text,  adding  indent,  and  page 
offset.  Add  a  \n  at  eol.  Do  line  adjusting  or 
centering  as  required.  If  adhyphen  is  true,  a  hyphen 
is  added  immediatly  after  the  line  is  printed. 


The  line  array  must  be  MAXSTR  characters  long. 


This  routine  is  called  by  dofill  when  filling  is 
and  is  called  directly  by  text  when  it  is  not.  If 
Nospace  is  enabled  (by  a  .ns  command)  then  lines 
consisting  of  single  newline  characters  will  not 
*  be  printed. 


on 


J*  .if  if  "“i1  thfn  we  are  Panting  a  blank  line, 
in  this  situation  line  numbering  (.nra)  is  not  done 
and  only  one  blank  line  will  be  printed,  regardless 
of  the  line  spacing  (.Is).  If  there  is  not  left  or 
right  margin  string  defined,  no  offset  is  printed. 


register  int  i  ; 

char  numbuf{32)  ; 


Terapin++; 

) 


if  (  Nura_center  ) 

( 

justify (  line,  CENTER  ); 
— Num  center; 

} 

else  if(  Adjusting  ) 

justify (  line,  Adjmode  ); 


} 


if(  addhyphen  ) 

— Tempin; 


pad (i) 

register  int  i; 

{ 


/• 

•/ 


Print  i  spaces  using  outchar () 


while (  — i  >-  0  ) 

outchar (  TO_CTYPE(*  ')  ); 


outc(  c  ) 
int  c; 

< 

/*  Lowest  level  output  function.  Handles  control 

•  character  suppression.  "Ispage"  tests  to  see  if  a 

*  page  is  in  the  list  specified  with  a  -o  command  line 

•  switch.  Note  that  direct  calls  to  outc  don't  update 

•  any  global  variables,  outchar ()  is  the  normal  output 

*  function  and  should  be  used  except  in  wierd 

*  situations  (ie.  the  .ou  command) .  Note  that  \n  is  not 

*  translated  into  \r\n  by  outc  (it  is  so  translated  by 

•  outchar) . 


*  Note  that  outc()  processes  normal  char's,  not  CTYPE 's. 


A  minor  problem  here  is  diversions.  We  are  writing 
in  untranslated  mods  so  that  control  sequences  can 
get  to  the  printer  unmolested.  With  diversions  though, 
we  need  to  translate  character  by  hand  (strip  off  \r) 
so  that  when  they  are  read  back  (via  mgetc)  they 
won't  be  re-translated  on  output. 


C  4-  Oxff  ; 

if(  Isdiv  )  /*  if  we're  processing  a  •/ 

(  /*  diversion,  put  the  text  */ 

if(  c  !-  '\r'  )  /*  there.  */ 

mputc (  c,  Ofile  ); 

) 

else  if (  ispage (  PAGE  )  ) 

{ 

/•If  this  is  a  printing  page  as  per  the  -o  flag 
•  print  the  character 
•/ 


D (  outints (  "outbuf",  line  );  ) 

if(  ! ‘line  44  Nospace  ) 
return; 

Nospac.e  -  0;  /«  Re-enable  spacing  on  a  non-blank  */ 

/*  line.  */ 

/•  Then  output  the  page  offset  and  •/ 
/•  the  left  margin  string  */ 

if  (  '  Is<*iv  44  (*line  ||  •Lmarg  str  ||  -Rraarg  str)  ) 
pad (  OFFSET  );  “  ~ 


if(  No  cntl  44  (c  <  •  '  | |  c  >  0x7f )  ) 

( 

print f (  "<%02x>",  c  ); 
if (  c  —  '\n'  ) 

printf (  "\n"  ) ; 

) 

else 

( 

/*  It's  virtually  impossible  to  get  untranslated 

•  standard  output  out  of  the  Microsoft  compiler. 

*  so  do  it  here  with  a  direct  DOS  call. 

*/ 


if  ( 

if  < 
{ 


*Lmarg_str  ) 

outchs (  Lmarg_str  )  ; 

Nra_on  44  (Nra_blanks  | |  *line)  ) 


/• 

*/ 


Number  the  line  if  necessary 


if(  LINE  %  Nm_rault  ) 

pad(  strlen (Nra  str)  +  3  ); 
else  “ 

{ 

sprlntf (numbuf,  "%3d»s“,  LINE,  Nm  str  )• 
^  outchs (  nurabuf  );  — 

LINE++; 


if(  *line  )  /*  followed  by  the  line  if  there  */ 

pad (  INDENT  +  Tempin  ); 

/*  Now,  either  center  or  adjust  the  line  as 

•  required.  Note:  Num_center  (set  by  the  ,ce 
command)  is  different  from  a  centering 

•  adjustment  mode  (ie.  the  former  applies  to 

•  input  lines,  the  latter  to  output  lines. 
•^Centering  and  adjusting  are  mutually  exclusive. 

if(  addhyphen  ) 

( 

/•  This  kludge  tricks  justifyO  into 
•  thinking  that  the  line  is  shorter  than 
really  is  when  we  add  a  hyphen. 


bdos (  2,  c  4  Oxff,  0  );  /•  putc(  c,  stdout  )  •/ 

) 

) 

- - - 

outchar (  big  c  ) 

CTYPE  big  c; 

l 

/•  Medium  level  character  output  function.  Translates 

•  attributes  into  strings  (ie.  for  bold  chars  etc..). 

•  Updates  the  line  and  page  numbers  when  necessary. 

•  Springs  traps  as  required. 

•  Handles  the  various  motion  escape  sequences  \u  \d  etc. 

•  Returns  the  width  on  the  output  line  of  c.  This  will 

•  be  0  if  c  is  a  motion  character,  a  change  font,  a 

•  newline,  etc.  It  will  be  1  otherwise. 

•  Note  that  outchar ()  is  passed  CTYPE's,  not  normal 

•  characters . 

•/ 


register  unsigned  int 

static  int 

int 


rval,  c  ; 
lastch  -  0; 
width; 


rval  -  0  ; 


if (  ISMOTION (big  c)  ) 

{ 

do_ul  (  (CTYPE) 0  );  /*  All  attributes  off  first  */ 

dojbold (  (CTYPE) 0  ); 


(continued  on  page  106) 


104 

308 


Dr.  Dobb's  Journal,  April  1987 


C  CHEST 


Listing  Twenty-two 

(Listing  continued,  text  begins  on  page  120.) 


ao_over (  (CTYPE)O  ); 

mot  ion (  MVAL(big_c),  big_c  ); 

) 

else  if(  ISFONT (big_c)  ) 

< 

do_ul  (  (CTYPE)O  ) ; 
do_bold(  (CTYPE)O  ); 
do  over (  (CTYPE)O  ): 

ohTont (  FVAL(big_c)  );  /*  then  change  fonts  */ 

» 

else 

{  /*  ...  or  print  a  character:  */ 

do_ul  (big_c) ;  /*  Underline  character  if  reqd.*/ 


{ 


) 

/ 


/*  Output  a  character  string  using  outc.  Note  that 

*  \n  will  not  be  translated  into  \r\n  and  no  global 

*  vars  will  be  updated. 

•/ 

while(  *str  ) 

outc.  (  *str++  ); 


- ./ 

do_bold  (big_c);  /*  Boldface  character  if  reqd."/ 
do_over  (big_c.);  /*  Overstrike  character  if  reqd.*/ 
c  -  CHAR(big_c);  /*  Translate  to  normal  char  */ 

if (  c  —  '  \n' ) 

{ 

nextline(  lastch  —  '\n'  ); 


TEXTLEN  -  out s (  line  );  /*  Finally  output  the  line  */ 


) 

else 


if(  addhyphen  ) 


{ 


i 

TEXTLEN++; 

out char (  TO_CTYPE( ) 

> 


width  -  CWIDTH (  big_c  ); 

if(  width  >  1  )  /*  In  a  proportional  font  */ 

motion (  width/2,  (CTYPE) (MODE_BIT |HM_BIT) ) ; 


/*  Print  the  right  margin  character  if  needed.  The 

*  first  call  to  pad()  (in  the  if)  prints  the  left 

*  margin  padding  only  if  this  is  a  blank  line  (normaly 

*  we  don’t  want  to  print  any  padding  on  blank  lines). 

*  The  call  to  pad ( )  gets  us  to  the  right  margin. 

*/ 


outc.(  c  );  /*  Print  char  and  advance  */ 

/*  1  HMI  unit  */ 

i f (  width  >  1  ) 

motion (  (width/2)  -  1, 

(CTYPE)  (MODE_BIT | HM_BIT) )  ; 


if(  *Rmarg_str  ) 

< 

if(  ! *line  ) 

pad (  INDENT  +  Tempin  ); 

pad (  TLEN  -  TEXTLEN  ); 
outchs (Rmarg_str) ; 

) 

/*  Reset  Temporary  indent.  We  couldn't  do  it  earlier 

*  because  it's  used  in  the  TLEN  macro.  We  don’t  want 

*  to  reset  it  if  the  line  is  blank  though. 

*/ 

if  (  ‘line  ) 

Tempin  -  0; 

/*  Output  at  least  1  but  as  many  newlines  as  are 

*  required  by  the  .Is  N  command.  Only  one  line  is 

*  output  if  we  are  doing  a  blank  line. 

«/ 


for  (  i  -  Cline  ?  LSPACE  :  1);  ~i  >-  0;  ) 

outchar (  TO_CTYPE ( ' \n • )  ) ; 

) 

/ - - -I 

outs(  line  ) 

CTYPE  "line; 

( 

/•  Output  an  integer  string  using  outchar (). 

*  Returns  amount  of  space  occupied  on  the  output 

*  line  by  p.  This  will  not  include  motion,  font 

*  changes,  etc. 

*/ 

register  rval  -  0  ; 

D(  outints(  "outs",  line);  ) 

while  (  *line  ) 

rval  +-  outchar (  *line++  )  ; 

return  rval; 

) 

#ifdef  DEBUG  /• - •/ 

outintst  str,  line  ) 
char  *str; 

CTYPE  ‘line; 

l 

/*  Output  a  CTYPE  string  using  putc.har,  putting 

*  "str"  in  front  of  it. 

•/ 

printf("%s  <",  str  ); 

for(;  ‘line  ;  putc.har (  *line++  &  Oxff  )  ) 


if(  i  HASWIDTH  (big_c.)  ) 

motion (  -Fonts (CURFONT) .widths [c], 

(CTYPE) (  MODE_BIT  |  HM_BIT  )); 


rval  -  1; 

) 

lastch  -  c; 

) 

return  rval; 


- - */ 


motion (  count,  how  ) 
int  count; 

CTYPE  how; 

( 

/*  Take  care  of  motion.  \u  \d  \v  etc.  •/ 


} 


register  int  negative  -  0; 

if (  count  <  0  ) 

{ 

negative  -  1; 
count  -  -count; 


) 

while ( 
{ 

} 


— count  >-  0  ) 

if(  HORIZONTAL (how)  ) 

ots (  negative  ?  Left_str 

else 

ots(  negative  ?  Up_str 


Right_str  ) ; 
Dn_str  ) ; 


nextline(  harder  ) 

{ 

/*  -  Adjust  the  line  and  page  numbers  or  diversion 

*  height  as  appropriate. 

*  -  Spring  a  line  trap  if  one  is  present.  A  line 

*  trap  is  expanded  when  a  "line  of  text  is  output 

*  whose  vertical  size  reaches  or  sweeps  past  the 

*  trap  position."  In  other  words,  a  line  trap  at 

*  line  10  will  be  triggered  immediatly  after  line 

*  10  is  printed.  Trap  0  is  used  to  spring  a  trap 

*  at  the  top  of  the  page;  it  is  done  in  outchar () 

*  before  the  first  character  on  a  page  i3  printed 

*  (we  can't  do  it  at  the  end  of  the  current  page 

*  because  it  wouldn't  be  sprung  on  the  first  page). 

*  —  Adjust  various  number  registers  as  appropriate. 

*  -  Process  the  -s  command  line  switch  if  one  was 

*  given. 

*  -  Translate  \n  into  \r\n  or  do  Wordstar  mode  if 

*  necessary. 


printf (">\n") ; 

) 

#endif  /* - */ 

outchs (  str  ) 
char  *str; 

{ 

/•  Output  a  character  string  using  outchar ()  */ 

while(  *str  ) 

outchar (  TO_CTYPE(  *str++  )  ); 

) 

/. - ./ 


extern  int  Stop;  /*  Declared  in  nr.c  */ 


if(  IWordstar  )  /*  Not  in  Wordstar  mode  */ 

{ 


outc(  ' 

\r  •  ) ; 

outc(  ' 

\n '  ) ; 

} 

else 

if(  harder  ) 

/*  In  Wordstar  mode: 

*/ 

( 

outc( 

\r '  ) ; 

/*  Saw  two  successive 

*/ 

outc( 

\n'  )  ; 

/*  newlines  in 

*/ 

outc( 

Nr'  )  ; 

/*  outchar  (). 

*/ 

outc.( 

\n '  ) ; 

} 

else 

if(  Wordstar  ■—  1  ) 

/*  Use  Wordstar  soft 

*/ 

{ 

/*  carriage  return  for 

*/ 

ots(  str  ) 
char  *str; 


(continued  on  page  108) 


106 


Dr.  Dobb's  Journal,  April  1987 

309 


CHEST 


Listing  Twenty-two 

( Listing  continued,  tejct  begins  on  page  130. ) 


outc(  0x8d  ); 
outc (  ' \n '  )  ; 


/*  newlines 


/*  Replace  newline  with  */ 
/*  a  space  character  •/ 


if<  Isdiv  ) 

< 


iv  )  /•  Adjust  diversion  height  and  width 

/*  and  spring  trap  if  appropriate 
if(  Divwidth  <  TEXTLEN  ) 

Divwidth  -  TEXTLEN  ; 

if(  Divtrap  —  VERT  ) 
do_divtrap () ; 

++VERT  ;  / *  increment  diversion  height  */ 


do_bold (c) 
CTYPE  C.; 


/•  Same  as  above  but  do  boldface  instead  of  underline 

*/ 


static  int  ambold  -  0  ; 


if(  Plain  ) 

return; 


if(  IS  BD(c)  ) 


if{  * Bd_on  ) 

i 


if(  ! ambold  ) 

{ 

ambold  -  1; 
ots  (  Bd  on  ) ; 


/*  Spring  any  line  traps  for  the  current  line  and 

*  then  adjust  the  distance  to  the  next  trap.  Note 

*  that  the  line  number  has  to  be  incremented  as 

*  part  of  the  do_linetrap ( )  call  or  else  the  trap 
*^for  the  current  line  will  be  sprung  recursively. 

do_linetrap (  OLINE++  ); 

TOTRAP  -  distance (); 

if  (  OLINE  >  PGLEN  ) 

{ 

if(  Stop  tt  ! (PAGE  %  Stop)  ) 

{ 

/*  Process  the  -s  command  line  switch  */ 

fprintf (stderr, "\nHit  any  key  to  continue."); 
getchar ( ) ; 


outc(  (int)  c  ); 
outc (  ' \b ' ) ; 


if(  * Bd_o f f  it  ambold  ) 
ots(  Bd  off  ) ; 


ambold  -  0; 


/*  Turn  off  boldface  */ 


do_over (c) 

CTYPE  c: 

< 

/*  Same  as  above  but  do  overstrike. 


/*  Adjust  line  and  page  # 


OLINE  -  1 
++PAGE; 


do_linetrap(0) ;  /*  Spring  top  of  page  trap  */ 

TOTRAP  -  distance ( ) ; 


go_up (  amt  ) 

{ 

/*  Called  from  sp(),  in  nrprocs.c,  to  handle  negative 
*  spacing.  Amt  is  a  negative  number. 


static  int  am_os  -  0  ; 


if(  Plain  ) 

return; 


if(  IS_OS(c)  ) 

( 


if(  *Os  on  ) 


i f (  ! am_o  s  ) 

( 

am_os  -  1; 
ots(  Os  on  ); 


ot 3 ("-\b") ; 


if(  Isdiv  ) 

VERT  -  max (  VERT  +  amt,  0  ); 

else 

{ 

OLINE  -  max (  OLINE  +  amt,  1  ); 
TOTRAP  -  distance (); 


if(  *Os_off  &&  am  os  ) 
ot  s (  Os  oT  f  ) ; 


/*  Turn  off  overstrike  ■/ 


motion (  Vs_amt  *  amt,  (CTYPE)  (  MODE_B I T  |  VM_BIT  )); 


End  Listing  Twenty-two 


do_ul (  c  ) 

CTYPE  c; 

( 

/ *  Take  care  of  underlining  c  but  don't  actually  print 

*  c.  If  Ul_on  is  defined  toggle  underline  mode  at  the 

*  printer  at  the  appropriate  times.  If  Ul_on  is  not 

*  defined  then  just  print  a  "_\b". 


static  int  amunder  -  0 

if  (  Plain  ) 

return; 

if(  IS_UL(c)  ) 

( 

if(  *Ul_on  ) 

( 

if(  1 amunder  ) 

{ 

amunder  -  1; 
ots(  U1  on  ); 

) 

) 

else 

ots  (  "_\b"  ) ; 


/*  Turn  on  underlining 


if(  *Ul_off  tt  amunder  ) 
ots (  Ul_of f  ) ; 

amunder  -  0;  / 


Listing  Twenty-three 


♦include  <stdio.h> 
♦include  "nr.h" 


•  NRTAB.C  —  Tab-processing  stuff  for  NR 

•  (C)  1987,  Allen  I.  Holub,  All  rights  reserved. 

•  Tab  processing  is  rather  primitive.  In  particular,  it 

•  assumes  that  we're  using  a  monospaced  font. 

int  width (  c,  advance  ) 
int  c,  ‘advance; 

( 

/•  Return  the  amount  of  space  taken  by  the  character 
•  on  the  output  line.  Modifiy  "advance"  to  be  the 
a  amount  of  space  required  to  skip  past  it. 

switch (  c  ) 

( 


case 

VMOVE : 

•advance 

- 

2 

return 

0 

case 

HMOVE : 

•advance 

- 

2 

return 

0 

case 

CH  FONT: 

•advance 

- 

2 

return 

0 

case 

CH  ATTRIB: 

•advance 

- 

2 

return 

0 

case 

ZWIDTH: 

•advance 

- 

2 

return 

0 

case 

LITCHAR: 

■advance 

- 

2 

return 

1 

case 

SOFT  HYPHEN: 

•advance 

- 

1 

return 

0 

/*  Turn  off  underlining  */ 


•advance  -  1; 
return  1; 


UP  SPACE  and  default  case 


(continued  on  page  112) 


18 

0 


Dr.  Dobb's  Journal,  April  1987 


C  CHEST 


Listing  Twenty-three 

(Listing  continued,  text  begins  on  page  130.) 

. . --•/ 

dist_to_tab(  col  ) 

( 

/•  Col  is  the  current  column  position.  Return  the 

*  distance  the  the  next  tab  set  in  the  Tabstop  array. 

•  That  is,  the  next  tab  will  be  at  Tabstop(col  ♦  rval). 
■  A  tab  at  the  current  column  position  is  ignored. 

•/ 

register  int  ocol; 

for(  ocol  -  col++;  !Tabstoplcol]  it  col  <  LIN  LEW;  col+4) 


return  (col  <  LINLEN)  ?  (col  -  ocol)  :  0  ; 

) 

. . . . . . . 

field_width(  p  ) 

UCHAR  *p; 

/•  Return  the  distance  beween  p  and  either  end  of  line 
*  or  a  tab  character  (\t)  or  a  leader  character  (SOH) . 

*/ 

register  int  count  -  0; 
int  advance; 

if(  l *p*+  ) 

return  0; 


"  Three  types  of  tabs  are  recognized.  (Deft  justifying 

•  tabs  will  print  the  character  following  the  \t  at  the 

•  tabstop.  Right  and  Centering  tabs  both  use  a  field 

•  width  (ie.  the  number  of  character  following  the  \t 
■  up  to,  but  not  including,  a  following  \t  or  end  of 

•  line) .  A  centering  tab  will  cause  this  MfieldM  to  be 

•  centered  on  the  tabstop.  A  right  adjusting  tab  will 
"  cause  the  rightmost  character  in  the  field  to  rest 


•  immediately  to 

*/ 

the  left  of  the  tabstop. 

int 

w; 

/•  Current  field  width 

•/ 

int 

d; 

/•  Distance  to  next  tabstop 

*/ 

int 

col  ; 

/•  Current  output  column 

*/ 

UCHAR 

*p; 

/•  Current  character 

•/ 

UCHAR 

•startstr; 

/*  Original  beginning  of  string 

*/ 

int 

advance; 

/•  amount  to  advance  past  char 

•/ 

static  UCHAR  buf (  MAX3TR  )/ 

col  -  1  ; 

startstr  -  str  ; 


/*  Copy  str  into  buf  with  strncpyO  and  then  copy  it 
*  back  expanding  tabs  an  leaders. 

*/ 

strnr.py  (buf ,  str,  MAXSTR); 

for(  p  -  buf;  *p  ;) 

{ 

if (  ! ( *p  — —  ' \t '  ||  »p  —  SOH)  ) 

if(  str-startstr  >-  MAXSTR-1  )  /*  out  of  space  */ 

break; 

col  +-  width  (*p,  ^advance); 


while (  *p  44  *p  !-  *\t'  &i  *p  !-  SOH 
( 

count  +-  width(*p,  ^advance); 
p  +-  advance; 

) 


while (  — advance  >-  0  ) 
*str++  -  *p+  +  ; 


continue; 


return  count  ; 

) 

- - - / 

dotab(  str  ) 

UCHAR  *  str; 

( 

/•  Expand  the  tab  (*1)  and  leader  (“A  -  SOH) 

•  characters  in  str  to  the  proper  number  of  tab  or 
■  leader  characters. 


if(  I (d  -  dist_to_tab(col) )  ) 
break; 

w  -  field_width(  p  ); 

/•  Convert  d  to  the  number  of  spaces  to  print 

*  to  get  to  the  specified  ta,b  stop.  If  there 

*  are  no  characters  between  the  current 

*  \t  and  the  next  \t  then  just  move  to  the  next 

*  tab. 

•/ 


switch(  Tabstop[  col  +  d  ]  ) 

case  'R':  d  —  w  ;  break  ; 
case  'C ' :  d  —  w/2  ;  break  ; 

) 

while (  — d  >-  0  ) 

( 

if(  str-startstr  >«  MAXSTR  ) 
goto  exit; 

•str++  -  (  *p  —  *\t'  ?  Tab  :  Leader  ); 
col++; 

) 

p+  +  ; 

) 

exit : 

•str  -  0; 

1  End  Listing  Twenty-three 

Listing  Twenty-Sour 


•  enqueued  until  an  entire  line  is  in  the 

•  queue,  the  the  queue's  contents  are  printed. 

•  Last_queued:  Most  recently  enqueued  character. 

•  Owidth:  Space  occupied  on  the  output  line  by  the 

•  characters  r.ow  in  the  queue,  in  units 

•  (modified  by  putqO  and  used  in  several 

•  places) . 

•/ 


•define 

QSIZE 

(MAXSTR  *  2) 

static 

CTYPE 

blankline  -  NULL 

static 

static 

static 

QUEUE 

CTYPE 

int 

•Input_queue  ; 
Last  queued  ; 
Owidth  -  0; 

/. 

void 

UCHAR 

text  ( 
•str; 

str  ) 

/•Highest-level  text  processing  routine,  called 
•  from  process  ( ) . 

•/ 


•include  <stdio.h> 
•include  <ctype.h> 
♦include  "nr-h" 
•include  “nrmap.h" 
•include  "nrtlen.h" 


NRTEXT.C;  Text  processing  portion  of  nr 

Copyright  (c)  1985  Allen  I.  Holub.  All  rights  reserved. 


typedef  CTYPE  QUEUE;  / •  Dummy  typedef  for  queue  routines  •/ 


. . — t 

extern  QUEUE  *makequeue()  ; 
extern  CTYPE  “shownextO  ; 
extern  void  map(CTYPE*,  char*); 


CTYPE  line  (MAXSTR); 

static  int  been_called  -  0; 


D(  printf (“text ( ) :  working  on  <%s>\nM,  str);  ) 


if  ( 

Inhibit  ) 

/• 

Input  has  been  inhibited  by  an 

*/ 

return; 

/• 

.if  or  .ie  command 

*/ 

if  ( 

!been_called  ] 

i 

( 

/•  If  this  is  the  first  time  we've  been  called, 

•  spring  the  top  of  page  macro  for  the  first  page. 

•  (all  other  top  of  page  macros  are  sprung 

•  imediately  after  the  bottom  line  of  the  previous 

•  page  is  printed. 

•/ 

been  called  -  1; 
do_lTnetrap(  0  ); 

TOTRAP  -  distance (); 

) 


/ - - 

•  Globals: 

•  blankline:  Used  in  outbufO  calls  when  we  need  to  print 

«  a  blank  line. 

•  Input_queue:  Input  queue  used  for  line  filling.  Words  are 


i f (  TLEN  >-  MAXSTR  ) 
( 

err ("Output 

) 

else  if(  !*str  ) 


line  too  long\n") ; 

(continued  on  next  page) 


112 


Dr.  Dobb's  Journal,  April  1987 

311 


Listing  Twenty-four 

(Listing  continued,  tegt  begins  on  page  130.) 

/•  A  blank  line  always  causes  a  break  use  * sp  1  if 

•  you  want  blank  lines  without  a  buffer  flush. 

•  Note  that  outbuf  won't  print  the  blank  line 

•  itself  unless  spacing  is  enabled. 


outbuf {  tblankline,  0  );  /•  print  the  blank  line  */ 


return  rval; 


restorq(  qp  ) 

CTYPE  "qp; 

1 

/ *  Restore  the  queue  to  the  condition  it  was  in  before 

*  a  previous  saveq.  p  is  a  pointer  returned  from  a 

*  saveq ()  call.  The  queue  is  flushed  before  it  is 

*  reloaded  from  p. 


map (  line,  str  >  ; 

if(  FILL  ) 

do fill (  line  ) ; 

else 

outbuf <  line,  0  ); 


register  CTYPE 


Flush  current  queue 


for (  p  -  qp  ;  "p  ;  putq(  p++  )  ) 


if(  Itrap  >  0  >  /•  input  line  trap  •/ 

if<  —Itrap  —  0  ) 

eMpand_manro(  Xtrap_name  )/ 


free  (  qp  ) ; 


d  i  lt_text() 

/*  D  one-time  initializations  for  thi3  module.  This 
* ^  r  Jtine  is  called  from  main()  when  the  program  boots. 

if(  ! (Input_queue  -  makequeue(  QSIZE,  sizeof (CTYPE) ) )  ) 

err ("Not  enough  memory  for  fill  buffer\n"); 
exit ( 1 ) ; 


prblank(  n  ) 
int  n; 

< 

/*  Flush  the  buffer  and  print  n  blank  lines.  This 

*  routine  handles  the  . sp  command. 

*  If  spacing  is  inhibited  (ie  .ns  was  given)  do 

*  nothing.  The  nospace  test  is  done  in  outbuf. 


while (  — n  >-  0  ) 

outbuf (  iblankline,  0  ); 


Everything  above  this  point  works  on  char  strings, 
everything  below  this  point  works  on  CTYPE  strings.  See 
nrmap.c  for  the  mapping  routine. 


static  int 


first_white  () 


/*  Return  true  if  the  first  character  in  the  queue 
*  (the  one  to  come  out  next)  is  a  space. 

CTYPE  next  -  *show_next (Input_queue)  ; 

return (  sp_used ( Input_queue)  &&  WHITE (next)  ); 


add_sep (  line  ) 

CTYPE  ‘line; 

( 

/*  there  is  something  in  the  queue  and  that  something 

*  doesn't  start  with  white  space  then  enqueue  an  extra 

*  space  character  as  a  word  seperator. 


if (  sp_u sed ( Input_queue )  s&  ! WHITE (Last_queued)  ) 


C  -  TO_CTYPE ( 
putq  (  4c  ) ; 


static  int  putq(cp) 

CTYPE  *cp; 

( 

Last_queued  -  ‘cp  4  Oxff  ; 

Owidth  +-  CWIDTH (  * cp  ) ; 

if(  ! enqueue (cp  ,  Input_queue)  ) 

err (  "Fill  buffer  full\n"  ); 
brk  () ; 
return  0; 


CTYPE  *saveq() 

I 

/*  Save  the  current  queue  contents  and  flush  the  queue. 

•  Return  a  pointer  which,  when  passed  to  a  subsequent 

•  restorqO  call,  will  restore  the  queue  to  its 

•  previous  state. 


dofillt  line  ) 

CTYPE  "line  ; 

{ 

/*  Collect  words  until  we  have  filled  an  entire  line 
•  and  then  print  it. 

-/ 


int  pad;  /•  Amount  of  padding  needed  */ 

int  tail;  /*  padding  needed  on  current  line  */ 

int  nchars;  /*  #  of  chars  to  print  out  */ 

int  prevwidth;  /■  amount  of  space  used  by  nchars  */ 

CTYPE  ‘word;  /•  beginning  of  current  word.  */ 

CTYPE  ‘white;  /*  pointer  to  beginning  of  white  */ 

/ *  space  preceedeing  current  word  */ 
CTYPE  *p;  /•  pointer  into  hyphenated  word.  */ 

CTYPE  *chop_here; 
int  extra; 

int  addhyphen; 

D (  outints ("dofill",  line)  ); 

D (  print f (">\ndof ill:  prevwidth-%d,  Owidth-%d\n",  ) 
D<  prevwidth, Owidth)  ); 

if(  Num  center  ) 

{ 

/"  If  we're  doing  centered  lines,  output  the  current 
•^buffer  without  filling. 


p  -  (CTYPE  *)  malloc(  ( sp_used ( Input_queue)  +1) 

*  sizeof (CTYPE) ) ; 


j 

err ("Not  enough  memory  to  save  fill  buf fer\n") ; 

while (  dequeue (p,  Input_queue)  ) 

++p; 

*p  -  0; 

Owidth  -  0; 


outbuf (  line,  0  ); 
return  0; 


while(  ‘line  ) 


prevwidth  -  Owidth  ; 

pad  -  U_TLEN  -  prevwidth; 

nchars  -  sp_used(  Input_queue  ); 

white  -  line  ; 

/*  Insert  a  word  into  the  queue.  Add  leading  blanks 

*  first  and  then  the  word  itself.  Putq()  increments 

*  Owidth  as  necessary  to  reflect  the  amount  of 

*  space  occupied  on  the  output  line  by  the 

*  characters  in  the  queue. 


(continued  on  page  116) 


114 


Dr.  Dobb's  Journal,  April  1987 


CHEST 


Listing  Twenty-four 

(Listing  continued,  text  begins  on  page  130. ) 


/*  this  is  the  first  word 
/*  of  an  input  line 


if(  CHAR ( “line)  )  / 

add_sep(  line  );  / 

else  while (  CHAR (* line)  —  ' 
putq(line++) ; 

r.hop_here  -  word  -  line; 

while  Cline  &&  CHARCline)  I- 
putq(line++) ; 


/*  If  the  new  word  put  more  characters  into  the 

*  queue  than  will  fit  on  the  line  (U_TLEN)  then 

*  output  everything  up  to  (but  not  including) 

*  the  most  recently  added  word.  If  hyphenation 

*  is  enabled,  output  a  prefix  too. 

-/ 

if (  Owidth  >-  U_TLEN  ) 

( 

addhyphen  -  0; 

if(  prevwidth  >  (LINLEN  *  SPACE_SIZE)  ) 

{ 

err ("Line  overflow,  truncating\n") ; 
nchars  -  LINLEN; 

) 

else  if(  Hyphenate  &&  ‘word 

&&  pad  >  (3 * SPACE_SIZE)  ) 

/•  pad  —  difference  between  line  width 

•  and  width  of  line  up  to  the  beginning  of 

*  the  previous  word. 


hyphen (  word,  line-1  ); 
p  -  word; 

prevwidth  CWIDTH(  TO_CTYPE ( • )  ); 

for(;  p  <  line  &&  prevwidth  <  U_TLEN;  p++  ) 
( 

prevwidth  +-  CWIDTHCp)  ; 

if  (  HAS_HYPHEN  Cp)  ) 

1 

chop_here  -  p  +  1 ; 
addhyphen  -  1; 

J 

else  if(  CHAR ( *p)  —  ) 

[ 

c.hop_here  -  p  +  2; 
addhyphen  -  0  ; 

) 


nchars  +-  (chop_here  -  word); 


outqueue(  nchars,  addhyphen  ); 


while  (  — numchars  >-  0  dequeue  Uochar,  I nput_queue) ) 

( 

Owidth  —  CWIDTH (  ochar  ); 

*p++  -  ochar; 

) 


/•  Delete  trailing  white  space  from  the  output  buffer 
•  and  leading  white  space  from  the  queue. 

-/ 

while (  WHITE ( • p)  ) 

i f (  — p  <  buf  ) 
break; 

*+*p  -  ' \0' ; 

while (  first_white ()  U  dequeue ( iochar,  Input_queue)  ) 

— Owidth  ; 

outbuf (  buf,  addhyphen  );  / *  Output  the  line  */ 

End  Listing  Twenty-four 


Listing  Twenty-five 


Makefile  for  Lattice  lmk  to  manufacture  nr  using 
the  Microsoft  C  compiler,  version  4.0 


CV_CSWITCH  -  /Zi 
CV  LINKSW  -  /CO  /M 


CSWITCH 

LINKSW 


/NO I  /STACK: 4 096 


OBJ  1  -  nr.obj  nrmd.obj  nreg.obj  nroxcept.obj  nrglbls.obj 
OBJ2  -  nrhyphen.obj  nrinp.obj  nrmac.obj  nrmap. obj  nrmsc.obj 
OBJ 3  -  nrout.obj  nrproc.s.obj  nrtab.obj  nrtext. obj 


cc  -c  $(CV  CSWITCH)  s (CSWITCH)  $*.c  >>err 


5 (OBJ1 )  $ (OBJ2 )  $ (OBJ 3) 

link  5 (CV  LINKSW)  $ ( LIWKSW)  <g< 


S(OBJl)  + 
$ ( OBJ  2 )  ♦ 
$  (OBJ3 ) 


\lib\tools .lib 


final:  $(OBJl)  $(OBJ2)  S(OBJ3) 

link  $ (LINKSW)  <§< 

$ ( OBJ  1 )  ♦ 

$  ( OBJ  2 )  ♦ 

$ (OBJ 3 )  > 


\lib\tools .lib 


/*  Process  a  line  break.  If  there's  anything  in  the 

•  buffer,  flush  it  out.  If  the  adjustment  mode  is  BOTH 

•  then  adjustment  is  turned  off  when  the  line  is 

•  flushed  (so  that  the  last  line  of  a  paragraph  looks 

•  correct.  This  routine  assumes  that  the  queue  always 

•  has  less  than  one  output  lines  worth  of  text  in  it. 


register  int  oadj; 

D(  printf ("Doing  break\n");  ) 

oadj  -  Adjusting; 

if (  Adjmode  —  BOTH  ) 

Adjusting  -  0; 

outqueue (  sp_used ( Input_queue) ,  0  ); 
Adjusting  -  oadj  ; 


nr.obj: 

nr  .c 

nr  .h 

nrcmd .obj : 

nrcmd .  c 

nr.h 

nreg.obj: 

nreg.c. 

nr  .h 

nrglbls .obj : 

nrglbl s .  c 

nr.h 

nrhyphen.obj : 

nrhyphen .  c 

nr.h 

nrmap. h 

nrinp.obj : 

nrinp.  c. 

nr .  h 

nrmac.obj : 

nrmac . c 

nr.h 

nrmap . obj : 

nrmap. c 

nr.h 

nrmap.  h 

nrmsc . obj : 

nrmsc. c 

nr .  h 

nrout .obj : 

nrout .c 

nr  .h 

nrmap.  h 

nrprocs .obj  : 

nrprocs . c 

nr.h 

nrtab.obj : 

nrtab.c 

nr.h 

nrtext . obj : 

nrtext . c 

nr.h 

nrmap. h 

nrexcept .obj : 

nrexcept . c 

nr.h 

nrmap.  h 

nrtlen.h 

nrtlen.h 


End  Listings 

(Listing  26  will  be  in  next  issue.) 


outqueue (  numchars,  addhyphen  ) 


/*  #  of  chars,  to  dequeue  */ 


/*  Output  numchars  characters  from  the  input  queue, 

*  using  outbuf ().  If  "addhyphen"  is  true  then  the  a 

*  hyphen  is  printed  at  the  end  of  the  line. 


register  CTYPE 
static  CTYPE 
CTYPE 


*p  ; 

buf (MAXSTR)  ; 
ochar  ; 


D (printf ("outqueue :  putting  %d  characters'^", numchars)); 


if (  numchars  <-  0  ) 
return; 


116 


Dr.  Dobb's  Journal,  April  1987 


STRUCTURED  PROGRAMMING 


Listing  One  (Text  begins  on  page  140. . 


(  Elemental  tools 


Ham  10:31  12/13/86  ) 


:  —CUR  14  0  SET-CURSOR  ;  (  no  cursor  ) 

:  +CUR  6  7  SET-CURSOR  ;  (  normal  cursor  ) 

:  BACK  {  n  -  )  0  ?DO  8  EMIT  LOOP  ;  (  backspace  word  ) 

0  CONSTANT  NEW  (  to  collect  new  digits  ) 

-1  CONSTANT  OLD  (  to  provide  existing  number  to  routine  ) 

:  I NCR  (  a  —  )  1  SWAP  +!  ;  (  increments  variable  ) 

:  DECR  (  a  -  )  -1  SWAP  +!  ;  (  decrements  variable  ) 

:  Bs?  (  n  -  f  )  8  -  ;  (  T  if  backspace  pressed  ) 

:  Cr?  (  n  -  f  )  13  -  ;  (  T  if  carriage  return  pressed  ) 

VARIABLE  OK-NEG  (  T  allows  for  entry  of  -  ;  F  rejects  -  ) 
VARIABLE  SOUND  (  T  if  using  sound  ) 

:  BELL  (  -  )  SOUND  @  IF  440  8  BEEP  (  short  beep  )  THEN  ; 

CREATE  #PAD  15  ALLOT  (  work  area  ) 

:  #P!  (  c  n  -  )  ♦  PAD  +  C!  ;  (  stores  character  c  at  offset  n  ) 

CREATE  #VAR  14  ALLOT  (  holds  various  values  ) 

♦  VAR  CONSTANT  ♦  DEC  {  no.  of  fractional  digits  ALLOWED  ) 

♦  VAR  2+  CONSTANT  #dec  (  no.  of  fractional  digits  ENTERED  ) 

#VAR  4  *  CONSTANT  #WHOLE  (  no.  of  whole  digits  entered  ) 

♦VAR  6  +  CONSTANT  #HIT  (  no.  of  keystrokes  ) 

♦VAR  8  +  CONSTANT  NEG-  (  T  if  number  is  negative  ) 

♦VAR  10  +  CONSTANT  dec-  (  T  if  decimal  point  entered  ) 

♦VAR  12  +  CONSTANT  DIGCNT  (  counts  no.  of  digits  for  old  nos.) 

:  PLACES  (  n  -  )  4DEC  !  ;  (  sets  ♦  of  decimal  places  allowed  ) 

:  ♦init  ♦dec  12  ERASE  (  don't  erase  ♦DEC  )  ♦PAD  15  ERASE  ; 

:  “HIT/NEG  ♦HIT  4  ERASE  ;  (  resets  no.  hit  and  negative  flag  ) 

:  NEG?  (  -  f  )  NEG-  0  ;  (  T  if  number  is  negative  ) 

:  dec?  (  -  f  )  dec-  0  ;  (  T  if  dec  point  entered  ) 


(  Get  and  edit  keystroke 


Ham  10:33  12/13/86  ) 


:  CAPITALIZE  (  C  -  C  )  DUP  96  >  OVER  123  <  AND  IF  BL  -  THEN  ; 

:  FIXUP  (  c-c)  DUP  ASCII  B  -  OVER  BL  -  OR  IF  DROP  ASCII  C  THEN 

(  convert  B  and  space  bar  to  C  :—  clear  number  entry  ) 

(  L  :-  1  )  DU?  ASCII  L  -  IF  DROP  ASCII  1  THEN 

(  O  :-  0  )  DUP  ASCII  O  -  IF  DROP  ASCII  0  THEN  ; 

:  ♦?  (  n  -  f  )  DUP  ASCII  /  >  SWAP  ASCII  :  <  AND  ;  (  T  if  digit) 

:  BAD?  (  n  -  f  )  DUP  ♦?  OK-NEG  @  IF  OVER  ASCII  -  OR  THEN 

♦DEC  0  IF  OVER  ASCII  .  -  OR  THEN 

OVER  ASCII  C  -  OR  OVER  Bs?  OR  SWAP  Cr?  OR  NOT  ; 

:  GET^  (  -  n)  BEGIN  KEY  CAPITALIZE  FIXUP  DUP  BAD? 

WHILE  DROP  BELL  REPEAT  ; 


(  Collection  box 


Ham  10:34  12/13/86  ) 


:  i,S  (  ♦ w  -  ♦,  )  3  /MOD  SWAP  0-  +  0  MAX  ; 

(  takes  ♦  of  whole-number  digits,  leaves  ♦  of  commas  required) 
(  Warning:  Assumes  83-Std  flag  -  -1;  negate  flag  if  79  Std  ) 

:  FULLCNT  <  n  -  n'  )  ♦  DEC  0  IF  1+  THEN  OK-NEG  0  IF  1+  THEN  ; 

(  adds  to  char  cnt  the  decimal  point  and  minus  sign  if  any  ) 

:  BOXSIZE  (  n  -  m  )  DUP  (  ♦  of  digits  )  ♦  DEC  0  -  (  ♦whole  digts) 
DUP  1  <  (  T  if  no  whole  digits  )  NEGATE  (  83-Std  flag  )  >R 
♦,S  (  ♦of  commas  )  R>  +  +  2+  (  space  at  either  end  ) 

FULLCNT  ;  (  leaves  number  of  character  in  box  ) 

:  BOX  (  n  -  )  BOXSIZE  SPACES  ; 

(  prints  inverse  spaces  to  define  field  for  number  entry) 


(  Sign/decimal 


Ham  10:34  12/13/86  ) 


(  displays  -  or  .  or  both  when  no  digits  yet  entered  ) 
NEG?  dec?  AND  IF  3  BACK  " 

ELSE  NEG?  IF  2  BACK  . "  -  “ 

ELSE  dec?  IF  2  BACK  .  " 

THEN  THEN  THEN  ; 


(  Count  digits;  show  number 


Ham  10:35  12/13/86  ) 


:  2,  (  d  -  )  ,  ,  ;  (  store  double  into  dictionary  ) 

CREATE  NINES  9.  2,  99.  2,  999.  2,  9999.  2,  99999.  2, 

999999.  2,  9999999.  2,  99999999.  2,  999999999.  2, 

:  ♦OFDIGITS  (  d  -  ♦  )  DABS  1  DIGCNT  ! 

BEGIN  2 DUP  DIGCNT  01-4*  NINES  +  20  D> 

WHILE  DIGCNT  INCR  REPEAT  2DROP  DIGCNT  e  ; 


:  PUT^  (  -  adr  cnt  )  (  prepares  number  for  display  ) 

0.  ♦PAD  1-  CONVERT  DROP  2DUP  ♦OFDIGITS  >R 
<♦  dec?  IF  ♦dec  0  0  ?DO  ♦  LOOP  ASCII  .  HOLD  THEN 
R>  ♦dec  0  -  *,S  0  ?DO  ♦  ♦  ♦  ASCII  ,  HOLD  LOOP 
♦S  NEG?  SIGN  ♦>  ; 


(  Display  the  number  nicely 


Ham  19:39  12/04/86  ) 


:  DISPLAYS  (  n  -  n  )  DUP  (  get  another  copy  of  max  ♦  of  digits  ) 
BOXSIZE  DUP  BACK 

(  back  up  to  beginning  of  entry  field;  top  of  stack  is  ) 

(  size  of  box,  which  is  greater  than  ♦  of  digits  ) 

♦HIT  0 

IF  1-  {  space  at  end  )  PUT^  ROT  OVER  -  SPACES  TYPE  SPACE 
ELSE  SPACES  (  new  box  )  -.  THEN  ; 

(  n  is  max  no.  of  digits  to  be  entered,  which  stays  on  stk  ) 


(  Wrap-up  routine  Ham  21:15  11/27/86  ) 

:  10D*  (  d  -  1 0*d  )  2 DUP  2DUP  D+  2DUP  D+  D+  2DUP  D+  ; 


SCALED  ♦DEC  0  ?DUP  IF  ♦dec  0  -  0  ?DO  10D*  LOOP  THEN  ; 

(  scale  up  to  integer  from  decimal  fraction  ) 

♦DONE  (  -  d  ♦  ) 

(  leaves  double  number  entered  and  no.  of  digits  entered  ) 

(  no.  of  digits  -  zero  means  no  digits  entered  ) 

0.  ♦PAD  1-  CONVERT  (  leaves  addr  of  1st  nonconverting  char  ) 

♦PAD  -  (  number  of  digits  )  >R 

NEG?  IF  DNEGATE  THEN  SCALED  R>  DUP  0- 

IF  (  number  is  0,  see  whether  key  pressed  or  no  entry  ) 
DROP  ♦HIT  0  0>  NEGATE  (  Note:  83-Std  flag  )  THEN  ; 


(  Adjust  counts 


Ham  18:51  11/06/86  ) 


:  ♦dec- AD J  ♦dec  0  IF  ♦dec  DECR  THEN  ;  (  down  one  decimal  ) 

:  ♦WHOLE-ADJ  dec?  NOT  IF  ♦WHOLE  DECR  THEN  ;  (  down  1  whole  nc 
:  NO-.?  (  -  f  )  ♦HIT  0  0-  NEG?  0-  dec?  0-  AND  AND  ; 


(  When  decimal  point  is  hit 


Ham  18:53  11/06/86  ) 


:  .ROUTINE  dec?  IF  BELL  (  decimal  point  already  entered  ) 
ELSE  dec-  ON  (  mark  entry  of  decimal  point  ) 
THEN  ; 


Check  if  need  to  adjust  digits 


Ham  18:23  11/06/86  ) 


(  WHOLE-CK  t  DEC-CK  have  this  stack  diagram:  (  n  f  -  n  f‘  ) 

(  where  n  is  the  no.  of  digits  entered  so  far  ) 

:  WHOLE-CK  dec?  0-  IF  OVER  ♦DEC  0  -  ♦WHOLE  0  -  OR  THEN  ; 

(  makes  flag  T  if  dec  pt  not  entered  AND  we  have  all  the  ) 

(  whole  number  digits  that  we  can  accept  ) 

:  DEC-CK  ♦DEC  0  ?DUP  IF  ♦dec  0  -  OR  THEN  ; 

(  makes  flag  T  if  we  have  all  the  digits  to  the  right  ) 

(  of  the  decimal  that  we  can  accept  ) 

(  The  true  flag  will  cause  the  latest  digit  entered  to  be  ) 
(  dropped  and  the  bell  to  sound  (if  SOUND  is  on) 


(  Count  each  digit  entered 


Ham  18:24  11/06/86  ) 


VARIABLE  0START  (  T  if  starting  with  whole  number  zero  ) 

(  A  starting  whole  number  value  of  zero  is  in  effect  a  ) 

(  leading  zero  and  should  not  be  counted  in  the  total  of  ) 

(  digits  entered,  or  else  the  final  numeric  digit  will  not  ) 

(  be  accepted.  ) 

CNT-DIGIT  dec?  IF  ♦  dec  INCR 

ELSE  0START  0  IF  0START  OFF  (  1-time  switch  ) 
ELSE  ♦WHOLE  INCR  THEN  THEN  ; 


(  Initialization  for  "old"  numbers 


Ham  18:26  11/06/86  ) 


(  If  old  number  is  decimal,  all  places  are  present.  ) 

:  SET-dec  ♦DEC  0  ?DUP  IF  ♦dec  •  dec-  ON  THEN  ; 

:  SET-NEG  (  d  n  -  n  d  )  ROT  ROT  (  move  dbl  to  top  )  2DU?  0 .  D< 

IF  (  neg:  convert  and  note  sign  )  DNEGATE  NEG-  ON  THEN  ; 

(  Put  number  into  ♦PAD  as  an  string  of  ASCII  values:  ) 

:  SET-^P  (  d  -  )  <♦  dec?  IF  ♦dec  0  0  DO  ♦  LOOP  THEN 

DIGCNT  0  ♦DEC  0  >  IF  ♦S  THEN  ♦>  ♦PAD  SWAP  CMOVE  ; 


(  Initializes  for  loop 


Ham  18:34  11/06/86  ) 


:  DSET  (  d  ♦  T I ♦  F  —  mnp) 

(  m  -  ♦  of  digits  to  collect,  n  p  -  limits  for  loop  ) 


118 


Dr.  Dobb's  Journal,  April  1987 


OSTART  OFF  #init  OVER  BOX 

IF  (  old  number  present  )  SET-dec  SET-NEG  2DUP 

2DUP  OR  0-  #DEC  0  0-  AND  (  double  is  both  zero  and  whole  ) 

IF  OSTART  ON  THEN  (  so  mark  it  as  a  zero  start  ) 

IOFDIGITS  #DEC  0  MAX  DUP  #HIT  !  (  set  #  of  digits  entered  ) 

DUP  #DEC  0-0  MAX  *WHOLE  1  {  set  #  of  whole  digits  ent) 

ROT  ROT  SET-#P  (  make  &  save  ASCII  string  ) 

SWAP  DUP  1+  ROT  OSTART  0  +  (  using  83-Std  flag  to  deer) 

ELSE  (  no  old  number  present  )  DUP  1+  0  THEN  ; 


(  Backspace  routine  Ham  10:35  12/13/86  ) 

VARIABLE  INDX  (  holds  index  from  loop  ) 

:  “I"  (  -  index  )  INDX  0  ;  (  lets  me  get  I  from  outside  loop  ) 

:  NOf?  (  -  f)  #HIT  0  0-  ;  (  T  -  no  digits  entered  ) 

:  BSP-ROU  (  —  loop-incr  )  dec?  #dec  0  0-  AND 

IF  dec-  OFF  0  (  just  backed  over  the  decimal  point  ) 

ELSE  "I"  IF  0  "I"  1-  #P !  (  zap  previous  entry  in  string  ) 

#HIT  DECR  #dec-ADJ  #WHOLE-ADJ  (  adjust  counts  ) 
NO-.?  (  no  minus  sign  or  decimal  point?  ) 

IF  BELL  "I"  NEGATE  (  back  up  all  the  way  ) 

ELSE  NO#?  IF  "I"  NEGATE 

ELSE  -1  THEN  THEN 
ELSE  ‘HIT/NEG  BELL  0  THEN  THEN  ; 

(  The  above  above  takes  care  of  the  details  of  the  backspace  in 

numeric  entry  &  leaves  the  proper  loop  increment  on  the  stk  ) 


THEN  THEN  THEN  THEN  THEN  +LOOP 
DROP  (  count  )  #DONE  REVERSE  ; 


(  Test 

0  PLACES 
OK-NEG  OFF 
SOUND  ON 
-CUR 

5  NEW  DIGITS 

CR  CR 
2  PLACES 
OK-NEG  ON 

7  NEW  DIGITS 

+CUR 


Ham  18:51  11/06/86  ) 


End  Listing 


(  Final  input  word  Ham  18:51  11/06/86  ) 

:  DIGITS  (  d  ♦  T  I  #  F  —  d  #)  REVERSE  DSET  DO  DISPLAY#  GET# 

DUP  Bs?  IF  DROP  I  INDX  \  BSP-ROU 

ELSE  DUP  ASCII  -  IF  DROP  NEG-  0  NOT  NEG-  !  0 

ELSE  DUP  ASCII  .  -  IF  DROP  .ROUTINE  0 

ELSE  DUP  ASCII  C  -  IF  DROP  #PAD  C0  ASCII  0  <> 

IF  #HIT  0  NEGATE  ELSE  0  THEN  #init 
ELSE  DUP  I  *P!  (  store  char  )  Cr?  IF  LEAVE  THEN 
11+  #HIT  !  (  count  of  net  keystrokes  ) 

DUP  (  *  of  digits  to  enter  )  I  -  WHOLE-CK  DEC-CK 

IF  (  at  end:  reject  digit  )  0  I  #P !  I  #HIT  !  BELL  0 

ELSE  #PAD  I  +  C0  ASCII  0  <>  I  0<>  OR  dec?  OR 

NEGATE  (  83  Std  flag  )  DUP  IF  CNT-DIGIT  THEN 


Dr.  Dobb's  Journal,  April  1987 


119 

315 


COLUMNS 


C  CHEST 


\r:  A  C  Implementation  of  JVrofff,  Part  3 


This  is  the  continuation  of  the  us¬ 
ers'  manual  for  nr,  my  text  for¬ 
matting  program.  On  the  source  code 
disk  I  show  how  the  commands  are 
used  by  presenting  an  implementa¬ 
tion  of  the  ms  macro  package.  (See 
the  end  of  this  column  for  informa¬ 
tion  about  the  source-code  disk.) 

Tabs  and  Leaders 

Nr  supports  arbitrary  tab  stops  that 
can  be  placed  at  any  column.  Tabs 
are  represented  in  the  text  with  ei¬ 
ther  an  ASCII  TAB  character  (Ctrl-I)  or 
with  the  \T  escape  sequence.  Tabs 
are  expanded  at  input  time  to  the 
number  of  space  characters  needed 
to  get  to  the  indicated  column.  They 
don't  work  very  well  in  proportion¬ 
ally  spaced  fonts  for  this  reason. 
Three  types  of  tabs  are  supported: 
left  adjusting,  centering,  and  right  ad¬ 
justing.  The  left-adjusting  tabs  are  the 
normal  variety — the  text  following 
the  Ctrl-I  or  \T  is  aligned  with  the 
next  tab  stop.  Centering  and  right-ad¬ 
justing  tabs  are  more  complicated.  If 
a  tab  stop  is  a  centering  one,  then  all 
text  between  the  next  two  Ctrl-Is  is 
centered  on  the  next  tab  stop.  Simi¬ 
larly,  right-adjusting  tabs  cause  the 
text  to  be  right-adjusted  on  the  next 
tab  stop.  For  example,  the  following 
two  commands  clear  all  the  default 


\Tleft\Tcentered\Tright\T 
the  following  will  be  printed: 


left  centered  right 

The  vertical  bars  mark  the  tab-stop 
positions. 

Leaders  are  like  tabs  except  that 
the  leader  character  rather  than  the 
tab  character  is  used  to  pad  out  the 
text.  The  default  leader  character  is  a 
period.  You  use  leaders  for  things 
such  as  tables  of  contents  in  which 
you  want  a  string  of  dots  between  the 
last  word  in  the  chapter  title  and  the 
page  number.  A  leader  is  signaled  by 
embedding  an  ASCII  Ctrl-A  in  the  text 
(or  by  using  the  \A  escape  sequence). 
For  example,  the  following  sets  up  a 
tab  stop  at  column  20  and  then  prints 
a  table  of  contents  entry  with  an  in¬ 
tervening  leader: 

.ta 

.ta  20 

Chapter  2  \A  17 


by  Allen  Holub 

tab  stops  and  then  set  up  a  left-adjust¬ 
ing,  centering,  and  right-adjusting 
tab  in  columns  10,  20,  and  30: 

.ta 

.ta  10L,20C,30R 
Given  the  input: 

\T!\T!\T!\T 


The  foregoing  will  print  as: 

Chapter  2 . 17 

■ta  [A,B, . . .  Z] — the  argument,  if  pre¬ 
sent,  is  a  comma-delimited  list  of  tab 
stops.  Each  element  of  the  list  can  be 
a  specific  column,  as  in  .ta  9,17,25,33, 
or  an  offset  from  the  previous  num¬ 
ber,  as  in  .ta  8,  +8,  +8,  +8.  In  addition, 
each  number  can  be  followed  by  one 
of  the  following  tab  types:  R,  for 


right-justified;  C,  for  centered;  and  L, 
for  left-justified.  An  example  was  giv¬ 
en  earlier.  If  no  tab  type  is  specified,  L 
is  assumed.  If  no  argument  is  given, 
all  tab  stops  are  cleared. 

.tp — print  out  all  the  current  tab  stops 
in  a  graphical  form: 

■ . . .  L . C . R....L.. 

where  the  tab  positions  are  marked 
with  an  L,  C,  or  R  depending  on  the 
tab  type. 

■tc  C — set  the  tab-expansion  charac¬ 
ter  to  C.  The  default  tab-expansion 
character  is  a  space.  If  no  C  is  given, 
tab  expansion  is  disabled. 

.lc  C — change  the  leader  character 
from  a  period  to  C. 

Control  Flow 

Though  nr  doesn't  support  a  fancy 
control-flow  language,  it  does  sup¬ 
port  if  and  if. . .  else  mechanisms. 
The  control-flow  statements  nest. 
The  expression  syntax  described  ear¬ 
lier  is  also  used  in  an  if  statement,  so 
all  the  expression  operators  are  avail¬ 
able  to  you  here.  An  expression  that 
evaluates  to  zero  is  false;  nonzero  ex¬ 
pressions  are  true.  The  basic  form  of 
the  if  statement  is  if  ejcpr  action, 
where  ejcpr  is  any  expression  involv¬ 
ing  constants,  number  registers,  and 
so  forth,  and  action  is  any  single  dot 
command  or  text.  For  example,  in: 

.if  "\n%  !  =  1"  .bp 

the  .bp  will  be  executed  only  if  you’re 
not  on  page  1.  In 

.if  "\n%  =  =  1"  This  is  the  Title 

the  text  This  is  the  Title  is  printed 
only  on  page  1.  You  could  also  use  the 
if. . .  else  form  of  the  command: 


130 

316 


Dr.  Dobb's  Journal,  April  1987 


C  CHEST 

(continued  from  page  130 ) 


.ie  "\n%  !  =  1"  .bp 
.el  This  is  the  Title 

You  can  combine  several  state¬ 
ments  into  a  block  using  the  two 
block  commands  ( ./ and .} ).  For  exam¬ 
ple,  in: 

.ie  "\n%  !=  1"  .{ 

bp 
sp  10 

•} 

.el  This  is  the  Title 

both  .bp  and  .sp  10  will  be  executed  if 
you're  not  on  page  1.  For  nroff-com- 
patibility  reasons,  you  can  also  use 
the  less  readable: 

.ie  "\n%  !=  1"  \{\ 

.bp 

.sp  10  \} 

.el  This  is  the  Title 
if  you  like. 

.if  condition — a  simple  if  statement 
(that  doesn’t  take  an  else  clause).  The 


expression  parser  described  earlier  is 
used  to  evaluate  condition.  Two  spe¬ 
cial  conditions  are  supported: 

.if  e  action 
.if  o  action 

The  e  evaluates  to  true  if  the  current 
page  number  is  even;  the  o  is  true  if 
you’re  doing  an  odd  page. 

.ie  condition — the  if  part  of  an 
if. . .  else.  It  is  otherwise  used  like  an 

■if- 

.el — the  else  clause  part  of  the  .ie 
command. 

./—start  a  block  for  an  .if  .ie,  or  .el. 
The  \/\  and  \/  escape  sequences  are 
mapped  to  a  ./  for  nroff 
compatibility. 

.} — terminate  a  ./block.  The  \}  escape 
sequence  is  mapped  to  a  .}  for  nroff 
compatibility. 

Hyphenation 

Nr  supports  automatic  hyphenation, 
enabled  with  a  .hy  command  and  dis¬ 
abled  with  a  .nh  command.  The  nroff 
.hy  command  takes  arguments,  but 
the  nr  variant  ignores  its  arguments. 


A  conservative  hyphenation  algo¬ 
rithm  is  used  to  avoid  incorrect  hy¬ 
phens.  In  addition,  only  words  com¬ 
posed  of  lowercase  alphabetic  char¬ 
acters  are  hyphenated.  If  the  word 
contains  a  hyphen,  it  is  always  sub¬ 
ject  to  being  broken  at  the  explicit  hy¬ 
phen.  In  general,  nr  won't  hyphen¬ 
ate  a  word  if  it's  not  sure  what  to  do. 
Nonetheless,  it  does  make  occasional 
mistakes.  You  can  put  a  soft  hyphen 
into  the  word  to  tell  the  program 
where  a  hyphen  can  go — the  \%  es¬ 
cape  sequence  is  a  soft  hyphen.  For 
example,  nr  will  not  hyphenate  hy¬ 
phenate  correctly  (it  will  try  to  make 
it  hyphe-nate).  You  can  correct  this 
with  hyphen\%ate.  The  \%  is  ignored 
if  no  hyphen  is  inserted.  If  \%  pre¬ 
cedes  the  word,  that  word  won't  be 
hyphenated.  If  a  word  contains  a  soft 
hyphen,  nr  will  not  rehyphenate  it. 

.hy  [N] — enable  hyphenation;  N  is 
ignored. 

.nh — turn  off  hyphenation  (turned 
on  with  a  .hy  command). 

Three-Part  Titles 

Three  commands  are  used  to  support 


132 


Dr.  Dobb's  Journal,  April  1987 

317 


three-part  titles: 

■tl  / A/B/C/ — print  a  three-part  title. 
A,  B,  and  C  are  strings,  where  A  is 
left-justified,  B  is  centered,  and  C  is 
right-justified  on  the  page.  Any  of 
these  can  be  omitted,  as  in: 

■tl  ///page  %/ 

.tl  //-%-// 

The  first  character  in  the  string  (a  / 
here)  is  used  as  a  delimiter  and  can  be 
any  character.  The  title  is  printed  at 
the  current  page  offset  but  indent  is 
ignored.  The  title  width  is  defined 
with  the  .It  command.  A  %  character 
is  special  in  a  title.  It  is  replaced  by 
the  current  page  number,  printed  in 
the  current  format  associated  with 
the  %  number  register.  For  example, 
you  can  produce  Roman-numeral 
page  numbers  with: 

.a/%  i 

■tl  //-%-// 

The  .tl  command  is  usually  used  in  a 
top-of-page  or  bottom-of-page  macro. 

■It  [+—]N — specify  the  width  of  a 
three-part  title,  in  spaces.  Because 
this  command  does  not  affect  the  .11 
command,  it’s  possible  to  have  a  title 
with  a  different  width  from  that  of 
the  body  of  the  text. 

.pc  C — change  the  character  used  to 
indicate  a  page  number  in  a  three- 
part  title  from  %  to  C.  This  is  useful  if 
you  want  to  put  a  percent  sign  in  the 
title  itself. 

Output  Line  Numbering 

Nr  can  automatically  number  out¬ 
put  lines  for  you— in  fact,  I  use  it  to 
generate  all  the  numbered  listings 
5:for  C  Chest.  You  enable  output  line 
numbering  with  a  .nm  command. 
Note  that  this  command  behaves  a 
little  differently  from  the  nroff 
equivalent,  mostly  because  I  can’t 
figure  out  how  the  nroff  one 
10:  works.  Numbers  are  printed  right- 
justified  in  a  three-space-wide 
field.  Syntax  is  .nm  N  M  S,  where 
N  is  the  number  used  for  the  first 
line,  M  is  a  line  number  multiplier 
15:and  S  is  a  string  that’s  printed  to 
the  right  of  each  number.  The 
number  is  printed  only  when  the 
current  output  line  number  is  an 
integer  multiple  of  M.  When  it’s 
20:not,  a  filler  composed  of  three 
space  characters  is  printed  instead 
of  the  number.  This  paragraph 


was  output  with  .nm  1  5: 

.nm  N  M  S — enable  or  disable  line 
numbering.  If  you  need  to  change  M 
without  changing  N,  use  .nm  y  M  S, 
where  y  is  any  nonnumber.  The 
same  goes  for  .nm  y  y  S.  You  can  turn 
off  line  numbering  by  issuing  a  .nm 
with  no  arguments.  If  you  want  to 


Nr  provides 
several 
mechanisms 
other  than 
the  command  line 
for  getting  input 
or  sending  output. 


reenable  it  without  resetting  the  line 
number,  use  .nm  y,  where  y  is  any 
nondigit.  In  addition,  the  line  num¬ 
ber  used  by  .nm  is  stored  in  the  pre¬ 


defined  nl  number  register. 

.nb  [args] — enable  or  disable  blank¬ 
line  numbering.  Usually  blank  lines 
are  not  numbered — they  are  output 
but  no  number  is  printed  and  the  out¬ 
put  line  number  is  not  incremented. 
A  .nb  X  causes  blank  lines  to  be  num¬ 
bered  too.  A  .nb  with  no  arguments 
disables  blank-line  numbering.  This 
command  is  not  supported  by  nroff. 

Input  and  Output 

Nr  povides  several  mechanisms 
other  than  the  command  line  for  get¬ 
ting  input  or  sending  output  to  the 
file  or  the  console: 

.cf  file — cop y  file  directly  to  standard 
output  without  any  sort  of  process¬ 
ing.  This  command  is  useful  for  auto¬ 
matically  downloading  fonts. 

.tm  string — print  string  directly  to 
standard  error.  One  of  the  uses  for 
this  command  is  to  print  diagnostics. 
My  version  of  the  ms  package,  for  ex¬ 
ample,  prints  the  page  number  at  the 
top  of  each  page  so  you  can  see 


Dr.  Dobb’s  Journal,  April  1987 

318 


133 


C  CHEST 

(continued  from  page  133) 


where  you  are  in  the  formatting  pro¬ 
cess,  even  if  output’s  redirected.  You 
have  to  use  \N  to  get  a  new  line  into  a 
.tm  string  (as  in  .tm  page  \n%  \N ). 

.mf  macro  file — copy  the  contents  of 
the  named  macro  into  the  indicated 
file.  This  is  not  an  nroff  command. 
It's  useful  primarily  for  indexes.  You 
can  write  a  macro  for  collecting  in¬ 
dex  entries,  and  this  macro  would 
automatically  append  some  sort  of 
reference  information  and  the  cur¬ 
rent  page  number  to  a  special  macro 
every  time  that  it  was  called.  After 
the  entire  document  has  been  pro¬ 
cessed,  the  macro  could  then  be 
transferred  to  a  file  so  that  you  could 
modify  the  data. 

.ou  string — send  string  directly  to  the 
current  output,  without  going 
through  the  normal  text-processing 
mechanism.  This  also  is  not  an  nroff 
command.  This  command  is  for 
sending  control  sequences  directly  to 
the  printer — that  is,  for  initializations 
and  so  on.  Use  \y<2  fiey  digits>  to 
send  nonprinting  characters.  Note 
that  the  — c  flag  (which  causes  con¬ 
trol  characters  to  be  printed  in  read¬ 
able  form)  affects  the  output  from 
this  command. 

.rd  [ prompt ] — read  insertion  from 
standard  input  rather  than  from  the 
current  input  file.  Reading  stops 
when  two  new  lines  in  a  row  are  en¬ 
countered.  This  command  allows 
you  to  insert  text  interactively  into  a 
document.  The  prompt,  if  any,  is 
printed  (and  the  bell  is  rung)  before 
any  text  is  read. 

.so  file — get  (source)  input  from  the 
named  file.  The  position  in  the  cur¬ 
rent  file  is  remembered,  and  pro¬ 
cessing  will  continue  when  the 
source  file  is  exhausted.  This  com¬ 
mand  works  like  an  * include  direc¬ 
tive  in  C  does.  The  .so  command 
is  replaced  by  the  contents  of  the 
indicated  file. 

Miscellaneous 

,\" — signifies  a  comment.  The  entire 
line  is  ignored.  Note  that  a  dot  on  a 
line  by  itself  is  also  considered  to  be  a 
comment  line. 


134 


.db  [1] — enable  or  disable  debugging 
mode.  A  .db  y  enables  debugging 
mode  (same  as  —  v  — c  on  the  com¬ 
mand  line),  and  a  .db  without  an  argu¬ 
ment  disables  debugging  mode.  This 
is  not  an  nroff  command. 

.ey — exit  back  to  the  operating  sys¬ 
tem  just  as  if  input  had  ended.  The 
end  macro  is  executed. 

.ig  /yy7 — ignore  all  input  until  a  line 
starting  with  .yy  is  found,  where  yy  is 
the  argument  to  .ig.  In  the  absence  of 
yy,  .  .  is  used. 

.me  string  [N]  — specify  a  right  margin 
character,  and  print  string,  N  spaces 
to  the  right  of  the  current  right  mar¬ 
gin.  This  usage  differs  from  nroff, 
which  uses  a  single  character  rather 
than  a  string.  If  no  arguments  are  pre¬ 
sent,  the  margin  character  is  disabled. 
The  string  is  limited  to  20  characters 
(including  any  spaces  implied  by  AO.  If 
N  is  missing  or  0,  2  is  used. 

.ml  string — print  the  string  at  the  left 
margin  rather  than  the  right  margin 
(it  works  like  .me  does).  The  page  off¬ 
set  must  be  at  least  as  large  as  the 
string,  which  is  limited  to  21  charac¬ 
ters.  This  command  is  not  supported 
by  nroff. 

.wa  [N] — wait  for  about  N  seconds  (at 
most  N  +  1).  If  N  is  0  or  if  no  argu¬ 
ment  is  given,  a  prompt  is  printed 
and  the  program  waits  for  you  to 
type  Enter.  This  command  is  not  sup¬ 
ported  by  nroff. 

.ws  N — enable  WordStar-mode  out¬ 
put.  If  N  is  0  (or  missing),  WordStar 
mode  is  disabled.  If  N  is  1,  all  single 
new  lines  are  mapped  to  WordStar 
soft  carriage  returns.  Note  that  dou¬ 
ble  new  lines,  as  are  used  to  create  a 
blank  line,  map  to  two  hard  carriage 
returns.  This  way  you  can  get  a  hard 
carriage  return  at  the  end  of  a  para¬ 
graph  by  putting  a  blank  line  after 
every  paragraph.  An  N  of  2  is  han¬ 
dled  like  N=1  except  that  single  car¬ 
riage  returns  are  replaced  with  space 
characters  rather  than  soft  carriage 
returns.  Note  that  you’ll  also  want  to 
do  the  following: 

.po  0  \"  No  page  offset 

.bd  \y02  \y02  \"  ‘B  for  bold 

.ud  \y!3  \y!3  \"  ' Sfior  underline 

.od  \y!8  \yJ8  \"  ~X  for  overstrike 


Dr.  Dobb's  Journal,  April  1987 

319 


C  CHEST 

(continued  from  page  135) 


This  command  is  not  supported  by 
nroff. 

Escape  Sequences 

Escape  sequences  are  special  se¬ 
quences  of  characters  that  either  tell 
nr  to  do  something  or  signal  some 
sort  of  macro  expansion  (that  is,  the 
escape  sequence  will  be  replaced  by 
some  other  text).  They  are  all  intro¬ 
duced  with  a  leading  escape  charac¬ 
ter.  This  character  is  a  backslash  by 
default,  but  you  can  change  it  with 
the  .ec  C  command,  where  C  is  the 
new  escape  character. 

Nr  expands  escape  sequences  in 
three  distinct  modes:  normal  mode, 
copy  mode,  and  nroff  copy  mode.  In 
normal  mode,  usually  in  effect,  all  es¬ 
cape  sequences  are  expanded.  Copy 
mode  becomes  active  when  a  macro 
is  being  defined — that  is,  at  definition 
time,  the  contents  of  the  macro  or 
string  are  copied  into  the  macro  in 
copy  mode.  The  macro  is  expanded 
in  normal  mode,  however.  In  copy 
mode  only  two  escape  sequences  are 
recognized:  \ ",  which  introduces  a 
comment;  and  \(CR),  a  backslash  at 
the  end  of  the  line,  which  is  a  hidden 
line  feed  (that  is,  the  current  line  is 
merged  with  the  next  line).  All  other 
escape  sequences  are  just  copied  into 
the  macro  intact.  The  real  nroff  has  a 
less  restricted  copy  mode  in  which 
all  the  following  are  recognized: 

\(CR)  \"  \.  V  \$N  \nx\n(xx  \n+x 
\n+(xx  \*x  \*(xx  \\ 

Usually  this  is  more  trouble  than 
it's  worth  because  you  have  to  use  \  \ 
every  time  you  want  to  get  a  \  into  a 
macro  (to  prevent  it  from  being  ex¬ 
panded  at  definition  time).  You  can 
use  nroff  copy  mode  instead  of  nor¬ 
mal  copy  mode,  however.  It  is  en¬ 
abled  with  a  .cm  1  command  and  dis¬ 
abled  using  .cm  without  the 
argument.  Supported  escape  se¬ 
quences  are  summarized  in  Table  1, 
page  137,  and  are  described  in  depth 
in  the  following  paragraphs. 

\  "■ — introduces  a  comment.  All  char¬ 
acters  following  the  \ "  on  the  line  and 
all  white  space  preceding  the  \ "  on 
the  line  are  ignored.  A  \ "  at  the  begin¬ 
ning  of  line  is  treated  as  if  it  were  a 


blank  line  (the  fill  buffer  is  flushed 
and  a  blank  line  is  written  to  the 
output).  The  . \"  command,  how¬ 
ever,  causes  the  line  to  be  ignored 
entirely  (for  example,  no  blank  line  is 
printed). 

\(CR)  — ignores  the  end  of  line — that 
is,  all  text  on  the  following  line  is 
merged  with  text  on  the  current  line, 
as  if  the  two  lines  were  one. 

\- — a  dot  that's  never  interpreted  as  a 
command  character. 


Nr  expands 
escape  sequences 
in  normal 
mode , 
copy  mode, 
and  nroff  copy 
mode. 


\- — a  backquote  that's  never  inter¬ 
preted  as  a  secondary  command 
character. 

\  \ — expands  to  a  backslash. 

\$N — a  macro  argument,  where  N  is 
a  number  in  the  range  1—9.  A  macro 
defined  with: 

.de  XX 
arg  1  <\$1> 
arg2  <\$2> 
arg  3  <\$3> 

and  invoked  with: 

.XX  "this  is  one  argument"  doo  wha 
will  print: 

arg  1  <this  is  one  argument> 
arg  2  <doo> 
arg  3  <wha> 

\*X — expands  to  the  contents  of  a 
string  named  y.  Strings  are  created 
with  a  \.ds  or  \.as  command.  Note 
that  the  string  could  also  be  expand¬ 
ed  as  if  it  were  a  macro  (using  \.y). 
Nested  \*  expansions  are  supported, 
and  the  strings  can  contain  other  es¬ 
cape  sequences  (such  as  \ny). 


\*(yy — expands  to  the  contents  of  a 
*y  string  having  the  two-character 
name  yy  (works  like  \  *y  does). 

\ny,  \nfyy,  \n-/-y,  \n  T-fyy — interpo¬ 
lates  number  registers,  discussed 
earlier. 

\% — indicates  a  soft  hyphen.  Soft  hy¬ 
phens  are  used  to  indicate  places 
where  a  word  may  be  hyphenated.  If 
hyphenation  is  disabled  or  if  the 
word  isn’t  at  the  end  of  a  line,  then 
the  soft  hyphen  is  ignored.  For  exam¬ 
ple,  hy\%phen\%ate  tells  nr  to  hy¬ 
phenate  hyphenate  either  after  y  or  n. 
Words  with  soft  hyphens  in  them 
will  not  be  rehyphenated  by  the 
automatic-hyphenation  algorithm. 
You  can  prevent  a  word  from  being 
hyphenated  by  preceding  the  first 
letter  with  a  soft  hyphen. 

\&c — treats  the  c  as  a  literal  charac¬ 
ter.  Note  that  this  is  different  from 
the  normal  nroff  syntax,  which 
treats  \&  as  a  zero-width,  nonprint¬ 
ing  character.  Because  nr  uses  sever¬ 
al  characters  whose  values  are  great¬ 
er  than  0y7finternally,  \&  is  the  only 
safe  way  to  get  such  a  character 
through  to  the  printer.  For  example, 
if  you  need  to  get  a  0y8a  to  the  printer 
without  nroff  intercepting  it,  use 
\&\ySa.  The  character  following  the 
\&  can  be  any  single  character  or  es¬ 
cape  sequence  that  evaluates  to  a  sin¬ 
gle  character. 

\(SP)  (a  backslash  followed  by  a 
space) — a  nonpaddable  and  non¬ 
breaking  space.  Given  two  words 
separated  with  a  nonpadding  space 
( word\  word),  the  justification  algo¬ 
rithm  will  never  add  additional 
spaces  between  the  words  and  the 
two  words  will  always  be  on  the 
same  output  line. 

\  — — evaluates  to  a  dash  (it's  a  minus 
sign  in  troff). 

\ \ \0 — nr  ignores  the  first  two  and 
maps  a  \0  to  a  normal  space  charac¬ 
ter.  These  sequences  are  supported 
for  compatibility  with  troff,  which 
treats  \ '  as  a  thin  space,  \ "  as  a  some¬ 
what  thicker  but  nonetheless  thin 
space,  and  \0  as  a  digit-width  space. 

\A — a  visible  leader  character.  It's 
the  same  as  a  Ctrl-A  embedded  in  the 


136 

320 


Dr.  Dobb's  Journal,  April  1987 


text. 

\L'Nc',  \l’Nc' — the  two  line-drawing 
functions.  \L'Nc'  evaluates  to  a  verti¬ 
cal  line  composed  of  N  cs  stacked  one 
on  top  of  the  other.  For  example,  the 
command  \L'3+'  prints: 

+ 

+ 

+ 

The  cursor  is  positioned  immediately 
below  the  bottom  plus  sign  (you  can 
use  \v  [discussed  later]  to  get  back  to 
the  top).  If  no  character  is  specified 
explicitly,  a  vertical  bar  is  used:  \L'3’ 
prints  a  three-line-high  bar. 

The  \l'Nc'  works  just  like  \L  does 


except  that  a  horizontal  line  is 
drawn.  For  example,  \l'ZO—'  draws  a 
horizontal  line  composed  of  20 
dashes.  The  default  character  is  an 
underscore.  Note  that  you  can't  use 
an  escape  sequence  for  the  line  char¬ 
acter  (as  in  \l'10\}c85').  You  can  define 
a  string  to  do  this  though: 

.dsli  \  \l'10\?c85' 

\  *(li 

\N — a  new  line  that  can  be  embed¬ 
ded  in  a  string  or  .tm  command. 

\T — a  visible  tab  character,  it  is  re¬ 
placed  with  a  Ctrl-I,  which  will  be  ex¬ 
panded  as  the  input  is  processed. 


\a — a  nonexpanded  leader  charac¬ 
ter.  That  is,  it’s  a  Ctrl-A  that  will  make 
it  through  to  the  output  without  be¬ 
ing  transformed  into  a  series  of  dots 
or  whatever. 

\d — sends  the  cursor  down  half  a 
line.  N2  can  be  done  with  N\u2\d. 

\e — a  printing  version  of  the  current 
escape  character  (the  one  that  was  ac¬ 
tive  when  the  \e  was  encountered  in 
the  input).  This  is  more  convenient 
than  \\  because  many  macros  will 
create  other  macros  on  the  fly,  and 
each  level  of  secondary  macro  will 
also  expand  backslashes.  Sometimes  it 
can  take  as  many  as  six  or  eight  back¬ 
slashes  for  one  to  make  it  all  the  way 
to  the  output.  A  single  \e,  however,  is 
never  interpreted  as  a  backslash — it 
always  gets  to  the  output  unmolested. 

\fF — changes  fonts.  For  example, 
the  word  italics  was  created  with 
\fIitalics\fP.  (See  the  description  of 
the  .ft  command  for  more 
information.) 

\h'N',  \h'Nu',  \v'N',  \v'Nu' — give  you 
fine  control  over  cursor  motion.  The 
\h  command  is  for  horizontal  motion, 
and  the  \  v  is  for  vertical.  N  is  the  num¬ 
ber  of  lines  or  spaces,  as  appropriate. 
N  can  be  negative.  For  example: 

X\h'4'X\h'—l'\v'2’X\h'—5'X 

prints: 

X  X 
X  X 

It  can  be  broken  up  into: 

X  print  an  X 

\h’4'  move  four  spaces  to  the 
right 

X  print  another  X 
\h'—l'  back  up  one  position  (over 
the  last  X) 

v'2'  go  down  two  lines 

X  print  a  third  X 

\h'—5'  back  up  five  spaces 
X  print  the  last  X 

If  the  character  u  follows  the 

count,  then  motion  will  be  in  terms 
of  vertical  and  horizontal  units,  as  de¬ 
fined  with  the  .vd  and  .hd  com¬ 
mands,  instead  of  lines  and  spaces. 

\o'ab' — superimposes  all  characters 
between  the  quotes  one  on  top  of  the 
other.  For  example,  \o'Q/'  can  be 
used  to  print  0  .  Use  the  special  es¬ 
cape  sequence  \ '  to  put  a  single  quote 
into  the  list. 


Copy  mode 

Comment  (deletes  all  following  text  and  all  preceding  white  space). 

V 

\(CR) 

Ignore  the  end  of  line. 

Expanded  in  nroff  copy  mode  but  not  in  normal  copy  mode 

\. 

A  dot  that’s  never  interpreted  as  a  command  character. 

V 

A  backquote  that’s  never  interpreted  as  a  command  character. 

\\ 

A  backslash. 

\$N 

Macro  argument,  where  1  <  =  N  <  =  9. 

Vx 

String  x.  Nested  \  *  expansions  are  supported,  and  the  strings  can  contain 
other  escape  sequences  (such  as  \nx). 

V(xx 

String  xx. 

\nx 

Number  register* 

\n(xx 

Number  register  xx. 

\n+x 

Number  register  x  with  auto  preincrement. 

\n+(xx 

Number  register  xx  with  auto  preincrement. 

Expanded  only  when  not  in  either  normal  or  nroff  copy  mode 

\% 

Soft  hyphen. 

\&c 

c  is  literal  (can  be  \xDD)  (nonstandard). 

\(SP) 

Nonpaddable  nonbreaking  space. 

\- 

Dash  (minus  sign  in  troff). 

\o 

Normal  space  (digit-width  space  in  troff). 

\A 

Same  as  "A. 

\L’Nc’ 

Vertical  line  of  N  cs  (default  c  is  /). 

\l’Nc’ 

Horizontal  line  of  W  cs  (default  c  is  underscore). 

\N 

New  line  that  can  be  embedded  in  a  string  or  .tm  command. 

AT 

Same  as  1. 

\X 

Where  X  is  any  other  character,  that  character. 

\a 

Nonexpanded  leader  character. 

\d 

Down  a  half  line. 

\e 

Printable  version  of  current  escape  character. 

\fF 

Change  to  font  F. 

\h’N’ 

Horizontal  motion  by  N  spaces  (N  can  be  negative). 

\h’Nu’ 

Horizontal  motion  by  N  units  (as  defined  with  .hd). 

\v’N’ 

Vertical  motion  by  N spaces  (Wean  be  negative). 

\v’Nu’ 

Vertical  motion  by  W  units  (as  defined  with  .hd). 

\o’ab’ 

Superimpose  all  characters  between  the  quotes  (overstrike). 

\r 

Up  one  line. 

\t 

Nonexpanded  tab  character. 

\u 

Up  a  half  line. 

\xDD 

Where  DD  is  two  hex  digits,  that  character. 

\zc 

c  is  zero  width. 

M 

Start  block  (see  also ./}. 

\! 

Ignored  (thin  space  with  troff). 

\) 

End  block  (see  also .}). 

Table  1:  Nr-supported  escape  sequences 


Dr.  Dobb's  Journal,  April  1987 


137 

321 


Flotsam  and  Jetsam 


Declarations  and 
Definitions  in  One  Pile 
The  matter  of  declarations  and  defi- 
nitions  that  I  discussed  in  last  month's 
Flotsam  and  Jetsam  can  cause  main¬ 
tenance  problems.  You  can  define  (al¬ 
locate  space  for)  a  variable  in  only 
one  place  in  your  program.  Nonethe¬ 
less,  you  have  to  declare  the  variables 
(with  extern  statements)  in  every  file 
that  uses  the  variable. 

Michael  Yam  of  NYC  suggests  a  so¬ 
lution  to  this  problem:  "Managing 
globals  can  be  messy,  particularly  in 
a  C  program  that  has  many  modules. 
One  of  the  more  difficult  tasks  in  a 
large  program  is  coordinating  the 
variable  declarations  (the  extern 
statements  in  a  .H  file)  with  the  defi¬ 
nitions  (where  the  space  is  allocated 
in  a  ,C  file).  You  can  both  define  and 
declare  all  globals  in  one  place  by  us¬ 
ing  the  G  preprocessor,  however, 
thereby  making  globals  easy  to  track 
and  document.  Let's  say  you  have 
three  modules:  testmain.c,  testl.c, 
and  test2.c.  The  mainf  )  subroutine  is 
in  testmain.c,  which  also  includes  the 
following  statements: 

'defineALLGCATE 
'include  "testmain.h” 

Other  files  may  include  testmain.h, 
hut  none  of  these  other  files  includes 
the  ^define  ALLOCATE— that's  in  test¬ 
main.c.  Testmain.h  holds  all  global 
definitions  and  declarations  and 
looks  like  this: 

'ifdef  ALLOCATE 

'define  GLOBAL 

'else  ’ 

'tjefihe  GLOBAL  extern 
*endif  . 

GLOBAL  int  globl; 

GLOBAL  int  glofaZ; 

GLOBAT.  struct 

{ 

■  int  x;y,  z; 

}  1 

worlds -tte  " •: .  .  : , ;  *  ■  ■ 

When  you  compile  testmain.c,  GLOB¬ 
AL  expands  to  nothing  and  the  vari¬ 
ables  are  defined  (space  is  allocated 
for  them).  In  all  other  modules,  be¬ 


cause  allocate  isn’t  ^defined,  GLOB 
AL  expands  to  the  keyword  extern 
and  variables  are  declared. 

"I  don't  think  this  approach  is  any¬ 
thing  new,  but  so  few  programs 
take  advantage  of' this  technique.” 

Michael's  correct  in  thinking  that 
the  technique's  not  new,  but  it's  cer¬ 
tainly  useful  at  times.  When  you  use 
it,  however,  be  careful  of  static  initia¬ 
lizers,  which  can’t  be  used  in  extern 
statements.  A  good  solution  is  to  use 
the  variant  of  the  D( )  macro  I  dis¬ 
cussed  a  few  months  ago: 

'ifdef  ALLOCATE 
'define  INTUx)  ={  x } 

'define  GLOBAL 
'else 

'define  INITtx) 

'define  GLOBAL  extern 
'endif 

GLOBAL  int  x[  ]  INIT(  1,  2,  3, 4 ); 
GLOBAL  char  *y[  ]  INIT(  "quick", 
"brown",  "fox" ); 

As  before,  if  ALLOCATE  isn’t  *defined 
then  the  initializations  aren’t  com¬ 
piled.  When  ALLOCATE  is  ' defined , 

int  x[]  ={1,2, 3,4}; 
char  *y(  ]  =»{  "quick",  "brown",  "fox" 
}; 

You  may  need  two  INIT  definitions 
because  some  compilers  won't  ac¬ 
cept  curly  braces  around  single  ob¬ 
jects,  as  in: 

int  Z“(l}; 

Use: 

'define  INIT2(x)  =  x 

Finally  you  can  use  a  general-pur¬ 
pose  initialization  macro  that  in¬ 
cludes  all  the  brackets  and  equal 
signs  with  it: 

'if  def  ALIJJCA  TE 
'define  INITXx)  x 
'else 

'define  INIT 
'endit 

You  can  then  say: 

GLOBAL  x[  ]  INrrt=={l,2,3});  C 


C  CHEST 

(continued  from  page  137) 

\r — sends  the  cursor  up  one  line. 

\f — a  nonexpanded  tab  character 
(Ctrl-I).  Like  a  \a,  it  will  make  it  ail  the 
way  to  the  output  without  being  ex¬ 
panded  by  the  tab-processing 
routines. 

\u — sends  the  cursor  up  half  a  line. 

\gDD — gets  binary  information  to  the 
printer.  DD  is  two  hex  digits  (two  are 
required).  For  example,  an  ASCII  es¬ 
cape  character  can  be  sent  to  the 
printer  with  a  \}clb. 

\zc — says  that  c  is  a  zero-width  char¬ 
acter.  For  example,  a  can  be  printed 
with  \z  'a. 

\{,  \ }  — form  a  block  in  an  .if  .ie,  or  .el. 
For  example,  in: 

.iff  \ng)  \{\ 

.in  +10 
.ti  -10  \} 

both  the  .in  and  .ti  are  executed  if  \ng 
contains  a  nonzero  number.  The 
and  \}  are  supported  primarily  for 
nroff  compatibility.  The  nr  com¬ 
mands  .{  and  .}  tend  to  be  more 
readable: 

.iff  \ng  ).{ 

.in  +10 
.ti  -10 
■} 

Availability 

The  February,  March,  and  April  1987 
C  Chests  have  been  combined  in  Nr: 
An  Nroff-Like  Text  Processor  for  MS- 
DOS.  This  reprint  is  available  with  a 
source-code  disk  for  $29.95.  Send  pre¬ 
paid  orders  to  M&T  Books,  501  Gal¬ 
veston  Dr.,  Redwood  City,  CA  94063 
or  call  (415)  366-3600,  extension  216. 
Please  add  $2.25  for  shipping  and 
handling  ($5  for  foreign  orders). 

Missing  Subroutines 

The  subroutines  newsamplef  J,  run- 
ning—meanf),  and  deviationf)  were 
referenced  in  February  but  will  be 
published  in  the  May  listings.  The 
ferrf )  subroutine  was  referenced  in 
February  and  published  in  March 
(page  48).  The  errf )  subroutine  is  just 
ferrf )  without  the  egitf )  call. 

DDJ 

(Listings  begin  on  page  84.) 

Vote  for  your  favorite  feature/article. 

Circle  Reader  Service  No.  5. 


138 

322 


Dr.  Dobb's  Journal,  April  1987 


COLUMNS 


STRUCTURED  PROGRAMMING 

People  in  Programming 


Programming  begins  with  peo¬ 
ple  and  their  problems:  a  good 
program  is  a  sound  solution  that  the 
user  likes.  The  initial  problem  in  pro¬ 
gramming  is  not  communicating 
with  the  machine  but  with  the  client. 
Discovering  what  the  client  wants 
and — more  important — what  the  cli¬ 
ent  needs,  helping  the  client  under¬ 
stand  trade-offs  and  alternatives,  and 
in  general  learning  enough  about  the 
client  and  his  or  her  relationship 
with  the  problem  so  that  the  pro¬ 
gram  is  born  with  the  greatest 
chance  of  success — all  this  requires 
techniques  that  do  not  fold  neatly 
into  algorithms. 

Programmers  sometimes  come  to 
grief  not  because  of  their  lack  of  tech¬ 
nical  skills  with  the  hardware  and 
the  software  but  because  of  misun¬ 
derstandings  and  imperfect  compro¬ 
mises  with  the  people  involved. 

You  might  be  interested  in  two 
books  that  seem  particularly  interest¬ 
ing  and  helpful  in  extending  skills  in 
the  people  direction.  The  first  is  Get¬ 
ting  to  Yes,  by  Roger  Fisher  and  Wil¬ 
liam  Ury  (Boston:  Houghton  Mifflin, 
1981).  This  is  the  only  book  on  negoti¬ 
ation  I  have  seen  that  describes  a 
method  instead  of  offering  only  a 
potpourri  of  unrelated  tactics.  Not 
only  does  it  provide  a  method  but  it 
also  gives  a  rationale  that  demon¬ 
strates  both  the  effectiveness  and  the 
legitimacy  of  the  techniques. 

Negotiation  is  a  primary  people 
skill  in  programming.  Who  has  not 
encountered  a  client  whose  desires 
exceed  the  budget  or  whose  whims 

by  Michael  Ham 

overburden  the  hardware?  The  pro¬ 
grammer  often  must  tactfully  devise 
and  suggest  practical  alternatives  to 
wishful  thinking  and  then  negotiate 
for  their  acceptance.  The  unprofes- 

®  1986  by  Michael  Ham.  All  rights 
reserved. 


140 


sional  programmer's  approach  is  ei¬ 
ther  to  do  slavishly  whatever  the  cli¬ 
ent  asks,  however  inefficient  or 
dubious,  or  to  implement  the  pro¬ 
grammer's  own  ideas,  arrogantly  ig¬ 
noring  the  wishes  of  the  client.  The 
professional,  on  the  other  hand,  rec¬ 
ognizes  the  importance  of  educating 
the  client  and  negotiating  a  sound  so¬ 
lution. 

Besides  negotiating  on  the  techni¬ 
cal  aspects,  programmers  (particular¬ 
ly  freelance  programmers,  whose 
every  job  involves  a  contract)  must  be 
competent  at  blunt  business  negotia¬ 
tion.  Getting  to  Yes  is  as  close  to  a  com¬ 
plete  manual  of  negotiation  as  you 
can  get,  with  application  in  every 
area  of  negotiating. 

The  key  to  successful  client  rela¬ 
tions  involves  more  than  a  sound 
technical  solution  in  the  context  of  a 
well-negotiated  agreement.  The  rela¬ 
tionship  is  woven  from  a  myriad  of 
daily  exchanges,  with  the  resulting 
fabric  called  cooperation.  The  Evolu¬ 
tion  of  Cooperation,  by  Robert  Axel¬ 
rod  (New  York:  Basic  Books,  1984) 
provides  a  computer  context  for  this 
elemental  mode  of  human  exchange. 

The  book  describes  two  tourna¬ 
ments  in  which  programs  were  de¬ 
veloped  to  play  out  strategies  of  coop- 
eration  (or  noncooperation)  in  a 
prisoners’  dilemma  situation:  in 
which  mutual  betrayal  gets  nothing, 
but  one  betrayal  gains  significant  ad¬ 
vantage  over  mutual  cooperation 
and  being  betrayed  gets  nothing  or 
little.  The  explorations  are  intrigu¬ 
ing.  You  cannot  push  too  far  the  so¬ 
cial  analogies  of  computer  strategies, 
but  playing  with  the  ideas  is  at  least 


stimulating,  especially  since  the  strat¬ 
egy  that  won  the  two  tournaments 
has  a  long  and  honorable  history. 

Number  Input 

In  my  inaugural  column  (July  1986),  I 
proposed  a  number-input  word  to 
collect  numbers  calculator-style,  dis¬ 
playing  appropriate  punctuation,  tol¬ 
erating  normal  user  errors,  and  giv¬ 
ing  the  programmer  control  over  the 
number  of  digits  that  can  be  entered 
before  and  after  the  decimal  point. 

Other  languages  (notably  C)  come 
provided  with  a  toolbox  already 
stocked  with  useful  routines  such  as 
number  input.  Forth  instead  offers  to 
its  programmers  the  elements  from 
which  they  can  build  tools.  Over 
time,  Forth  programmers  accumu¬ 
late  a  collection  of  handmade  tools, 
often  beautifully  worked  to  fit  pre¬ 
cisely  the  special  nature  of  their  par¬ 
ticular  applications.  Moreover,  be¬ 
cause  he  or  she  built  the  tool,  the 
programmer  understands  exactly 
not  only  what  it  does  but  also  how  it 
does  it — the  limitations  and  the 
strengths — and  can  readily  modify 
the  tool  to  fit  any  mutation  of  the 
original  situation. 

The  drawback  to  Forth’s  approach 
is  that  building  and  polishing  the 
tools  take  time.  Sometimes  it  would 
be  nice  to  take  a  close-enough  solu¬ 
tion  off  the  shelf  instead  of  fashion¬ 
ing  the  perfect  fit  to  the  specific  prob¬ 
lem.  One  source  of  such  solutions  is 
the  set  of  modules  written  in  other 
languages — C  or  FORTRAN,  for  exam¬ 
ple — that  have  a  store  of  existing 
tools.  Laboratory  Microsystems  has 
recently  published  a  Forth,  which  it 
calls  UR/Forth,  that  can  be  linked  to 
such  alien  modules. 

Another  approach  is  to  augment 
your  collection  of  software  tools  by 
reading  (and  adapting)  published 
tools.  Here  I  discuss  my  current  solu¬ 
tion  to  the  common  problem  of  col¬ 
lecting  a  number  from  the  key- 


Dr.  Dobb 's  Journal,  April  1987 

323 


STRUCTURED  PROGRAMMING 

(continued  from  page  140) 

board — the  word  DIGITS  (see  Listing 
One,  page  118).  This  word  leaves  two 
numbers  on  the  stack — the  number 
entered  (as  a  signed  double-precision 
number)  and  above  it  the  count  of  the 
digits  entered.  Decimal  fractions  are 
scaled  to  integers,  and  the  decimal 
display  is  merely  cosmetic — for  ex¬ 
ample,  the  number  displayed  as 
0.0023  is  in  fact  stored  as  a  double¬ 
precision  23. 

The  count  of  digits  distinguishes  an 
entered  0  from  the  0  that  results  from 
no  entry.  "No  entry"  results  from  the 
user  pressing  Return  when  the  entry 
field  is  blank.  I  designed  "no  entry”  to 
leave  a  double-precision  0  on  the 
stack  because  I  like  my  general-use 
tools  to  leave  the  same  number  of  ar¬ 
guments  under  all  conditions. 

The  count  of  digits  has  some  other 
uses.  If  the  count  is  less  than  5,  for 
example,  the  entered  number  is  less 
than  10,000  and  thus  the  top  cell  of 
the  double-precision  number  will  be 
0,  which  can  be  dropped  to  leave  a 
single-precision  version  of  the  entry. 

Most  of  the  complexities  in  the 
word  are  because  of  various  special 
cases.  Each  of  the  following  revealed 
a  bug  during  the  development  of 
DIGITS : 

•  pressing  a  clear-entry  key  (C  or  B  or 
the  space  bar)  when  the  entry  is  al¬ 
ready  clear 

•  starting  with  an  old  number  equal 
to  0 

•  having  fewer  nonzero  digits  in  the 
number  than  the  number  of  decimal 
places  entered 

When  you  start  with  an  old  num¬ 
ber  of  0,  the  single  0  digit  must  not  be 
counted  as  you  begin  to  enter  num¬ 
bers.  If  it  is  counted,  you  will  be  able 
to  enter  at  most  one  fewer  digits  than 
you  should  because  the  keystroke 
count  routine  will  be  initialized  to 
count  what  is  in  effect  a  leading  0:  the 
0  that  was  the  old  number. 

The  number  of  nonzero  digits  in 
the  number  can  be  less  than  the  num¬ 
ber  of  decimal  places  entered  (for  ex¬ 
ample,  0.0023  has  four  decimal  places 
entered  but  is  represented  as  the  two- 
digit  number  23).  The  difference  be¬ 
tween  the  number  of  digits  in  the 
number  and  the  number  of  digits  to 


the  right  of  the  decimal  can  thus  be 
negative.  It  is  for  this  reason  that  you 
see  0  MAX  in  computing  the  number 
of  commas. 

Another  minor  complexity  results 
from  the  incomplete  complement  of 
double-precision  operators  provided 
by  most  Forths.  My  January  column 
suggested  a  naming  scheme  for  arith¬ 
metic  operators.  In  terms  of  those 
names,  this  application  could  use  the 
two  operators  M*D  (a  double  and  a 
single  factor,  single  on  top,  giving  a 
double  product  )  and  M/D  (a  double 
dividend,  a  single  divisor,  and  a  dou¬ 
ble  quotient).  The  pair  D*  (two  dou¬ 
bles  as  factors  with  a  double  product) 
and  D/  (a  double  dividend  and  a  dou¬ 
ble  divisor  giving  a  double  quotient) 
would  serve  equally  well. 

With  M*D  (or  D*)  the  routine  could 
accumulate  the  number  as  a  double: 
each  time  a  digit  is  entered,  the  num¬ 
ber  so  far  entered  would  be  multi¬ 
plied  by  10  and  the  new  digit  added. 
With  M/D  (or  D/)  the  backspace 
would  be  easy  to  implement  by  divid¬ 
ing  the  number  so  far  accumulated 
by  10  to  strip  off  the  last  digit  entered. 

One  approach  is  to  write  defini¬ 
tions  for  these  operators.  I  use  an  al¬ 
ternate  route:  I  accumulate  the  num¬ 
ber  as  a  string  of  ASCII  characters  and 
use  CONVERT  whenever  I  want  the 
value  of  the  number. 

I’ll  briefly  discuss  the  code,  but  let 
me  first  point  out  that  I  do  some  arith¬ 
metic  with  the  flags.  Deplorable  as 
the  practice  may  be,  I  find  it  irresist¬ 
ible.  The  important  thing  for  you  to 
know  is  that  these  are  83  Standard 
flags,  in  which  true  is  shown  by  —1 
(all  bits  on),  unlike  the  79  Standard 
true,  which  is  1.  Thus  I  use  the  (83 
Standard)  flag  to  decrement  or  (after 
negating  it)  to  increment  a  count.  The 
sign  must  be  reversed  if  you  are  using 
79  Standard  flags. 

This  code  may  be  more  complex 
than  necessary:  simplicity  is  not  easi¬ 
ly  achieved.  The  code  does,  however, 
get  the  job  done,  and  the  response 
time  (on  an  IBM  PC)  is  totally  adequate. 
Share  with  me  any  simplifications 
you  discover. 

The  first  few  definitions  are  tiny 
tools.  You  will  note  that  the  words  to 
turn  the  cursor  on  and  off  are  vendor 
dependent,  and  you  should  check 
with  your  own  Forth  for  this  type  of 
control.  Some  Forths  automatically 
extinguish  the  cursor  when  KEY  is  ex¬ 


ecuted;  this  was  written  in  Labora¬ 
tory  Microsystem's  PC/Forth,  which 
does  not.  I  use  —CUR  to  turn  the  cur¬ 
sor  off,  +CUR  to  turn  it  back  on. 

The  function  of  the  phrase  8  EMIT 
is  also  vendor  dependent:  some 
Forths  execute  a  backspace,  some  dis¬ 
play  a  character.  So  you  may  have  to 
revise  the  definition  of  BACK  to  make 
it  work  as  intended:  to  backspace  the 
cursor  as  many  positions  as  specified 
by  the  number  on  the  stack. 

Some  definitions,  such  as  NEW  and 
OLD  and  Bs?  and  Cr? ,  are  nonce 
words  to  improve  the  readability  of 
the  code,  always  an  important  objec¬ 
tive.  As  usual,  I  prefer  short  defini¬ 
tions  based  on  normal  English  usage. 
In  my  eyes,  playfulness  is  more  an 
asset  than  a  detriment,  provided  that 
the  word’s  name  reflects  its  effect. 

Control  of  the  sound  generator  is 
also  vendor  dependent.  In  the  defini¬ 
tion  of  BELL,  the  stack  holds  numbers 
that  define  the  pitch  and  the  dura¬ 
tion.  I  prefer  a  short  beep.  The  vari¬ 
able  SOUND  provides  an  easy  on/off 
control  for  beeping. 

Because  PAD  was  occupied  with 
other  tasks,  I  created  a  separate  work 
area  for  the  number  being  entered. 
This  work  area,  #PAD,  will  contain 
the  string  of  ASCII  characters  that  rep¬ 
resent  the  number. 

#VAR  is  an  array  in  which  I  name 
each  cell  using  a  constant.  The  con¬ 
stant,  returning  the  address  that  is  its 
value,  acts  as  a  variable  name.  Some 
naming  conventions  are  apparent:  a 
name  ending  with  ~  is  a  Boolean 
variable;  a  named  prefixed  with  *  ini¬ 
tializes  an  array  or  variable  by  setting 
it  to  zero.  (I  read  *as  "zap.") 

The  words  that  collect  and  edit  the 
character  typed  have  some  points  of 
interest.  You  will  note  that  FIXUP  con¬ 
verts  B  or  blank  to  C.  C  clears  the  dis¬ 
play,  and  B  and  blank  become  syn¬ 
onyms.  Also,  L  and  O  are  converted  to 
1  and  0,  respectively;  the  user's  inten¬ 
tion  is  clear,  and  overpunctilious  pro¬ 
grams  quickly  make  enemies. 

The  word  *?  checks  whether  the 
ASCII  value  is  in  the  range  for  a  deci¬ 
mal  digit.  BAD?  leaves  a  flag  (denoted 
by  the  suffix  ? )  that  is  true  if  the  entry 
was  bad.  Because  Forth  is  not  typed,  I 
can  use  as  a  flag  the  value  from  #DEC 
(the  number  of  places  to  the  right  of 
the  decimal).  A  0  from  #DEC  (that  is, 
there  are  no  places  to  the  right  of  the 
decimal)  acts  as  a  false  flag;  any  other 


142 

324 


Dr.  Dobb's  Journal,  April  1987 


STRUCTURED  PROGRAMMING 

(continued  from  page  143) 

value  (that  is,  there  are  places  to  the 
right  of  the  decimal)  acts  as  a  true 
flag.  The  decimal  point  is  thus  al¬ 
lowed  as  a  valid  character  only  if 
#DEC  is  greater  than  0. 

The  word  #,S  computes  the  num¬ 
ber  of  commas  required,  given  the 
number  of  whole-number  digits.  If 
the  number  of  whole-number  digits 
is  a  multiple  of  3,  the  computed  com¬ 
ma  count  is  corrected  by  decrement¬ 
ing  it  by  1  (using  the  83  Standard  true 
flag  of  —1).  The  phrase  0  MAX  cor¬ 
rects  for  those  instances  in  which  the 
number  of  nonzero  digits  entered  mi¬ 
nus  the  number  of  places  to  the  right 
of  the  decimal  is  negative,  thus  pro¬ 
ducing  a  negative  quotient. 

FULLCNT  adjusts  the  character 
count  of  the  number  of  digits  and 
commas  to  include  the  decimal  point 
and  minus  sign  (if  they  are  allowed). 
BOXSIZE  uses  this  word  to  calculate 
from  the  number  of  digits  being  col¬ 
lected  how  large  a  box  will  be  need¬ 
ed:  as  many  spaces  as  the  maximum 


number  of  printable  characters,  plus 
one  space  at  either  end  of  the  box. 
Because  the  inverse  video  on  the  col¬ 
or  screen  clips  the  edges  of  some 
characters,  I  include  an  extra  space  at 
the  beginning  and  end  of  the  number 
display. 

BOXSIZE  has  one  tricky  aspect:  be¬ 
cause  I  print  the  leading  0  for  decimal 
fractions  less  than  1  (for  example, 
0.0023  instead  of  .0023),  I  have  to  add  1 
to  the  character  count  if  I  am  collect¬ 
ing  such  a  fraction.  I  save  on  the  re¬ 
turn  stack  the  flag  that  tells  me 
whether  this  is  that  sort  of  fraction. 
The  flag,  after  correcting  the  sign,  is 
added  to  the  count.  BOX  then  uses 
BOXSIZE  to  print  an  inverse  field  in 
which  the  number  will  be  entered. 

One  peculiarity  in  number  entry  is 
that  some  data  must  be  displayed  be¬ 
fore  any  number  has  been  entered  at 
all:  the  minus  sign  and  the  decimal 
point  could  be  the  first  two  key¬ 
strokes,  and  when  those  are  entered, 
there  is  still  no  number  to  edit.  The 
word  — .  displays  these  characters. 
You  will  note,  by  the  way,  that  the 
minus  key  works  as  a  toggle  so  that 


the  minus  sign  can  be  entered  or  al¬ 
tered  at  any  time  during  number 
entry. 

PUT#  prepares  the  accumulated 
number  for  display,  leaving  on  the 
stack  the  address  and  count  of  the 
string,  which  is  then  displayed  in  DIS¬ 
PLAY#.  ?DO  is  found  in  most  Forths:  it 
executes  the  loop  only  if  the  two  ar¬ 
guments  are  unequal.  If  your  Forth 
lacks  ?DO,  you  can  substitute  2DUP  = 
IF  ZDROP  ELSE  DO  for  it  and  follow 
LOOP  by  THEN. 

When  number  entry  is  complete, 
the  number  is  left  on  the  stack  with 
any  decimal  fractions  scaled  up  to  an 
integer  value.  It  is  the  programmer's 
responsibility  to  make  sure  that  the 
size  of  the  entries  allowed  (total  num¬ 
ber  of  digits,  number  of  decimal 
places)  will  result  in  a  number  that 
will  fit  within  the  bounds  of  a  double¬ 
precision  number  after  scaling. 

SCALE *  takes  care  of  the  situation 
in  which  the  user  did  not  enter  all  the 
fractional  digits  allowed.  If  the  entry 
parameters  allowed  for  thousandths 
(three  fractional  digits),  for  example, 
and  the  user  pressed  Return  before 
entering  a  decimal  point  (or  any  dig¬ 
its  after  it),  SCALE #  will  multiply  the 
entered  number  by  10  three  times  in 
order  to  force  the  three  fractional 
digits  (0s  in  this  case).  The  word  10D* 
fakes  a  double-precision  multiply  by 
10. 

The  count  adjustments  in  the  next 
group  take  care  of  decrements  re¬ 
quired  when  backspacing  (deleting 
digits).  The  words  WHOLE-CK  and  DEC- 
CK  check  the  limits  imposed  by  the 
programmer  on  the  number  of  digits 
that  can  be  entered  as  whole  number 
digits  and  the  number  of  digits  that 
can  be  entered  as  fractional  digits.  At¬ 
tempts  to  transcend  the  program¬ 
mer-imposed  limit  are  rejected. 

SET-NEG  sets  the  negative  variable 
when  an  old  number  is  entered.  Note 
that  the  number  is  accumulated  as  a 
positive  number,  with  the  actual  sign 
indicated  by  the  variable  NEG~. 

DSET  sets  up  everything  to  start  the 
loop  and  leaves  on  the  stack  the  loop 
limits  and  the  number  of  digits  to  col¬ 
lect.  When  you  are  collecting  a  new 
number,  the  loop  limits  are  m+1 
(one  more  than  the  number  of  digits 
to  collect)  and  0,  but  if  the  routine 
starts  with  a  number  already  en¬ 
tered,  then  the  limits  are  adjusted  ap¬ 
propriately.  The  upper  limit  is  m+1 


144 


Dr.  Dobb's  Journal,  April  1987 

325 


because  the  user  must  be  able  to  en¬ 
ter  one  more  keystroke  than  the 
number  of  digits  to  collect.  Typically 
that  one  additional  keystroke  will  be 
a  Return,  but  it  might  be  a  backspace. 
Any  other  final  keystroke  is  rejected. 

Because  the  topmost  word  DIGITS  is 
so  complex,  I  wanted  to  factor  out 
some  subroutines  to  improve  read¬ 
ability  and  comprehensibility  (and 
therefore  debugging).  But  then  I  had 
to  be  able  to  access  the  index  value 
from  a  word  outside  the  loop.  The 
simplest  solution  was  to  store  the  in¬ 
dex  value  into  a  variable  and  define 
the  word  “I”  to  fetch  the  variable’s 
value.  This  would  have  been  a  natu¬ 
ral  place  for  a  OJUAN,  but  most  Forths 
don't  have  them. 

You  will  note  that  DIGITS  is  a 
DO . . .  +LOOP  structure.  This  struc¬ 
ture  seemed  the  easiest  way  to  move 
back  (backspacing)  and  forth  (enter¬ 
ing  a  valid  character)  in  the  entry  or 
just  to  remain  in  place  (attempting  to 
enter  an  invalid  character):  by  hav¬ 
ing  the  index  increment  be  negative, 
positive,  or  0,  respectively. 

The  backspace  routine,  by  the 


way,  has  to  keep  track  of  whether 
the  backspace  is  over  a  digit,  the  deci¬ 
mal  point,  or  the  minus  sign,  or  disal¬ 
lowed  because  no  valid  character  has 
yet  been  entered  to  backspace  over. 
The  variety  of  situations  tends  to 
complicate  the  definition.  BSP-ROU 
leaves  the  appropriate  increment  to 
the  loop  index.  When  this  is  the  nega¬ 
tive  of  the  current  index  value,  the 
loop  returns  to  its  starting  point. 

DIGITS  keeps  on  the  stack  the  num¬ 
ber  of  digits  to  be  entered  because 
that  number  is  periodically  refer¬ 
enced.  It  would  perhaps  have  been 
cleaner  to  park  that  number  inside 
one  of  the  pseudovariables  that  make 
up  *VAR,  but  by  the  time  the  idea  oc¬ 
curred  to  me,  the  routine  was  al¬ 
ready  working.  Once  the  routine  was 
working,  I  was  disinclined  to  toy 
with  it. 

I  suggest  that,  as  an  exercise,  you 
modify  the  routine  so  that  it  stashes 
the  number  of  digits  to  be  entered 
into  an  additional  cell  in  #VAR, 
whence  it  is  fetched  when  needed. 
When  you  have  the  revised  routine 
working  once  more,  you  should  have 


a  good  understanding  of  this  tool, 
which  I  hope  proves  useful  to  you. 

Availability 

All  the  source  code  for  articles  in  this 
issue  (except  for  C  Chest)  is  available 
on  a  single  disk.  To  order,  send  $14.95 
to  Dr.  Dobb's  Journal,  501  Galveston 
Dr.,  Redwood  City,  CA  94063  or  call 
(415)  366-3600  ext.  216.  Please  specify 
the  issue  number  and  format  (MS-DOS, 
Macintosh,  Kaypro). 

DDJ 

(Listing  begins  on  page  118.) 

Vote  for  your  favorite  feature/article. 

Circle  Reader  Service  No.  6. 


Dr.  Dobb's  Journal,  April  1987 

326 


145 


COLUMNS 


ARTIFICIAL  INTELLIGENCE 


Object-Oriented  Programming  in  AI 


This  month  I  continue  the 
theme  of  object-oriented  lisps 
by  focusing  on  three  object-oriented 
extensions  to  Common  LISP:  Object¬ 
LISP,  New  Flavors,  and  Common- 
Loops.  Although  it  is  unlikely  that 
any  of  these  precise  implementations 
will  become  a  standard,  any  such 
standard  will  probably  combine  fea¬ 
tures  of  all  three. 

ObjectLISP 

ObjectLISP  is  the  candidate  for  an  ob¬ 
ject-oriented  extension  to  Common 
LISP  that  was  offered  by  LISP  Ma¬ 
chines.  Although  it  does  not  now  look 
as  though  it  will  provide  a  substantial 
part  of  the  standard  currently  being 
defined,  it  is  a  relatively  easy  system 
to  understand  and  has  several  com¬ 
mendable  features.  One  of  the  dis¬ 
tinctive  features  of  the  ObjectLISP  ap¬ 
proach  is  eliminating  any  special 
syntax  for  message  sending  so  that 
object-oriented  methods  are  invoked 
with  essentially  the  same  syntax  as 
any  Common  LISP  function.  Another 
important  feature  is  that  ObjectLISP 
departs  from  most  other  object-ori¬ 
ented  systems  by  deliberately  mak¬ 
ing  the  relation  of  a  class  to  a  subclass 
the  same  as  the  relation  of  a  class  to 
an  instance.  What  this  means  on  the 
implementation  level  is  that  the  nest¬ 
ing  of  closures  is  used  for  both  spe¬ 
cializing  and  instantiating  classes. 
When  combined,  as  they  are  in  Ob¬ 
jectLISP,  these  two  features  result  in  a 
simple  and  streamlined  system  so 
that  class  variables,  class  functions, 


by  Ernest  R.  Tello 

instance  variables,  and  instance  func¬ 
tions  all  exhibit  the  same  basic  behav¬ 
ior.  One  of  the  byproducts  of  not  dif¬ 
ferentiating  between  an  instance  and 
a  class  is  that,  during  development, 
you  can  use  a  class  as  a  prototype  in¬ 
stance  or  an  instance  as  a  prototype 
class. 


ObjectLISP  also  has  the  convenient 
feature,  often  not  present  in  object- 
oriented  systems,  of  allowing  the  dy¬ 
namic  creation  and  modification  of 
objects  on  the  fly,  as  it  were,  while 
programs  are  running.  Also,  all  in¬ 
heritance  operates  dynamically.  This 
means  that  changes  in  the  state  of  a 
superclass  of  an  object  that  are  inher¬ 
itable  will  do  so  right  when  the 
changes  occur. 

The  basic  ObjectLISP  system  is 
based  on  five  primitive  functions: 
make-obj,  kindof,  ask,  have,  and  def¬ 
ob  fun.  Creating  an  object  can  be  as 
simple  as  writing: 

(setq  business  (make-obj)) 

Frequently,  though,  the  object  will 
be  a  specialized  version  of  another 
object  that  already  exists.  In  this  case, 
an  object  is  created  with  something 
such  as: 

(setq  wholesaler  (kindof  business)) 

The  ask  function  is  used  to  evaluate  a 
Common  LISP  expression  in  a  particu¬ 
lar  object's  environment.  The  have 
function  creates  variable  bindings 
that  are  local  to  objects.  These  func¬ 
tions  are  usually  used  together  in  Ob¬ 
jectLISP  to  declare  class  variables  and 
instance  variables.  This  could  be 
done  for  the  business  object  just  cre¬ 
ated  like  this: 

(ask  business  (have  'type-of-activity 

'economic)) 

This  can  then  be  checked  to  make 
sure  it  has  been  accepted  by  the  sys¬ 
tem.  If  you  do  so,  the  terminal  screen 
might  read: 


(ask  business  type-of-activity) 

economic 

The  defobfun  function  is  used  to  de¬ 
fine  Common  LISP  functions  that  are 
bound  or  assigned  only  to  a  particu¬ 
lar  object  or  class  of  objects.  Continu¬ 
ing  with  the  example  I  have  been  us¬ 
ing,  I  might  say: 

(defobfun  (calc-net-gain  business) 

(gross-sales  costs) 
(setq  net-gain  (-  gross-sales  total-costs))) 

The  ObjectLISP  syntax  for  calling 
such  a  function  can  be  illustrated  by: 

(ask  business  (calc-net-gain  500  300)) 

Another  important  capability  of 
ObjectLISP  is  the  ability  to  create  shad¬ 
owed  functions.  This  is  a  way  in 
which  the  inherited  functions  can  be 
used  to  create  more  specific  versions 
of  the  function  for  more  specialized 
objects.  In  many  cases,  the  way  this 
can  be  done  efficiently  is  by  adding 
only  the  more  specialized  parts  and 
then  making  a  call  to  the  inherited 
function. 

Although  there  is  no  real  differ¬ 
ence  between  an  instance  and  a  sub¬ 
class  in  ObjectLISP,  in  practice  it  is 
convenient  to  have  a  way  of  using  an 
object  as  a  template  for  creating 
other  instances  of  it.  One  way  of  do¬ 
ing  this  is  to  define  an  ey/st  function 
for  that  object  as  shown  in  Example 
1,  page  147.  With  eyisf  functions  of 
this  kind,  it  becomes  much  easier  in 
ObjectLISP  to  define  instances  of  ob¬ 
jects.  So,  for  example,  you  could  de¬ 
fine  several  business  instances  as 
follows: 

(setq  unicomp  (kindof  business)) 

(ask  unicomp  (exist)) 

(setq  softrend  (kindof  business)) 

(ask  softrend  (exist  'ownership- type 
sole-proprietor)) 


146 


Dr.  Dobb's  Journal,  April  1987 

327 


In  ObjectLISP,  an  object  is  really  a 
list  of  frames.  The  first  member  of 
the  list  is  its  innermost  frame,  the 
original  bindings  supplied  to  it  when 
it  is  created.  The  remaining  elements 
of  the  list  are  all  the  elements  that  it 
inherits,  appearing  in  the  order  in 
which  it  inherits  them. 

Multiple  inheritance  in  ObjectLISP 
is  accomplished  by  supplying  multi¬ 
ple  arguments  to  the  kindof  function. 
So,  for  example,  if  you  have  also  de¬ 
fined  an  object  called  adversary, 
then  you  could  define  a  class  called 
competitor  using  multiple  inheri¬ 
tance  as  follows: 

(setq  competitor  (kindof  business 

adversary)) 

At  this  point,  ObjectLISP  does  not  ap¬ 
pear  to  be  one  of  the  winning  con¬ 
tenders  for  the  standard,  partly  be¬ 
cause  it  uses  dynamic  binding  but  also 
because  it  is  a  new  approach  that  still 
has  not  been  tried  and  proven  for  any 
appreciable  time.  Because  there  are 
still  some  difficult  and  controversial 
issues  in  object-oriented  LISP,  I  think 
that  some  of  the  aspects  of  ObjectLISP, 
particularly  the  placing  of  classes  and 
instances  on  a  common  footing  and 
the  ability  to  modify  objects  on  the  fly, 
deserve  some  serious  consideration. 

Old  and  Vru  Flavors 

As  I  said  in  my  last  column,  the  origi¬ 
nal  Symbolics  Flavors  system  was  the 
first  commercial  object-oriented  ex¬ 
tension  to  LISP  to  gain  relatively  wide¬ 
spread  popularity  and  to  prove  the 


extreme  value  of  object-oriented  LISP 
in  practice.  The  latest  software  re¬ 
lease  for  the  Symbolics  3600  series 
machines,  Genera  Release  7.0,  now 
includes  New  Flavors,  the  candidate 
from  Symbolics  for  the  object-orient¬ 
ed  standard  for  Common  LISP.  Sym¬ 
bolics  Flavors  grew  out  of  the  Flavors 
system  developed  by  the  MIT  LISP  Ma¬ 
chine  group  back  in  1979.  By  1981, 
the  Symbolics  software  group  had 
developed  a  more  efficient  Flavors 
system,  an  object-oriented  system 
that  has  come  to  be  a  favored  pro¬ 
gramming  approach  both  for  much 
of  the  in-house  systems  program¬ 
ming  at  Symbolics  and  for  numerous 
AI  projects  carried  out  by  users. 

New  Flavors  represents  an  attempt 
to  overcome  some  of  the  weaknesses 
encountered  by  users  of  the  Symbo¬ 
lics  Flavors  system  during  the  five 
years  of  its  existence.  David  A.  Moon 
of  Symbolics  recently  outlined  the 
main  goals  of  New  Flavors  as  follows: 

•  to  encourage  greater  program 
modularity 

•to  facilitate  writing  large,  complex 
programs 

•  to  provide  favorable  run-time 
performance 

•  to  maintain  downward  compatibili¬ 
ty  with  old  Flavors 

Like  the  original  Flavors,  New  Fla¬ 
vors  uses  the  defflavor,  defmethod, 
and  make-instance  functions  for  cre¬ 
ating  objects  and  procedures.  The 
way  the  example  introduced  in  the 
discussion  of  ObjectLISP  would  be 


coded  in  New  Flavors  is  shown  in  Ex¬ 
ample  2,  below. 

One  of  the  central  ideas  in  New  Fla¬ 
vors  is  the  notion  of  generic  func¬ 
tions.  The  main  point  of  this  is  to  al¬ 
low  distributed  definition  of 
functions  as  well  as  multiple  inheri¬ 
tance  of  properties.  This  means  both 
having  the  same  name  for  a  method 
that  varies  depending  upon  the  class 
to  which  it  is  bound  and  being  able  to 
use  parts  of  code  from  various  differ¬ 
ent  objects.  Toward  this  end,  the  def- 
generic  function  has  been  provided. 

In  New  Flavors,  generic  functions 
have  the  same  syntax  as  do  nonge¬ 
neric  functions.  This  has  the  advan¬ 
tage  that  any  function  that  is  a  caller 
of  another  function  does  not  need  to 
know  which  to  specify.  Other  advan¬ 
tages  are  that  all  debugging  and  utili¬ 
ty  functions  designed  to  work  with 
ordinary  Common  LISP  functions  can 
also  work  with  generics. 

New  Flavors  has  adopted  a  clear  set 
of  rules  for  ordering  flavor  object 
components.  Components  are  all 
parts  of  an  object,  both  those  declared 
directly  and  those  that  are  inherited. 
The  three  rules  that  are  followed  are: 

•The  flavor's  own  binding  always 
precedes  those  of  its  components. 

•  The  local  order  of  components  of  fla¬ 
vors  always  adopts  the  order  stipulat¬ 
ed  in  the  defflavor  declarations. 

•  All  duplicate  flavors  are  automatical¬ 
ly  removed  from  the  sequence. 

Method  Combination 

As  I've  mentioned  before,  Flavors 


(defobfun  (exist  business)  ( Srest  args  Skey* 
(name  ' no-name-yet ) 

(location  1 no-location-yet ) 

(industry  ' no-industry-yet ) 

(business-type  ' no-bus-type-yet ) 

(size  ' no-size-yet ) 

( year-founded  ' no-year-f ounded-yet ) 

( ownership-type  ’ no-ownership-type-yet ) 
(market-share  ' no-market-share-yet ) 
Sallow-other-keys ) 

(have  'name  name 
'location  location 
' industry  industry 
'business-type  business-type 
'size  size 

'year-founded  year-founded 
'ownership-type  ownership-type 
(apply  ' shadowed-exist  args) 

)  ) 


Example  1:  Defining  an  exist  function  in 
ObjectLISP 


1  (defflavor  business 

(name  location 

industry  business-type  size 

year-founded 

ownership-type  market-share)  () 

: readable-instance-variables 

: writable-instance-variables 

: inittable-instance-variables ) 

(setq  unicomp 

(make-instance  ' 

business 

:  name 

unicomp 

:  location 

santa  clara 

: industry 

computer 

: business -type 

software 

:  size 

18 

:year-f ounded 

1976 

: owner ship- type 

private 

: market- share 

11.3 

(defmethod  (calc 

-net-gain  business)  (gross- 

sales  costs)  (-  gross-sales  costs)) 

Example  2:  Creating  objects  and  procedures  with 
New  Flavors 


Dr.  Dobb's  Journal,  April  1987 

328 


147 


ARTIFICIAL  INTELLIGENCE 

(continued  from  page  147) 

was  the  first  object-oriented  system 
to  provide  the  form  of  abstraction 
that  allowed  different  parts  of  the 
code  for  functions  to  be  mixed  in 
modular  fashion  just  as  complete 
methods  and  variables  may  be  inher¬ 
ited.  Various  built-in  combination 
methods  are  provided  for  this  pur¬ 
pose.  So,  for  example,  programmers 
can  choose  between  such  method- 
combination  modes  as: 

•  calling  only  the  most  specific  meth¬ 
od  available  in  the  hierarchy 

•  calling  all  the  methods  in  order  of 
specificity,  either  upward  or 
downward 

•  trying  each  method  in  turn,  starting 
with  the  most  specialized,  until  one  is 
found  that  does  not  return  nil 

There  are  also  several  other  built-in 
combination-method  modes. 

In  addition  to  defining  new  meth¬ 
ods  and  selecting  built-in  combina¬ 
tion-method  types,  programmers 
can  also  define  new  combination 
methods  using  the  define-method- 
combination  and  define-simple- 
method-combination  functions. 

Development  Tools 

New  Flavors  provides  various  facili¬ 
ties  for  inspecting  the  current  state  of 
an  object-oriented  system  under  de¬ 
velopment.  You  can  either  invoke 
them  by  entering  commands  or  by 
pointing  the  mouse  at  the  names  of 
various  items  on  the  display.  So,  for 
example,  you  can  view  either  the 
subclasses  or  superclasses  of  a  cur¬ 


rent  flavor,  and  you  can  view  all  the 
instances  of  a  given  flavor  that  are 
currently  alive.  These  are  some  of 
the  really  useful  facilities  that  an  ob¬ 
ject-oriented  system  needs  if  it  is  to  be 
used  for  serious  AI  applications.  De¬ 
spite  its  lush  user  environment,  some 
of  these  features  are  absent  even  in 
Smalltalk. 

CommonLoops 

The  object-oriented  extension  that 
has  been  developed  at  Xerox  PARC 
has  several  definite  goals  and  key 
concepts.  It  is  not  surprising  that,  of 
all  the  systems  discussed  here,  it  has 
the  most  in  common  with  Smalltalk 
because  the  amount  of  expertise  pre¬ 
sent  at  Xerox  with  this  type  of  object- 
oriented  system  is  still  considerable. 
But  CommonLoops  also  represents  a 
departure  from  Smalltalk  in  that  it  of¬ 
fers  a  clear  philosophical  vision  of 
how  object-oriented  programming 
can  be  fitted  most  naturally  into  the 
Common  LISP  dialect  and  in  a  way 
that  preserves  the  greatest  amount  of 
generality.  It  is  therefore  intended  to 
provide  a  basis  for  as  many  as  possi¬ 
ble  of  the  serious  approaches  to  ob¬ 
ject-oriented  AI.  One  of  the  stated 
goals  of  CommonLoops  was  to  pro¬ 
vide  a  general  kernel,  written  in 
Common  LISP,  from  which  any  of  the 
major  object-oriented  systems  in  use 
today,  such  as  Flavors,  Smalltalk-80, 
and  Loops,  could  all  be  implemented. 
Like  Smalltalk,  therefore,  Common- 
Loops  makes  use  of  the  metaclass 
protocol  to  implement  its  class  hier¬ 
archy  system. 

The  Kernel 

The  direction  taken  by  Common- 


classes 

class-of 

defmethod 

get-dynamic-slot 

get-function 

get-slot 

miet 

ref 

remove-dynamic-slot 

remove-method 

run-super 

specialize 

with 


Table  1:  Main  CommonLoops 
primitives 


Loops  is  to  use  an  option  to  the  def 
struct  construct  in  Common  LISP  in 
order  to  define  classes.  The  :class  op¬ 
tion  to  defstruct  is  employed  by  Com¬ 
monLoops  to  specify  the  metaclass 
that  will  be  used  in  the  system  to  de¬ 
termine  how  the  object-oriented  ap¬ 
proach  to  be  implemented  will  be¬ 
have.  The  standard  metaclasses 
provided  in  CommonLoops  are  built- 
in-class,  structure-class,  list-structure- 
class,  and  vector-structure-class. 

As  does  Smalltalk,  CommonLoops 
has  various  built-in  classes.  This 
means  that  even  before  a  program¬ 
mer  defines  any  classes,  there  are  al¬ 
ready  various  ones  present  that  de¬ 
scribe  the  behavior  of  the  system. 
Example  3,  below,  shows  the  hierar¬ 
chy  of  CommonLoops'  built-in  class¬ 
es,  shown  as  they  might  appear  in  a 
"class  browser,”  with  the  type  of 
class  represented  in  parentheses  to 
the  right. 

Through  this  hierarchy  of  meta¬ 
classes,  CommonLoops  controls  the 
way  that  the  options  to  defstruct  de¬ 
termine  the  form  of  object-oriented 
system  that  will  be  present.  The 
structure-class,  for  example,  is  the  de¬ 
fault  class  that  defstruct  uses  when 
no  :class  option  is  specified.  The  class¬ 
es  that  are  then  created  default  to  a 
structure  that  acts  like  the  ordinary 
defstruct  in  Common  LISP.  Abstract- 
class  is  used  in  the  .class  option  for 
classes  that  will  not  themselves  be  in¬ 
stantiated  but  act  as  placeholders  in 
the  hierarchy — for  example: 

(defstruct  (business  Lclass 

list-structure-class))) 

Table  1,  left,  gives  a  list  of  the  main 
CommonLoops  primitives. 

Multiple  Inheritance 

Specifying  inheritance  from  multi¬ 
ple  classes  is  accomplished  in  Com¬ 
monLoops  through  an  extension  of 
the  -.include  option  of  defstruct  to  al¬ 
low  it  to  accept  a  list  of  names  of  class¬ 
es.  Following  the  same  example  that  I 
have  been  using  to  illustrate  multiple 
inheritance,  the  competitor  class, 
which  inherits  from  the  two  parent 
classes  business  and  adversary, 
would  be  implemented  in  the  follow¬ 
ing  way  in  CommonLoops: 

(defstruct  (competitor  Linclude 

business  adversary))) 


t  (abstract-class) 

object  (class) 

essential-class  1  1 

abstract-class  1  • 

built-in-class  ’  ' 

class  '  ' 

structure-class  ' 

list-structure-class 
vector-structure-class 


number 
integer 
f i xnum 
sequence 
list 
cons 


(abstract-class) 

(built-in-class) 

(abstract-class) 

(built-in-class) 


Example  3:  The  hierarchy  of 
CommonLoops'  built-in 


148 


Dr.  Dobb's  Journal,  April  1987 

329 


ARTIFICIAL  INTELLIGENCE 

(continued  from  page  148) 


Multimethods 

One  of  the  most  important  innova¬ 
tions  in  CommonLoops  is  that  of  mul¬ 
timethods — procedures  that  are,  in 
effect,  messages  sent  to  any  number 
of  objects  of  different  types.  So  in¬ 
stead  of  defining  the  method  draw-at: 

(defmethod  (rectangle  :draw-at)  (up- 
per-left-x  upper-left-y 
lower-right-x  lower-right-y) 
(draw-box  upper-left-x  upper-left- 
y  lower-right-x 
lower-right-y) 

as  you  would  in  Flavors,  in  Common- 
Loops  you  would  write  it: 

(defmeth  draw-at 

((r  rectangle)  (upper-left-x  integer) 
upper-left-y  integer) 
(lower-right-x  integer)  (lower- 
right-y  integer) 
(draw-box  upper-left-x  upper- 
left-y  lower-right-x 
lower-right-y) 

In  this  definition,  the  first  argument 
to  draw-at  is  r,  which  is  declared  as  of 
the  class  rectangle,  and  the  remaining 
are  screen  coordinates,  all  declared 
as  of  the  class  integer. 

The  implementation  of  Common- 
Loops  is  itself  fully  object-oriented  in 
the  sense  that  all  data  structures  used 
to  implement  the  system  are  objects 
that  are  instances  of  a  class.  So,  for  ex¬ 
ample,  when  a  new  method  is  de¬ 
fined,  three  new  objects  are  created: 
the  method  object,  the  discriminator, 
and  the  discriminating  function.  The 
method  object  is  the  object  that  de¬ 
scribes  the  method  to  be  created,  and 
the  discriminating  function  is  an  ob¬ 
ject  that  selects  the  method  that  will 
be  called.  The  discriminator  and  its 
own  methods  use  the  information  in 
the  method  object  and  its  own  de¬ 
scription  of  a  generic  function  to  com¬ 
pile  the  code  for  the  method.  Generic 
function  is  used  here  in  the  same  sense 
as  in  the  discussion  of  New  Flavors. 

Method  Combination 

Method  combination  is  accomplished 
in  CommonLoops  using  the  run-super 
mechanism.  It  closely  resembles  the 
method  combination  approach  used 
in  Smalltalk,  LOOPS,  and  ObjectLISP. 


(LOOPS  is  an  AI  development  tool  used 
at  Xerox  PARC  and  will  be  described  in 
detail  in  a  subsequent  column.)  The 
run-super  mechanism  is  implement¬ 
ed  using  the  method  and  discrimina¬ 
tor  object  described  earlier.  Because 
of  the  use  of  metaobjects  in  the  imple¬ 
mentation  of  method  combination, 
many  interesting  research  possibili¬ 
ties  for  AI  languages  are  opened  up. 
For  example,  through  defining  spe¬ 
cialized  method  and  discriminator  ob¬ 
jects,  a  means  is  available  for  integrat¬ 
ing  logic  programming  into 
CommonLoops.  A  prototype  for  such 
a  system,  called  CommonLog,  has 
been  implemented  at  Xerox  PARC.  It  is 
hoped  that  this  will  provide  the  basis 
for  a  more  advanced  AI  tool  called 
Vulcan.  (This  name  was  apparently 
not  chosen  by  accident.  One  of  the 
times  I  called  the  Intelligent  Systems 
Lab  at  Xerox  PARC,  the  entire  staff  was 
at  the  movie  theatre  to  see  the  debut 
of  Star  Trek  IV). 

Future  Directions  in 
Object-Oriented  LISP 

One  of  the  isues  still  to  be  settled  by 
the  object-oriented  LISP  community 
and  the  object-oriented  program¬ 
ming  community  in  general  is  that  of 
the  structural  vs.  the  procedural 
view  of  objects.  This  is  the  issue  of 
whether  the  specification  or  inter¬ 
face  description  of  classes  should  be 
purely  procedural  or  split  into  proce¬ 
dural  and  structural  parts.  If  purely 
procedural,  an  object  is  defined  ex¬ 
clusively  by  its  message  protocols.  As 
you  have  seen,  CommonLoops  is  of 
the  second  type  because  method 
lookup  is  achieved  by  a  combination 
of  object  structures  and  discrimina¬ 
tion  procedures.  If  things  continue  to 
proceed  as  they  have  been,  it  is  antici¬ 
pated  that  this  approach  will  come  to 
be  adopted  as  the  standard  one. 

On  the  whole,  I  don't  think  it  is  nec¬ 
essarily  a  problem  of  overwhelming 
difficulty  to  determine  what  would 
be  the  best  kind  of  standard  for  the 
present  as  an  object-oriented  exten¬ 
sion  to  Common  LISP.  Because  it  is  a 
case  of  needing  some  standard  now 
but  not  having  enough  experience 
with  this  area  to  fully  define  the  pos¬ 
sibilities,  the  only  kind  of  standard 
that  can  at  all  serve  is  a  partial  stand¬ 
ard  based  on  those  features  of  the 
technology  that  have  shown  them¬ 
selves  to  be  the  most  useful  and  reli¬ 


able,  while  leaving  the  options  as 
open  as  possible.  To  be  more  specific, 
I  think  that  a  new  design  for  the 
standard  needs  to  be  constructed 
from  the  best  features  of  Common- 
Loops,  New  Flavors,  and  ObjectLISP 
that  address  as  many  of  the  key  issues 
I  have  been  discussing  as  is  feasible. 

The  main  things  from  ObjectLISP 
that  I  feel  should  not  be  lost  are  the 
ability  to  modify  objects  and  their 
variables  on  the  fly  and  to  keep  in¬ 
stances  and  classes  on  an  equal  foot¬ 
ing.  One  thing  I  would  particularly 
like  to  see  is  a  standard  that  did  not 
prevent  the  option  of  having  instanti¬ 
ated  objects  that  were  not  yet  formal¬ 
ly  members  of  any  class  but  that  at  a 
later  time  could  become  "associated” 
with  various  classes  and  gain  from 
what  could  be  inherited  from  them. 
Another  important  issue  from  the  AI 
perspective  is  allowing  for  the  coexis¬ 
tence  of  multiple  types  of  hierarchy 
in  the  same  binding  environment, 
where  the  same  object  can  be  a  mem¬ 
ber  of  each  of  the  different  hierar¬ 
chies  simultaneously  if  this  is  so  de¬ 
sired.  As  explained  earlier,  this 
feature  appears  essential  for  using 
objects  to  develop  systems  with  deep 
models  capable  of  reasoning  about 
objects  in  real-world  settings  in  terms 
of  function,  location,  and  generic 
significance. 

In  my  next  column  I  will  continue 
the  discussion  of  object-oriented  pro¬ 
gramming  in  AI  with  a  review  of  PC 
Scheme — an  object-oriented  pro¬ 
gramming  system  for  IBM  PCs  and 
compatibles. 

Bibliography 

Bobrow,  D.,  et  al.  "Commonloops: 
Merging  Common  LISP  and  Object- 
Oriented  Programming.  "OOPSLA  '86 
Proceedings. 

Dresher,  G.  "ObjectLISP.”  LMl  (. 1985). 
Moon,  D.  "Object-Oriented  Program¬ 
ming  with  Flavors.”  OOPSLA  '86 
Proceedings. 

Ressler,  J.  "Introduction  to  Object¬ 
LISP.”  LMI  (1985). 


DDJ 

Vote  for  your  favorite  feature/article. 
Circle  Reader  Service  No.  7. 


150 

330 


Dr.  Dobb's  Journal,  April  1987 


FORUM _ 

DDJ  ON  LINE 


The  following  exchange  took  place  on 
the  message  board  of  DDJ  FORUM,  our 
SIC  on  CompuServe. 

»:  7972  S9/UN1X  for  the  PC 
17-Nov-86  18:15:23 
Sb:  Dead  Child  Floating 
Fm:  Steve  Sampson  75136,626 
To:  Multitasking  expert 
I’ve  been  trying  to  get  a  CRON  pro¬ 
gram  running  on  a  Unix  clone,  but 
after  each  task  finishes  it  leaves  an 
entry  in  the  process  table — "termi¬ 
nated."  Of  course,  this  is  unsightly, 
and  I’d  like  to  get  it  out.  Is  there  a  way 
to  pop  this  without  a  wait!  )  call?  If  it 
will  help  I  can  upload  the  offensive 
code.  Maybe  someone  has  seen  this 
type  of  behavior  before? 

“‘There  is  a  reply:  7986 

#:  7986 

Sb:  Dead  Child  Floating 
Fm:  Fred  Buck  73327,3604 
Is  your  only  beef  that  the  ps  com¬ 
mand  shows  the  process  and  says  it’s 
terminated?  For  how  long  a  time 
does  this  condition  continue?  If  stuff 
keeps  clogging  the  process  table, 
then  that's  bad  news,  but  you  say  it’s 
"unsightly,”  so  I  assume  your  objec¬ 
tion  is  just  to  the  cosmetic  aspects. 
The  ps  command  just  takes  a  snap¬ 
shot  of  the  process  table  at  the  time 
you  invoke  it,  and  sometimes  the 
dead  stuff  just  hasn’t  been  cleared  out 
by  that  time.  Do  successive  ps  calls 
keep  showing  the  same  terminated 
process? 

“‘There  is  a  reply:  7995 
#:  7995 

Sb:  Dead  Child  Floating 
Fm:  Steve  Sampson  75136,626 


Yes  on  the  ps.  I  need  to  find  out  if  the 
process  entry  ever  goes  away,  but  so 
far  the  terminated  entry  stays 
around  for  many  minutes.  This  is  just 
a  cosmetic  thing  and  more  a  question 
of  programming.  I’m  still  coming  up 
to  speed,  and  there's  much  to  discov¬ 
er.  I  have  found  that  when  I  kill  CRON 
all  the  children  drop  out  of  the  ps  ta¬ 
ble. 

‘“There  is  a  reply:  8009 
*\  8009 

Sb:  Dead  Child  Floating 
Fm:  Fred  Buck  73327,3604 
OK,  I  now  believe  I  understand  your 
situation.  Let's  see  if  I  can  summarize 
it  correctly.  You  have  a  single,  long- 
running  process  that  spawns  special- 
purpose  children,  each  of  which  dies 
when  its  special  purpose  is  finished. 
Such  "dead”  children  show  up  in  the 
process  table  as  terminated  as  long  as 
the  long-running  parent  continues  to 
run.  You  know  that  a  wait (  )  call  will 
clear  the  entries  out  of  the  process  ta¬ 
ble,  but  you  don’t  want  to  paralyze 
your  long-running  parent  by  "wait¬ 
ing”  for  children.  If  I'm  correct  about 
what  your  problem  is,  I  don’t  know 
any  way  other  than  a  waitf  )  call  to  do 
what  you  want  to  do,  but  on  the 
other  hand  a  waitf  )  should  be  able  to 
be  broken  by  a  signal,  including  an 
alarm  signal.  You  should  be  able  to 
clear  out  the  dead  children  by  plac¬ 
ing,  in  some  frequently  traveled  por¬ 
tion  of  the  parent’s  code,  something 
such  as  alarm(l);  followed  by  waitf  );. 
Each  time  waitf  )  is  called,  a  dead  kid 
will  be  stripped  out  of  the  process  ta¬ 
ble  (I  assume  you  don’t  care  about  its 
exit  status).  If  there’re  no  dead  kids 
but  one  or  more  alive  kids,  the 
alarmf  )  will  break  out  of  waitf  ) 
within  one  second.  If  there  are  no 
kids  at  all,  waitf  )  should  return  im¬ 
mediately.  I'm  assuming  you've 
trapped  the  alarm  signal  appropri¬ 
ately.  If  I'm  wrong  about  this,  then 
just  what  is  your  objection  to  a  waitf  ) 
call? 

Oops,  waitf  );  should  be  wait(O);  or 
whatever  variant  of  waitf  )  that 
doesn't  care  about  the  child’s  exit  sta¬ 
tus. 

“’There  are  replies:  8020,  8090 


#■.  8090 

Sb:  Dead  Child  Floating 
Fm:  Steve  Sampson  75136,626 
Well,  I  did  state  the  problem  in  my 
nonverbose  mode,  but  you  summa¬ 
rized  it  very  well.  Thanks  for  the  sug¬ 
gested  work-arounds;  I’ll  give  them  a 
try. 

“‘There  is  a  reply:  8101 
#:  8101 

Sb:  Dead  Child  Floating 
Fm:  Fred  Buck  73327,3604 
I  spoke  to  a  friend  tonight  who  has 
more  Unix  smarts  than  I  do  and  can 
now  provide  the  following  addition¬ 
al  details:  the  defunct  child  processes 
are  more  than  just  cosmetic,  they  in 
fact  clog  up  the  system  process  table 
and  can  (if  sufficiently  numerous)  in¬ 
hibit  new  process  creation.  In  vanilla 
Unix  there  isn’t  really  any  way  to 
handle  such  a  situation  other  than 
the  way  I  suggested  or  something 
very  much  like  it.  In  some  versions  of 
Unix  (perhaps  standard  in  Berke¬ 
ley — my  friend  wasn't  off-top-of- 
head  sure),  there's  a  call  called 
wait3( ),  which  can  be  set  to  serve  the 
single  purpose  of  disposing  of  dead 
children  and  doesn't  wait  for  a  child 
to  die  if  nothing  but  living  children 
exist.  The  reason  the  dead  children 
disappear  when  your  CRON  is  killed  is 
that  child  processes,  upon  the  death 
of  their  parent,  become  adopted  chil¬ 
dren  of  process  1  Unit),  which  is  al¬ 
most  always  in  a  waitf  )  and  so  takes 
care  of  them  handily  and  which  ig¬ 
nores  deaths  of  children  it  doesn't  re¬ 
member  having  spawned  itself. 

’“There  is  a  reply:  8143 

*:  8143 

Sb:  Dead  Child  Floating 
Fm:  Steve  Sampson  75136,626 
Good  stuff  here,  Fred!  Appreciate 
your  help  on  this.  I  was  thinking 
maybe  there  was  a  way  to  do  it  but 
was  off  on  a  very  different  tangent. 
The  info  on  process  1  adopting  chil¬ 
dren  I  suspected  but  didn't  know 
how  to  get  them  adopted.  Thanks. 

*:  8020 

Sb:  #8009  Dead  Child  Floating 


152 


Dr.  Dobb's  Journal,  April  1987 

331 


Fm:  Levi  Thomas  (sysop)  76703,4060 
To:  Fred  Buck  73327,3604 
Why  do  these  messages  read  like  a 
Steven  King  novel?  Dead  kids?  Para¬ 
lyzed  parents?  Macabre  metaphor- 
s,eh?  Shall  we  start  an  on-line  novel? 
A  two-tiered  story — on  one  level,  a 
technical  problem  is  stated  and  vari¬ 
ous  solutions  suggested.  On  another 
level  is  a  horror  story  in  the  Edward 
Gorey  style.  Sorry,  I  get  carried  away 
sometimes — carry  on. 

— Levi  ("A/  is  for  Nevil  who  died  of 
ennui”)  Thomas 

“"There  are  replies:  8027,  8054 

#:  8027 

Sb:  Dead  Child  Floating 
Fm:  Fred  Buck  73327,3604 
Not  to  mention  those  instances  in 
which  a  dead  child  prevents  stuff 
from  being  flushed  down  a  pipe.  Es¬ 
pecially  if  the  child  has  been 
spawned  by  a  demon. 

"“There  is  a  reply:  8029 

#:  8029 

Sb:  Dead  Child  Floating 
Fm:  Duane  Ellis  76064,1107 
If  I  did  not  know  what  you  were  talk¬ 
ing  about,  you  would  get  some  very 
strange  responses  from  me.  As  it  is,  a 
coworker  came  by  and  looked  at  my 
screen — I  had  to  explain  what  is 
meant  by  dead  children  and  flushing 


them  down  pipes  and  also  the  fact 
that  they  could  have  a  demon  for  a 
parent!  Egads! 

"""There  is  a  reply:  8044 

#:  8044 

Sb:  Dead  Child  Floating 
Fm:  Fred  Buck  73327,3604 
Just  imagine  if  Pat  Robertson  ever  got 
a  hold  of  this. 

#:  8054 

Sb:  #8020  Dead  Child  Floating 
Fm:  Neil  J.  Rubenking  7 2267,1531 
To:  Levi  Thomas  (sysop)  76703,4060 
"S  is  for  Sarah,  who  perished  of  fits; 
Tis  for  Titus,  who  flew  into  bits.” 

— Neil  ("C  is  for  Cora,  who  wasted 
away;  D  is  for  Desmond,  thrown  out 
of  a  sleigh;  /  is  for  Ina,  who  drowned 
in  the  lake;  J  is  for  Jake,  who  took  lye, 
by  mistake .”)  Rubenking 

“"There  is  a  reply:  8064 

#:  8064 

Sb:  Dead  Child  Floating 

Fm:  Levi  Thomas  (sysop)  76703,4060 

Thanks  for  the  Gorey  details. 

<grin> 

“"There  is  a  reply:  8076 
#:  8076 

Sb:  Dead  Child  Floating 
Fm:  jhon  Stanley  73765,1026 


We  called  in  an  agent  from  M4  to  in¬ 
vestigate  the  dead  children  problem. 
His  name  was  LEX.  He  lost  his  GREP  on 
reality  after  spending  a  day  YACCing 
with  CAL,  the  supposed  perpetrator. 
CAL  gave  him  the  month  but  not  the 
DATE.  "MORE,”  said  LEX,  "TALK,  and 
MAKE  my  day.”  Then  LEX  used  his 
pipe  on  CAL.  He  pulled  CAL’s  FINGER 
out  of  its  socket,  which  caused  LEX’s 
pipe  to  break. 

LEX’s  partner  walked  in.  He  asked 
LEX  to  swap  space  because  he  had 
CAL's  partner  in  the  other  room. 

Lisa  was  a  real  looker.  Her  rap 
sheet  was  longer  than — well,  it  was 
long.  She  was  ready  to  confess.  LEX 
was  thinking  of  other  things.  Like, 
was  the  FTP  florist  still  open? — Lisa 
needed  some  roses.  "Inode  what  I 
was  doin’.  We  took  'em  down  to  the 
STREAM  and  let  them  float  away. 
They  was  SLEEPing  real  peacful. 
What’s  the  DIFF?  The  superuser 
would  a  KILLed  ’em  when  they  EXIT- 
ed  anyway.” 

It  was  a  dark  and  stormy  night.  .  .  . 

DDJ 


Vote  for  your  favorite  feature/article. 
Circle  Reader  Service  No.  8. 


Dr.  Dobb  s  Journal,  April  1987 

332 


153 


PROGRAMMER'S  SERVICES 


THE  STATE  OF  BASIC 


The  New  Face 
of  Subroutines 

hose  who  learned  BASIC  with 
popular  dialects  such  as  Apple¬ 
soft  BASIC,  TRS-80  BASIC,  MS-BASIC,  and 
BASICA  know  how  inefficient  the  GO- 
SUB  <line  number>  syntax  is.  First, 
the  line  numbers  do  not  correspond 
obviously  to  the  tasks  that  the  subrou¬ 
tines  carry  out.  Second,  subroutines 
have  no  argument  lists  because  these 
BASIC  dialects  support  global  vari¬ 
ables  only.  These  conditions  make 
the  readability  of  BASIC  programs,  es¬ 
pecially  those  lacking  numerous 
comments,  difficult.  The  good  news 
is  that  the  "new  wave”  BASICS  effec¬ 
tively  handle  these  problems.  In  this 
issue  I  will  discuss  the  way  in  which 
subroutines  are  implemented  in 
QuickBASIC  and  True  BASIC,  taking 
advantage  of  their  similar  implemen¬ 
tations.  In  the  next  column,  I  will 
look  at  BetterBASIC's  subroutines, 
which  are  more  like  those  of  Pascal. 

The  subroutines  in  QuickBASIC  re¬ 
flect  Microsoft's  response  to  the  need 
for  more  sophisticated  syntax.  The 
solution  QuickBASIC  offers  is  twofold. 
First,  alphanumeric  labels  replace 
line  numbers.  Thus,  you  can  rewrite 
an  ambiguous  GOSUB  2000  in  the  old 
BASIC  as  GOSUB  READ.FILE.  The  label 
READ. FILE  says  much  more  than  a  line 
number  of  2000! 

The  second  boost  to  subroutines  in 
QuickBASIC  is  the  implementation  of 
named  subroutines  that  have  an  op¬ 
tional  argument  list,  much  like  those 
in  FORTRAN.  QuickBASIC  supports  a 
strict  data  interface:  all  variables 
used  by  the  subroutine  that  do  not  ap¬ 
pear  in  the  argument  list  are  local, 
even  if  they  have  identical  names  to 
those  of  variables  in  the  main  pro¬ 


gram.  Example  1  (below)  shows  two 
simple  subroutines:  the  first  clears  a 
screen  line,  and  the  other  centers  text 
on  a  specified  line  number.  Subrou¬ 
tine  Center.Text  calls  subroutine 
ClrLine  and  passes  the  line  number 
specified.  The  STATIC  declaration  is 
mandatory  and  also  serves  as  a  re¬ 
minder  that  recursive  subroutines 
are  not  yet  implemented. 

QuickBASIC  subroutine  parameters 
are  passed  by  reference  when  a  vari¬ 
able  is  used  and  by  value  when  an 
expression  is  used.  To  protect  a  scalar 
variable  from  being  altered  by  a  sub¬ 
routine,  enclose  it  in  parentheses  to 
make  it  an  expression  (see  comment¬ 
ed  CALL  ClrLine  in  Example  1). 

Passing  arrays  is  also  simple. 
QuickBASIC  needs  to  know  the  num¬ 
ber  of  dimensions  the  array  has 
when  you  declare  the  subroutine. 
Example  2,  page  157,  shows  a  subrou¬ 
tine  that  calculates  the  average  and 
standard  deviation  values  of  a  speci¬ 
fied  column  in  a  numeric  table.  The 
matrix  X  is  written  as  X(2)  to  indicate 
that  it's  two-dimensional.  QuickBASIC 
provides  the  LBound  and  UBound 
functions  to  return  the  array’s  lower 
and  upper  bounds,  respectively.  For 
one-dimensional  arrays  you  simply 
enclose  the  array  name  in  either 
function  to  obtain  the  sought  bounds. 
In  the  case  of  multidimensional  ar¬ 
rays,  a  second  argument  is  needed — 
namely,  the  dimension  number.  In 
Example  2,  the  FOR  . . .  NEXT  loop  iter¬ 
ates  for  all  rows  in  matrix  X.  Assum¬ 
ing  that  the  rows  of  the  numeric  ma¬ 
trix  are  represented  by  the  first 
dimension  and  the  columns  by  the 
second,  I  use  LBound(X,l)  and 


UBound(X,l)  to  obtain  the  row  limits. 
The  array-bound  functions  are  very 
powerful  for  helping  you  write  gen¬ 
eral-purpose  routines  that  manipu¬ 
late  arrays  of  any  size. 

QuickBASIC  also  provides  the 
SHARED  attribute,  used  with  COM¬ 
MON,  DIM,  and  REDIM  statements,  to 
shorten  argument  lists  of  subrou¬ 
tines.  The  SHARED  attribute  declares 
the  variables  and  arrays  as  global  and 
accessible  to  all  routines  within  a  sin¬ 
gle  program.  Thus,  you  should  only 
declare  variables  that  are  logically 
global  (that  is,  needed  by  most  rou¬ 
tines)  as  SHARED  to  avoid  the  side  ef¬ 
fects  of  the  old  BASICS. 

Finally,  QuickBASIC  subroutines 
can  be  exited  from  using  the  END  SUB 
statement.  Subroutines  cannot  be 
nested  among  themselves  or  with 
function  definitions. 

True  BASIC  implements  named 
subroutines  in  a  similar  manner  to 
QuickBASIC.  True  BASIC  does  not  sup¬ 
port  labels,  so  the  GOSUB  <label> 
syntax  is  not  available — just  the  GO¬ 
SUB  <line—number>  (if  line  num¬ 
bers  are  used  at  all).  The  preceding 
discussion  of  QuickBASIC  subroutines 
applies  to  True  BASIC,  with  the  fol¬ 
lowing  exceptions: 

•  True  BASIC  supports  recursive  calls. 

•  The  syntax  for  declaring  arrays  in 
the  subroutine  argument  lists  is 
slightly  different.  True  BASIC  re¬ 
quires  a  comma  for  each  additional 
dimension.  Thus,  a  simple  array  X  is 
declared  as  X(  ),  whereas  a  matrix  X 
is  written  as  X(,),  and  so  on.  With  the 
advent  of  True  BASIC  Version  2,  called 
subroutines  passing  array  arguments 


SUB  ClrLine ( Line . Num)  STATIC 
LOCATE  Line. Num, 1 

PRINT  STRING* (80,  ’  clear  line 

END  SUB 

SUB  Center . Text ( T$ ,  Line. Num)  STATIC 
•  Subroutine  to  center  a  text 

CALL  ClrLine(Line.Num)  •  pass  Line. Num  by  reference 
'  or 

'  CALL  ClrLine (  (Line. Num)  )  to  pass  Line. Num  by  value 
LOCATE  Line. Num,  (40  —  LEN(T$)/2) 

PRINT  T$ 

END  SUB 

Example  1:.  QuickBASIC  subroutines  to  clear  a  line  and  center  a  text  on  the 
I  screen 


Dr.  Dobb's  Journal,  April  1987 

333 


15G 


can  optionally  (for  enhanced  read¬ 
ability)  include  parentheses,  follow¬ 
ing  exactly  the  same  rules  as  subrou¬ 
tine  declarations  do. 

•True  BASIC  supports  both  internal 
and  external  subroutines.  Internal 
subroutines  are  declared  within  the 
main  BASIC  program  and  before  the 
unique  END  statement.  External  sub¬ 
routines  can  reside  in  external  librar¬ 
ies,  modules,  or  beyond  the  END  state¬ 
ment.  The  difference  between  the 
subroutine  types  is  their  accessibility 
to  variables  in  the  main  program. 
The  main  program  (up  to  the  END 
statement)  is  regarded  as  one  pro¬ 
gramming  unit  within  which  all  vari¬ 
ables  are  accessible.  Thus,  internal 
subroutines  can  also  create  and  ma¬ 
nipulate  global  variables  not  appear¬ 
ing  in  the  argument  list.  Unlike 
QuickBASIC  subroutines,  internal 
True  BASIC  subroutines  have  no  local 
variables.  This  is  an  important  differ¬ 
ence  to  remember  if  you  ever  trans¬ 
late  programs  between  the  two  im¬ 
plementations.  External  subroutines 
do  not  enjoy  the  same  privilege  and 
thus  have  a  stricter  data  interface, 
similar  to  that  in  QuickBASIC.  Exam¬ 
ples  3  and  4  (right)  show  the  True  BA¬ 
SIC  versions  of  Examples  1  and  2, 
respectively. 

Looking  at  the  subroutines  in  the 
new  wave  BASICS,  you  can  see  a  touch 
of  FORTRAN  present,  and  why  not? 
They  offer  a  radical  solution  to  a 
chronic  problem  that  plagued  the  old 
microcomputer  BASICS.  Callable  sub¬ 
routines  are  also  endorsed  by  Bor¬ 
land  International  in  its  new  Turbo 
BASIC.  At  the  time  of  writing  this  col¬ 
umn,  I  have  Borland’s  Comdex  Fall- 
86  press  release,  which  indicates  that 
Turbo  BASIC  will  have  the  more  pow¬ 
erful  subroutine  syntax.  Perhaps  the 
Beatles'  lyrics  from  a  song  on  the  Sgt. 
Pepper  album  provide  a  suitable 
comment  on  BASIC  — “It's  getting  bet¬ 
ter  all  the  time!” 


c 


Vote  for  your  favorite  feature/article. 
Circle  Reader  Service  No.  1 0. 


DEE  FNMiss ing - 1 E  +  30  '  def  i  ne  numeric  code  for  mi s sing  numbers 

SUB  Get . Stat (X( 2 ) ,  Col%  ,  Average ,  StdDev )  STATIC 
'  Get  average  and  stnd.  deviation  of  column  Col%  of 
'  two-dimensional  array  X(  ,  ) 

Sum  =  0 
SumX  =  0 
SumXX  =  0 

FOR  Row%  =  LBound  ( X ,  1  )  TO  UBound  ( X ,  1  ) 

IF  X(  Row%  ,  Col*  )  >  FNMiss  ing  THEN  '  Valid  data? 

Sum  =  Sum  +  1 

SumX  =  SumX  +  X(Row%  ,  ColJS ) 

SumXX  =  SumXX  +  X(Row% , Col% ) "2 
END  IF 
NEXT  I 

Average  =  SumX  /  Sum 

StdDev  =  SQR  (  (  SumXX  —  SumX'2/Sum)  /  (  Sum  —  1  )  ) 

END  SUB 

Example  2:  QuickBASIC  subroutine  to  obtain  the  average  and  standard  de¬ 
viation  of  data  stored  in  an  array 


SUB  Clr Line ( Line_Num) 

SET  CURSOR  1  ,Line_Num 
PRINT  REPEATS ( " "  •  '  ,80)  ;  !  clear  line 
END  SUB 

SUB  Center — Text ( T$  ,  Line_Num) 

!  Subroutine  to  center  a  text 

CALL  ClrLine  (  Line — Num)  !  pass  Line _ Nun  by  reference 

!  or 

!  CALL  ClrLine  (  (Line_Num)  )  to  pass  Line_Num  by  value 
SET  CURSOR  (40  —  LEN(T$  )/2  )  ,Line_Num 
PRINT  T$ 

END  SUB 


Example  3:  True  BASIC  subroutines  to  clear  a  line  and  center  a  teyt  on  the 
screen 


DEF  Missing - 1 E  +  30  !  def  i  ne  numeric  code  for  missing  numbers 

SUB  Get_Stat (X( , )  ,  Col ,  Average  ,  StdDev ) 

!  Get  average  and  stnd.  deviation  of  column  Col  of 
!  two-dimensional  array  X(  ,  ) 

LET  Sum  =  0 
LET  SumX  =  0 
LET  SumXX  =  0 

FOR  Row  =  LBound ( X ,  1  )  TO  UBoundfX,  1  ) 

IF  X ( Row ,  Col )  ) Missing  THEN  !  Valid  data? 

LET  Sum  =  Sum  +  1 

LET  SumX  =  SumX  +X(Row,Col) 

LET  SumXX  =  SumXX  +  X(Row,Col )  "2 
END  IF 
NEXT  I 

LET  Average  =  SumX  /  Sum 

LET  StdDev  =  SQR (  (SumXX  —  SumX"2/Sum)  /  (Sum  -  1  )  ) 

END  SUB 

Example  4:  True  BASIC  subroutine  to  obtain  the  average  and  standard  de¬ 
viation  of  data  stored  in  an  array 


Dr.  Dobb's  Journal,  April  1987 

334 


157 


PROGRAMMER'S  SERVICES 

OF  INTEREST 


For  the  Mac 

A  full-featured  AI  development  sys¬ 
tem  for  the  Mac,  ExperCommon  Lisp, 
from  ExperTelligence,  allows  users 
to  develop  applications  that  are  inde¬ 
pendent  of  the  LISP  environment. 
Some  of  the  key  features  of  Exper¬ 
Common  Lisp  include  a  development 
environment  with  more  than  1,000 
primitives;  an  incremental  compiler 
that  generates  68000  native  code  di¬ 
rectly  from  LISP  source  code;  a  trans¬ 
parent  "load-on-call,  load-on-return” 
memory-management/optimization 
technique;  an  on-line,  automatic 
symbolic  debugger;  a  class  system; 
and  direct  access  to  the  Macintosh 
Toolbox.  The  debugger,  class  system, 
and  direct  Toolbox  access  are  totally 
integrated  with  the  compiler.  Exper¬ 
Common  Lisp  sells  for  $995.  Reader 
Service  No.  16. 

ExperTelligence  Inc. 

559  San  Ysidro  Rd. 

Santa  Rarbara,  CA  93108 
(805)  969-7871 

Musicware,  a  line  of  products  for 
computer  music  enthusiasts  and  re¬ 
cording  engineers,  is  now  available 
from  Opcode  Systems.  The  line  in¬ 
cludes  two  MIDI  interfaces  that  work 
with  the  512K  Mac  and  the  Mac  Plus. 
The  full-specification  Professional 
Plus  MIDI  Interface  sells  for  $150,  and 
the  Studio  Plus  MIDI  Interface,  which 
features  two  independent  MIDI  ins  so 
you  can  record  from  two  keyboards 
at  once  or  record  and  sync  at  the 
same  time,  is  priced  at  $225.  Reader 
Service  No.  17. 

Opcode  Systems 
444  Ramona  St. 

Palo  Alto,  CA  94301 
(415)  321-8977 


MagicProducts  has  introduced  the 
Magic  800K  External  Disk  Drive  and 
the  compact  MiniMagicDrive  for  the 
Mac.  The  External  Drive  sells  for 
$199,  and  the  MiniMagicDrive,  which 
is  available  in  20-  or  40-megabyte  ver¬ 
sions,  costs  $995  or  $1,495.  Reader  Ser¬ 
vice  No.  18. 

MagicProducts  Inc. 

4505  Spicewood  Springs  Rd.,  Ste.  304 
Austin,  TX  78759 
(512)  343-0781 

MacC  Jr.,  from  Consulair  Corp.,  is  an 
introductory  C  language  develop¬ 
ment  system  for  the  Mac  and  Mac 
Plus.  A  complete  K  &  R  implementa¬ 
tion  of  C,  MacC  Jr.  takes  full  advan¬ 
tage  of  the  Mac's  features,  letting  you 
produce  stand-alone  applications  in  a 
single  step.  It  includes  an  extensive 
standard  C  and  Macintosh  support  li¬ 
brary,  Pascal  function  calls  for  easy 
interface  to  ROM,  enhanced  symbolic 
debugging  capabilities,  and  several 
example  programs.  MacC  Jr.  sells  for 
$79.95.  Reader  Service  No.  19. 
Consulair  Corp. 

140  Campo  Dr. 

Portola  Valley,  CA  94025 
(415)  851-3272 

Boards  for  the  IBM  PC 

The  Rocket  286,  from  FiveStar  Elec¬ 
tronics,  is  an  add-in  accelerator 
board  for  IBM  PCs,  PC/XTs,  and  com¬ 
patibles.  The  board  provides  an  80286 
running  at  8  MHz  and  8K  of  zero- 
wait-state  cache  memory.  The  hard¬ 
ware  switchable  8088  remains  in  the 
system  for  full  compatibility  with 
most  BIOSs.  The  Rocket  286  measures 
5X9  inches  and  costs  $250.  Reader 
Service  No.  20. 

FiveStar  Electronics  Inc. 

3220  Commander  Dr.,  Ste.  102 
Dallas,  TX  75006 
(214)  733-4077 

Designed  for  the  Compaq  386,  IBM  PC/ 
AT,  PC/XT,  and  compatibles,  Mighty 
Meg  is  a  new  3.5-megabyte  extended 
memory  board  from  Quadram  Corp. 
The  board  is  also  upgradable  to  14 
megabytes  with  1-megabit  parts  and 
switchless  installation.  It  is  designed 
for  use  with  RAM  disks,  protected 
mode  operating  systems  such  as  ADOS, 
Topview,  and  Xenix  applications. 


Mighty  Meg  is  priced  from  $545  for  0.5 
megabyte  of  memory  to  $1,475  for  3.5 
megabytes.  Reader  Service  No.  21. 
Quadram  Corp. 

One  Quad  Way 
Norcross,  GA  30093-2919 
(404)  923-6666 

PML  Systems  has  released  an  add-on 
board  and  software  package  that  al¬ 
low  non-English  speaking  PC  users  to 
run  English-language  application 
programs  in  their  native  languages. 
The  PML86  board  fits  inside  a  full-  or 
half-length  PC  expansion  slot  and 
comes  with  a  disk  containing  the  for¬ 
eign-language  drivers  needed  to 
translate  commands  into  English. 
Language  disks  are  available  for 
French,  Spanish,  German,  Greek,  Ital¬ 
ian,  Russian,  Swedish,  Finnish,  Thai, 
Vietnamese,  and  Sanskrit.  With  one 
language  disk,  PML86  sells  for  $375. 
Reader  Service  No.  22. 

PML  Systems 
3139  E.  Almond  Ave. 

Orange,  CA  92669 
(714)  771-7744 

The  Professional  Image  Board  from 
ATronics  International  allows  you 
to  plug  a  video  camera  into  a  PC  or 
compatible  and  capture  live-action 
images.  You  can  then  freeze,  comput- 
er-enhance,  and  store  pictures  on 
disk.  The  board  sells  for  $595.  Reader 
Service  No.  23. 

ATronics  International  Inc. 

1830  McCandless  Dr. 

Milpitas,  CA  95035 
(408)  943-6629 

Ariel  Corp.  has  introduced  a  plug-in 
board  that  provides  a  complete  signal 
acquisition,  synthesis,  and  processing 
system.  The  DSP-16  combines  two 
channels  of  high-speed,  high-resolu- 
tion  input/output  conversion;  a  large 
data  buffer;  and  Texas  Instruments' 
second-generation  Digital  Signal  Pro¬ 
cessing  (DSP)  microprocessor,  the 
TMS32020.  Supplied  with  the  DSP-26  is 
a  software  package  consisting  of  the 
Program  Development  System  and 
five  software  application  programs: 
Data  Acquisition,  Digital  Audio  Ef¬ 
fects,  Storage  Oscilloscope,  Audio 
Loop  Editor,  Waveform  Synthesizer. 
The  Program  Development  System 


158 


Dr.  Dobb's  Journal,  April  1987 

335 


includes  all  driver  routines,  a 
TMS32020  assembler,  and  debug  facil¬ 
ities.  A  royalty  arrangement  is  avail¬ 
able  for  qualified  independent  devel¬ 
opers.  The  DSP-16  is  priced  at  $2,495. 
Reader  Service  No.  24. 

Ariel  Corp. 

110  Green  St.,  Ste.  404 
New  York,  NY  10012 
(212)  925-4155 

The  CADcard  Model  1040  from  Intel¬ 
ligent  Graphics  Corp.  features  an 
80186  CPU  with  512K  of  memory  dedi¬ 
cated  to  storing  the  emulating  micro¬ 
code  for  IBM's  color  graphics  and  pro¬ 
fessional  graphics  controller.  The 
CPU  provides  an  additional  380K  of 
storage  for  display  lists,  graphic  pa¬ 
rameters,  and  user-defined  applica¬ 
tion  code.  IGC  also  offers  several  high- 
end  performance  options  to  the  basic 
unit,  including  a  graphics  accelera¬ 
tor,  a  Z-buffer  with  hardware  hidden 
surface  removal  for  solids  modeling, 
and  an  8087  coprocessor  for  user  ap¬ 
plication  software.  CADcard  Model 
1040  sells  for  $1,750.  Reader  Service 
No.  25. 

Intelligent  Graphics  Corp. 

4800  Great  Amercia  Pkwy. 

Santa  Clara,  CA  95050 
(415)  986-8373 

Discovery  Systems  has  released  an 
audio-cassette  training  program  for 
Autodesk’s  AutoLISP,  a  training 
course  for  AutoCAD  users.  The  eight 
lessons  provide  a  step-by-step  pro¬ 
gram  with  complete  instructions  to 
create  custom  AutoLISP  functions, 
custom  menus,  and  other  time-saving 
utilities.  The  price  is  $179.  Reader  Ser¬ 
vice  No.  26. 

Discovery  Systems 
34  Autumnleaf 
Irvine,  CA  92714 
(714)  783-9890 

DDJ 


336 


FORUM 


SWAINE'S  FLAMES 


Mattel  and  Axlon  have  been 
granted  permission  by  the  FCC 
to  broadcast  robotic  control  signals  to 
Mattel  and  Axlon  toys  as  part  of  a 
new  departure  in  children's  televi¬ 
sion  programming.  The  toys  will 
move  in  response  to  events  on  screen 
and  shoot  at  targets  on  the  screen. 
This  scheme  is  being  touted  as  "inter¬ 
active,"  which  is  nonsense. 

There  is  an  interesting  precedent 
for  such  "interactive"  television,  and 
it  seems  to  include  all  the  elements  of 
the  present  case  but  one.  The  prece¬ 
dent  had  special  products  that  al¬ 
lowed  a  similar  level  of  "interaction” 
with  the  television  program;  it  creat¬ 
ed  two  classes  of  viewers — those 
who  had  the  products  and  those  who 
didn't;  and  it  was  initially  perceived 
as  a  departure  in  children's  television 
programming.  What  it  did  not  have 
was  an  individualized  signal  sent  out 
over  the  public  airways. 

The  precedent  I  have  in  mind  was 
a  cartoon  series  called  "Winky  Dink 
and  You”  from  the  days  of  black-and- 
white  television.  If  you  sent  for  a  spe¬ 
cial  plastic  sheet  to  place  over  your 
television  screen  and  special  crayons 
for  drawing  on  it,  you  could  custom¬ 
ize  W.  Dink's  on-screen  adventures.  I 
don't  recall  that  the  producers  were 
accused  of  economic  discrimination, 
but  those  were  black-and-white 
times. 

Subtract  Winky  Dink  from  the  Mat¬ 
tel/ Axlon  scenario  and  you  are  left 
with  the  individualized  television  sig¬ 
nal  and  the  question  of  the  appropri¬ 
ate  use  of  a  limited  information  chan¬ 
nel.  I  suspect  that  any  effective 
protest  of  the  plan  will  focus  on  that 
signal.  I  also  suspect  that  this  skir¬ 
mish,  like  the  Captain  Video  airwave 
hijacking  of  last  year,  presages  an  in¬ 
creasing  number  of  battles  for  band¬ 
width  .  S 

There  has  been  some  discussion  in 
our  pages  'and  in  some  more  archly 
academic  journals  of  the  failings  of 


conventional  implementations  of 
PROLOG  as  a  tool  for  logic  program¬ 
ming.  One  writer  who  has  not  only 
criticised  but  suggested  ways  to  bring 
PROLOG  closer  to  the  ideal  of  logic 
programming  is  Lee  Naish,  who  ad¬ 
dresses  PROLOG  S  knottiest  problems 
in  his  book  Negation  and  Control  in 
Prolog  (Berlin/Heidelberg:  Springer- 
Verlag,  1986). 

All  PROLOG  implementations  have 
some  problem  with  negation.  PRO¬ 
LOG  is  based  on  Horn-clause  logic,  and 
negative  information  cannot  really 
be  expressed  in  Horn-clause  form. 
PROLOG  implementations  typically 
treat  negation  by  more  or  less  identi¬ 
fying  it  with  failure:  letting  x  cannot 
be  proved  stand  for  x  is  false.  This  is 
often  a  perfectly  reasonable  thing  to 
do,  but  if  implemented  naively  it  can 
lead  to  illogical  conclusions.  Naish 
lays  out  recommendations  for  the  ef¬ 
fective  implementation  of  negation. 
PROLOG  vendors  should  read  them. 

The  problem  with  control  in  PRO¬ 
LOG  programs  is  that  semantically  in¬ 
nocuous  variations  in  the  control  log¬ 
ic  can  cause  huge  changes  in 
performance.  Naish  contends  that 
the  heuristics  of  good  PROLOG  coding 
that  programmers  have  developed  to 
deal  with  this  problem  are  by  and 
large  simple,  effective,  and  automat¬ 
able.  His  approach  is  to  let  a  prepro¬ 
cessor  restructure  the  code  to  avoid 
the  worst  of  the  inefficiencies.  PRO¬ 
LOG  programmers  should  read  this 
part. 

My  cousin  Corbett  has  recently 
been  combining  catastrophe  theory 
with  market  analysis  and  producing 


surprising  results.  Catastrophe  the¬ 
ory  is  the  fledgling  discipline  that 
studies  systems  on  the  brink,  where 
the  fundamental  assumptions  on 
which  the  study  of  the  systems  is 
built  cease  to  apply. 

His  inspiration  was  an  article  on 
parallel  computers  in  High  Technolo¬ 
gy  magazine.  Having  approached  the 
article  to  learn  about  the  parallel- 
computer  market,  he  was  non¬ 
plussed  to  find  that  there  was  no 
such  thing:  that  there  are  degrees  of 
parallelism;  that  different  architec¬ 
tures  serve  different  purposes;  that 
the  parallel  machines  do  not  com¬ 
pete  in  one  market  but  define  several 
overlapping,  interdependent  market 
fragments. 

Expanding  his  research,  Corbett 
found  other  computer  industry  mar¬ 
kets  in  catastrophic  transition.  The 
80386  computer  market,  with  its  pre¬ 
natal  cloning,  spectre  of  proprietary 
designs,  and  its  nonexistent  software, 
has  been  a  particularly  fruitful  ob¬ 
ject  of  study.  Then  there’s  the  C  com¬ 
piler  market,  clearly  catastrophic, 
through  which  Corbett  discerns  a 
major  fault  line  developing  with  op¬ 
timization  on  one  side  and  conve¬ 
nience  on  the  other.  When  the  big 
split  comes,  Corbett  says,  half  the  C 
market  will  fall  away  into  ease  of  use. 

The  night  before  it  happens,  farm 
animals  will  be  restless. 

Oh,  yes.  If  I  remember  correctly, 
the  Winky  Dink  television  program 
went  away  in  the  face  of  parental 
protest.  Apparently  the  producers 
had  not  reckoned  with  the  resource¬ 
fulness  of  American  youth,  who. 
quickly  discerned  that  there  was  no 
need  to  send  for  the  special  plastic 
sheet  or  the  special  crayons.  You 
could  just  draw  on  the  glass  with 
your  Crayolas. 

'  Michael  Swaine 

editor-in-chief 


160 


Dr.  Dobb's  Journal,  April  1987 

337 


Software  Tools 


Notes  On  Computer  Music 


Languages: 

Ada 


C  Subroutines 
True  BASIC  and  Modula-2 
Object-Oriented  LISP 
New  BASIC  Subroutines 


Ray  Dunca 
Command 
Processors 


Getting  Started 
with  MIDI 


Scientific 
Programming 
Dimensional 
Data  Types 


with 


MAY  1987 


CONTENTS 


VOLUME  12,  ISSUE  5 


ARTICLES 


MUSIC:  Poshing  the  Sound  Envelope 

by  David  Levitt 

A  brief  history  of  computer  music  and  a  look  at  some 
recent  developments  in  MIDI  programming,  sampling, 
transient-oriented  synthesis  methods,  and  programs  that 
compose  and  collaborate  on  original  music. 

MUSIC:  Designing  a  Music  Recorder 

by  Mark  Garvin 

Mark  shows  how  to  design  a  software-based  music 
recorder  using  midi. 

SCIENTIFIC:  Dimensional  Data  Types 

by  Do-While  Jones 

Using  dimensional  units  as  data  types  can  facilitate  the 
writing  of  clearer,  more  easily  maintained  code.  Do- While 
presents  example  programs  in  Ada. 


16 


22 


50 


COLUMNS 


102 


C  CHEST 
by  Allen  Holub 

Allen  looks  at  statistical  applications  of  digital  low-pass 
filters,  a  set  of  subroutines  that  has  applications  in  both 
scientific  and  music  programming. 

16-BIT  SOFTWARE  TOOLBOX 
by  Ray  Duncan 

Ray  looks  at  some  "quality  of  life  tools”  for  MS-DOS 
programmers,  namely,  Command  Plus,  and  ProCED. 
STRUCTURED  PROGRAMMING 
by  Namir  Clement  Shammas 

Namir  analyses  True  BASIC  modules  and  compares  modules 
of  True  BASIC  and  Modula-2. 

ARTIFICIAL  INTELLIGENCE  132 

by  Ernest  R.  Tello 

Ernie  discusses  the  features  of  PC  Scheme — "the  Turbo 
Pascal  of  object-oriented  LISPs.” 


118 


124 


FORUM 


PROGRAMMER'S 

SERVICES 


Computers  ► 
and  music 


Getting  started  ► 
with  MIDI 

Dimensional  ^ 
data  types 
in  Ada 


C filters  ► 
in  statistics 

Command  ► 
processors 

True  BASIC  and  ^ 
Modula-2 

Object-oriented  ► 
LISP 


Ignoble  and 
mercenary  ► 
motives 


Things 

computers  can  ^ 
never  do 


EDITORIAL  6 

by  Michael  Swaine 
RUNNING  LIGHT  8 

by  Allen  Holub 

ARCHIVES  8 

LETTERS  lO 

by  you 

VIEWPOINT  14 

by  Philip  J.  Erdelsky 
SWAINE’S  FLAMES  152 

by  Michael  Swaine 


THE  STATE  OF  BASIC:  144 
More  on  new  BASIC 
subroutines 

BOOKS:  146 

Numerical  Recipes:  The  Art  of 
Scientific  Computing 
OF  INTEREST:  148 

New  products  out  there 
ADVERTISER  INDEX:  151 
Where  to  find  those  ads 


Dr.  Dobb's  Journal,  May  1987 

340 


About  the  Cover 

Programmers  who  have  always 
longed  to  play  (or  play  with)  music 
and  musicians  with  techie  tenden¬ 
cies  and  a  yen  to  create  new 
sounds  would  feel  right  at  home  in 
the  tableau  pictured  on  this 
month's  cover. 

This  Issue 

What  is  it  that  draws  so  many  pro¬ 
grammers  towards  making  music? 
Maybe  it's  the  interactiveness  of 
the  process  or  the  fact  that  a  piece 
of  music  is  something  that  can  be 
shared  with  everyone  (unlike  a 
good  piece  of  code,  which  only 
other  programmers  can  appreci¬ 
ate),  or  maybe  it’s  because  comput¬ 
er  music  is  a  new  frontier.  Most 
likely  it  is  a  combination  of  these 
factors,  plus  the  fact  that  making 
tools  for  making  music  is  as  fun 
and  interesting  as  the  end  result. 
Here  we  look  at  the  roots  of  com¬ 
puter  music  and  at  some  interest¬ 
ing  work  to  be  done  in  this  area  on 
today's  microcomputers.  We  also 
look  at  scientific  programming 
with  an  article  on  dimensional 
data  types  by  Do-While  Jones,  and 
Allen  Holub  covers  both  music  and 
scientific  programming  in  C  Chest. 

Next  Issue 

We've  planned  some  very  practi¬ 
cal  pieces  for  June.  The  lead  article 
presents  an  algorithmic  solution  to 
the  problem  of  sharing  on-line  re¬ 
sources  using  large  priority  queu¬ 
ing.  We'll  also  have  how-to  articles 
about  an  extended  communica¬ 
tions  port  driver  and  building  a 
two-bit  analog  to  digital  convertor. 


3 


FORUM 


EDITORIAL 


I/I.IAWW  ojuminuwi 

Software  Tbols 

FOR  THE  PROFESSIONAL  PROGRAMMER 


When  giants 
sneeze,  they  set 
the  sod  ashiver  and 
shake  the  sedentary 
stones. 

When  the  Lotus  De¬ 
velopment  corporate 
nose  got  irritated  by 
products  that  copied 
the  look  and  feel  of  its 
successful  spreadsheet 
product,  it  blew  up  a 
lot  of  dirt,  and  from 
under  scattered  rocks  slithered  and 
scuttered  the  hungry  lawyers. 

At  press  time,  things  were  looking 
bad  for  the  lawyers. 

Lotus  had  brought  suit  against  Pa¬ 
perback  Software  and  Mosaic  Soft¬ 
ware  over  alleged  copyright  in¬ 
fringement  of  Lotus  1-2-3  by  these 
companies’  products,  VP-Planner 
and  The  Twin,  respectively.  Lotus 
was  not  contending  that  these  com¬ 
panies  copied  Lotus  code  but  rather 
that  their  products,  in  being  key- 
stroke-for-keystroke  compatible 
with  1-2-3,  infringed  on  the  copy¬ 
righted  "look  and  feel”  of  1-2-3. 

Now  that  the  copyright  office  has 
determined  that  there  is  no  copy¬ 
rightable  look  and  feel  to  1-2-3,  Lotus' 
case  is  much  weaker,  but  the  issues 
the  case  raised  will  not  go  away.  Crit¬ 
ics  such  as  Dan  Bricklin,  who  created 
VisiCalc,  the  first  electronic  spread¬ 
sheet,  have  argued  that  to  protect  the 
keystoke  sequences  of  a  product  de¬ 
nies  competitors  access  to  perhaps 
the  most  compelling  feature  a  com¬ 
mercial  software  product  can  have: 
familiarity. 

The  suit,  or  one  like  it,  could  have 
enormous  implications  for  innova¬ 
tion  and  advance  in  software  devel¬ 
opment.  Although  VP-Planner  and 
The  Twin  are  anything  but  innova¬ 
tive,  a  decision  against  the  imitators 
in  such  a  case  could  make  even  truly 
innovative  developers  more  cautious 
in  bringing  products  to  market.  It 
also  would  remove  one  incentive  to 
improve  existing  products.  As  Brick¬ 
lin  put  it,  "you  won’t  have  to  do  ver¬ 


sion  two,  because  no¬ 
body  else  will  be  able 
to.” 

At  least  one  editor 
has  questioned  the 
motives  of  Lotus  exec¬ 
utives,  but  surely  their 
motives  are  beyond 
question.  In  this  partic¬ 
ular  instance,  the  mo¬ 
tives  of  the  executives 
at  Lotus  and  Paper¬ 
back  and  Mosaic  were 
ignoble  and  mercenary.  Innovation 
issues  have  been  raised  by  the  Lotus 
look-and-feel  case,  to  be  sure,  but  the 
actions  and  motivations  of  the  corpo¬ 
rations  involved  had  to  do  not  with 
innovation  but  with  profits. 

These  executives  were  undoubted¬ 
ly  behaving  appropriately  in  this. 
Corporations  exist  to  make  profits  for 
their  stockholders,  not  to  innovate, 
pioneer,  or  upgrade  products,  except 
as  such  actions  may  seem  to  them  to 
be  necessary  steps  to  profits.  These 
corporate  executives  smelled  wealth 
in  the  well-trod  ground  of  the  Lotus 
1-2-3  user  interface.  If  the  companies 
involved  in  the  Lotus  suit  did  not  take 
a  particularly  high-minded  view,  it 
was  because  they  were  keeping  their 
eyes  on  the  turf. 

As  a  market,  the  1-2-3  turf  is  rich 
indeed,  but ,  surveying  it  technologi¬ 
cally,  the  ground  these  companies 
chose  to  squabble  over  is  played  out. 
The  pace  of  technological  develop¬ 
ment  in  personal  computer  software 
has  left  Lotus  1-2-3  behind,  even 
though  it  is  still  healthy  as  a  product. 

But  what-can-be  still  can  drive 
what-will-be  in  the  software  market, 
and  before  long  someone  will  bring 
the  spreadsheet  market  up  to  date 
with  the  technology.  I  look  forward 
to  some  innovative  competitor  leav¬ 
ing  Lotus  1-2-3  and  its  archaic  inter¬ 
face  in  the  dust — even  if  it  has  to  be 
Microsoft. 

Michael  Swaine 
editor-in-chief 


Editorial 

Editor-in-Chief  Michael  Swaine 
Managing  Editor  Vince  Leone 
Assistant  Editors  Sara  Noah  Ruddy 
Levi  Thomas 

Technical  Editor  Allen  Holub 
Consulting  Editor  Nick  Turner 
Contributing  Editors  Ray  Duncan 
Michael  Ham 
Bela  Lubkin 
Namir  Shammas 
Ernest  R.  Tello 

Copy  Editor  Rhoda  Simmons 
Production 

Production  Manager  Bob  Wynne 

Art  Director  Michael  Hollister 
Assoc.  Art  Director  Joe  Sikoryak 
Typesetter  Jean  Aring 
Technical  Illustrator  Frank  PoUifrone 
Cover  Artist  Michael  Carr 
Circulation 

Circulation  Director  Maureen  Kaminski 
Newsstand  Sales  Mgr.  Stephanie  Ericson 
Book  Marketing  Mgr.  Jane  Sharninghouse 
Circulation  Coordinator  Kathleen  Shav 
Administration 
Finance  Director  Kate  Wheat 
Business  Manager  Betty  Trickett 
Accounts  Payable  Supv.  Mavda  Ixipez-Quintana 
Accts.  Receivable  Supv.  Laura  Di  Lazzaro 
Account  Managers 
Lisa  Boudreau  (415)  366-3600 
Gary  George  (404 )  897-1923 
Michael  Wiener  (415)  366-3600 
Cynthia  Zuck  (718 )  499-9333 
Promotions/Srvcs.  Mgr.  Anna  Kittleson 
Advertising  Coordinator  Charles  Shively 

M&T  Publishing  Inc. 

Chairman  of  the  Board  Otmar  Welx;r 
Director  C.  F.  von  Quadt 
President  and  Publisher  Laird  Foshav 
Associate  Publisher  Michael  Swaine 


Dr.  Dobb 's  Journal  of  Software  Tools  (USPS  307690) 
is  published  monthly  by  M&.T  Publishing  Inc.,  501  Gal¬ 
veston  Dr.,  Redwood  City,  CA  94063;  (415)  366-3600. 
Second-class  postage  paid  at  Redwood  City  and  at  ad¬ 
ditional  entry  points. 

Article  Submissions:  Send  manuscripts  and  disk 
(with  article  and  listings)  to  the  Editor. 

DDJ  on  CompuServe:  Type  GO  DDJ 
Address  Correction  Requested:  Postmaster:  Send 
Form  3579  to  Dr.  Dobb  s  Journal,  P.O.  Box  27809,  San 
Diego,  CA  92128.  ISSN  0888-3076 

Customer  Service:  For  subscription  problems  call: 
outside  CA  (800)  321-3333;  in  CA  (619)  485-9623  or  566- 
6947.  For  l>ook/sott  ware  order  problems  call  (415)  366- 
3600. 

Subscriptions:  $29.97  per  1  year;  $56.97  lor  2  years. 
Canada  and  Mexico  add  $27  per  year  airmail  or  $10 
per  year  surface.  All  other  countries  add  $27  per  year 
airmail.  Foreign  suliscriptions  must  l>e  prepaid  in  U.S. 
funds  drawn  on  a  U.S.  bank.  For  foreign  subscriptions, 
TELEX:  752-351. 

Foreign  Newsstand  Distributor:  Worldwide  Media 
Service  Inc.,  386  Park  Ave.  South,  New  York,  NY  10016; 
(212)  686-1520  TELEX:  620430  (WUD. 

Entire  contents  copyright  c  1987  by  M&.T 
Publishing  Inc.  unless  otherwise  noted  on 
s|>ecifio  articles.  All  rights  reserved. 

People’s  Computer  Company 

Dr.  Dobb's  Journal  of  Software  Tools  is  published  by  M&.T 
Publishing  Inc.  under  license  from  People's  Computer  Company, 
2682  Bishop  Dr..  Suite  107,  San  Ramon.  CA  94583.  a  nonprofit 
corporation. 


6 


Dr.  Dobb 's  Journal,  May  1987 

341 


FORUM 


RUNNING  LIGHT 


Most  of  the  con¬ 
tent  of  this  issue 
has  to  do  with  either 
scientific  programming 
or  music.  Scientific 
programming  yes,  but 
why  music  in  a  pro¬ 
grammers’  magazine? 

Maybe  you  think  the  connection 
needs  no  explanation,  particularly  if 
you,  like  many  professional  pro¬ 
grammers,  are  a  musician.  It  seems 
that  just  as  visual  artists  often  sup¬ 
port  themselves  doing  paste-up, 
many  musicians  earn  a  living  as  pro¬ 
grammers.  In  any  case,  there  is  a  re¬ 
markably  large  overlap  between 
computer  programmers  and 
musicians. 

But  more  to  the  point,  the  applica¬ 
tion  of  computers  to  musical  prob¬ 
lems  is  often  fascinating  from  a  pure 
programming  perspective.  A  truly 
interactive  music  system  has  to  un¬ 
derstand  what's  happening  musical¬ 
ly — a  nontrivial  problem  in  artificial 
intelligence.  Moreover,  music  pro¬ 
grams  have  some  of  the  most  com¬ 
plex,  and  most  interesting,  user  inter¬ 
faces  going.  They  use  graphics  to  a 
higher  degree  than  most  programs 
do  and  have  to  do  sophisticated 
things  on  the  output  side,  juggling 
the  control  of  several  other  comput¬ 
ers  and  synthesizers.  What  is  MIDI  if 
not  a  multiprocessor  system  commu¬ 
nicating  over  a  network? 

What  I’m  saying  here  is  that  the 
problems  of  computer  music  are 
most  often  programming  problems, 
of  interest  to  the  musician  and  non¬ 
musician  alike. 

This  issue  covers  the  music  theme 
with  three  music-related  articles.  Da¬ 
vid  Levitt’s  piece  discusses  some  of 
the  newer  trends  in  music-related 
hardware  and  software.  It  may 
pique  the  curiosity  of  even  tone-deaf 
coders  enough  to  get  them  interested 
in  solving  some  of  the  programming 


problems  David  pre¬ 
sents.  Mark  Garvin  dis¬ 
cusses  MIDI,  but  from 
the  not-often-taken 
perspective  of  how  to 
store  and  retrieve  the 
staggering  amount  of 
data  that  comes  across 
the  network.  Finally,  this  month’s  C 
Chest  discusses  digital  low-pass  fil¬ 
ters  and  demonstrates  the  universali¬ 
ty  of  music  software  by  applying 
these  filters  to  statistical  applications. 

The  one  sour  note  in  this  issue,  to 
my  ear  at  least,  is  that  there’s  not  a 
whole  lot  of  code.  I'm  hoping  that 
those  of  you  who  are  both  program¬ 
mers  and  musicians  will  send  us  arti¬ 
cles  about  some  of  the  programs 
you've  written.  If  you  have  some¬ 
thing  useful  you'd  like  to  share,  send 
it  in. 

Maybe  together  we  can  advance 
the  state  of  the  music  software  art. 


/  Allen  Holub 
technical  editor 


ARCHIVES 


Math  in  DDJ 

"My  experiences  during  the  the  last  few 
months  vividly  illustrate  the  fact  that  there 
are  plenty  of  good  mathematical  problems 
still  waiting  to  be  solved  almost  every¬ 
where  you  look — especially  in  areas  of  life 
where  mathematics  has  rarely  been  ap¬ 
plied  before.  Mathematicians  can  provide 
solutions  to  these  problems,  receiving  a 
double  payoff — namely  the  pleasure  of 
working  out  the  mathematics,  together 
with  the  appreciation  of  the  people  who 
can  use  the  solutions.  So  let’s  go  forth  and 
apply  mathematics  in  new  ways.” — 
" Mathematical  Typography,"  Donald  E. 
Knuth,  DDJ,  March  1980. 

"I  use  a  Polymorphic  8813  as  a  home  sys¬ 
tem.  As  a  precision  buff,  I  was  delighted  by 
their  newest  release  of  BASIC  which  among 
the  many  other  features,  has  a  variable 
precision  ‘settable’  from  6  to  26  digits.  By 
the  way,  this  precision  holds  for  all  of  the 
trigonometric  and  other  math  functions 
unlike  so-called  double  precision  calcula¬ 
tions  in  some  other  BASICS." — Letter  to  the 
Editor,  John  W.  McGraw,  DDJ,  March  1980. 

"I  trust  that  anyone  with  even  the  slight¬ 
est  love  for  mathematics  (however  deep) 
will  want  to  see  how  mathematical  tools 
can  be  applied  to  the  problems  of  pro¬ 
gramming.” — "An  Introduction  to  Algo¬ 
rithm  Design, "  Jon  Louis  Bentley,  DDJ,  April 
1980. 


Ten  Years  Ago  in  DDJ 

"Computers  are  considered  to  be  useful 
tools  with  which  to  achieve  a  specific  end 
result  such  as  processing  a  payroll  or  calcu¬ 
lating  a  trajectory.  This  view  of  computers 
has  often  been  carried  over  into  education¬ 
al  applications  with  the  computer  cast  in 
the  role  of  teacher/tutor.  The  low-cost 
home/school  system  described  here 
(called  FRED — Flexible  Recreational  and 
Educational  Device)  is  intended  as  a  play¬ 
thing  which  encourages  experimentation 
and  stimulates  a  desire  to  learn. — "A  Prac¬ 
tical,  Low-Cost,  Home/School  Micropro¬ 
cessor  System, "  Joe  Weisbecker,  DDJ,  May 
1977. 

"One  difference  from  other  versions  of 
[computer  game]  CHASE — if  two  robots  col¬ 
lide  in  my  version,  they  do  not  annhialate 
[sic],  but  travel  as  a  pair — with  some 
strangly  unpredictable  consequences! 
(This  was  originally  a  ‘bug’  in  my  program, 
but  it  was  so  cute  I  decided  to  leave  it 
in!)” — "Video  Chase  for  8080/VDM,"  Joseph 
Jay  Sanger,  DDJ,  May  1977. 


Dr  Dobb's  Journals 

COMPUTER 

vJalisdjcnics  {Jithodontia 

Rnmmmg  Ugh  Widtoml  Ovrrbyte 


8 

342 


Dr.  Dobb's  Journal,  May  1987 


FORUM 


LETTERS 


BASIC09  and  OS-9 

Dear  DDJ, 

Brian  Capouch  was  mistaken  when 
he  asserted  in  ‘The  OS-9  Operating 
System”  (January  1987)  that  the  lan¬ 
guage  BASIC09  and  the  operating  sys¬ 
tem  OS-9  "appeared  together  in  1981, 
a  few  months  after  the  6809  came 
into  production.”  For  example,  the 
May  1979  issue  of  Byte  contained  an 
ad  from  Southwest  Technical  Prod¬ 
ucts  offering  both  an  MP-09  Proces¬ 
sor  Card  and  a  68/09  Computer  w/ 
48K.  As  I  recall,  these  products  were 
being  shipped  in  the  second  half  of 
1979. 

This  confusion  is  especially  mis¬ 
leading  because  the  reason  for  the 
BAS1C09  project  (which  came 
to  include  OS-9)  was  to  pro¬ 
mote  the  6809  processor:  BA- 
SIC09  was  to  have  provided 
an  example  of  efficient  cod¬ 
ing,  fast  execution,  and  also 
a  language-on-a-ROM  for  use 
by  OEMs.  I  was  the  supervis¬ 
ing  engineer  for  the  project, 
and  it  was  especially  unfor¬ 
tunate  for  me  that  it  had  still 
failed  to  terminate  by  early 
1981,  two  years  after  the 
6809  was  introduced.  The  8- 
bit  6809  was  competing  with 
the  16-bit  8088  and  often 
came  up  "a  day  late  and  a 
dollar  short”  by  paper  com¬ 
parison.  BASIC09  would  have 
helped,  but  by  1981  the  new- 
design  window  for  the  6809 
had  pretty  well  closed. 

BASIC09  was  to  be  com¬ 
posed  of  dynamically  re¬ 
placeable  memory  modules; 

OS-9  was  to  be  multitasking, 
real-time,  reentrant,  and  dy¬ 


namically  reconfigurable.  Of  course, 
these  are  great  buzzwords  now,  but 
consider  how  different  these  con¬ 
cepts  are  from  the  current  MS-DOS 
system  and  how  much  better  off 
we’d  be  if  MS-DOS  had  been  forced, 
by  competition,  to  include  these 
features. 

Terry  Ritter 
2609  Choctaw  Trail 
Austin,  TX  78745 

Hashing 

Dear  DDJ, 

In  response  to  Edwin  T.  Floyd's 
"Hashing  for  High-Performance 
Searching”  (February  1987),  I  present 
the  following  items: 

1.  A  hash  code  "qualifier”  can  be 
stored  (along  with  the  physical  ad¬ 
dress  of  the  data)  in  the  hash  bucket. 
The  qualifier  can  be  derived  by  a 
function  similar  to  the  hash  function. 
When  the  hash  bucket  is  searched 
for  the  desired  key,  the  qualifier  will 
most  likely  be  unique  for  each  differ¬ 
ent  key.  A  physical  read  of  the  actual 
data  item  is  then  required  to  ensure  a 
correct  match;  however,  a  hash  miss 
becomes  far  less  likely  using  this 
method.  Because  the  data  is  kept  sep¬ 


arate  from  the  index,  it  has  to  be  read 
anyway  (especially  in  the  case  of  a 
duplicate  key),  so  this  method  in¬ 
volves  extra  overhead  only  in  the 
case  of  a  hash  miss  or  because  of  the 
presence  of  a  duplicate  key.  The  ma¬ 
jor  advantages  of  this  method  are 
that  the  index  space  required  is 
greatly  reduced  and  the  search  time 
for  lengthy  keys  is  minimized.  For 
32-byte  keys,  for  example,  a  linear  in¬ 
dex  (for  binary  search)  would  re¬ 
quire  at  least  34  bytes  per  entry  (32 
for  the  key,  2  for  the  data  address). 
Assuming  a  4-byte  hash  code  qualifi¬ 
er,  however,  an  index  entry  in  a 
hash  bucket  would  occupy  a  mere  6 
bytes,  with  the  search  taking  corre¬ 
spondingly  less  time. 

2.  Hashing  techniques  used  as  a  disk 
database  indexing  scheme  are  not 
much  different  from  the  RAM-resi¬ 
dent  symbol  table  implementation 
presented  by  Floyd.  Depending  upon 
the  size  of  the  hash  table,  all  or  part 
of  the  table  can  be  kept  in  memory, 
with  some  kind  of  buffering  tech¬ 
nique  being  an  integral  part. 

A  quick-access,  hash-indexed  data¬ 
base  using  the  technique  described 
above  is  available  from  us.  It  is  a  disk 
database  offering  hash- 
keyed  and  associative  (rela¬ 
tional)  access  simultaneous¬ 
ly,  which  hints  at  the 
extreme  versatility  of  hash 
indexing  techniques.  Al¬ 
though  it  does  not  make  use 
of  the  move-to-front  tech¬ 
nique,  there  is  no  reason 
why  this  optimization  could 
not  be  used.  It  seems  to  me 
that  MTF  optimization 
would  be  especially  useful 
when  duplicate  keys  are 
involved. 

Dave  Joy 
Joy  Research  &, 
Development 
9403  Wallingham  Dr. 
Spring,  TX  77379 

Dear  DDJ, 

I  was  pleased  to  see  hashing 
discussed  in  two  places  in 
the  February  issue.  It  is  my 
experience  in  developing 
practical  commercial  pro- 


10 


Dr.  Dobb 's  Journal,  Mav  1987 

343 


LETTERS 

(continued  from  page  10) 

grams  that  indirect  hashing  consis¬ 
tently  outperforms  such  often  dis¬ 
cussed  techniques  as  binary  trees  as  a 
means  for  organizing  data  that  is  ac¬ 
cessed  by  name  (that  is,  by  the  value 
of  a  string  associated  with  the  data). 

Although  not  explicitly  stated,  the 
techniques  described  by  Floyd  and 
Holub  are  types  of  indirect  hashing; 
indirect  hashing  consists  of  the  use  of 
a  fixed-size  array  of  pointers  (the 
hash  table)  pointing  to  chained  lists 
of  buckets,  each  bucket  containing  a 
data  item  and  its  name.  The  buckets 
are  allocated  dynamically  from  a 
heap.  The  name  of  each  bucket  in  a 
given  chain  hashes  to  the  index  of 
the  array  element  that  points  to  the 
beginning  of  the  chain.  (Floyd  incor¬ 
rectly  defines  buckets  as  the  ele¬ 
ments  of  the  hash  table  array.) 

Any  bucket  is  reached  by  getting  a 
pointer  to  the  first  bucket  in  a  chain 
from  the  hash  table  and  then  travers¬ 
ing  the  chain  from  the  first  bucket.  It 
is  thus  easy  to  keep  a  pointer  to  the 
previous  bucket  as  each  bucket  is 
probed.  Therefore,  it  is  not  necessary 
to  associate  a  back  pointer  with  each 
bucket,  as  Holub  does. 

Because  only  one  pointer  need  be 
associated  with  each  bucket,  the 
space  overhead  associated  with  indi¬ 
rect  hashing  can  be  held  to  H  +  N 
pointers,  where  H  is  the  hash  table 
size  and  N  is  the  number  of  data 
items  (that  is,  the  number  of  buckets). 
The  average  number  of  probes  is  (1 
+  N)/2H.  The  space  overhead  for  a 
binary  tree  with  N  data  items  is  2N 
pointers,  and  the  average  number  of 
probes  is  at  least  log(N/2)  (base  2 
logarithm). 

As  a  specific  example,  with  1,000 
data  items  and  a  hash  table  size  of 
250,  the  indirect  hashing  technique's 
space  overhead  is  1,250  pointers  and 
the  average  number  of  probes  is  3. 
The  binary  tree’s  space  overhead  is 
2,000  pointers  and  the  best  average 
number  of  probes  is  9. 

Floyd  states  that  hashing  is  effec¬ 
tive  when  ordering  (for  example,  al¬ 
phabetical  ordering)  is  not  impor¬ 
tant.  In  fact,  I  have  been  using  a 
hashing  algorithm  that  permits  al¬ 
phabetical  ordering:  the  hashing  al¬ 
gorithm  consists  of  taking  256  times 
the  ASCII  value  of  the  first  character 
of  the  name,  adding  the  ASCII  value 


of  the  second  character,  and  dividing 
the  result  by  16.  (This  works  with  a 
4,096-element  hash  table.)  Each  chain 
of  buckets  is  maintained  in  alphabeti¬ 
cal  order  by  inserting  each  new 
bucket  at  the  proper  place  in  the 
chain. 

To  read  out  all  data  items  in  alpha¬ 
betical  order,  then,  you  simply  loop 
over  the  hash  table  elements  and,  for 
each  nonnull  element,  read  out  the 
buckets  of  the  chain  it  points  to  as  the 
chain  is  traversed.  Our  products — 
Source  Print  and  Tree  Diagrammer — 
both  use  this  hashing  technique.  The 
simple  hash  code  gives  rise  to  a  mod¬ 
erate  amount  of  clustering,  but  I  do 
not  think  the  clustering  is  severe 
enough  to  seriously  degrade  probing 
speed.  In  fact,  Source  Print  can  index 
all  variables  within  a  source  code  file 
at  about  6,000  lines  per  minute,  and 
very  little  of  this  time  is  spent  dealing 
with  the  hashing  operations. 

Larry  R.  Miller 

Aldebaran  Labs. 

3339  Vincent  Rd. 

Pleasant  Hill,  CA  94523 

Baby  Ducks 

Dear  DDJ, 

The  wish  list  for  the  ultimate  editor 
[“Text  Editors,”  February  1987]  sur¬ 
prised  me  a  bit,  as  the  one  I  use  for 
word  processing  (simple  stuff,  such 
as  this  letter)  and  mostly  for  pro¬ 
gramming  has  virtually  everything 
on  the  list.  The  package  is  KEDIT  from 
Mansfield  Software  Group.  It  has  ev¬ 
erything  from  the  Must  Haves  list, 
and  then  some.  In  the  Not  Neces¬ 
sary  .  .  .,  it  also  covers  all.  In  fact,  in 
the  entire  list  of  wishes,  the  only 
thing  it  cannot  do  is  edit  files  larger 
than  memory. 

I  program  in  C,  and  I  have  macros 
to  do  things  such  as  comment  or  un¬ 
comment  a  line  (same  key)  and  my 
Can't  Live  Without:  given  an  if  or  for 
(while,  until  .  .  .)  statement,  I  press  a 
key  combo  (Ctrl-F2)  and  it  indents 
and  puts  an  open  bracket,  blank  line, 
and  close  bracket  followed  by  a  com¬ 
ment  (the  first  20  characters  of  the  if 
or  loop  statement),  then  positions  me 
at  the  first  indented  position  of  the 
blank  line.  I  go  from: 

for(i=0;  i<100;  i+  +  ) 


to: 

for(i  =  0;  i<100;  i+  +) 

{ 

}  /*  for  (i=0;  i  >100;  i++).  .  .*/ 
in  one  keystroke. 

KEDIT  has  many  other  features, 
and  for  $125  it  is  one  of  the  best  buys 
around.  No,  I  don't  sell  KEDIT  or 
know  the  firm,  but  I  love  to  share 
knowledge  about  excellent  software 
with  others. 

Flip  Nehrt 
1209  N.  Topeka 
Whichita,  KS  67214 

Dear  DDJ, 

I  read  with  great  interest  the  article 
on  word  processing  [February  1987], 
I  agree  with  Thomas  and  Turner 
about  the  baby  duck  syndrome.  I 
wish  to  offer  a  wish  list  in  reply: 

1.  Text  buffered  to  mass  storage.  Wait 
time  should  be  minimal.  Anticipa¬ 
tory  loading  and  dumping  could  be 
implemented. 

2.  Response  to  keyboarding  should 
be  adequate,  so  fast  typists  do  not  feel 
impeded  by  the  computer. 

3.  Reduced  command  set  that  works 
quickly. 

4.  Wordwrap  and  left  and  right 
justification. 

5.  Screen  should  display  the  printed 
page.  Special  fonts  such  as  italics  are 
unnecessary.  After  all,  who  wants  to 
change  a  daisy  for  two  words.  Cute 
code  is  expensive. 

6.  Block  move,  copy,  delete,  fetch, 
and  store  to  disk. 

7.  Search  and  replace  globally  with 
case  insensitivity. 

8.  Multifile  and  multiwindow. 

What  I  really  want  is  an  editor  that 
can  be  tailored  to  the  task  at  hand. 

As  the  good  Doctor  has  always 
been  at  the  forefront  of  new  devel¬ 
opments — providing  us  with  Tiny 
BASIC  and  Small-C — perhaps  it  or  its 
readers  could  provide  us  with  an  edi¬ 
tor  that  is  machine  independent  and 
extensible. 

Robert  B.  McCormick 
11  East  Chestnut  St. 

Bordentown,  NJ  08505 

DDJ 


12 

344 


Dr.  Dobb's  Journal,  May  1987 


FORUM 


VIEWPOINT 


Things  Computers 
Can  Never  Do 

Anyone  who  has  witnessed  the  enor¬ 
mous  improvements  in  computers  in 
the  last  40  years  may  get  the  impres¬ 
sion  that  computers  will  eventually 
be  able  to  solve  every  well-defined 
problem.  Progress  in  language  un¬ 
derstanding  and  other  forms  of  artifi¬ 
cial  intelligence  has  been  disappoint¬ 
ing,  but  human  language  is  full  of 
ambiguities,  so  that’s  not  a  well-de¬ 
fined  problem.  Chess,  on  the  other 
hand,  is  very  well  defined.  Although 
it  was  once  considered  the  epitome 
of  intelligent  activity,  computers  can 
now  play  chess  better  than  all  but  a 
few  human  players. 

Some  problems,  although  well  de¬ 
fined,  are  too  large  to  be  solved  in  a 
reasonable  time  even  on  our  largest 
computers.  But  surely,  if  a  computer 
could  be  freed  from  all  limitations  on 
time  and  memory,  couldn’t  it  solve 
any  well-defined  problem? 

The  surprising  answer  to  this  ques¬ 
tion,  which  was  known  to  mathema¬ 
ticians  even  before  the  first  real  com¬ 
puters  were  constructed,  is  no.  There 
are  some  things  no  computer  can 
ever  do  because  it  can  be  proved  that 


by  Philip  J.  Erdelsky 

there  are  no  algorithms  to  do  them — 
just  as  there  is  no  way  to  square  a  cir¬ 
cle  with  a  compass  and  straightedge. 

These  things  are  not  mere  mathe¬ 
matical  curiosities.  They  are  things 
that  programmers  would  like  to  have 
their  computers  do  for  them  and 


Philip  J.  Erdelsky,  Data/Ware  Devel¬ 
opment  Inc.,  4204  Sorrento  Valley 
Blvd.,  San  Diego,  CA  92121.  Philip  is  a 
software  manager. 


things  that  the  suppliers  of  software 
development  tools  would  like  to  in¬ 
corporate  into  their  debuggers.  Com¬ 
puter  science  curricula  usually  in¬ 
clude  the  subject  of  uncomputable 
functions,  but  programmers  who  are 
not  computer  science  majors  some¬ 
times  ask  for  the  impossible  without 
realizing  it. 

Alan  Turing  in  1935  asked  whether 
there  is  a  method  by  which  a  com¬ 
puter  program  can  determine 
whether  any  other  computer  pro¬ 
gram  will  halt.  This  is  the  famous 
“halting  problem.”  Turing  showed 
that  it  has  no  solution. 

A  debugger  with  this  ability  would 
certainly  be  useful.  Failure  to  halt 
normally  is  a  common  form  of  pro¬ 
gram  failure.  Moreover,  the  debug¬ 
ger  could  be  applied  successively  to 
parts  of  the  failed  program  to  isolate 
the  part  that  is  hanging  up. 

It  is  not  obvious  that  such  a  debug¬ 
ger  is  impossible.  Of  course,  the  de¬ 
bugger  can’t  just  single-step  the  pro¬ 
gram  to  see  if  it  halts.  If  the  program 
doesn't  halt,  the  debugger  could  run 
forever  without  determining  that 
this  is  the  case.  Or  it  might  give  up  just 
as  the  program  is  about  to  terminate, 
as  human  programmers  sometimes 
do.  At  some  point,  the  debugger 
would  have  to  be  able  to  say,  "Aha! 
This  loop  is  infinite!”  It  seems  as 
though  a  cleverly  written  debugger, 
having  all  the  tools  of  modern  high- 
level  languages  at  its  disposal,  might 
be  able  to  do  that. 

The  impossibility  proof  is  based  on 
the  following  argument.  If  you  have 
a  debugger  that  can  solve  the  halting 
problem,  given  unlimited  time  and 
memory,  then  you  can  use  the  same 
code  to  make  the  debugger  do  other 
things,  some  of  which  are  self-contra¬ 
dictory  and  hence  impossible. 

The  particular  computer  language 
is  not  important.  If  you  can  solve  the 
halting  problem  for  one  language, 
you  can  solve  it  for  another.  Just  use  a 
compiler  or  other  translation  pro¬ 
gram  before  solving  the  halting  prob¬ 
lem.  Notice  that  translating  an  assem¬ 
bly-language  program  to  a  higher- 
level  language  is  quite  easy,  although 
the  object  program  is  bound  to  be  in¬ 
efficient.  The  goal,  however,  is  to 
show  that  a  solution  to  the  halting 
problem  is  impossible,  not  merely 


inefficient. 

Turing  himself  proposed  a  mini¬ 
mal  machine  that  has  come  to  be 
called  the  Turing  Machine.  Its  memo¬ 
ry  was  supposed  to  be  infinitely  long 
but  only  one  bit  wide,  and  the  ma¬ 
chine  had  only  sequential  access  to  it, 
as  with  a  tape.  The  programming 
language  was  essentially  a  flowchart, 
with  only  a  few  basic  commands. 
Nevertheless,  Turing  showed  that  his 
machine  was  able  to  emulate  any 
other  machine,  given  enough  time 
and  a  suitable  program.  Such  a  con¬ 
struction  is  not  necessary  for  our 
purposes — you  can  imagine  that  the 
computer  is  programmed  in  some  fa¬ 
miliar  high-level  language. 

Now  consider  the  problem  of  de¬ 
termining  whether  a  program  can 
print  out  a  specified  string  S  (with  or 
without  other  output).  If  you  can 
solve  the  halting  problem,  you  can 
solve  this  problem.  Just  replace  every 
print  statement  in  the  program  with 
a  routine  that  does  not  send  the  out¬ 
put  to  the  printer  but  keeps  track  of 
the  output  and  halts  when  the  string 
S  appears.  Then,  to  keep  the  program 
from  halting  for  any  other  reason,  re¬ 
place  all  the  halt  statements  in  the 
program  with  endless  loops.  Then 
solve  the  halting  problem  for  the 
result. 

Such  a  program  would  be  useful  in 
itself  because  many  run-time  errors 
produce  distinctive  messages,  and  it 
would  be  helpful  to  predict  in  ad¬ 
vance  that  such  errors  will  occur. 

Because  this  applies  to  any  string  S, 
you  can  also  determine  whether  a 
program  prints  out  a  copy  of  itself. 
This  is  not  as  curious  as  it  appears  at 
first  glance.  It  is  easy  to  write  a  1,000- 
character  program  that  prints  out  all 
combinations  of  1,000  characters  in¬ 
cluding  itself.  In  fact,  1,000  charac¬ 
ters  is  probably  an  over  estimate  of 
the  number  of  characters  required 
in  most  high  level  languages. 

Now  you  can  write  a  program  to  do 
the  following  things.  First,  generate, 
one  by  one,  all  possible  programs. 
The  easiest  way  to  do  this  is  to  gener¬ 
ate  all  strings  and  check  each  one  to 
see  whether  it  is  a  program.  Compil¬ 
ers  do  this  when  they  check  syntax. 
Then  check  each  program  to  see 
whether  it  prints  out  a  copy  of  itself. 

(continued  on  page  140) 


14 


Dr.  Dobb's  Journal,  May  1987 

345 


ARTICLES 


by  David  Levitt 


This  year  heralds  an 
exciting  time  for  mu¬ 
sic  software.  Devel¬ 
opments  during  the  past  few 
years  have  pushed  the  first 
grand  visions  of  music  soft¬ 
ware  enthusiasts  rapidly  to¬ 
ward  reality.  Moreover,  the 
wide  adoption  of  MIDI  and 
decreasing  hardware  costs  mean  that  discoveries  move 
quickly  from  laboratories  into  recording  studios  and 
homes.  In  fact,  today  's  music  algorithms  originate  both  in 
research  labs  and  in  the  dens  of  readers  of  journals  such 
as  this  one. 

This  article  includes  a  brief  history  of  computers  in  mu¬ 
sic  and  then  focuses  on  recent  developments  in  several 
areas:  MIDI,  sampling,  transient-oriented  synthesis  meth¬ 
ods  such  as  the  Karplus-Strong  algorithm,  and  programs 
that  compose  and  collaborate  on  original  music. 

Ada  Augusta ’s  Vision 

Countess  Ada  of  Lovelace  speculated  thus  on  musical  ap¬ 
plications  of  the  first  computer — Charles  Babbage’s  un¬ 
finished  Analytical  Engine — in  the  1840s:  "Supposing,  for 
instance,  that  the  fundamental  relations  of  pitched 
sounds  in  the  science  of  harmony  and  of  musical  compo¬ 
sition  were  susceptible  of  such  expression  and  adapta¬ 
tions,  the  Engine  might  compose  and  collaborate  scientif¬ 
ic  pieces  of  music  of  any  degree  of  complexity  or  extent. 

It  took  more  than  a  century  before  Babbage's  vision 
was  realized  in  the  first  electronic  computers;  today,  af¬ 
ter  more  than  30  years  of  experiments  with  computer¬ 
generated  sound,  Ada’s  vision  is  finally  close  at  hand.  In 
the  interim,  “computer  music"  has  taken  on  the  role 
played  by  atonal  and  "experimental"  music  in  the  first 


David  Levitt,  117  Harvard  #3,  Cambridge,  MA  02139.  David 
is  a  research  scientist  in  the  Entertainment  Group  at  the 
Massachusetts  Institute  of  Technology's  Media  Laboratory. 


half  of  this  century. 

Timbre 

In  the  1950s,  Max  Mathews  of 
Bell  Laboratories  was  among 
the  first  to  explore  the  com¬ 
puter  as  a  generator  of  as  yet 
unheard  sounds.  At  night, 
Bell  Labs’  computers  became 
generalized  timbre  generators,  and  Mathews’  FORTRAN- 
based  MUSIC  V  language  became  the  first  digital  signal  gen¬ 
eration  language  intended  for  composers. 

Mathews  experimented  with  composing  algorithms, 
too.  In  one  funny  piece  the  computer  "interpolated”  be¬ 
tween  two  traditional  melodies,  gradually  replacing  the 
notes  of  one  melody  with  notes  from  the  other.  Still,  MUSIC 
V  was  primarily  a  timbre-generation  language;  it  wasn't 
useful  to  people  who  wanted  to  deal  primarily  with  pitch 
and  meter,  such  as  professional  composers  and  musicians 
who  use  standard  music  notation.  For  decades,  MUSIC  V 
and  its  relatives  (including  MUSIC  360  and  MUSIC  11  from 
MIT’s  Barry  Vercoe)  defined  a  field  by  providing  semipor¬ 
table  software  laboratories  for  exploring  timbre. 

Timbre  experiments  fall  into  two  categories:  efforts  to 
imitate  (or  somehow,  improve  upon)  familiar  sounds — 
for  example,  the  realistic  synthetic  piano  or  violin — and 
efforts  to  create  new  kinds  of  sounds,  or  transitions  be¬ 
tween  sounds,  that  are  unfamiliar  but  intrinsically  inter¬ 
esting.  It  was  harder  than  people  thought  to  approximate 
some  natural  sounds  and  not  especially  easy  to  build  a 
synthetic  orchestra  that  sounded  as  good  as  a  real  one.  On 
the  other  hand,  the  possibilities  for  new  sounds,  and 
pieces  based  on  them,  beckoned.  Computer  music  quick¬ 
ly  became  the  new  digital  branch  of  electronic,  experi¬ 
mental  music. 

This  trend  continued  into  the  60s  and  70s  when  Stan¬ 
ford’s  Center  for  Computer  Research  in  Music  and  Acous¬ 
tics  (CCRMA)  laboratory  gained  prominence.  John  Chown- 
ing,  cofounder  of  CCRMA  (pronounced  "karma”),  showed 


These  programs 
make  it  easier 
for  musical  novices 
to  make  music. 


16 

346 


Dr.  Dobb's  Journal,  May  1987 


that  frequency-modulating  a  simple  audio  signal  with  a 
second  signal  in  the  audio  band  resulted  in  rich  timbres — 
the  idea  that  became  Yamaha’s  DX/TX  FM  synthesizers  al¬ 
most  20  years  later.  More  recently,  CCRMA  scientist  Marc 
LeBrun  generalized  the  FM  algorithm,  creating  a  more 
powerful  method  known  as  nonlinear  waveshaping.  In 
the  70s,  former  CCRMA  students  such  as  Andy  Moorer  and 
F.  Richard  Moore  founded  the  IRCAM  music  research  lab 
in  Paris,  Lucasfilm's  audio  lab,  and  University  of  Califor¬ 
nia  at  San  Diego's  computer  music  program.  These  groups 
developed  special  signal-processing  hardware  to  synthe¬ 
size  timbres  quickly,  often  in  real  time. 

Despite  such  advances,  many  people  still  liked  sounds 
made  by  banging,  plucking,  and  blowing  into  physical 
objects  as  much  as  or  more  than  synthetic  sounds.  This 
wasn't  simply  a  matter  of  familiarity  or  cost;  something 
was  missing  from  most  of  the  synthetic  sounds  and  still  is. 


Buxton  and  his  students  at  The  University  of  Toronto.  In 
the  70s,  Buxton's  lab  gained  prominence  by  focusing  on 
the  quality  of  interaction  with  the  computer  while  mak¬ 
ing  music  with  it.  Buxton's  work  made  extensive  use  of 
pointing  devices,  graphics,  and  other  innovations. 

In  the  same  period,  Alan  Kay's  group  at  Xerox  PARC, 
which  also  held  user  interaction  sacred,  made  a  music 
system  with  FM,  graphical  timbre  design,  and  simple 
score  editing.  This  was  followed  by  Mockingbird,  an  in¬ 
teractive  music  editing  program  by  John  Maxwell  III  and 
Severo  Ornstein,  also  from  Xerox  PARC.  Mockingbird  al¬ 
lowed  users  to  edit  a  rough  transcription  of  a  keyboard 
performance  with  the  mouse,  using  traditional  music  no¬ 
tation.  Its  capabilities  still  exceed  those  of  today's  com¬ 
mercial  music  editing  software.  Meanwhile,  Don  Byrd  at 
the  University  of  Indiana  produced  SMUT,  the  most  versa¬ 
tile  program  for  printing  standard  musical  notation. 


Gesture  and  Graphics 

Part  of  the  problem  is  the  absence  of  gesture.  Today's 
timbre  software  is  best  at  controlling  the  loudness,  pitch, 
and  timing  of  a  sound  independently  of  timbral  parame¬ 
ters.  In  natural  instruments,  the  timbral  details  are  often 
nonlinearly  coupled  with  loudness,  pitch,  and  even  tim¬ 
ing.  How  would  you  describe  the  different  ways  and 
speeds  in  which  a  violinist  can  move  a  bow?  Or  the  differ¬ 
ent  ways  a  flautist  blows  a  flute?  Most  software  doesn’t 
try.  Not  only  do  most  input  devices  have  too  few  degrees 
of  freedom  but  also  the  software  doesn't  provide  easy 
access  to  the  right  parameters. 

Other  natural  instruments  have  more  high  frequencies 
when  you  strike  them  harder.  So  even  a  percussive  in¬ 
strument  such  as  the  piano  has  a  nonlinear,  velocity-cou¬ 
pled  component  that  will  be  missed  in  a  synthetic  instru¬ 
ment  with  a  linear  loudness  model.  The  gestural 
subtleties  in  natural  instruments  are  poorly  understood, 
but  evidently  they  are  important  to  many  listeners,  and 
they  make  it  hard  for  synthetic  timbres  to  compete. 

Gesture  has  always  been  a  central  concern  for  William 


Transients 

Lack  of  gesture  is  only  part  of  the  problem  with  today’s 
synthetic  timbres.  Most  timbre  algorithms  provide  control 
over  the  steady-state  frequency  spectrum  of  the  sound,  but 
that  is  not  the  best  way  to  control  its  subjective  effects. 
Listeners  are  especially  sensitive  to  transients — short  (50 
milliseconds  or  less)  periods  of  sudden  change — especially 
in  the  initial  attack.  We  have  no  trouble  recognizing  a  pi¬ 
ano  on  cheap  speakers,  though  most  of  the  low  and  high 
frequencies  are  missing,  because  the  changes  in  the  parts 
we  do  hear  are  correct.  In  fact,  experiments  show  that,  if  a 
steady  violin  tone  is  preceded  by  the  attack  from  a  trum¬ 
pet,  most  people  hear  a  trumpet.  Transients  dominate.  But 
traditional  timbre  creation  methods  such  as  additive  syn¬ 
thesis  (adding  up  harmonics  with  different  amplitudes) 
and  FM  don't  provide  direct  control  over  them. 

In  1983,  Stanford  student  Alex  Strong  invented  a  simple 
method  for  controlling  attack  transients.  He  didn't  use  the 
fancy  signal  processors  from  up  the  hill  at  the  CCRMA  lab; 
instead  he  hunted  for  a  way  to  make  more  interesting 
sounds  on  his  8080  homebrew  computer,  the  size  of  a 


Dr.  Dobb's  Journal,  May  1987 


17 

347 


MUSIC  SOFTWARE 
(continued  from  page  17) 


shoebox.  He  toggled  in  an  8080  program  (the  machine  had 
no  keyboard),  and  in  real  time  it  generated  what  sounded 
just  like  a  plucked  guitar  string. 

Strong  teamed  up  with  Kevin  Karplus,  another  Stan¬ 
ford  student,  to  create  the  Karpl us-Strong  algorithm  and  a 
new  approach  to  synthetic  timbre  design.  The  algorithm 
is  simple:  one  cycle  of  the  sound  is  stored  in  a  memory 
buffer  and  is  played  repeatedly  through  a  digital/analog 
converter.  The  pitch  is  determined  by  the  size  of  the  buff¬ 
er  and  the  rate  at  which  samples  are  played: 

pitch = samprate/bufsize 

This  is  the  way  wave  tables  work  in  many  simple  synthe¬ 
sis  programs.  But  with  Karplus  and  Strong’s  scheme,  each 
time  the  waveform  is  played,  a  filtered  version  is  comput¬ 
ed  and  stored  back  where  the  previous  waveform  was. 
Thus  the  algorithm  defines  a  feedback  loop  in  which  a 
filter  is  used  over  and  over  to  transform  the  waveform 
data. 

If  the  buffer  waveform  is  initially  white  noise,  and  the 
filter  is  a  simple,  first-order  low-pass  filter,  we  have 
Strong’s  basic  guitar  algorithm.  Intuitively,  we  see  that 
each  time  through  the  loop,  more  and  more  of  the  high 
frequencies  are  filtered  out,  so  the  sound  changes  quickly 
from  a  sharp  scratch  to  a  pure  tone  at  the  frequency 
samprate/bufsize.  This  is  the  plucklike  guitar  attack. 
Strong  implemented  the  low-pass  filter  by  averaging  ad¬ 
jacent  samples — simply  adding  and  right-shifting  pairs  of 
samples  on  the  8080 — and  was  able  to  produce  two  guitar 
voices  in  real  time  with  no  special  hardware.  With  slight 
changes  in  parameters,  the  algorithm  also  produced  con¬ 
vincing  banjo  sounds. 

Algorithms  such  as  Karplus-Strong  have  not  yet  been 
fully  explored.  Still,  the  essential  idea  is  important:  de- 
signers  can  gain  control  over  the  attack  and  other  tran¬ 
sients  by  creating  a  filter  that  describes  relative  rates  of 
change  in  the  sound's  frequency  components.  A  low-pass 
filter  in  the  feedback  loop  means  the  higher  frequencies 


fade,  as  in  a  "pluck”  attack;  a  high-pass  filter  means  the 
higher  frequencies  grow,  as  in  a  horn  or  other  brass  in¬ 
strument  attack.  (I  am  not  aware  of  attempts  to  produce 
brass  sounds  with  Karplus-Strong;  this  is  a  research  topic.) 

So,  timbre  designers  are  still  learning  new  ways  to  cre¬ 
ate  synthetic  sounds  that  listeners  find  realistic  or  appeal¬ 
ing.  Some  day  we  may  yet  have  synthetic  sounds  that 
unquestionably  improve  on  their  physical  predecessors. 

Sampling 

Until  then,  we're  being  rescued  by  falling  memory  costs. 
If  we  can’t  easily  improve  on  natural  sounds,  we  can  re¬ 
cord  them  into  digital  samples,  then  resample  (scale  the 
frequency)  and  loop,  playing  them  back  with  a  different 
pitch,  loudness,  and  duration.  The  digital  recording 
means  the  whole  sound — including  transients — is  faith¬ 
fully  captured.  We  still  have  to  account  for  nonlinear  scal¬ 
ing,  usually  by  recording  several  samples  over  a  range  of 
pitch  and  loudness  values.  To  make  an  instrument  based 
on  a  human  voice,  for  example,  we  sample  several  voices 
in  different  ranges;  then  when  we  play  a  high  note,  it  will 
sound  like  a  soprano,  not  like  Alvin  the  Chipmunk. 

Sampling  memory  is  at  the  heart  of  the  Kurzweil,  Emu¬ 
lator  II,  Prophet  2000,  and  many  other  recent  synthesiz¬ 
ers;  the  samplers  are  sure  to  decrease  in  cost  and  increase 
in  popularity.  Today  8-bit  samplers  are  available  for  less 
than  $2,000.  Within  a  few  years  the  16-bit,  44.1-kHz  per 
channel  format  used  in  audio  compact  discs  is  likely  to 
become  the  standard  for  computer  sound  sampling,  too. 

MIDI 

Of  all  these  innovations,  the  introduction  of  the  MIDI  (Mu¬ 
sical  Instrument  Digital  Interface)  standard  in  1984  is  hav¬ 
ing  the  greatest  effect  on  modern  music  software.  MIDI 
has  several  limitations,  covered  in  more  detail  in  Mark 
Gavin's  MIDI  article  in  this  issue,  but  the  very  existence  of 
an  interface  standard  has  helped  programmers  focus  on 
issues  that  are  independent  of  the  synthesis  technology — 
that  is,  on  music  composition  and  user  interaction  rather 
than  timbre  design. 

On  the  plus  side,  MIDI  has  the  necessary  rudiments  for 
gestural  control:  velocity  and  aftertouch  for  keyboard  in¬ 
struments  and  other  linear  control  options  (for  example, 
for  pedal  and  breath  control).  MIDI  even  includes  space 
for  instrument  makers  to  invent  their  own  new  codes. 
MIDI’s  demerits  include  no  standard  codes  for  polyphonic 
pitch  bending  (so  most  MIDI  synthesizers  have  a  fixed 
chromatic  tuning)  and  a  31.25-kilobit/sec  data  rate  that  is 
simply  too  slow  for  some  dense  polyphonic  pieces. 

Despite  its  flaws,  MIDI  is  making  event-level  recording 
and  computer  control  the  rule  rather  than  the  exception 
in  new  instruments.  Companies  such  as  Yamaha  have 
even  begun  to  include  MIDI  outputs  in  their  acoustic  pi¬ 
anos. 

This  means  a  welcome  end  to  computer  music,  which 
we  needn’t  distinguish  from  "real”  music  for  much  long¬ 
er.  Presumably  other  media  suffered  an  early  period  in 
which  the  technology  was  so  peculiar  it  dominated  the 
perceptions  of  both  artists  and  audiences.  Still,  we  don’t 
consider  the  hardcover  novel  a  separate  art  form,  and  it’s 
hard  to  imagine  a  book  so  dull  that  our  first  comments 
would  be  about  the  paper  it's  printed  on.  Now  that  virtu- 


18 

348 


Dr.  Dobb's  Journal,  May  1987 


ally  every  music  studio  has  a  MIDI  synthesizer  and  will 
soon  have  a  computer,  computer  music  will  fade  into  the 
past  and  innovative  composers  will  find  new  ways  to 
distinguish  themselves.  Thus  MIDI  is  nudging  us  nerds 
away  from  timbre  design  and  toward  composition  soft¬ 
ware  — the  rhythm,  melody,  and  harmony  problems 
that  have  remained  the  backbone  of  music  theory,  the 
music  business,  and  musical  communication. 

Algorithmic  Composition 

The  rest  of  this  article  touches  on  the  current  state  of  this 
exciting  musical  field:  representing  the  knowledge  of  the 
composer  and  improvisor  in  software.  This  is  what  the 
countess  dreamed  of  when  she  wrote  that  "the  Engine 
might  compose  and  collaborate  scientific  pieces  of  mu¬ 
sic .. .  Algorithmic  composers  write  programs  that  de¬ 
termine  what  notes  to  play  and  when.  This  has  little  in 
common  with  timbre  design;  algorithmic  composers 
might  even  dislike  synthetic  timbres,  preferring  to  print 
the  scores  their  programs  generate  so  they  can  be  read  and 
performed  on  natural  instruments  by  human  musicians. 
Thus,  algorithmic  composers  are  a  small,  almost  separate 
subculture  of  music  programmers — often  working  inde¬ 
pendently  in  the  same  laboratories  as  their  timbral  coun¬ 
terparts. 

I  have  been  writing  programs  that  improvise  jazz  and 
arrange  a  melody  in  a  given  musical  style  for  about  ten 
years.  My  graduate  work  in  the  MIT  Artificial  Intelligence 
Lab  entailed  writing  LISP  programs  that  represent  chords, 
melodies,  and  relationships  between  them — much  as  I  do 
when  I  improvise  jazz,  invent  a  new  exercise,  or  solve  a 
musical  problem  at  the  piano.  I  have  developed  style  tem¬ 
plates  that  use  a  given  melody  and  chord  progression  and 
create  simplified  parodies  of  bass  players,  bebop,  rag¬ 
time,  and  New  Orleans  jazz  ensemble  improvisation. 

Others  in  the  field  include  Fry,  who  also  wrote  jazz 
improvisation  programs  at  MIT.  His  most  advanced  pro¬ 
grams  could  write  an  additional  chorus  for  John  Col- 
trane's  famous  Giant  Steps  solo  or  Iron  Butterfly’s  "In  a 
Gadda  da  Vida.”  David  Wessel,  now  in  Paris  at  IRCAM, 
writes  programs  that  produce  convincing,  often  over- 
imaginative,  blues  solos.  Peter  Langston  of  BellCOR  has 
produced  three-part  rock  harmonies  and  pieces  based  on 
organic  “growth”  algorithms. 

Laurie  Spiegel  was  a  composer /scientist  in  Max  Math¬ 
ews’  area  at  Bell  Labs  and  today  makes  her  living  as  a 
composer  in  New  York  City.  Her  programs  reflect  her  in¬ 
terest  in  folk  melodies  and  traditional  harmony.  Spiegel's 
programs  are  sufficiently  subtle  and  musical  that  people 
assume  her  pieces  weren't  written  with  a  computer. 

Bill  Schottstadt  developed  the  Pla  composing  language 
at  CCRMA,  where  he  and  others  use  it  for  algorithmic  com¬ 
position  of  relatively  traditional  pieces.  For  instance, 
when  CCRMA  student  David  Jaffc  extended  the  Karplus- 
Strong  algorithm  to  simulate  a  particularly  realistic  banjo 
and  guitar,  he  used  Pla  algorithms  to  build  a  parody  of 
Kentucky  bluegrass  arpeggios  in  an  arrangement  he 
called  "Silicon  Valley  Breakdown.” 

Roger  Dannenberg  of  Carnegie  Mellon  University  has 
been  pioneering  a  different  area:  software  that  accompa¬ 
nies  a  live  performer,  following  the  tempo  and  playing 
all  the  other  parts.  Barry  Vercoe  and  Miller  Puckette  of 


MIT  have  a  similar  program,  which  also  learns  to  antici¬ 
pate  the  performer  on  subsequent  rehearsals. 

In  short,  we  are  seeing  a  renaissance  of  knowledge- 
based  music  composition  and  collaboration  software.  Al¬ 
though  most  of  these  projects  are  based  in  university  re¬ 
search  labs,  the  software  can  also  run  on  personal 
computers.  For  instance,  we  have  several  such  programs 
running  on  the  Macintosh  at  the  MIT  Media  Lab.  More¬ 
over,  several  algorithmic  music  software  titles  have  ap¬ 
peared  on  the  market  in  1986  and  1987. 

In  Laurie  Spiegel's  Music  Mouse  for  the  Macintosh,  a 
performer  controls  the  overall  motion  of  up  to  four  voices 
using  the  mouse,  while  the  program  fits  the  performer’s 
gestures  into  a  selected  scale  (for  example,  Major,  Penta¬ 
tonic,  or  Middle  Eastern).  Notes  are  also  constrained  to 
change  only  on  metrical  time  boundaries,  so  every  ges¬ 
ture  is  automatically  both  tonal  and  rhythmically  "musi¬ 
cal.”  Typewriter  keyboard  switches  provide  control  over 
a  range  of  other  musical  parameters. 

In  Bob  Campbell's  Instant  Music  for  the  Amiga,  the  per¬ 
former  can  control  pitch  with  the  mouse  while  seeing 
harmonic  and  rhythmic  information  on  the  screen.  Im¬ 
portant  harmonic  relationships  are  indicated  cleverly  by 
color.  In  one  mode,  you  collaborate  with  the  computer  on 
a  lead  line:  you  control  pitch  while  the  program  controls 
rhythm.  When  your  ideas  conflict,  you  can  hear  and  al¬ 
most  feel  the  tension — again  without  losing  "musicality.” 

David  Zicarelli's  Jam  Factory,  also  for  the  Mac,  segments 
your  MIDI  performance  into  phrases  and  plays  them 
back — either  with  new  accents  or  stochastically  scram¬ 
bled  (using  a  Markov  algorithm)  into  a  sort  of  musical  wall¬ 
paper  that’s  like  what  you  played.  For  each  of  up  to  four 
performances,  you  control  about  a  dozen  playback  param¬ 
eters  with  the  mouse  and  can  play  along  with  them. 

Zicarelli  also  contributed  to  Joel  Chadabe’s  M,  an  inter¬ 
active  composing  and  performing  system.  Chadabe  has 
been  making  composing  algorithms  for  years;  I  haven't 
had  a  chance  to  try  his  program  yet. 

Most  of  these  programs  make  it  much  easier  than  ever 
for  musical  novices  to  make  a  piece  of  music  that  sounds 
good.  The  knowledge  in  the  program  provides  a  substi¬ 
tute  for  musical  virtuosity.  As  computer  scientist  Alan 
Kay  likes  to  ask:  Is  it  an  intelligence  amplifier  or  just  a 
prosthetic?  To  the  author  of  the  program,  it's  an  amplifi¬ 
er;  she  or  he  already  knows  the  musical  concepts  and  is 
controlling  them  with  greater  leverage.  For  some  users,  it 
might  be  a  crutch;  but  programs  that  provide  graphic  or 
other  feedback  can  show  novices  what  the  computer  is 
doing  and  what  knowledge  it  is  using.  Then  the  programs 
can  be  more  like  "training  wheels,”  temporary  stabilizers 
that  can  eventually  come  off.  Ultimately,  users  them¬ 
selves  will  decide. 

Motes 

1.  Robert  Taylor,  ed.,  Scientific  Memoirs,  Vol.  Ill:  Ada  Au¬ 
gusta,  Countess  of  Lovelace  (Johnson,  1943). 

DDJ 

Vote  for  your  favorite  feature/article. 

Circle  Reader  Service  No.  2. 


Dr.  Dobb's  Journal,  May  1987 


19 

349 


ARTICLES 


Designing 

a 

Music  Recorder 

by  Mark  Garvin 


Imagine  designing  products  for  a 
changing  microcomputer  mar¬ 
ket  without  hardware  and  soft¬ 
ware  standards.  Standard  architec¬ 
tures  such  as  IBM  PCs  and  Apples  have 
led  to  a  proliferation  of  new  micro¬ 
computer  products,  and  the  develop¬ 
ment  of  MIDI  (Musical  Instrument  Dig¬ 
ital  Interface)  has  resulted  in  a  similar 
revolution  in  the  music  industry.  By 
providing  common  ground  for  com¬ 
munication,  MIDI  allows  software  en¬ 
gineers  and  musicians  to  access  a 
wide  range  of  synthesizers,  comput¬ 
ers,  and  musical  controllers. 

MIDI  has  become  virtually  uncon¬ 
tested  as  a  means  to  link  synthesizers 
and  computer  equipment  and  is  now 
finding  acceptance  in  many  related 
industries  such  as  lighting  control, 
film  editing,  and  automated  audio 
mixing.  By  supporting  real-time  ac¬ 
cess  to  so  many  devices,  MIDI  has 
opened  new  dimensions  for  record¬ 
ing,  composition,  and  live  perform¬ 
ance.  Now  musicians  can  generate 
orchestral  scores  with  banks  of  rack¬ 
mounted  synthesizers,  coordinate 
sound  and  visual  effects,  and  even 
transmit  stored  musical  data  over 
phone  lines. 

In  this  article  I  outline  several  ap¬ 
plications  of  MIDI  and  I  suggest  sever¬ 
al  ways  to  get  started  writing  your 
own  MIDI  software.  I  have  listed  some 
of  the  current  products  and  manu¬ 
facturers,  but  these  listings  are  by  no 
means  complete.  They  are  given  only 
to  provide  a  starting  point  for  obtain- 


Mark  Garvin,  Xymetric  Productions, 
211  IV.  Broadway,  New  York,  NY 
10013.  Mark  has  worked  in  both  music 
and  electronics  for  15  years.  He  re¬ 
cently  designed  an  eight-port  MIDI  con¬ 
troller  that  runs  on  the  IBM  PC/AT. 


Here  are  ways 
to  get  started 
writing  your  own 
MIDI  software. 


ing  additional  information. 

Overview 

The  MIDI  specification  has  remained 
reasonably  intact  since  it  was  first 
proposed  by  synthesizer  manufac¬ 
turer  Sequential  Circuits  in  1982.  It  is 
sufficiently  specialized  to  handle 
most  direct  musical  communication, 
yet  its  generality  has  allowed  it  to 
adapt  to  applications  that  were  not 
foreseen  when  it  was  originally 
drafted. 

MIDI  entails  both  a  hardware  and  a 
software  specification:  the  hardware 
consists  of  a  relatively  fast  optically 
isolated  serial  loop  with  separate  ca¬ 
bles  for  send  and  receive;  the  soft¬ 
ware  provides  detailed  methods  for 
transmitting  note-control  data  and 
looser  specifications  for  handling  in¬ 
teraction  between  products  from  dif¬ 
ferent  manufacturers.  MIDI  works  in 
much  the  same  way  as  an  RS-232  mo¬ 
dem  protocol,  but  it  is  optimized  for 
musical  data.  Modern  MIDI  record/ 
playback  systems  could  be  regarded 
as  a  type  of  multiprocessor  network 
because  an  intelligent  master  (key¬ 
board  or  computer)  controls  a  series 
of  devices,  each  with  its  own  on¬ 
board  intelligence. 

MIDI  commands  include  NOTE-ON 
and  NOTE-OFF,  response  MODE  for  fil¬ 
tering  received  signals,  REAL-TIME 
messages  for  coordinating  events, 
and  SYSTEM  COMMON  and  EXCLUSIVE 


commands  for  setting  up  songs  or  ad¬ 
dressing  a  particular  brand  of  syn¬ 
thesizer.  This  simple  instruction  set 
allows  enough  flexibility  to  accom¬ 
modate  most  synthesizer  architec¬ 
tures  while  providing  much  needed 
universal  music  commands.  As  more 
manufacturers  have  realized  the  ad¬ 
vantages  of  communicating  with  a 
wide  array  of  musical  equipment, 
MIDI's  popularity  has  mushroomed. 
Few  synthesis  instruments  are  sold 
today  without  MIDI  interfaces. 

All  MIDI  documentation  is  now 
handled  by  the  International  MIDI  As¬ 
sociation  (IMA).  (See  the  box  on  page 
48.) 

Products 

Fortunately,  the  high  level  of  compe¬ 
tition  among  musical  instrument 
manufacturers  has  been  offset  by 
specialization:  small  companies  can 
offer  transmit-only  devices,  such  as 
high-quality  keyboards  with  no 
sound  output,  or  receive-only  de¬ 
vices,  such  as  sound  generators  that 
respond  only  to  MIDI  input.  Undoubt¬ 
edly,  the  broadest  new  field  is  that  of 
software-based  controllers.  These 
usually  connect  between  a  keyboard 
(or  other  source)  and  a  sound  genera¬ 
tor,  where  they  monitor  and  control 
MIDI  communication.  Some  of  the 
newer  MIDI-based  products  include 
guitar,  voice,  and  even  xylophone-to- 
MIDI  converters  (Roland,  Fairlight), 
software-hardware  retrofits  for 
playing  digitized  notes  from  personal 
computers  (Hybrid  Arts),  MIDI-con¬ 
trolled  reverb  and  echo  units  (Lexi¬ 
con,  KORG),  and  MIDI-controlled  audio 
mixing  consoles  (AKAI). 

Synthesizers 

Early  synthesizers — used  for  scoring 


22 

350 


Dr.  Dobb's  Journal,  May  1987 


so  many  old  science-fiction  movies — 
were  assembled  from  several  mod¬ 
ules  that  were  interconnected  man¬ 
ually  by  patchcords.  These  and  other 
voltage-controlled  instruments  have 
now  attained  a  certain  vintage  status. 

New  instruments  use  computer¬ 
ized  signal  routing,  and  modern 
sound-generating  techniques  range 
from  additive  synthesis  (Kawai),  FM 
synthesis  (Yamaha),  and  phase-distor¬ 
tion  synthesis  (Casio)  to  actual  digital 
recording,  or  sampling,  of  natural 
sounds  (Sequential  Circuits,  E-Mu, 
Kurzweil). 

Additive  synthesis  uses  the  addi¬ 
tion  of  several  sine-wave  compo¬ 
nents  to  produce  an  output  wave¬ 
form.  Theoretically,  any  waveform 
can  be  broken  down  into  sine-wave 
components  by  using  a  process 
known  as  Fourier  analysis.  It  follows, 
then,  that  any  waveform  can  be  re¬ 
created  by  adding  these  same  sine- 
wave  components.  In  actual  applica¬ 
tion,  the  process  is  not  so  simple;  the 
human  ear  quickly  becomes  bored 
with  the  static,  or  unchanging,  wave¬ 
form  that  is  created.  It  is  this  static 
characteristic  of  early  synthesizers 
that  contributed  to  the  stereotyped 
monotonous  or  bland  sound. 

The  waveforms  generated  by  tra¬ 
ditional  acoustic  instruments  change 
as  notes  are  being  played,  so  recreat¬ 
ing  the  natural  timbres  of  these  in¬ 
struments  requires  real-time  control 
over  the  amplitudes,  or  envelopes,  of 
the  sine-wave  components.  In  early 
synthesizers,  this  was  approximated 
by  the  use  of  an  envelope  generator 
that  cycled  through  Attack-Decay- 
Sustain-Release  (ADSR)  states  in  re¬ 
sponse  to  key-down  and  key-up 
events.  Patching  the  ADSR  generator 
into  voltage-controlled  filters  and 
voltage-controlled  amplifiers  provid¬ 
ed  a  primitive  level  of  control  over 
the  harmonic  structure  of  the  wave¬ 
form.  Newer  machines  sometimes 
use  a  separate  programmable  enve¬ 
lope  for  each  sine-wave  component 
of  the  waveform.  With  sufficient 
control  of  the  amplitudes,  any  con¬ 
ceivable  sound  can  be  recreated 
without  having  to  use  digitized  wave 
samples.  This  is  the  objective  of  the 
resynthesis  or  adaptive  synthesis  ma¬ 
chines,  such  as  Roland's  digital  piano. 

FM  synthesis  uses  a  limited  num¬ 
ber  of  sine-wave  oscillators  with  indi¬ 
vidual  envelopes,  so  in  this  respect  it 


bears  some  resemblance  to  resynthe¬ 
sis  methods.  One  significant  depar¬ 
ture  is  in  the  way  the  oscillators  are 
configured:  different  algorithms  can 
be  chosen  to  allow  oscillators  to  inter- 
modulate,  creating  rich  and  some¬ 
times  enharmonic  frequencies.  The 
resulting  output  can  range  from  clan¬ 
gorous-sounding  bells  to  human¬ 
sounding  voices,  but  FM  machines 
are  relatively  unpredictable  and  dif¬ 
ficult  to  program. 

Phase-distortion  synthesis  involves 
scanning  a  simple  (usually  sine)  wave 


MIDI  works 
in  much 
the  same  way 
as  an 
RS-232 

modem  protocol. 


and  varying  the  scan  rate  as  the  wave 
is  being  replayed.  In  other  words,  the 
leading  edge  of  the  sine  wave  can  be 
scanned  rapidly  so  it  appears  to  be  a 
nearly  vertical  edge;  the  trailing  edge 
can  be  scanned  more  slowly  so  it  has 
a  tapered  slope.  The  resulting  saw¬ 
tooth  waveform  is  much  richer  in 
harmonics  than  the  flute-like  sine 
wave.  Dynamic  variation  of  the  scan 
rate  can  change  the  shape  and  timbre 
of  the  waveform  as  a  note  is  being 
played. 

Some  of  these  methods  of  sound 
generation  may  seem  to  make  the  use 
of  waveform  sampling  unnecessary, 
but  in  fact  samplers  can  usually  do 
much  more  than  recreate  natural 
sounds.  For  example,  precise  (up  to 
16-bit)  digitizations  of  orchestras, 
drums,  waterfalls,  or  human  voice 
can  be  altered  and  played  back  at  any 
pitch.  The  elite  of  the  sampling  syn¬ 
thesizers  (Fairlight,  New  England  Dig¬ 
ital)  can  cost  hundreds  of  thousands 
of  dollars,  but  they  eliminate  the 
need  for  a  lot  of  expensive  equip¬ 
ment  in  a  recording  studio.  New  Eng¬ 
land  Digital  even  advertises  a  tapeless 
studio  that  records  audio  tracks  di¬ 
rectly  to  a  high-capacity  hard  disk. 


Optical  Media  offers  prerecorded 
sound  libraries  on  CD  ROMs.  Such  sys¬ 
tems  are  becoming  more  affordable 
as  the  cost  of  mass  storage  continues 
to  drop. 

Patch  Editors  and  Librarians 

These  diverse  methods  of  sound  syn¬ 
thesis  have  introduced  a  new  class  of 
problems:  a  few  potentiometers  on  a 
front  panel  are  no  longer  sufficient  to 
program  (that’s  right,  program )  a  syn¬ 
thesizer.  One  manufacturer  claims 
that,  if  all  of  its  front-panel  functions 
were  to  be  made  available  at  once,  its 
synthesizer  would  be  17  feet  long.  In¬ 
stead  of  overlaying  the  limited  set  of 
front-panel  controls  with  multiple 
modes,  synthesizer  front-panel  func¬ 
tions  can  usually  be  accessed  by  send¬ 
ing  a  special  set  of  MIDI  SYSTEM-SPE¬ 
CIFIC  commands  from  a  computer 
outfitted  with  MIDI  ports. 

Synthesizer  programming  has  be¬ 
come  an  art  in  the  same  sense  as  actu¬ 
al  performance  of  music,  and  album 
covers  frequently  give  credit  to 
sound  programmers,  even  if  they  do 
not  perform  on  the  album.  The 
sound  programs,  which  can  make  or 
break  the  sound  of  a  synthesizer,  are 
known  as  patches.  Good  program¬ 
mers  frequently  make  a  living  solely 
by  selling  their  patches,  either  in  the 
form  of  instruction  sheets  that  sim¬ 
ply  explain  how  to  recreate  the  origi¬ 
nal  sounds  or  in  a  downloadable  bi¬ 
nary  form  on  floppy  disks.  Disk- 
based  systems  require  the  use  of  a 
patch  librarian  program  to  organize 
and  access  patch  files.  Most  patch  li¬ 
brarians  allow  two-way  communica¬ 
tion  between  the  synthesizer  and 
computer,  so  patches  can  be  sent  to  a 
synthesizer,  modified  with  the  use  of 
the  synthesizer's  controls,  and  sent 
back  to  be  archived  on  disk.  Some  so¬ 
phisticated  librarians  (Voyetra)  can 
handle  several  different  types  of  syn¬ 
thesizers  or  even  download  rhythm 
patterns  to  electronic  drum 
machines. 

Patch  editor  programs  differ  from 
patch  librarians  in  that  they  can  actu¬ 
ally  alter  the  sound  of  a  patch.  Just  as 
the  term  implies,  MIDI  system-specif¬ 
ic  messages  usually  are  specific  to  a 
certain  brand  and  type  of  synthesiz¬ 
er,  so  most  patch  editors  are  designed 
to  work  with  only  one  or  two  types 
of  machines.  The  system-specific 
messages  that  are  sent  by  these  edi- 


Dr.  Dobb's  Journal,  May  1987 


23 

351 


MUSIC  RECORDER 
(continued  from  page  23) 


tors  are  prefixed  by  an  address  code 
that  tells  other  types  of  machines  to 
ignore  the  data  that  follows,  so  patch¬ 
es  can  be  sent  to  one  device  in  a  MIDI 
network  without  problems  resulting 
from  other  machines  misinterpret¬ 
ing  the  data.  System-specific  codes 
are  usually  published  along  with 
other  technical  data  in  a  synthesiz¬ 
er’s  instruction  manual. 

Some  vendors,  such  as  Bacchus 
Software,  specialize  in  patch  editor 
programs.  Bacchus'  IBM  PC-based  edi¬ 
tor  is  a  SideKick-style  memory-resi¬ 
dent  program  that  pops  up  over  an¬ 
other  running  music  program.  Its 
main  function  is  editing  and  archiv¬ 
ing  patches  for  the  popular  but  diffi¬ 
cult  to  program  Yamaha  DX-7  synthe¬ 
sizer.  It  is  usually  in  the  best  interests 
of  manufacturers  to  either  write 
their  own  micro-based  editors  or  to 
hire  third-party  software  writers  to 
support  their  products.  For  example, 
DigiDesign’s  waveform-editing  pro¬ 
grams  allow  data  to  be  downloaded 
from  sampling  synthesizers,  dis¬ 
played,  altered,  and  sent  back  to  a 
sampler  to  be  played.  They  support 
several  brands  of  synthesizers,  and 
manufacturers  value  their  support 
because  they  make  the  samplers 
much  more  accessible  and  market¬ 
able.  Some  manufacturers  even  fea¬ 
ture  DigiDesign's  software  in  their 
own  ads  and  exhibits. 

Sequencers 

One  of  the  many  advantages  MIDI  af¬ 
fords  is  the  ability  to  intercept  and 
record  musical  events.  The  recorded 
data  can  be  manipulated  in  ways  that 
were  impossible  using  standard  au¬ 
dio  tape  recorders.  Transposition 
(pitch  shift),  copying,  and  rearrang¬ 
ing  can  be  accomplished  with  soft¬ 
ware  alone,  and  errors  made  during 
the  recording  process  can  be  correct¬ 
ed  without  having  to  do  retakes. 
Compositions  can  even  be  replayed 
while  changing  synthesizer  patches, 
so  a  part  that  was  originally  written 
for  a  cello-type  voicing  could  be  tried 
out  with  a  piano  sound. 

MIDI-based  recorders  or  composi¬ 
tion  programs  are  sometimes  known 
as  sequencers  (a  carry-over  from  old 
analog  instruments).  In  the  modern 
context,  a  sequencer  might  be  visual¬ 


ized  as  being  much  like  an  audio  tape 
machine.  Concepts  such  as  multi¬ 
tracking,  fast-forward,  and  rewind 
can  be  translated  to  software  to  ease 
the  transition  from  conventional 
tape-oriented  studios.  New  concepts 
such  as  quantization  (automatic  tim¬ 
ing  correction),  real-time  transposi¬ 
tion,  and  complex  looping  constructs 
would  be  nearly  impossible  with  the 
use  of  audio  tape  alone. 

Sequencers  are  currently  available 
for  the  IBM  PC  (Jim  Miller,  Voyetra), 
Macintosh  (Southworth,  Mark  of  the 
Unicorn),  Amiga  (Mimetics),  Atari  ST 
(Hybrid  Arts),  and  Commodore-64 
and  Apple  II  (Doctor  T,  Passport).  One 
of  the  original  entries  in  the  IBM  field 
is  Jim  Miller’s  Personal  Composer, 
which  can  record  a  performance  in 
real  time  or  let  you  enter  data  with  a 
mouse  or  keyboard.  Musical  data  can 
then  be  rearranged  and  edited  using 
traditional  staff-line  notation  and 
scores  can  be  printed  on  common 
dot-matrix  printers.  Personal  Com¬ 
poser  includes  a  built-in  patch  editor 
for  DX-7  synthesizers,  a  small  graph¬ 
ics  editor,  and  even  a  user-accessible 
LISP  interface. 

Personal  Computer  Interfaces 

If  you  already  own  one  of  the  com¬ 
puters  mentioned,  getting  started  in 
MIDI  composition  is  as  simple  as  pur¬ 
chasing  the  standard  interface.  You 
will  then  have  access  to  a  broad 
range  of  software  packages  that  you 
can  expand  and  update  via  disk  (just 
like  you  do  compilers  and  word  pro¬ 
cessors).  Some  of  the  more  common 
interfaces  come  from  Passport  (Com¬ 
modore-64  and  Apple  II),  Opcode 
(Macintosh),  Mimetics  (Amiga),  and 
Roland  (IBM  PC  and  Apple  II).  The 
Atari  ST  series  has  a  built-in  MIDI  in¬ 
terface.  Most  of  these  interfaces  con¬ 
sist  of  little  more  than  a  UART  for  han¬ 
dling  serial  communication.  The 
Roland  MPU-401,  however,  includes 
an  on-board  microprocessor  and  tim¬ 
er  for  providing  you  with  prepro¬ 
cessed,  buffered  data  packets. 

Whitt  MIDI  Does 
(and  Doesn't  Do) 

First  of  all,  MIDI  does  not  solve  all  ex¬ 
isting  problems  in  interfacing  sound- 
generating  equipment.  For  example, 
it  is  easy  to  send  instructions  for  turn¬ 
ing  particular  notes  on  and  off,  and 
sending  a  NOTE-ON  command  usually 


results  in  a  predictable  pitch  from 
any  two  synthesizers.  There  is  no 
standard  for  a  violin  sound  or  an 
oboe  sound,  however.  Patches  that 
are  stored  internally  in  the  synthesiz¬ 
er  can  be  requested  via  MIDI,  but  it  is 
up  to  the  individual  programmer  or 
manufacturer  to  devise  the  violin  or 
oboe  sound  and  assign  a  patch  num¬ 
ber  to  it.  Manufacturers  of  lighting 
controls  or  mixing  boards  are  on 
their  own;  system-specific  com¬ 
mands  tell  other  devices  to  ignore 
codes  they  won't  understand,  but 
there  is  no  standard  spec  for  how 
lights  should  respond  to  music 
commands. 

Some  parameters  are  standardized 
by  MIDI — for  example,  voice  mes¬ 
sages  include  NOTE-ON  and  NOTE-OFF 
events.  Notes  are  assigned  numbers 
from  0-7fh  and  simply  turned  on  and 
off.  Velocity  information  is  included 
with  the  command  and  can  be  inter¬ 
preted  as  loudness.  Channel  address¬ 
es  designate  a  particular  oscillator  or 
synthesizer  that  will  respond  to  the 
command— bytes  from  90h  to  9fh 
turn  on  notes  for  channels  O-Ofh; 
bytes  from  80h  to  8fh  turn  the  same 
notes  off.  For  example,  the  3-byte 
command  to  turn  on  note  number 
32h,  on  channel  3,  at  velocity  22h  is 
93h,  32h,  22h;  the  command  to  turn 
off  the  same  note  with  a  turn-off  ve¬ 
locity  of  lOh  is  83h,  32h,  lOh.  Velocity 
is  usually  not  relevant  when  turning 
a  note  off,  so  sometimes  a  NOTE-ON 
command  with  a  velocity  of  zero  is 
used  as  a  NOTE-OFF.  Other  voice  com¬ 
mands  include  key  pressure  and 
pitch  bend,  but  these  are  not  sent  as 
part  of  the  NOTE-ON  or  NOTE-OFF 
message  packets. 

MODE  MESSAGES  control  the  re¬ 
sponse  characteristics  of  the  receiv¬ 
ing  device.  Synthesizers  can  be  told  to 
listen  to  a  specific  channel  ( OMNI-OFF ) 
or  to  respond  to  messages  on  all  chan¬ 
nels  ( OMNI-ON ).  In  addition,  provision 
is  made  to  address  one  oscillator  per 
channel  ( MONO  MODE)  or  to  allow  the 
synthesizer’s  internal  software  to  as¬ 
sign  its  own  voices  so  messages  can  be 
sent  over  one  channel  (POLY  MODE). 
Combinations  of  these  yield  four 
modes. 

SYSTEM-REAL-TIME  and  CLOCK  MES¬ 
SAGES  allow  synchronization  of  all 
machines  in  the  chain  by  sending  a 
periodic  software  clock  tick  over  the 
MIDI  bus. 


26 

352 


Dr.  Dobb's  Journal,  May  1987 


MUSIC  RECORDER 

(continued  from  page  26) 

SYSTEM-SPECIFIC  commands  ad¬ 
dress  synthesizers  from  a  specific 
manufacturer.  These  commands  are 
used  to  send  patches  and  other  data 
that  has  meaning  only  to  certain 
types  of  synthesizers.  Each  manufac¬ 
turer  must  apply  for  its  own  identifi¬ 
cation  byte  (Sequential  Circuits  =  1, 
Kawai  =  40h,  and  so  on).  An  END  OF 
EXCLUSIVE  (EOX )  or  another  status 
byte  tells  deselected  devices  to  re¬ 
sume  listening  to  bux  data. 

SYSTEM  COMMON  messages  ad¬ 
dress  all  synthesizers  on  line.  They 
are  used  primarily  for  setup  informa¬ 
tion,  such  as  selecting  songs  or  telling 
synthesizers  to  tune  their  oscillators. 

The  first  byte  of  all  MIDI  commands 
has  its  most-significant  bit  (MSB)  set  to 
1,  so  command  bytes  are  always  be¬ 
tween  80h  and  Offh.  Data  bytes  have 
their  MSBs  reset  (0),  so  their  range  is  0 
to  7fh.  If  a  given  block  of  data  con¬ 
tains  values  greater  than  7fh,  all 
bytes  in  the  block  are  broken  into  7- 
bit  nibbles  and  sent  in  two  parts.  This 
keeps  the  MSBs  reset  so  that  receivers 
can  always  stay  in  sync — even  if  a 
byte  is  lost. 

Miut  Hardware  Specification 

The  original  MIDI  specification  made 
some  trade-offs  between  speed,  econ¬ 
omy,  and  efficiency,  which  inevita¬ 
bly  resulted  in  performance  compro¬ 
mises.  It  is  easy  to  point  out 
shortcomings  now  that  the  interface 
has  become  widely  known,  but  the 
low  cost  had  a  lot  to  do  with  its  initial 
acceptance. 

MIDI  is  a  serial  protocol  that  com¬ 
municates  on  a  31.25-kHz,  optically 
isolated  current  loop.  The  odd  baud 
rate  resulted  from  the  reluctance  of 
earlier  manufacturers  to  install  spe¬ 
cial  crystals  when  they  usually  had  a 
1-  4-MHz  processor  clock  available 
(31.25  kHz=l  MHz/32).  Use  of  a  con¬ 
venient  single-chip  binary  divider 
yields  the  31.25-kHz  UART  clock. 

Optical  isolation  is  required  for 
eliminating  ground  loops  and  isolat¬ 
ing  sensitive  audio  equipment  from 
the  high  frequencies  in  computer 
gear.  An  optically  isolated  parallel  in¬ 
terface  could  have  been  specified, 
but  serial  interfacing  decreases  the 
cost  for  the  isolators  and  simplifies 
cabling.  The  penalty,  of  course,  is 


speed.  Clock  rates  are  limited  by  ca¬ 
ble  capacitance  and  by  response 
times  for  economical  optos  and 
UARTs.  Some  manufacturers  are  now 
using  62.5  -kHz  (double  frequency) 
rates  for  downloading  waveform 
samples  or  other  data-intensive  ap¬ 
plications,  but  it  is  unlikely  that  the 
baud  rate  standard  will  change  in  the 
near  future. 

The  optos  at  the  receiving  end  of 
each  MIDI  link  require  about  5  mA  to 
turn  on.  Because  the  loop  sends  a  cur¬ 
rent  (like  old  Teletype  machines  do), 
it  is  relatively  immune  to  noise  as 
long  as  cables  don’t  extend  more  than 
50  feet.  The  built-in  current  limit  re¬ 
sistors  prevent  star-network  configu¬ 
rations  (only  one  receiver  can  be 
hooked  to  a  transmitter),  so  a  third 
port  (the  MIDI  THRU  port)  is  included 
on  most  synthesizers  to  allow  daisy- 
chaining  of  receivers.  The  MIDI  THRU 
port  duplicates  the  data  coming  from 
the  MIDI  IN  port  and  retransmits  it  to 
the  next  machine  in  the  chain. 

Speed  Considerations 

Most  MIDI  NOTE-ON  or  NOTE-OFF  com¬ 
mands  require  3  bytes  at  320  micro¬ 
seconds  per  byte,  so  turning  on  a  note 
on  the  synthesizer  takes  approxi¬ 
mately  1  millisecond.  The  human  ear 
is  more  sensitive  to  starting  (attack) 
transients  than  to  ending  (decay)  tim¬ 
ings — for  psychoacoustic  reasons 
and  simply  because  the  played  notes 
usually  start  off  sharply  and  taper  off 
before  their  final  decay.  This  means 
that  even  in  best-case  circumstances 
when  no  other  events  are  being  sent 
over  the  MIDI  bus,  starting  transients 
for  ten  NOTE-ON  events  will  be 
spread  apart  by  10  milliseconds.  This 
approaches  the  threshold  of  audible 
delay,  and  additional  notes  may  have 
a  slap-echo  effect.  Similarly,  large 
chords  may  be  audibly  arpeggiated. 

To  avoid  objectionable  delays, 
some  MIDI  hardware  now  features 
multiple  MIDI  OUT  ports.  The  comput¬ 
er  sends  parallel  commands  to  each 
port  to  avoid  daisy-chaining  delays. 
These  should  not  be  confused  with 
MIDI  THRU  ports,  which  track  the  MIDI 
IN  port.  Multiple  MIDI  IN  ports  are  less 
common  but  certainly  helpful  when 
recording  events  coming  from  more 
than  one  source.  Because  MIDI  is  a 
multibyte  protocol,  merging  and  re¬ 
cording  two  sources  can  be  fairly 
complex  if  only  one  input  port  is 


available. 

Designing  a  Sequencer 

MIDI  control  software  can  be  written 
in  any  language,  but  fast  queuing  of 
serial  data  is  important.  The  majority 
of  software  writers  I  know  use  a  mix¬ 
ture  of  C  and  assembly  language.  It 
may  be  convenient  to  use  a  compiler 
such  as  Wizard  C,  which  allows  drop¬ 
ping  into  assembly  language  for  I/O 
access  or  speed. 

Small  computers  can  be  used,  but 
be  aware  that  RAM  can  be  used  up 
quickly.  If  storage  for  a  single  note 
uses  8  bytes,  an  eight-finger  chord 
will  use  64  bytes,  and  playing  eight  of 
these  chords  in  one  measure  will  use 
0.5K  RAM.  Linear  address  space  is 
easy  to  allocate  and  control,  and  seg¬ 
mented  architectures  present  no 
large  problems  because  a  segment  is 
usually  more  than  enough  to  record 
any  single  sequence  of  note  events  (a 
track  in  recording  terminology). 
Bank-select  RAM  is  usually  a  problem 
when  several  tracks  are  played  back 
in  parallel.  If  the  tracks  are  stored  in 
separate  banks,  the  switching  over¬ 
head  may  be  cumbersome. 

Most  of  my  current  designs  use  IBM 
PCs  and  ATs,  so  some  of  the  examples 
focus  on  8088  designs,  but  the  design 
principles  will  adapt  easily  to  other 
computers. 

Designing  Hardware 

Custom  hardware  affords  some  mea¬ 
sure  of  software  security  and  may 
provide  functions  not  available  on 
existing  interfaces.  Because  some  de¬ 
signers  prefer  using  their  own  hard¬ 
ware,  I  will  provide  a  few  guidelines. 

MIDI’s  serial  protocol  requires  no 
hardware  handshake  signals,  so  40- 
pin  UARTs  are  not  necessary.  The 
only  small  UART  that  may  cause  trou¬ 
ble  is  the  Intel  8251.  I  have  used  Mo¬ 
torola  68B50s  in  several  designs  with 
good  results.  On  8088  systems,  Motor¬ 
ola's  E  (enable)  signal  can  be  devel¬ 
oped  by  ANDing  the  port  read  and 
write  signals  together. 

Timers  get  tricky  when  multiple 
devices  are  hooked  to  one  interrupt 
line.  I  have  used  Intel  8253s,  but  I  usu¬ 
ally  connect  the  output  line  to  a  flip- 
flop  so  that  the  output  pulse  can  be 
trapped  and  identified.  Flip-flops  are 
not  necessary  if  the  timer  interrupt  is 
isolated  because  the  8259  interrupt 
controller  has  edge-triggering. 


28 


Dr.  Dobb  s  Journal,  May  1987 

353 


Most  MIDI  programs  that  run  on  the 
IBM  PC  use  Roland’s  MPU-40I  inter¬ 
face.  Timers  are  not  required  on  IBM 
PCs  that  use  this  interface,  but  if  you 
are  designing  your  own,  try  to  in¬ 
clude  an  on-board  timer.  The  PC's  in¬ 
ternal  timers  do  not  provide  the  accu¬ 
racy  necessary  to  deal  with  high- 
resolution  music  timing.  For  low- 
resolution  applications,  IRQ  0  from 
the  PC  motherboard  can  be  readjust¬ 
ed  to  run  at  a  multiple  (X)  of  its  nor¬ 
mal  speed.  Then,  every  X  pulses,  the 
old  interrupt  service  routine  is  run. 
The  interrupt  acknowledge  should 
be  skipped  when  jumping  to  the  old 
routine.  Remember  to  reset  the  inter¬ 
rupt  vectors  before  exiting  to  DOS. 
Disk  head  loads  use  timer  0  for  time¬ 
outs,  so  problems  with  this  routine 
usually  cause  the  drive  light  to  stay 
on. 

Slower  clock  rates  for  hardware 
timers  obviously  result  in  decreased 
resolution.  Not  so  obvious,  though,  is 
the  way  that  tempo  resolution  and 
note-timing  accuracy  combine  to 
make  even  tougher  demands  on  the 
system  timer’s  crystal  frequency.  I 
try  to  clock  the  timer  at  around  2 
MHz  so  that  I  can  record  with  a  reso¬ 
lution  of  about  96  divisions  per  quar¬ 
ter  note  while  maintaining  sufficient 
accuracy  in  specifying  tempos. 

In  most  applications,  the  divisor 
sent  to  the  hardware  timer  is  used  to 
trim  the  tempo,  which  is  set  in  incre¬ 
ments  of  beats-per-minute  (BPM).  An 
easy  way  to  control  tempo  is  to  index 
into  a  table  of  divisors  by  the  desired 
number  of  BPMs.  I  normally  generate 
the  tables  with  a  C  program  that  does 
the  calculations  and  prints  out  the  ta¬ 
bles  exactly  as  they  should  appear  in 
the  sequencer  program.  When  the 
values  are  verified,  I  redirect  the  out¬ 
put  of  the  program  into  a  file  that  is 
then  compiled. 

Interrupts 

The  interrupt  service  routine  (ISK)  is 
responsible  for  prioritizing  inter¬ 
rupts  and  coordinating  all  incoming 
data.  Obscure  problems  with  the  ISR 
can  propagate  through  the  entire  sys¬ 
tem.  A  flowchart  will  probably  help 
to  clarify  possible  timing  errors  or 
bottlenecks  before  the  ISR  is  coded. 

It  might  appear  that  a  system  using 
only  serial  ports  would  pose  no  prob¬ 
lems  in  I/O  handling,  but  MIDI  baud 
rates  are  high,  and  there  may  be  mul¬ 


tiple  ports.  Input  interrupts  should 
be  serviced  as  quickly  as  possible  be¬ 
cause  losing  a  byte  may  result  in  los¬ 
ing  an  important  NOTE-OFF  message. 
When  the  recorded  events  are  re¬ 
transmitted  on  playback,  the  synthe¬ 
sizer  will  play  a  stuck  note  until  the 
operator  can  figure  out  a  way  to  gen¬ 
erate  a  NOTE-OFF  (sometimes  leaving 
an  embarrassed  performer  desper¬ 
ately  groping  for  the  power  switch). 
Output  interrupts  are  less  critical;  a 
momentary  delay  is  the  only  penalty 
for  slow  output  response  times.  Be 
particularly  careful  how  these  two 
interrupt  sources  are  handled  when 


UART  hardware  lines  are  shared. 

On  the  IBM  PC,  the  first  instruction 
in  the  ISR  can  be  an  STI  (the  BIOS  does 
this)  for  reenabling  higher-priority 
interrupts.  Lower-priority  interrupts 
can  be  enabled  by  masking  the  pre¬ 
sent  interrupt  and  sending  an  ac¬ 
knowledge  (EOI)  to  the  interrupt  con¬ 
troller.  The  pending  interrupt 
cannot  be  retriggered  until  the 
source  of  the  interrupt  is  reset  (UART 
is  read  and  so  on).  Make  sure  that  the 
pending  interrupt  line  is  high  (active) 
when  sending  the  EOI  because  ob¬ 
scure  problems  can  result  with  the 
8259  when  it  cannot  find  the  source 


Dr.  Dobb's  Journal,  May  1987 

354 


29 


Service  for 
Other  Process 

Increment  Semaphore 


Get  a  Copy  of  Semaphore 


Re-enable  Interrupts 


Semaphore- 

Copy==1? 


Event 

Ready? 


Decrement  Semaphore 


Semaphore 
=  =0? 


MUSIC  RECORDER 

(continued  from  page  29) 

of  the  interrupt  being  acknowl¬ 
edged.  It  may  also  help  to  poll  the 
8259  registers  to  make  sure  that  the 
interrupt  line  is  low  before  exiting 
from  the  ISR.  This  will  help  to  avoid 
difficulty  with  the  PC's  edge-trig¬ 
gered  hardware. 

Timer  interrupts  are  important, 
but  they  can  generally  be  considered 
a  lower  priority  than  UART  inter¬ 
rupts.  The  timer  routine  itself  is  usu¬ 
ally  short,  consisting  of  little  more 
than  incrementing  a  series  of  soft¬ 
ware  counters,  but  at  some  point 
compares  must  be  made  with  target 
values  and  a  long  series  of  events 
may  be  triggered. 

The  timer  interrupt  conveys  no  ac¬ 
tual  data  aside  from  a  flag,  so  normal 
queues  are  not  necessary.  The  best 
way  to  enqueue  timer  interrupts  is 
with  a  semaphore,  which  allows  the 
ISR  itself  to  be  interrupted.  When  the 
timer  tick  occurs,  the  semaphore  is 
incremented  and  timer  interrupts 
are  re-enabled.  If  the  semaphore  has 
been  incremented  to  1,  no  interrupts 
are  nested  and  normal  processing 
can  resume.  If  the  value  is  greater 
than  1,  this  means  that  the  timer  in¬ 
terrupts  have  stacked  up,  so  the  inter¬ 
rupt  is  exited  and  control  is  returned 
to  the  timer  ISR  that  was  interrupted. 
Before  the  timer  ISR  exits,  the  sema¬ 
phore  is  decremented,  and  if  the  val¬ 
ue  is  still  nonzero,  the  interrupts 
must  have  been  nested.  In  this  case, 
control  is  returned  to  the  top  of  the 
timer  ISR,  which  loops  until  the  sema¬ 
phore  returns  to  zero.  This  allows  the 
ISR  to  catch  up  with  lost  interrupts 
without  losing  any  timer  pulses  or 
UART  interrupts. 

The  flowchart  in  Figure  1,  left,  is 
simplified;  it  may  help  to  refer  to  the 
code  listing  (Example  1,  page  31)  for 
more  subtle  details.  Make  sure  the 
semaphore  is  initialized  to  zero  at 
start-up  or  the  timer  ISR  will  never 
run. 


Exit  from  Interrupt 


Figure  1:  Flowchart  for  timer  interrupt  using  semaphore 


Timing  Notes 

The  MIDI  sequencer  usually  uses  a 
hardware  clock  for  its  main  timing 
reference,  but  it  may  be  necessary  to 
synchronize  to  an  external  software 
clock  provided  by  a  drum  machine 
or  timing  converter.  I  will  refer  to 
these  as  internal  and  external  sync, 


30 


Dr.  Dobb  s  Journal,  May  1987 

355 


respectively.  MIDI  software  clocks  oc¬ 
cur  at  a  standard  rate  of  24  per  quar¬ 
ter  note,  which  provides  musical  res¬ 
olution  to  within  a  64th  note  triplet. 
This  sounds  as  though  it  would  keep 
up  with  even  the  fastest  musicians, 
but  remember  that  you  are  dealing 
with  timing  edges.  Trimming  these 
edges  to  the  nearest  24th  of  a  quarter 
note  may  cause  the  recorded  notes  to 
sound  too  symmetrical,  or  mechan¬ 
ical 

Conversely,  it  may  seem  that  slow¬ 
ing  the  system  clock  could  correct 
the  timing  of  inaccurate  note  values. 
This  rounding  (quantization)  is  some¬ 
times  used  to  advantage,  but  overuse 
removes  the  human  signature  and 
creates  a  metronomic,  mechanical 
sound.  Uneven  qualities  are  most  of¬ 
ten  missed  on  parts  such  as  solos, 
which  appear  up  front  in  a  composi¬ 
tion.  Obviously,  all  devices  synchro¬ 
nized  with  MIDI  real-time  clock  sig¬ 
nals  will  be  quantized  to  some  extent. 

Time  Stamps 

In  order  to  maintain  precise  timing  of 
events  while  allowing  interrupts  to 
proceed  at  full  speed,  I  store  a  time 
stamp  with  each  event  coming  into 
the  UART  receive  queue.  Very  simply, 
if  a  received  byte  is  greater  than  7fh 
(MSB  is  set  on  all  commands),  the  cur¬ 
rent  time  is  enqueued  after  the  byte. 
This  provides  freeze-frame  timing; 
the  time  record  travels  with  the  re¬ 
ceived  data  until  the  program  is 
ready  to  process  it.  Only  the  leading 
byte  needs  to  be  time-stamped.  Fol¬ 
low-up  data  (with  MSBs  reset)  are  as¬ 
sumed  to  have  been  received  at  the 
same  time.  This  technique  can  pro¬ 
vide  accuracy  better  than  that  obtain¬ 
able  by  waiting  for  the  complete  re¬ 
cord  and  processing  it  instantly. 
Usually  the  sending  device  intends 
that  all  bytes  be  received  simulta¬ 
neously,  so  stamping  the  leading  byte 
will  more  accurately  reflect  the  actu¬ 
al  event  timing,  even  if  the  transmit¬ 
ter  lags  in  sending  the  follow-up  data. 

Real  Time 

The  MIDI  specification  calls  for  two 
basic  types  of  software  clocks.  Clock- 
in-stop  (Ofch)  allows  receivers  to 
phase-lock  to  the  clock  frequency 
prior  to  start-up.  When  the  transmit¬ 
ter  switches  to  clock-in-play  (0f8h),  all 
synchronized  receivers  switch  to 
their  active  state  (usually  playback  or 


;  TIMER_ISR 

;  This  partial  listing  will  help  to  model  a  timer  int.  based  on  semaphores 


dseg  segment  para  public  'dseg' 


t  sem  dw 


dseg  ends 


;  make  sure  this  value  always  starts  at 

zero 


cseg  segment  para  public  'cseg' 
assume  cs:cseg,  ds:dseg 


public 

int  vector 

int_vector  proc 

far 

sti 

push 

si 

push 

di 

push 

ax 

push 

bx 

push 

cx 

push 

dx 

push 

ds 

push 

9 

es 

mov 

di,dseg 

mov 

ds,di 

again: 

mov 

si,  [t_sem] 

inc 

[t_sem] 

9 

mov 

al, 20b 

out 

20h, al 

call 

reset  tiner  ff 

save  all  registers 


set  up  for  access  to  data  seg  in  this 

module 

loop  point  for  retries  if  more  ints  hit 
'sample'  the  semaphore  before  increment  ! 


this  will  acknowledge  interrupts  on  IBM 

PC 


call  reset_timer_ff  ;  toggle  flip-flop  hooked  to  the  timer  chip 
;  Interrupts  are  now  re-enabled 

or  si, si  ;  check  the  'sampled'  semaphore 

je  i_l°op  ;  If  semaphore  was  zero,  execute  tmr 

;  routine 

;  Skip  the  main  timer  routine  if  this  is  a  nested  int  -  check  for 
.  ovfl 

cnp  si,  200  ;  Too  many  interrupts  stacked-up"? 

(overflow?) 

jc  skip  ;  If  not,  exit  directly. 

! 

;  Interrupts  have  overflowed  —  stop  and  check  timer  values 

s 

call  emergency_stop  ;  semaphore  overflow!  timers  are  set  too 

fast 

jnp  skip 


i_loop : 


;  Execute  the  main  timer  routine  -  then  check  for  stacked  interrupts 
call  timer  routine  ;  run  main  timer  chain 


dec 

jnz 

skip: 

pop 

pop 

pop 

pop 

pop 

pop 

pop 

pop 

iret 

int_vector  endp 
cseg  ends 


[t_sem] 

i_loop 


run  main  timer  chain 

if  sem  dec's  to  zero,  no  ints  are  nested 
if  ints  ARE  nested,  loop  back  to  catch  up 


restore  registers  and  return 


E/cample  1:  A  MIDI  interrupt  service  routine  with  a  semaphore 


Dr.  Dobb's  Journal,  May  1987 

356 


31 


MUSIC  RECORDER 

(continued  from  page  31) 


record).  To  maintain  accuracy,  real¬ 
time  messages  are  transmitted  at  any 
time — even  in  the  middle  of  other 
multibyte  messages.  Receivers  must 
account  for  this  possibility  even  if  the 
real-time  messages  are  not  used.  Both 
clocks  are  always  sent  at  the  rate  of 
24  per  quarter  note.  Altering  the 
clock  frequency  changes  tempo,  not 
accuracy. 

Other  real-time  messages  include 
START-FROM-BEGINNING  (Ofah), 
which  resets  internal  song  pointers; 
CONTINUE  (Ofbh),  which  tells  receiv¬ 
ers  to  resume  from  the  current  loca¬ 
tion;  and  ACTIVE  SENSING  (Ofeh), 
which  just  lets  the  receivers  know 
that  the  transmitter  is  still  there.  The 
latter  is  optional  and  used  notably  by 
the  Yamaha  DX-7  synthesizer.  When 
using  a  DX-7,  you  will  probably  want 
to  discard  the  Ofeh  bytes  because 
they  will  be  received  constantly — 
even  when  not  recording.  They  can 
fill  up  the  input  queues  if  the  input 
interrupts  are  enabled. 

The  last  real-time  message,  SYSTEM 
RESET  (Offh),  is  dangerous  because  it 
could  start  a  regenerating  condition 
in  which  every  component  in  the 
system  sends  resets  to  each  other.  It  is 
usually  reserved  for  linkage  to  a 
hardware  reset  switch  or  used  judi¬ 
ciously  by  the  master  controller. 

Storage  Formats 

There  is  no  standard  yet  for  either 
RAM-  or  disk-based  storage  for  MIDI 
events.  I  have  heard  rumors  of  a 
standard  for  disk  storage  that  would 
allow  one  manufacturer’s  software 
to  read  files  written  by  someone  else, 
but  intermediate  RAM  storage  is  an¬ 
other  story.  The  storage  formats  used 
throughout  the  industry  are  diverse 
and  usually  so  complex  that  chang¬ 
ing  internal  formats  would  require 
extensive  rewrites. 

Most  internal  storage  methods  fall 
into  one  of  four  categories  that  I  call 


end-point-relative,  end-point-abso¬ 
lute,  single-point-absolute,  and  bar- 
and-note  storage.  All  storage  formats 
involve  storing  data  in  a  linear  data 
stream.  Relative  timing  implies  that 
timing  is  encoded  as  a  distance  from 
the  previous  event.  Absolute  timing 
uses  a  global  time  reference,  such  as 
beats  and  bars.  End-point  storage  re¬ 
fers  to  separate  storage  locations  for 
NOTE-ON  and  NOTE-OFF  events  (usual¬ 
ly  with  their  own  time  stamps).  Sin¬ 
gle-point  storage  requires  that  a 
pointer  be  aimed  at  the  NOTE-ON  re¬ 
cord  in  the  data  stream,  and  when 
the  NOTE-OFF  event  is  received,  it  is 
stored  at  the  same  location  (better 
yet,  the  note  duration  can  be  comput¬ 
ed  and  stored).  The  bar-and-note 
method  parallels  the  way  music  is 
normally  notated.  Each  method  has 
advantages,  and  there  is  a  lot  of  over¬ 
lap  between  categories. 

I  try  to  carry  MIDI’s  philosophy  of 
setting  MSBs  of  leading  bytes  when 
encoding  data  for  storage  in  the 
stream.  This  allows  resyncing  if  a 
byte  is  missed  and  makes  data 
streams  easier  to  edit.  Many  software 
writers  use  this  method  for  internal 
storage. 

End-Point-Relative  Storage 

MIDI  data  is  received  as  a  stream  of 
bytes  with  high  bits  (MSBs)  set  on  com¬ 
mands  and  reset  on  data.  Why  not 
store  the  bytes  just  as  they  are  re¬ 
ceived?  Embedded  MIDI  clock  mes¬ 
sages  provide  the  proper  spacing  for 
NOTE-ON,  NOTE-OFF,  and  other 
events.  Replaying  the  data  requires 
setting  up  a  series  of  play  pointers 
into  the  data  stream  (one  for  each 
track  to  be  played).  When  the  start 
command  is  received,  the  data  is  sent 
to  the  output  UARTs  just  as  it  was  re¬ 
ceived,  waiting  for  a  24th  of  a  beat 
every  time  a  clock  command  is  en¬ 
countered  in  the  stream.  Unfortu¬ 
nately,  this  method  uses  RAM  storage 
even  if  no  events  are  being  transmit¬ 
ted,  and  multiple  channels  store  mul¬ 
tiple  copies  of  all  unnecessary  timing 


bytes. 

In  Example  2,  below,  it  is  assumed 
that  an  external  MIDI  clock  provides 
the  timing.  Even  if  an  internal  timer 
is  used,  0f8h  bytes  can  be  inserted 
into  the  input  queue  to  simulate  the 
MIDI  software  clock.  A  refinement  of 
this  method  conserves  storage  by 
counting  all  received  clock  bytes. 
When  an  event  is  received,  the  accu¬ 
mulated  count  is  stored  before  the 
event  and  then  the  counter  is  reset. 

End-Point- Absolute  Storage 

Time  stamping  requires  a  system 
timer  that  is  incremented  every  time 
a  MIDI  clock  or  timer  interrupt  is  re¬ 
ceived.  When  MIDI  data  is  enqueued, 
the  system  timer  is  copied  into  the 
queue  as  the  time  stamp.  When  the 
data  is  dequeued  for  storage,  the  time 
stamp  is  stored  with  the  data  record 
to  provide  an  accurate  absolute  tim¬ 
ing  reference.  Unlike  relative  timing, 
this  allows  you  to  locate  any  spot  in 
the  data  stream  without  counting  all 
embedded  timing  bytes.  To  replay 
time-stamped  data,  restart  the  system 
timer  and  wait  until  it  matches  the 
timing  bytes  of  the  first  item  in  the 
data  stream.  Then  send  the  data 
(without  the  time  stamp),  and  ad¬ 
vance  the  stream  pointers. 

Example  3,  page  37,  uses  only  2 
bytes  for  the  time  stamp  and  for  the 
system  timer.  More  practical  systems 
use  3  bytes  to  allow  more  than  two 
hours'  recording  time  before  the  tim¬ 
er  overflows  (at  96  pulses  per  quar¬ 
ter).  The  low-order  byte  is  increment¬ 
ed  until  it  reaches  96  (or  24  for  MIDI 
clock  timing).  Then  the  byte  is  reset 
and  the  count  propagates  through 
the  other  two  timer  bytes.  The  top 
timer  bytes  turn  over  at  127,  so  the 
MSBs  are  always  zero. 

Single-Point- Absolute  Storage 

Single-point  storage  requires  main¬ 
taining  a  list  of  pointers  to  access  ac¬ 
tive  notes  in  the  data  stream.  When  a 
NOTE-ON  is  received,  it  is  enqueued 
and  stored  in  the  data  stream  in  the 


_  f 8h - f 8h  94h  46h  32h - f8h  84h  46h  16h - f8h 

Iwait  24th  Iwait  24th,  then  Iwait  24th,  then 

I  I  output  a  NOTE-ON  on  ch.4  I  output  a  NOTE-OFF  on  ch.4 

|  I  for  note  no.  46h  I  for  note  no.  46h 

I  I  velocity  =  32h  I  OFF  velocity  =16h 


Example  2:  End-point-relative  storage 


32 


Dr.  Dobb 's  Journal,  May  1987 

357 


MUSIC  RECORDER 

(continued  from  page  32) 

same  way  as  in  end-point-absolute 
storage,  but  zeros  are  stored  after¬ 
ward  in  a  compartment  reserved  to 
keep  track  of  the  note's  duration.  A 
pointer  to  the  zero  bytes  is  stored  in  a 
list  to  allow  access  to  the  duration.  Ev¬ 
ery  time  a  MIDI  clock  byte  or  timer 
interrupt  is  received,  the  list  is 
checked  and  the  pointers  are  used  to 
increment  duration  bytes.  When  the 
NOTE-OFF  event  is  received,  the  point¬ 
er  is  removed  from  the  list — no  other 
action  is  necessary.  The  duration  will 
be  frozen  as  part  of  the  note  record. 
Single-point-absolute  storage  pro¬ 
vides  advantages  in  editing  (notes  can 
be  moved  easily)  and  display  (it  is 
easy  to  tell  whether  a  note  is  a  quar¬ 
ter  note,  half  note,  and  so  on). 

In  Example  4,  below,  during  play¬ 
back  NOTE-ONs  are  derived  in  the 
usual  way  (wait  until  the  system  tim¬ 
er  reaches  the  embedded  time 
stamp),  but  this  time  the  duration 
bytes  are  retrieved  and  stored  in  a 
time-out  list.  They  are  decremented 
with  every  timer  tick,  and  when 
they  reach  zero,  a  NOTE-OFF  is  trans¬ 
mitted.  The  time-out  list  must  keep  a 
copy  of  the  note  number  so  that  the 
proper  note  can  be  turned  off.  The 
single-point  method  affords  another 


advantage:  it  is  unlikely  that  notes 
will  get  stuck.  NOTE-OFF  events  can¬ 
not  be  missed.  Any  reasonable  value 
in  the  time-out  list  will  eventually 
decrement  to  zero  and  cause  a  NOTE- 
OFF  to  be  sent. 

Bar-and-Note  Storage 

So  far,  I  have  discussed  playing  notes 
but  not  rests.  Of  course,  rests  are  sim¬ 
ply  the  spaces  between  notes,  but  the 
storage  formats  outlined  do  not  pro¬ 
vide  a  way  of  tracking  down  these 
spaces  for  correlation  with  written 
music.  Rests  can  be  stored  in  the  same 
way  as  notes,  but  note  numbers  and 
velocities  are  not  needed.  This  seems 
to  be  a  reversion  to  the  original  rela¬ 
tive  timing  method  when  you  consid¬ 
er  that  rests  are  similar  to  the  old  em¬ 
bedded  relative  timing  markers.  Now 
the  NOTE-ON  time  stamps  become  re¬ 
dundant — where  one  note  or  rest 
stops,  the  next  will  start.  Without 
some  frame  of  reference,  though,  it  is 
difficult  to  find  a  designated  spot  in 
the  data  stream.  I  have  borrowed  an¬ 
other  device  from  written  music:  bar 
lines.  There  is  no  MIDI  equivalent  for 
a  bar  line,  so  I  use  Obah.  The  bar 
marker  is  followed  by  1-  or  2-byte  bar 
numbers  to  allow  absolute  locations 
to  be  found.  This  combines  some  of 
the  better  features  from  all  the  meth¬ 
ods  outlined  earlier.  I  use  OaOh  as  the 


token  for  a  rest  (see  Example  5, 
below). 

Some  storage  formats  are  better 
suited  to  certain  approaches  to  edit¬ 
ing  or  to  certain  looping  constructs  or 
display  formats.  Choose  a  format  that 
suits  your  application,  but  remember 
to  take  as  general  an  approach  as  pos¬ 
sible.  You  will  undoubtedly  want  to 
expand  later  to  incorporate  new 
ideas. 

Quantization 

I  first  mentioned  quantization  in  the 
context  of  scaling  down  the  system 
clock.  If  a  section  of  music  was  re¬ 
corded  using  a  clock  with  96  pulses 
per  quarter  note,  the  system  clock 
could  simply  be  slowed  to  24  pulses 
per  quarter  note.  Each  timer  inter¬ 
rupt  would  then  increment  the  sys¬ 
tem  timer  by  4  instead  of  1.  This 
method  works,  but  it  has  one  main 
drawback.  If  there  is  no  frame  of  ref¬ 
erence  (an  unquantized  track,  for  in¬ 
stance),  it  may  not  be  noticed  but  all 
notes  will  effectively  be  shifted  late 
by  the  quantize  interval.  This  is  like 
truncating  a  number  when  the  real 
intention  is  to  round  it. 

To  quantize  events  so  that  the  notes 
fall  on  the  beat  rather  than  after  the 
beat,  the  timing  target  must  be  antici¬ 
pated  by  half  the  amount  of  the 
quantization  period.  This  causes 


- 94h  09h  03h  46h  32h  - - 

--  8 4h  08h  04h  46h  16h - 

1  output  a  NOTE-ON  on  ch . 4 

1  output  a  NOTE-OFF  on  ch.4 

on  the  ninth  clock  tick 

on  the  eighth  clock  tick 

of  the  third  beat 

of  the  fourth  beat 

for  note  no.  46h 

for  note  no.  46h 

1  velocity  =  32h 

1  OFF  velocity  =  16h 

Example  3:  End-point-absolute  storage 

— 94h  09h  03h  46h  32h  - 

-  04h  Olh  - 

output  a  NOTE-ON  on  ch.4 

These  are  duration  bytes  that 

on  the  ninth  clock  tick 

travel  with  the  note  record  and  are 

of  the  third  beat 

incremented  by  each  timer  tick 

for  note  no.  46h 

until  NOTE-OFF  is  received. 

velocity  =  32h 

Example  4:  Single-point-absolute  storage 


-Obah  32h  - 

-OaOh  02h  14h  - 

94h  46h 

32h - 

-Olh  02h - 

1  At  bar  #32h 

1  rest 

1  output 

a  NOTE -ON  on 

ch.4 

duration 

1 

for  two 

beats 

1  for 

note  no.  46h 

is  1  beat 

1 

and 

14h  ticks 

1 

velocity  = 

32h 

and  2  ticks 

Example  5:  Bar-and-note  storage 


Dr.  Dobb's  Journal,  May  1987 

358 


37 


MUSIC  RECORDER 

(continued  from  page  37) 


events  to  be  processed  in  the  center 
of  a  quantize  window.  To  accomplish 
this,  the  system  timer  contents  are 
copied  to  a  look-ahead  timer  and  in¬ 
cremented  so  that  it  leads  the  actual 
system  time  by  half  the  quantize  in¬ 
terval.  Using  look-ahead  timers  for 
compares  will  trigger  events  ahead 
of  time.  Remember  just  to  run  the 
clock  routine  every  Nth  clock  tick 
and  to  add  N/2  to  the  current  time  to 
derive  the  look-ahead  timer. 


The  same  result  can  be  accom¬ 
plished  by  first  running  the  clock 
routine  N/2  times  consecutively  (this 
accounts  for  look-ahead).  After  this 
initial  look-ahead,  the  clock  cycle 
consists  of  waiting  for  N  clock  periods 
and  then  running  the  clock  routine  N 
times  consecutively. 

Quantization  cleans  up  notes 
whose  timing  is  slightly  frayed  at  the 
edges,  but  some  mistakes  can  actually 
be  accentuated.  If  the  mistake  is  se¬ 
vere  enough  to  fall  outside  the  in¬ 
tended  quantize  window,  note  tim¬ 
ing  will  be  rounded  in  the  wrong 


direction,  as  in  note  3  of  Figure  2,  left. 

I  mentioned  that  the  trailing  edges 
of  notes  are  usually  much  less  timing 
critical  than  the  leading  edges.  One  of 
my  favorite  techniques  for  avoiding  a 
mechanical  sound  is  quantizing  the 
leading  edges  of  notes  but  leaving  the 
lengths  intact.  This  can  be  accom¬ 
plished  by  using  a  look-ahead  on  the 
clock  that  pulls  the  NOTE-ON  event 
and  note  length  from  the  data 
stream.  The  unprocessed  note  length 
is  stored  in  a  time-out  list,  where  it  is 
decremented  on  each  tick  of  the 
high-resolution  clock.  The  appropri¬ 
ate  NOTE-OFF  is  sent  when  it  reaches 
zero. 

The  timing  of  recorded  music  is 
easily  altered  or  quantized,  but  ran¬ 
dom  timing  information  from  hu¬ 
man  input  is  difficult  to  add  later  (you 
could  try  it).  Some  synthesizer  sys¬ 
tems,  such  as  Fairlight,  use  high-per¬ 
formance  hardware  to  derive  timing 
clocks  as  high  as  384  clocks  per  quar¬ 
ter  note,  but  be  wary  of  micro-based 
software  that  claims  this  level  of  ac¬ 
curacy.  By  attempting  performance 
beyond  the  capability  of  the  ma¬ 
chine,  the  software  can  actually  sac¬ 
rifice  accuracy. 

Pilot  Track 

Most  musical  compositions  have 
verses,  choruses,  and  other  types  of 
sections.  Sections  are  usually  record¬ 
ed  or  written  separately.  When  all 
the  sections  are  completed,  they  are 
rearranged,  repeated,  and  so  on  by 
the  use  of  what  I  call  a  pilot  track. 
The  pilot  track  in  this  type  of  system 
operates  outside  the  time  frame  of 
the  composition.  In  fact,  it  may  be  the 
only  timing  stream  that  moves  in  a 
linear  fashion.  Macro  commands, 
such  as  play  section  3,  five  times,  can 
pilot  the  interpreter  through  the  ap¬ 
propriate  series  of  pointer  loads, 
plays,  and  reloads  to  accomplish  the 
task.  (See  Figure  3,  page  39.) 

The  start  of  each  section  now  be¬ 
comes  the  main  point  of  reference 
for  note  timing.  At  the  end  of  a  sec¬ 
tion,  the  system  timer  is  usually  reset 
and  the  pilot  track  is  consulted  to  find 
the  next  section  to  be  played.  Each 
section  is  treated  as  if  it  were  a  sepa¬ 
rate  composition,  so  each  can  contin¬ 
ue  up  to  the  maximum  length  al¬ 
lowed  by  the  system  timer. 

Some  pilot-track  schemes  use  a 
Forth-type  stack  to  hold  reiterative 


Quantize  window 

Li 

j 

r 

1 

— 1 

Loose  timing 
on  3-note 

I 

chord 

_ 1 

1 

Output:  note  2 

1 _ 

Quantized 
note-on  and 

1 — 
I _ 

_ | 

NOTE-OFF 

note  3 

I- 

i 

■■■■■ 

j 

Quantized 
NOTE-ON  only 

Output:  note  2 

r 

_4 

r 

(duration  un¬ 
changed) 

note  3 

i — 

Figure  2:  Effects  of  quantization  on  note  durations 


38 


Dr.  Dobb's  Journal,  May  1987 

359 


constructs  and  section  or  data  refer¬ 
ences.  I  have  even  seen  a  sequencer 
that  allows  English  statements  such 
as  SECTION  3  =  VERSE  1  +  CHORUS.  In 
any  case,  the  section  is  treated  as  a 
subroutine,  with  control  returning  to 
the  pilot  track  when  the  section  has 
completed. 


Advanced  Features 

If  you’ve  made  it  this  far,  you  may  be 
interested  in  some  of  the  extensions 
to  MIDI  protocol  that  allow  more  rap¬ 
id  transmission  of  event  triggers. 
Running  status  says  that  the  receiver 
keeps  the  most  recently  received 
command  byte.  If  additional  data 


bytes  are  received  without  a  leading 
command  byte,  the  old  command 
byte  is  used. 

To  transmit  NOTE-ONs  on  channel  3 
for  note  numbers  22h,  33h,  and  44h 
with  velocities  of  55h,  66h,  and  77 h, 
the  bytes  shown  in  Example  6,  page 
41,  could  be  sent.  Duplicating  the 


Figure  3:  Relating  the  pilot  track  to  previously  stored  sections  of  a  composition 


Dr.  Dohb’s  Journal,  May  1987 

360 


39 


MUSIC  RECORDER 

(continued  from  page  39) 


same  series  with  an  83h  as  the  first 
byte  turns  the  notes  off  again. 

To  make  this  feature  more  useful, 
the  special  condition  NOTE-ON  with 
velocity  =  0  is  reserved  to  signal  a 
NOTE-OFF  operation.  Its  function  is 
identical  to  the  normal  NOTE-OFF,  but 
because  it  is  actually  a  NOTE-ON  com¬ 
mand,  the  running  status  rule  ap¬ 
plies.  The  same  notes  could  be  turned 
on  and  off  again  by  the  bytes  shown 
in  Example  7,  page  41. 

Remember  that  NOTE-OFFs  won’t 
actually  be  sent  immediately  after 
NOTE-ONs.  If  any  other  command 
bytes  are  sent  in  between,  the  run¬ 
ning  status  is  interrupted  and  the 
command  byte  must  be  retrans¬ 
mitted. 

Display  Methods 

Storage  methods  can  be  related  to  dis¬ 
play  methods  in  that  the  note  events 
can  be  displayed  at  one  spot  (as  a  con¬ 
ventional  music  note)  or  they  can  be 
displayed  as  a  (usually  horizontal) 
band  on  the  screen  with  the  positions 
of  the  start  and  end  points  represent¬ 
ing  the  start  and  end  times  of  the 
note.  The  latter  method,  sometimes 
known  as  piano-scroll  notation  (be¬ 
cause  of  the  similarity  to  a  player  pi¬ 
ano  scroll),  is  analogous  to  the  end¬ 
point  storage  method  I  have  outlined. 
So,  do  software  designers  who  use 
single-point  storage  use  conventional 
notation  and  designers  who  use  end¬ 
point  storage  use  piano-scroll  nota¬ 
tion?  Of  course  not.  Strangely 
enough,  some  of  the  more  popular 
software  packages  use  exactly  the  op¬ 
posite  techniques  from  those  you’d 
expect. 

Both  display  methods  have  advan¬ 
tages:  piano-scroll  notation  can  be 
visible  to  musicians  who  are  not  ac¬ 
customed  to  reading  music,  whereas 
conventional  notation  maintains 
high  information  density.  Also,  con¬ 
ventional  notation  always  requires 
graphics  capability.  Piano-scroll  can 
usually  be  done  with  text-mode 
graphics. 

Displaying  notes  on  staff  lines  re¬ 
quires  at  least  4  or  5  vertical  pixels 
per  line  on  the  staff,  for  a  total  of  17- 
21  pixels  for  a  complete  staff  line  (4 
lines  times  4-5  pixels,  plus  an  extra 
line).  I  have  seen  sheet  music  with 


notes  printed  as  far  as  six  spaces 
above  or  below  the  staff,  so  it  is  wise 
to  allow  a  lot  of  blank  space  on  each 
side  of  a  staff  line.  Allowing  60  verti¬ 
cal  pixels  total  per  staff  should  yield  5 
or  6  staff  lines  on  a  high-resolution 
screen. 

Horizontal  formats  are  usually  best 
handled  and  allocated  by  the  byte.  A 
screen  640  pixels  across  would  divide 
into  80  horizontal  compartments 
with  screen  objects  treated  some¬ 
what  like  ASCII  characters  but  with 
variable  widths  (a  G  clef  requires  two 
or  three  character  widths). 

Despite  the  popularity  of  IBM  com¬ 
puters,  the  CGA's  unfortunate  lack  of 
adequate  screen  resolution  has  limit¬ 
ed  its  use  for  staff-line-oriented  edi¬ 
tors  or  forced  earlier  sequencers  to 
use  Hercules  or  other  non-IBM  graph¬ 
ics  boards.  The  CGA  screen  produces 
square-looking  notes,  so  convention¬ 
al  notation  on  this  screen  may  not  be 
worthwhile.  I  find  the  EGA  graphics 
board  extremely  slow,  but  color  is  a 
valuable  tool  for  displaying  music. 
Notes  on  a  staff  line  can  be  color-cod¬ 
ed  to  designate  channel  numbers  (to 
allow  more  than  one  channel  per 
staff  line).  Monochrome  editors 
sometimes  allow  selection  between 
two  channels  per  staff  by  directing 
note  stems  up  or  down. 

Display  formats  will  depend  large¬ 
ly  on  the  display  devices  you  have  on 
hand  or  wish  to  support.  Although 
EGA  boards  have  finally  made  the  IBM 
PC  competitive  with  other  computers 
when  displaying  color,  you  may  only 
want  to  enter  musical  notes  and  then 
print  them  out  on  paper.  Some  music 
transcribers  are  using  Jim  Miller's 
software  without  ever  buying  MIDI 
hardware  for  their  computers. 
When  doing  black-and-white  print¬ 
outs,  obviously  a  Hercules  mono 
graphics  card  will  suffice.  The  Hercu¬ 
les  board  has  720  horizontal  pixels, 
which  ends  up  looking  a  lot  longer 
than  the  EGA’s  640  pixels  (sometimes 
a  full  bar  of  music). 

Note-event  editors  take  many  dif¬ 
ferent  forms,  but  list-oriented  editors 
are  the  easiest  to  design,  followed  by 
piano-scroll  editors.  List  editors  can 
simply  convert  a  list  of  note  numbers 
into  NOTE-ON  and  NOTE-OFF  events. 
This  may  be  adequate  if  the  main 
function  of  the  software  is  to  record 
live  music  being  played  on  a 
synthesizer. 


Dr.  Dobb's  Journal ,  May  1987 

361 


Most  piano-scroll  editors  use  strict 
binding  between  screen  position  and 
note  events.  The  screen  is  sometimes 
partitioned  into  an  even  bar  of  music, 
for  example.  The  x-axis  position  of 
the  cursor  then  relates  directly  to  the 
music’s  time  domain. 

Staff-line  editors  usually  require  a 
complex  series  of  pointers  to  corre¬ 
late  records  in  the  data  stream  with 
locations  on  the  screen.  Using  a  cur¬ 
sor  to  locate  and  change  a  point  with¬ 
in  the  music  data  stream  can  be  a  dif¬ 
ficult  task  because  the  screen  spacing 
may  be  nonlinear  when  related  to 
the  time  domain  (a  bar  consisting  of  a 
single  whole  note  will  be  shorter 
than  a  bar  that  holds  a  series  of  16th 
notes).  Many  staff-line  editors  do  re¬ 
quire  vertical  alignment  of  synchro¬ 
nized  events,  such  as  left-and  right- 
hand  piano  parts  that  are  written  on 
separate  staffs.  This  allows  the  screen 
to  be  swept  from  left  to  right  with  a 
single  x-axis  pointer,  but  strange  tim¬ 
ing  errors  are  introduced  by  the  con¬ 
verter  when  vertical  alignment  is  not 
maintained.  If  you  are  writing  this 
type  of  editor/converter,  I  recom¬ 
mend  keeping  a  separate  x-axis 
pointer  for  each  staff  line.  Staff-ori¬ 
ented  editors  are  further  complicat¬ 
ed  by  the  need  for  multiple  data  rep¬ 
resentations.  The  on-screen  symbols 
usually  must  be  transformed  into  a 
format  that  is  quite  different  in  order 
to  play  them  as  MIDI  notes. 

A  piano-scroll  editor  is  a  good  start¬ 
ing  point  for  experienced  software 
writers  who  have  limited  knowledge 


of  traditional  musical  concepts.  Only 
the  most  experienced  writers  should 
attempt  to  write  a  staff-line  editor,  as 
it  requires  a  thorough  knowledge  of 
both  music  theory  and  program¬ 
ming. 

Writing  Your  Own 
Patch  Editor 

One  of  my  current  projects  is  a  patch 
editor  for  the  Kawai  K-3  synthesizer, 
so  I  can  explain  exactly  how  patch 
editing  works.  The  K-3  generates 
sound  by  building  waveforms  from 
sine-wave  harmonics.  These  wave¬ 
forms  can  be  manipulated  from  the 
K-3's  front  panel,  but  the  addition  of  a 
computer  screen  offers  an  enormous 
advantage  in  visualizing  sounds  as 
waveforms  are  being  altered.  In  my 
wave  editor,  color-coded  bars  move 
to  indicate  harmonic  numbers  and 
amplitudes.  As  the  operator  tailors 
the  waveform  by  adjusting  the  on¬ 
screen  representation,  an  internal  ta¬ 
ble  of  values  is  also  adjusted.  Just  as  in 
a  text  editor,  different  versions  of  the 
data  can  be  saved  to  disk  as  the  wave¬ 
forms  are  being  edited.  The  K-3  holds 
only  1  internal  user-defined  wave, 
but  I  hold  up  to  100  waves  within  one 
file  so  they  can  be  compared  and 
interchanged. 

When  a  patch  or  wave  is  ready  to 
be  sent  to  the  K-3,  a  MIDI  SYSTEM  EX¬ 
CLUSIVE  command  tells  the  synthesiz¬ 
er  to  expect  new  patch  information. 
No  acknowledge  is  necessary — the 
harmonic  numbers  and  amplitudes 
are  sent  out  immediately.  Transmit¬ 


ted  data  can  be  of  any  length,  but  8- 
bit  data  must  be  sent  as  two  separate 
nybbles  to  ensure  that  the  MSBs  will 
be  reset  (0).  Because  of  the  potentially 
long  data  stream,  Kawai  requires  a 
checksum  for  confirmation,  but  this 
is  entirely  up  to  the  manufacturer. 
The  data  packet  is  closed  by  an  END 
OF  EXCLUSIVE  byte,  which  lets  the 
synthesizer  get  back  to  music 
processing. The  bytes  sent  to  the  K-3 
to  set  up  a  wave  look  like  those 
shown  in  Example  8,  below. 

Data  formats  for  send  and  for  re¬ 
ceive  are  identical,  so  two  synthesiz¬ 
ers  can  be  hooked  together  for  trad¬ 
ing  waveforms  or  waves  can  be  sent 
to  the  computer,  modified,  and  then 
sent  back  to  the  K-3. 

Every  manufacturer  and  synthe¬ 
sizer  has  a  different  format  for  sys¬ 
tem-specific  commands.  Consult  syn¬ 
thesizer  manuals  for  details. 

Choosing  a  Synthesixer 

If  you  will  be  testing  your  MIDI  soft¬ 
ware  by  recording  music  in  real 
time,  you  will  need  a  keyboard  or 
other  source  of  MIDI  note  messages.  If 
cost  is  a  factor,  look  at  Casio 's  CZ-101 
(miniature  keyboard)  or  CZ-1000 
(larger  version).  In  a  slightly  higher 
price  bracket,  the  Kawai  K-3  has  a 
good  performance-to-cost  ratio.  Ya¬ 
maha  has  developed  a  wide  range  of 
FM  instruments,  and  its  DX-7  series  is 
one  of  the  best-selling  series  of  syn¬ 
thesizers  in  the  over-$l,000  range. 
KORG  also  makes  several  affordable 
machines. 


93h 

66h - 

| 

i 

4411""“"  J  JU 

1  1 

j  JXX  - 

1 

1 

/  /n - 

1 

NOTE-ON 

cmd 

note#  vel 

note# 

vel 

l 

note# 

I 

vel 

Ch  3 

first  note 

second 

note 

third 

note 

Example  G:  Transmitting  individual  notes  over  MIDI 

Q  Vh 

--22h — 

I 

-33h — 
| 

~44h - 0 

1 

_ 

4  <£■  U 

1 

3  3  xi  j  jn  1  o  o  n 

i  i  i 

1  1 

-0 - 

1 

-0 - 

1 

NOTE-ON 

note# 

vel  note#  vel 

note#  vel 

note# 

vel 

I 

note# 

1 

vel 

1  i 

note#  vel 

Ch  3 

first_note  second— note 

third-note 

f irst_note 

second— note 

third— note 

Example  7:  A  MIDI  note  stream 


-Of  0h-40h-0h-20h-0h-1h-64h - Oh-lh — 0h-5h- . —  lh-9h — Ih-Ofh - Qf  7h~ 

I  I  I  I  I  I  I  I  I  i  I  I  I 

SYS-EX  I  CHAN  I  GRP  I  INTRN  HARM=1  AMP  =  5  (more-data ) HARM=  1 9h  AMP=1fh  BOX 
KAWAI  FUNCf  K3  WAVE 


Example  S:  Example  data  stream  to  initialize  Kawai  wave  table 


Dr.  Dobb’s  Journal,  May  1987 

362 


41 


MUSIC  RECORDER 

(continued  from  page  41) 

If  you  are  interested  only  in  patch 
editors,  or  if  you  already  own  a  MIDI 
keyboard,  consider  using  a  stand¬ 
alone  sound  generator.  These  are  be¬ 
coming  more  popular  because  one 
MIDI  keyboard  can  control  several 
slaved  sound  generators.  Kawai  has 
introduced  a  keyboardless  version  of 
its  K-3  synthesizer,  and  Yamaha  has 
just  introduced  a  module  (the  FB-01) 
for  less  than  $400.  Low-cost,  rack¬ 
mounted  sampling  units  include  the 
AKAI  model  S612  and  the  Ensoniq 
Mirage. 

The  eight-voiced  FB-01  and  the 
four-voiced  Casio  CZ-101  can  both  as¬ 
sign  separate  tone  patches  to  each 
voice.  In  other  words,  a  piano  sound, 
a  violin  sound,  and  a  horn  sound  can 
all  be  played  at  once.  This  is  an  ad¬ 
vantage  to  software  designers  be¬ 
cause  the  module  can  be  made  to  re¬ 
spond  like  multiple  synthesizers.  It  is 
difficult  to  assign  or  prioritize  multi¬ 
ple  voices  with  a  single  keyboard,  so 
the  Casio  allows  this  MONO-MODE  op¬ 
eration  only  when  sounds  are  played 
and  assigned  by  an  external  comput¬ 
er.  The  FB-01,  of  course,  has  no 
keyboard. 

Synthesizer  keyboards  generally 
have  nonweighted  plastic  keys.  If 
you  prefer  a  keyboard  that  feels 
more  like  an  acoustic  piano,  you  may 
want  to  invest  in  one  of  the  higher- 
quality  MIDI  keyboard  controllers, 
such  as  the  Roland  MKB-1000  or  Ya¬ 
maha  KX-88.  Because  these  are  out¬ 
put-only  keyboards  (they  have  no 
sound-generation  electronics),  they 
must  be  used  in  conjunction  with  an 
external  MIDI-controlled  sound 
generator. 

The  sound  of  a  synthesizer  de¬ 
pends  both  on  its  electronics  and  on 
its  programming,  so  it  is  impossible  to 
categorize  every  type  of  instrument. 
There  are  some  guidelines,  however. 
Oscillators  can  be  analog  or  digital. 
The  difference  is  somewhat  like 
comparing  records  and  compact 
discs:  digital  oscillators  are  precise 
and  they  can  produce  a  wide  range 
of  timbres,  but  some  say  that  they 
lack  the  warmth  of  analog  oscillators. 
Analog  machines,  such  as  the  Roland, 
Moog,  or  Oberheim  synthesizers,  are 
known  for  producing  rich  string 
patches  or  resonant  brass  sounds. 


Digital  machines,  such  as  the  Yamaha 
line,  excel  at  more  percussive  sounds, 
like  pianos  or  bells.  Samplers  can  cap¬ 
ture  breathy,  human  or  flutelike 
voices.  Musicians  often  use  different 
types  of  synthesizers  to  cover  differ¬ 
ent  ranges  of  tonalities,  but  the 
ranges  overlap  quite  a  bit.  Listen  to  as 
many  patches  as  possible  before 
making  your  choice. 

SMPTE  (and  Other 
Time  Codes) 

Computerized  recording  is  becoming 
popular,  but  the  accepted  medium 
for  interchange  of  completed  music 
is  still  audio  tape.  Tape  is  necessary 
for  recording  voices,  guitars,  and 
acoustic  instruments  and  for  trans¬ 
porting  sounds  between  studios  that 
have  different  types  of  synthesizers 
or  drum  machines.  Synchronizing 
tape-based  and  computer-based  re¬ 
cording  media  can  be  difficult.  Fortu¬ 
nately,  a  solution  to  many  of  the 
problems  already  exists  in  the  form 
of  time  code. 

Time  codes  of  various  types  have 
been  in  use  for  many  years.  They  are 
used  to  lock  audio  recorders  to  video 
machines  for  movie  sound  tracks  and 
for  time  stamping  video  tape  for  TV 
news.  One  of  the  most  popular  forms 
of  time  code  was  devised  by  NASA  as  a 
simple,  fixed  time  reference  for  its 
experiments.  It  used  recordable  au¬ 
dio-range  pulses  in  a  format  known 
as  biphase  modulation.  Biphase  uses 
clock  transitions,  rather  than  states  of 
polarity,  to  encode  binary  data.  This 
means  that  the  output  will  never  be  a 
nonrecordable  DC  voltage.  The  80-bit 
serial  data  stream  encodes  time  as 
hours,  minutes,  seconds,  frames,  and 
subframes.  The  last  two  increments 
are  arbitrary  values  that  vary,  de¬ 
pending  on  usage,  but  even  this 
vague  specification  was  sufficient  to 
merit  acceptance  in  a  wide  range  of 
applications,  especially  video  film.  It 
has  come  to  be  known  as  SMPTE  code, 
after  the  Society  for  Motion  Picture 
and  Television  Engineers. 

SMPTE  code  is  based  on  a  fixed  time- 
of-day  clock,  rather  than  a  variable 
rate,  so  it  is  not  the  ideal  code  for  re¬ 
solving  the  fine  nuances  of  musical 
timing.  Hardware  synchronizers,  in¬ 
corporating  complex  frequency  mul¬ 
tipliers  and  phase-locking  schemes, 
must  be  used  to  correlate  MIDI  tempo 
timing  and  SMPTE  absolute  timing. 


Dr.  Dobb's  Journal,  May  1987 

363 


MUSIC  RECORDER 

(continued  from  page  4Z) 


Some  of  the  first  sync  boxes  for 
SMPTE-MIDI  conversion  were  from 
Roland  (SBX-80)  and  from  Garfield 
Electronics.  It  is  difficult  to  calculate 
rates  and  match-up  points  for  the 
two  time  codes,  so  both  of  these  units 
take  the  more  practical  approach  of 
building  a  map  of  alignments  as  the 
time  codes  are  received.  The  map  is 
then  stored  to  tape  or  to  disk  via  MIDI. 

An  important  step  in  the  develop¬ 
ment  of  SMPTE-to-MlDl  standards  are 
the  map  formats  being  proposed  by 


necessary  to  stay  compatible  with 
support  hardware  marketed  by 
other  companies. 

Summary 

By  providing  a  bridge  between  the 
music  and  computer  industries,  MIDI 
has  sparked  new  interest  in  the  de¬ 
sign  of  innovative  musical  instru¬ 
ments.  It  has,  in  fact,  created  its  own 
industry.  Many  competent  software 
engineers  are  becoming  interested  in 
music  because  of  this  accessibility, 
and  better  products  are  being  intro¬ 
duced  every  day. 


If  something  is 
conspicuously  absent 
on  the  sequencers 
you  see, 
it  is  probably 
difficult 
to  design. 


SMPTE  synchronizer  manufacturers 
such  as  Adams-Smith.  Adams-Smith's 
new  Zeta  3  system  allows  commands 
from  MIDI  or  RS-232  to  control  a  tape 
machine  or  lets  time  codes  from  the 
tape  machine  be  translated  back  to 
MIDI  format.  Two  tape  transports  and 
a  variety  of  sequencers  and  comput¬ 
ers  can  be  operated  from  a  single 
Zeta  3  synchronizer.  In  actual  opera¬ 
tion,  machines  are  synchronized  by 
striping  one  of  the  tape  tracks  on 
each  machine  with  SMPTE  code.  The 
tape  controller  reads  these  tracks  and 
fine-tunes  motor  speeds  on  the  trans¬ 
ports.  The  Zeta  3  controller  also  out¬ 
puts  MIDI  timing  bytes  to  keep  se¬ 
quencers  in  step  with  the  tape. 

Another  type  of  time  code  in  com¬ 
mon  use  is  FSK,  or  frequency  shift 
keying,  which  encodes  Os  and  Is  as 
two  different  frequencies.  A  major 
drawback  to  FSK  is  the  lack  of  enough 
resolution  to  provide  any  form  of  em¬ 
bedded  absolute  time  reference.  FSK 
tapes  must  always  be  started  from  a 
known  reference  point  because  FSK 
is  a  relative  timing  reference. 

Incorporation  of  SMPTE  control  or 
provision  for  some  kind  of  sync-to- 
tape  can  be  a  big  advantage  when 
marketing  software.  It  may  only  be 


Make  sure  you  look  at  a  few  of  the 
commercially  available  sequencers 
or  patch  editors  before  you  start  writ¬ 
ing  software.  If  you  see  something 
conspicuously  absent  on  all  the  se¬ 
quencers  you  encounter  (such  as  a 
built-in  universal  patch  editor), 
chances  are  that  it  is  difficult  to  de¬ 
sign.  There  are  some  imaginative  de¬ 
signers  working  with  MIDI,  and  some 
impossible  things  can  be  accom¬ 
plished  with  a  new  approach  or  just 
a  lot  of  work.  Visit  one  of  the  larger 
music  stores  to  find  out  what  is  cur¬ 
rently  on  the  market. 


Dr.  Dobb's  Journal,  May  1987 

364 


45 


MUSC  RECORDER 
(continued  from  page  45) 

Resources 

PppinHirnk 

Availability 

All  the  source  code  for  articles  in  this 
issue  is  available  on  a  single  disk.  To 
order,  send  $14.95  to  Dr.  Dobb's  Jour¬ 
nal,  501  Galveston  Dr.,  Redwood  City, 
CA  94063  or  call  (415)  366-3600,  ext. 
216.  Please  specify  the  issue  number 
and  format  (MS-DOS,  Macintosh, 
Kaypro). 

DDJ 

Vote  for  your  favorite  feature/article. 

Circle  Reader  Service  No.  3. 

IMA  (Inti.  MIDI  Association) 

12543  Hortense  Ave. 

Studio  City,  CA 

IMUG  (Inti.  MIDI  User’s  Group) 

P.O.  Box  593 

Los  Altos,  CA  94022 

SMPTE 

(Society  of  Motion  Picture  and 
Television  Engineers) 

595  W.  Hartsdale  Ave. 

White  Plains,  NY  10607 

Electronic  Musician 

2608  Ninth  St. 

Berkeley,  CA  94710 

Subscription  Dept.: 

5615  Cermak  Rd.,  Cicero,  IL  60650 

Keyboards  Computers  and  Software 

299  Main  St. 

Northport,  NY  11768 

Keyboard  Magazine 

P.O.  Box  2110 

Cupertino,  CA  95015 

Adams-Smith 

Vendors 

Jim  Miller  (Personal  Composer) 

White  River  Junction,  VT  05001 

34  Tower  St. 

P.O.  Box  648 

(802)  295-5800 

Hudson,  MA  01749 

Honaunau,  HI  96726 

(617)  562-3801 

(808)  328-9518 

Optical  Media  International 

AKAI  Professional  Div. 

Kawai  America  Corp. 

P.O.  Box  2107 

AptOS,  CA  94301 

P.O.  Box  2344 

2055  E.  University  Dr.,  C.S.  9045 

(408)  662-1772 

Fort  Worth,  TX  76113 

Compton,  CA  90224-9045 

(213)  534-2350 

Oberheim,  A  Division  of  ECC 

Casio  Inc. 

Development  Corp. 

15  Gardner  Rd. 

KORG  USA  Inc. 

11650  W.  Olympic  Blvd. 

Fairfield,  NJ  07006 

89  Frost  St. 

Los  Angeles,  CA  90064 

(201)  575-7400 

Westbury,  NY  11590 

(213)  479-4948 

Doctor  T’s  Music  Software 

(516)  333-9100 

Opcode  Systems 

220  Boylston,  Ste.  306 

Kurzweil  Music  Systems  Inc. 

444  Ramona  St. 

Chesnut  Hill,  MA  02172 

411  Waverly  Oaks  Rd. 

Palo  Alto,  CA  94301 

(617)  224-6954 

Waltham,  MA  02154-8464 

(415)  321-8977 

E-Mu  Systems  Inc. 

(617)  893-5900 

Passport  Designs  Inc. 

1600  Green  Hills  Rd. 

Lexicon  Inc. 

625  Miramontes  St.,  Ste.  103 

Scotts  Valley,  CA  95066 

60  Turner  St. 

Half  Moon  Bay,  CA  94019 

(408)  438-1921 

Waltham,  MA  02154 

(415)  726-0280 

Ensoniq  Corp. 

(617)  891-6790 

Roland  Corp.  US 

263  Great  Valley  Pkwy. 

Mark  of  the  Unicorn 

7200  Dominion  Circle 

Malvern,  PA  19355 

222  Third  St. 

Los  Angeles,  CA  90040-36477 

(215)  647-3930 

Cambridge,  MA  02142 

(213)  685-5141 

Fairlight  Instruments 

(617)  576-2760 

Sequential  Circuits  Inc. 

2945  Westwood  Blvd. 

Mimetics  Corp. 

3051  N.  First  St. 

Los  Angeles,  CA  90064 

P.O.  Box  60238,  Station  A 

San  Jose,  CA  95134 

(213)470-6280 

Palo  Alto,  CA  94306 

(408)  946-5240 

Garfield  Electronics 

(408)  741-0117 

Voyetra  Technologies 

P.O.  Box  1941 

Moog  Electronics 

426  Mt.  Pleasant  Ave. 

Burbank,  CA  91507 

2500  Walden  Ave. 

Mamaroneck,  NY  10543 

(213)  434-6643 

Buffalo,  NY  14225-4799 

(914)  698-3377 

Hybrid  Arts  Inc. 

(716)  681-7242 

Yamaha  International  Corp. 

11920  W.  Olympic  Blvd. 

New  England  Digital  Corp. 

6600  Orangethorpe  Ave. 

Los  Angeles,  CA  90064 

49  N.  Main 

Buena  Park,  CA  90622 

(213)  826-3777 

P.O.  Box  546 

(714)  522-9011 

48 


Dr.  Dobb's  Journal,  May  1987 

365 


ARTICLES 


Dimensional 
Data  Types 


by  Do-While  Jones 


Back  in  the  days  when  memo¬ 
ry  was  expensive,  comput¬ 
ers  were  slow,  and  compil¬ 
ers  weren't  smart  enough  to 
optimize  code,  it  was  necessary  to 
write  concise,  clever  programs.  The 
resulting  programs  were  efficient, 
but  they  were  cryptic,  which  made  it 
difficult  to  modify  the  code  without 
adding  new  bugs.  Managers  prized 
their  few  brilliant  programmers 
who  could  write  and  maintain  effi¬ 
cient  code. 

But  then  memory  became  cheap¬ 
er,  computers  ran  faster,  compilers 
became  better,  and  genius  program¬ 
mers  started  changing  jobs  whenev¬ 
er  someone  offered  them  more  mon¬ 
ey.  Managers  discovered  they  were 
spending  more  money  on  software 
than  they  were  on  hardware. 

The  old  values  have  changed.  Com¬ 
pact  code  is  no  longer  necessary  or 
desirable.  Instead,  managers  are 
looking  for  code  that  can  be  easily  un¬ 
derstood.  This  makes  the  code  cheap¬ 
er  to  validate  and  maintain  and 
makes  it  possible  for  a  team  of  pro¬ 
grammers  to  work  together  efficient¬ 
ly  and  get  their  software  product  to 
the  market  (or  battlefield)  first.  Peo¬ 
ple  who  write  programs  that  are  so 
complex  that  nobody  else  can  under¬ 
stand  them  are  no  longer  an  asset  to 
an  organization.  The  big  money  is 
now  starting  to  go  to  people  who  can 
write  clear  code. 

One  way  to  make  programs  easier 
to  understand  is  to  use  dimensional 


Do-While  Jones,  324  Traci  Lane,  Ridge¬ 
crest,  CA  93553.  Do-While  is  currently 
teaching  Ada  programming.  He  is  a 
columnist  for  the  Journal  of  Pascal, 
Ada,  &  Modula-2. 


The  big  money 
now  goes  to 
those  who 
write  clear  code. 


units  (centimeters,  grams,  seconds, 
and  so  on)  as  data  types.  The  bibliog¬ 
raphy  lists  some  articles  that  show 
that  there  has  been  some  interest 
lately  in  extending  the  syntax  of 
high-level  languages  to  include  di¬ 
mensional  units. 

You  don't  have  to  wait  for  someone 
to  invent  a  new  high-level  language 
with  built-in  dimensional  data  types 
because  one  already  exists.  The  Ada 
programming  language  has  features 
that  make  it  possible  to  invoke  di¬ 
mensional  data  types  simply  by  ref¬ 
erencing  a  library  unit.  But  before  I 
show  you  how,  let  me  take  a  little 
time  out  to  show  you  why  people 
feel  a  need  for  dimensional  units. 

The  Need  for  Dimensional 
Data  Types 

Example  1,  page  51,  shows  a  simple 
Ada  program  that  could  be  used  in  a 
microprocessor-controlled  radar 
speed  gun.  The  program  as  it  stands  is 
perfectly  legal,  and  presumably  it 
would  work  if  I  added  some  instruc¬ 
tions  that  really  measured  frequen¬ 
cies  and  showed  the  results  on  a  dis¬ 
play.  This  program  is  an  example  of 
bad  programming  practice,  though, 
because  it  is  difficult  to  validate  and 
maintain. 

If  someone  handed  you  Example  1 
and  asked  you  to  determine  if  it  were 
correct,  could  you?  If  someone  asked 


you  to  change  it  so  it  gave  answers  in 
feet  per  second,  how  difficult  would 
it  be  for  you  to  make  the  change? 
What  makes  this  program  difficult  to 
validate  and  maintain  is  the  cryptic 
number  335,300,000.  Where  did  it 
come  from?  What  does  it  mean?  Is  it 
correct? 

Example  1  is  too  short  to  do  justice 
to  the  problem,  though,  as  it  contains 
only  one  equation.  You  can  concen¬ 
trate  your  attention  on  that  one  line, 
and  if  you  can't  figure  it  out,  you  can 
rewrite  the  whole  program  and  you 
haven't  lost  much.  In  the  real  world, 
a  tactical  embedded  computer  pro¬ 
gram  has  dozens,  or  hundreds,  or 
maybe  thousands  of  equations.  This 
complexity  makes  it  much  more  dif¬ 
ficult  to  figure  out,  and  rewriting  the 
program  from  scratch  is  out  of  the 
question.  Although  Example  1 
doesn't  show  the  magnitude  of  the 
problem,  I  hope  it  gives  you  an  idea 
of  the  kind  of  difficulties  that  you  can 
encounter  when  ambiguous  data 
types  such  as  float  are  used. 

How  Dimensional 
Data  Types  Help 

If  you  write  the  program  using  di¬ 
mensional  units  as  data  types,  the 
ambiguity  problem  goes  away.  Not 
only  that,  the  compiler  can  catch  ob¬ 
vious  mistakes  at  compile  time. 

Even  though  Example  1  uses  de¬ 
scriptive  variable  names  and  there  is 
no  question  in  your  mind  what 
SPEED  represents,  you  don't  know  if 
it  is  calculated  in  feet  per  second, 
miles  per  hour,  or  kilometers  per 
hour.  On  the  other  hand,  if  the  ob¬ 
jects  in  the  program  were  declared 
using  dimensional  units,  you  would 
know  everything  you  needed  to 


SO 

366 


Dr.  Dobb's  Journal,  May  1987 


know  about  them — for  example: 

TRANSMIT _ FREQUENCY, 

DOPPLER_FREQUENCY  :  Hertz; 
SPEED :  Miles_per_hour; 

The  Ada  programming  language 
allows  you  to  derive  new  data  types 
from  existing  ones.  Therefore  you 
could  say: 

type  Hertz  is  new  float; 

type  Miles_per_hour  is  new  integer; 

Then,  assuming  you  had  also  de¬ 
clared  RECEIVED— FREQUENCY  to  be  of 
type  Hertz,  Ada  would  let  you  write 
this  statement: 

DOPPLER _ FREQUENCY  :  = 

RECEIVED_FREQUENCY  - 
TRANSMIT_FREQUENCY ; 

But  if  you  had  declared  RECEIVED 
_ FREQUENCY  and  TRANSMIT _ 
FREQUENCY  to  be  of  type  Megahertz 
and  DOPPLER— FREQUENCY  to  be  of 
type  Kilohertz,  then  Ada  would  have 
rejected  the  preceding  statement. 
The  error  message  would  have 
shown  a  dimensional-unit  error  at 
compile  time. 

Simply  deriving  new  data  types 
from  existing  numeric  types  is  a  step 
in  the  right  direction,  but  it  doesn't 
get  you  as  far  as  you  want  to  go.  If  I 
merely  defined  Hertz  to  be  a  new 
kind  of  floating-point  number,  then 
Ada  would  think  it  could  multiply 
two  objects  of  type  Hertz  together 
and  get  Hertz  (instead  of  Hertz 
squared).  Similarly,  if  I  divided  Hertz 
by  Hertz,  I  should  get  a  dimensionless 
floating-point  number — not  a  result 
in  Hertz. 

The  preceding  paragraph  shows 
some  of  the  differences  between  di¬ 
mensional  quantities  and  simple  sca¬ 
lar  (dimensionless)  numbers.  Ada  lets 
me  define  additional  properties  of  di¬ 
mensional  data  types  that  go  beyond 
the  scalar  operations,  but  it  won't  let 
me  undefine  the  operations  that  are 
valid  for  scalar  objects  but  illegal  for 
dimensional  quantities.  There  is  no 
way  I  can  tell  Ada  that  Hertz  times 
Hertz  doesn’t  give  me  an  answer  in 
Hertz. 

Therefore,  it  isn’t  a  good  idea  to  de¬ 
rive  dimensional  data  types  from  nu¬ 
meric  data  types.  Fortunately,  there 
is  another  way. 


A  Better  Solution 

Listing  One,  page  58,  shows  an  Ada 
package  that  creates  dimensional 
data  types.  Packages  are  standard 
Ada  constructs  that  generally  come 
in  two  parts.  The  package  specifica¬ 
tion  defines  the  available  services, 
and  the  package  body  tells  Ada  how 
to  implement  those  services. 

The  DIMENSIONAL-UNITS  package 
specification  defines  two  private 
data  types  called  Integer— unit  and 
Float— unit.  As  these  are  private  types, 
they  have  only  three  legal 
operations: 

•  Assignment —  You  can  assign  a  val¬ 
ue  to  an  object  of  this  type. 

•  Equality —  You  can  check  to  see  if 
two  objects  of  this  type  are  identical. 

•  Inequality —  You  can  check  to  see  if 
two  of  these  objects  differ  in  any 
way. 

All  three  of  these  operations  are  valid 
and  useful  for  dimensional  quanti¬ 
ties,  but  clearly  more  operations 
need  to  be  provided. 

You  can  add  two  objects  measured 
in  feet  and  you  will  get  an  answer  in 
feet,  so  addition  must  be  defined.  You 
can  multiply  an  object  in  feet  by  a 


dimensionless  number,  and  the  re¬ 
sult  will  be  in  feet.  If  you  divide  feet 
by  feet  you  get  a  dimensionless  num¬ 
ber.  You  can  check  to  see  if  one  vari¬ 
able  measured  in  feet  is  shorter  than 
a  second  variable  measured  in  feet. 
You  will  find  all  these  operations 
(and  more)  in  the  DIMENSIONAL 
— UNITS  package  specification. 

The  DIMENSIONAL— UNITS  package 
body  is  simple  but  lengthy.  Fortu¬ 
nately,  you  have  to  compile  it  only 
once.  Then  it  becomes  part  of  your 
bag-of-tricks  library,  and  you  can  ac¬ 
cess  it  with  a  one-line  context  clause. 
(Modern  software  professionals  are 
trying  to  promote  this  kind  of  univer¬ 
sal,  reusable  code.) 

After  I  wrote  this  package,  I  started 
building  a  package  called  WEIGHTS 
-AND-MEASURES  that  would  derive 
all  possible  dimensional  units.  It  start¬ 
ed  out  like  this: 

with  DIMENSIONAL _ UNITS;  use 

DIMENSIONAL _ UNITS; 

package  WEIGHTS_AND_MEASURES  is 

type  Inches  is  new  Integer_unit; 

type  Feet  is  new  Integer_unit; 
type  Yards  is  new  Integer_unit; 
type  Miles  is  new  Integer_unit; 


BADGUN.ADA 

procedure  Bad_Example  is 

TRANSMIT_FREQUENCY ,  DOPPLER_FREQUENCY ,  SPEED  :  float; 

function  Xmit_Frequency_Measurement  return  float  is 
begin 

return  1.0;  —  Machine  specific  code  to  measure  frequency 
--  goes  here 

end  Xmit_Frequency_Measurement ; 

function  Doppler_Frequency_Measurement  return  float  is 
begin 

return  1.0;  --  Machine  specific  code  to  measure  frequency 
--  goes  here 

end  Doppler_Frequency_Measurement ; 

procedure  put ( N  :  integer)  is 
begin 

null;  --  Machine  specific  code  to  display  an  integer  goes 
—  here 

end  put; 
begin 

TRANSMIT_FREQUENCY  :=  Xmit_Frequency_Measurement ; 
DOPPLER__FREQUENCY  :=  Doppler„Frequency_Measurement ; 

SPEED  :=  335 . 30e6  *  DOPPLER_FREQUENCY  /  TRANSMIT_FREQUENCY ; 
put (integer(SPEED) ) ; 
end  Bad_Example; 


Example  1:  An  Ada  program  that  demonstrates  bad  programming  practice 


Dr.  Dobb's  Journal,  May  1987 


51 

367 


DIMENSIONAL  DATA  TYPES 

(continued  from  page  51 ) 


type  Centimeters  is  new 

Integer_unit; 


end  WEIGHTS _ AND _ MEASURES; 

My  intention  was  to  start  every  pro¬ 
gram  that  needed  dimensional  quan¬ 
tities  with  a  context  clause  invoking 

WEIGHTS — AND _ MEASURES.  A  typical 

program  would  have  looked  like  this: 

with  WEIGHTS_AND_MEASURES;  use 
WEIGHTS_AND_MEASURES; 
procedure  Main_Program  is 

X,  Y,  Z  :  Feet; 
begin 

(do  something  with  X,  Y,  and  Z) 
end  Main_Program; 

The  problem  was  that  there  are  too 
many  dimensional  units  to  list  them 
all  and  define  the  relationship  to  ev¬ 
ery  other  related  quantity.  Just  look 
through  any  physics  reference  book, 
and  you  will  find  pages  of  conver¬ 
sions  from  one  kind  of  unit  to  anoth¬ 
er.  Even  after  discarding  units  I  knew 
I  would  never  use  (such  as  cubic  fur¬ 
longs),  there  were  still  far  too  many 
to  make  a  universal  WEIGHTS— AND 
—MEASURES  package  practical.  So,  I 
now  derive  just  the  units  I  need  for 
each  program. 

An  Example 

Listing  Two,  page  61,  shows  the 
speed  gun  program  rewritten  using 
better  style.  It  consists  of  five  individ¬ 
ual  compilation  units:  SPEED— GUN 
— UNITS  (specification),  HARDWARE 
—CIRCUITS  (specification),  Speed— Gun 
(main  program  body),  SPEED— GUN 
- UNITS  (body),  and  HARDWARE- 
CIRCUITS  (body). 

The  first  compilation  unit  (SPEED 
—GUN— UNITS)  is  the  small,  custom¬ 
ized  version  of  WEIGHTS— AND— MEA¬ 
SURES  for  this  application.  It  defines 
only  three  dimensional  data  types: 
Miles— per— hour,  Hertz,  and  Miles 
—per— second.  The  main  program 
computes  a  speed  in  Miles— per— 
second  but  the  answer  is  desired  in 
Miles— per— hour,  so  the  Type— Con¬ 
vert  function  is  provided  in  this  pack¬ 
age  to  make  the  conversion.  I  didn’t 
want  to  clutter  the  main  program 


with  this  type  conversion,  so  I  de¬ 
fined  a  special  multiplication  func¬ 
tion  that  includes  an  automatic  con¬ 
version  from  Miles— per— second  to 
Miles— per— hour.  A  package  specifi¬ 
cation  of  this  type  often  contains  spe¬ 
cial  arithmetic  operators  that  convert 
data  types— for  example,  a  division 
operator  that  divides  Feet  by  Seconds 
and  returns  a  value  of  type  Feet  per 
—second  is  common. 

The  second  compilation  unit 
( HARDWARE-CIRCUITS )  separates  all 
the  implementation-dependent  code 
from  the  main  program  logic.  If  this 
were  a  real  project  (rather  than  an 
classroom  exercise),  I  could  write  and 


Programmers 
should  no  longer 
waste  time 
combining 
constants — the 
compiler  should 
do  it. 


debug  my  speed  calculation  while 
being  blissfully  ignorant  of  how 
some  other  guy  was  writing  the  code 
to  measure  frequency  and  display 
numbers.  (Two  of  us  would  have 
trouble  doing  this  with  Example  1.) 

The  third  compilation  unit  is  the 
main  program,  and  it  looks  a  lot  like 
Example  1,  only  easier  to  read.  If  I 
gave  you  the  procedure  Speed— Gun 
instead  of  Example  1,  you  could  see  at 
a  glance  how  I  am  computing  SPEED. 
To  validate  it,  you  would  only  need  to 
satisfy  yourself  that  it  is  a  correct  re¬ 
arrangement  of  the  usual  Doppler 
frequency  equation — fd  =  (2  X 
speed  X  fx)  /  (speed  of  light),  where 
fd  is  the  Doppler  frequency  and^y  is 
the  transmitted  frequency. 

If  you  compile  the  first  three  units 
in  the  order  given,  Ada  will  check  for 
dimensional  consistency  automati¬ 
cally  and  tell  you  that  there  are  no 
errors.  Ada  doesn’t  need  units  4  and  5 
until  you  ask  it  to  link  the  modules  to 
create  an  executable  image. 

The  fourth  compilation  unit  tells 
Ada  how  to  implement  the  two  func¬ 
tions  defined  in  unit  1.  Ada  checks  for 
consistency  between  the  specifica-  I 


tion  and  the  body  to  make  sure  they 
match.  The  functions  are  easily  veri¬ 
fied  and  could  easily  be  tested  sepa¬ 
rately  from  the  Speed— Gun  program. 

Finally,  the  fifth  unit  shows  how 
you  can  test  software  before  the 
hardware  is  finished.  I  chose  to  simu¬ 
late  the  inputs  by  prompting  the  user 
to  enter  some  numbers  from  the  ter¬ 
minal,  but  I  could  just  as  easily  have 
made  the  unit  read  numbers  from  a 
disk  file. 

If  someone  were  to  really  build  the 
speed  gun,  all  I  would  have  to  do 
would  be  to  rewrite  the  HARDWARE 
—CIRCUITS  package  body  (unit  5)  and 
recompile  it.  Relinking  Speed— Gun 
would  then  replace  the  terminal  I/O 
with  the  new  speed  gun  circuit  inter¬ 
face.  I  would  not  have  to  change  (or 
even  recompile)  the  first  four  units. 
Note  that  the  HARDWARE— CIRCUITS 
package  could  also  be  tested  without 
Speed— Gun. 

There  are  some  test  results  at  the 
end  of  Listing  Two. 

The  Overhead  Isn ’t 
as  Bad  as  It  Looks 

Listing  Two  is  longer  than  Example  1, 
and  that  might  be  a  cause  for  some 
concern.  To  make  the  comparison 
fair,  you  have  to  ignore  compilation 
unit  5  (because  Example  1  leaves  that 
part  out),  but  even  so  it  is  clear  there 
is  some  overhead  (in  terms  of  pro¬ 
gram  lines)  when  using  dimensional 
units.  In  this  example  the  overhead  is 
significant,  but  in  practical  programs 
it  isn’t  as  high.  The  overhead  appears 
out  of  proportion  here  because  the 
computational  part  of  the  example 
program  is  so  trivial. 

Consider  this  analogy.  You  might 
think  it  impractical  to  use  a  computer 
to  keep  track  of  your  checkbook  bal¬ 
ance  because  the  overhead  would  be 
too  high.  Every  time  you  wrote  a  $10 
check,  you  would  have  to  remove  the 
dustcovers,  turn  on  the  computer, 
wait  for  the  CRT  to  warm  up  and  the 
hard  disk  to  come  up  to  speed,  boot 
the  operating  system,  load  the  check¬ 
book  balancing  program,  enter  the 
data,  copy  the  result  to  your  check¬ 
book,  turn  off  the  computer,  and  put 
the  dustcovers  back  on.  It  would  be 
easier  just  to  subtract  the  $10  on  pa¬ 
per!  But  if  you  were  running  a  bank, 
you  wouldn't  dream  of  keeping  track 
of  all  of  each  customer’s  checking  ac¬ 
counts  with  pencil  and  paper.  The 


52 

368 


Dr.  Dobbs  Journal,  May  1987 


DIMENSIONAL  DATA  TYPES 

(continued  from  page  52) 

same  overhead  would  still  exist  (you 
would  still  have  to  take  off  the  dust- 
covers  and  turn  on  the  computer), 
but  it  would  now  be  a  small  part  of 
the  whole  process. 

Practical  embedded  computer  pro¬ 
grams  are  so  much  more  complicat¬ 
ed  than  the  speed  gun  example  that 
the  few  extra  lines  needed  to  invoke 
a  WEIGHTS _ AND^MEASURES  pack¬ 

age  are  negligible. 

Listing  Two  Can  Be 
as  Efficient  as  Example  1 

Example  1  simply  multiplies  a  ratio 
by  a  predetermined  conversion  con¬ 
stant,  but  Listing  Two  appears  to 
compute  that  conversion  constant  at 
run  time  by  dividing  the  speed  of 
light  by  2  and  then  multiplying  by  60 
twice.  If  it  really  did  that,  Listing  Two 
would  run  much  slower  than  Exam¬ 
ple  l.  An  old  FORTRAN-IV  compiler 
might  have  optimized  Listing  Two  by 
storing  the  speed  of  light  and  the 
numbers  2  and  60  in  registers  instead 
of  memory  to  make  the  program  fast¬ 


er.  Modern  compilers  are  smarter 
than  that.  A  good  Ada  compiler  can 
recognize  that  all  those  constants  can 
be  combined  at  compile  time  and 
should  generate  exactly  the  same 
code  for  Listing  Two  as  it  would  for 
Example  1  (if  the  terminal  I/O  simula¬ 
tion  code  was  added  to  Example  1). 
Programmers  should  no  longer 
waste  time  combining  constants  be¬ 
cause  the  compiler  should  do  it  any¬ 
way.  Combining  constants  just  ob¬ 
scures  the  source  code. 

Conclusion 

Listing  Two  probably  seems  radical 
to  those  who  first  learned  to  program 
a  computer  in  the  60s  (as  I  did).  But  we 
have  to  realize  that  the  old  values 
have  changed.  Now  the  most  impor¬ 
tant  feature  of  a  program  is  that  it  be 
easily  understood  so  it  can  be  easily 
debugged,  validated,  and  main¬ 
tained.  The  use  of  dimensional  units 
as  data  types  is  an  important  tech¬ 
nique  to  adopt  because  it  helps  pre¬ 
serve  the  sense  of  the  program. 

Availability 

All  the  source  code  for  articles  in  this 


issue  is  available  on  a  single  disk.  To 
order,  send  $14.95  to  Dr.  Dobb's  Jour¬ 
nal,  501  Galveston  Dr.,  Redwood  City, 
CA  94063  or  call  (415)  366-3600,  exten¬ 
sion  216.  Please  specify  the  issue 
number  and  format  (MS-DOS,  Macin¬ 
tosh,  Kaypro). 

Bibliography 

Jones,  Do-While.  “Readers’  Forum.” 
Journal  of  Pascal,  Ada,  &  Modula-2, 
vol.  4,  no.  1  (January/February  1985). 
Manner,  R.  SIGPLAN  Notices  (March 
1986). 

Swaine,  Michael.  "Swaine's  Flames.” 
DDJ  (August  1986). 

DDJ 


(Listings  begin  on  page  58.) 

Vote  for  your  favorite  feature/article. 
Circle  Reader  Service  No.  4. 


54 


Dr.  Dobb's  Journal,  May  1987 

369 


DIMENSIONAL  DATA  TYPES 


Listing  One  (Text  begins  on  page  50. ) 


Listing  One 

DUNITS . ADA 
VERSION  1 

1  JANUARY  1986 
DO-WHILE  JONES 
324  TRACI  LANE 
RIDGECREST,  CA  93555 
(619)  375-4607 

package  DIMENSIONAL_UNITS  is 

—  This  package  provides  useful  parent  types  for  derived 

—  dimensional  units.  That  is,  it  makes  it  possible  to 

—  do  this: 

—  type  Feet  is  new  Integer_Unit; 

—  type  Radians  is  new  Float_Unit; 

—  PI  :  constant  Radians  Type_Convert (3. 14159) ; 

—  TARGET_RANGE  :  Feet; 

—  ANGLE  :  Radians; 

—  REVOLUTIONS  :  integer; 

—  These  derived  data  types  will  inherit  all  the  operations 

—  in  the  package  below.  These  are  all  the  operations  which 

—  make  sense  for  dimensional  quantities. 

—  The  modulo  operation  for  Float_Units  is  provided  to  make 

—  it  easy  to  normalize  angular  measurements. 

—  ANGLE  ANGLE  mod  (2.0  *  PI); 

—  The  division  operator  for  Float_Units  which  return  INTEGERS 

—  truncates  toward  zero  (rather  than  rounding)  to  make  it 

consistant 

—  with  integer  division,  and  it  lets  you  do  this: 


—  REVOLUTIONS 


ANGLE  /  (2.0  *  PI) ; 


type  Integer_Unit  is  private; 

function  Type_Convert (X  :  integer)  return  Integer_Unit; 

—  Lets  you  assign  values  to  dimensional  objects. 

—  For  example, 

—  TARGET_RANGE  :-  Type  JTonvert  (587)  ; 


function 

return 

function 

return 

function 

return 

function 

return 

function 

return 

function 

return 

function 

return 

function 

return 

function 

return 

function 

return 

function 

return 

function 

return 

function 

return 

function 

return 


"+" (RIGHT  :  IntegerJJnit) 

Integer_Unit; 

"-"(RIGHT  :  IntegerJJnit) 

IntegerJJnit; 

"abs" (RIGHT  :  IntegerJJnit ) 

IntegerJJnit; 

"  +  " (LEFT,  RIGHT  :  IntegerJJnit) 
IntegerJJnit; 

"-"(LEFT,  RIGHT  :  IntegerJJnit ) 
IntegerJJnit; 

"•"(LEFT  :  integer;  RIGHT  :  IntegerJJnit) 
IntegerJJnit; 

"•"(LEFT  :  IntegerJJnit;  RIGHT  :  integer) 
IntegerJJnit; 

"/"(LEFT  :  IntegerJJnit;  RIGHT  :  integer) 
IntegerJJnit; 

"/"(LEFT,  RIGHT  :  IntegerJJnit) 
integer; 

"/"(LEFT,  RIGHT  :  Integer  Unit) 
float; 

"rem"  (I£FT,  RIGHT  :  IntegerJJnit) 
IntegerJJnit; 

"mod"  (LEFT,  RIGHT  :  IntegerJJnit) 
IntegerJJnit; 

Dimensionless (LEFT  :  IntegerJJnit) 
integer; 

Dimensionless (LEFT  :  Integer  Unit) 
float; 


—  and  "/-"  are  already  defined  for  private  types 
function  "<"  (LEFT,  RIGHT  :  IntegerJJnit ) 
return  boolean; 

function  "<-"(LEFT,  RIGHT  :  IntegerJJnit) 
return  boolean; 

function  ">" (LEFT,  RIGHT  :  IntegerJJnit) 
return  boolean; 

function  ">-" (LEFT,  RIGHT  :  IntegerJJnit) 
return  boolean; 

type  FloatJJnit  is  private; 

function  Type_Convert (X  :  float)  return  Float  Unit; 

—  Lets  you  assign  values  to  dimensional  objects. 

—  For  example, 

—  ANGLE  :-  TypeJTonvert  (3 . 14159) ; 

function  "+" (RIGHT  :  FloatJJnit) 
return  FloatJJnit; 
function  "-"(RIGHT  :  FloatJJnit) 
return  FloatJJnit; 
function  "abs" (RIGHT  :  FloatJJnit) 
return  FloatJJnit; 

function  "+" (LEFT,  RIGHT  :  FloatJJnit) 
return  FloatJJnit; 

function  "-"(LEFT,  RIGHT  :  FloatJJnit) 
return  Float  Unit; 


(continued  on  page  61) 


Dr.  Dobb's  Journal,  May  1987 


DIMENSIONAL  DATA  TYPES 


Listing  One  (Listing  continued,  tegt  begins  on  page  50.) 

function  :,*"(LEFT  :  integer;  RIGHT  :  Float_Unit) 
return  FloatJJnit; 

function  (LEFT  :  Float_Unit;  RIGHT  :  integer) 
return  Float_Unit; 

function  »*»(LEFT  :  float;  RIGHT  :  FloatJJnit) 
return  FloatJJnit; 

function  (LEFT  ;  Float_Unit;  RIGHT  :  float) 
return  Float_Unit; 

function  "/"(LEFT  :  Float_Unit;  RIGHT  :  integer) 
return  FloatJJnit; 

function  "/"(LEFT  :  Float_Unit;  RIGHT  ;  float) 
return  Float_Unit; 

function  "/"(LEFT,  RIGHT  :  FloatJJnit) 
return  integer;  —  trucates  toward  zero 

function  "/"(LEFT,  RIGHT  :  FloatJJnit) 
return  float; 

function  "rem" (LEFT,  RIGHT  :  Float_Unit) 
return  Float_Unit; 

function  "mod"  (LEFT,  RIGHT  :  FloatJJnit) 
return  FloatJJnit; 

function  Dimensionless (LEFT  :  FloatJJnit) 
return  integer; 

function  Dimensionless (LEFT  :  FloatJJnit) 
return  float; 

—  m-m  and  "/-"  are  already  defined  for  private  types 

function  "<" (LEFT,  RIGHT  :  FloatJJnit) 
return  boolean; 

function  "<—" (LEFT,  RIGHT  :  FloatJJnit) 
return  boolean; 

function  ">" (LEFT,  RIGHT  ;  Float_Unit) 
return  boolean; 

function  ">-"(LEFT,  RIGHT  :  FloatJJnit) 
return  boolean; 

—  The  following  don't  have  any  application  to  dimensional 

—  problems.  I  almost  hid  them  in  the  package  body,  but  I 

—  thought  that  since  I  needed  them  to  derive  some  of  the 

—  FloatJJnit  operations  someone  else  might  need  them,  too. 

function  "/"(LEFT,  RIGHT  ;  float)  return  integer; 

—  divide  and  truncate  toward  zero 

function  "rem" (LEFT,  RIGHT  :  float)  return  float; 

function  "mod" (LEFT,  RIGHT  ;  float)  return  float; 

private 

type  Integer JJnit  is  new  integer; 

type  FloatJJnit  is  new  float; 

end  DIMENSIONAL  UNITS; 


End  Listing  One 


Listing  Two 


DUEX . ADA 

—  This  is  an  example  of  how  the  use  of  dimensional  units  as  data 

—  types  improves  program  clarity. 


Compilation  Unit  1 


with  D IMEN S I ONAL  JJN ITS;  use  DIMENSIONALJJNITS; 
package  SPEED  J5UNJJNITS  is 

type  Miles_per_hour  is  new  Integer  JJnit; 

type  Hertz  is  new  FloatJJnit; 

type  Miles_per_second  is  new  FloatJJnit; 

function  Type_Convert (X  :  Miles_per_second) 
return  Milesjper  Jiour; 

function  (LEFT  :  Miles jjer_second;  RIGHT  : 
return  Miles_per_hour; 

end  SPEED  GUN  UNITS; 


Compilation  Unit  2 


with  SPEED  J5UN  JJN  ITS;  use  SPEED  J3UNJJNITS; 
package  HARDWARE_CIRCUITS  is 

function  Xmit_Frequency_Measurement  return  Hertz; 
function  Doppler  J’requencyJieasurement  return  Hertz; 
procedure  put (X  :  Miles_per_hour) ; 
end  HARDWARE  CIRCUITS; 


Compilation  Unit  3 


with  HARDWARE JT IRCUI TS ;  use  HARDWARE j: I RCU ITS; 
with  SPEED  J3UNJJN I TS ;  use  SPEED_GUN_UNITS; 
procedure  Speed_Gun  is 

T  RAN  SMI T_F REQUEN C Y ,  DOP  P  LER_FREQUENC Y  :  Hertz; 

SPEED  :  Miles_per_hour; 

C  :  constant  Miles_per_second 

Type_Convert (186_280. 0) ;  —  speed  of  light 

begin 

TRANSMIT_FREQUENCY  Xmit  J’requencyJleasurement; 

DOPPLER  FREQUENCY  Doppler_Frequency  Jleasurement; 

SPEED  (C  /  2.0)  *  ( DOPPLERJTREQUENCY  /  TRANSMIT_FREQUENCY) ; 
put (SPEED); 
end  Speed_Gun; 


Compilation  Unit 


(continued  on  nejct  page) 


Dr.  Dobb’s  Journal,  May  1987 


61 

371 


DIMENSIONAL  DATA  TYPES 


Listing  Two 

(Listing  continued,  text  begins  on  page  50.) 

package  body  SPEED_GUN_UNITS  is 

function  Type_Convert (X  :  Milesper  second) 
return  Miles_per_hour  is 
MPH  :  Miles_per_second; 
begin 

MPH  X  *  60  *  60; 

return  Type_Convert (Dimensionless (MPH) ) ; 
end  Type_Convert; 

function  (LEFT  :  Miles_per  second;  RIGHT  :  float) 
return  Miles_per_hour  is 
begin 

return  Type_Conver t ( LEFT  *  RIGHT); 
end  "*"; 

end  SPEED_GUN_UNITS; 

Compilation  Unit  5  - 

with  TEXT_IO;  use  TEXT_IO; 
package  body  HARDWARE_C IRCUI TS  is 

The  statements  below  are  standing  in  for  code  which  would 
read  the  frequency  directly  from  hardware  circuits  and 
—  would  display  speed  on  an  LCD  or  LED  display.  Since  I'm 
using  a  terminal  as  a  substitute  input  device  I  used 
--  TEXT_IO  to  get  and  put  data . 

package  INT_IO  is  new  INTEGER_IO (integer) ;  use  INT  IO; 
package  F_IO  is  new  FLOAT_IO ( float ) ;  use  F_IO; 

function  Xmit  Frequency  Measurement  return  Hertz  is 
F  :  float;  “ 
begin 

put ("What  is  the  Transmit  Frequency  (in  Hertz)?  "); 
get (F); 

skip_line;  —  TEXT_IO  quirk 
return  Type_Convert (F) ; 
end  Xmit_Frequency_Measurement; 


get  (F)  ; 

skip_line;  —  TEXT_IO  quirk 
return  Type_Convert (F) ; 
end  Doppler_Frequency_Measurement; 

procedure  put (X  :  Miles_per_hour)  is 
I  :  integer ; 
begin 

I  Dimensionless (X) ; 

put ("The  speed  is  ") ;  put (I);  put_line("  MPH."); 
end  put; 

end  HARDWARE_CIRCUITS; 

-  Test  Results  - 

$  run  speed_gun 

What  is  the  Transmit  Frequency  (in  Hertz)?  10.0e9 
What  is  the  Doppler  Frequency  (in  Hertz)?  1600.0 
The  speed  is  54  MPH. 

$ 

$  run  speed_gun 

What  is  the  Transmit  Frequency  (in  Hertz)?  10.0e9 
What  is  the  Doppler  Frequency  (in  Hertz)?  1000.0 
The  speed  is  34  MPH. 

$ 

$  run  speed_gun 

What  is  the  Transmit  Frequency  (in  Hertz)?  10.0e9 
What  is  the  Doppler  Frequency  (in  Hertz)?  2000.0 
The  speed  is  67  MPH. 

$ 


End  Listings 


function  Doppler  Frequency  Measurement  return  Hertz  is 
F  ;  float;  ” 
begin 

put ("What  is  the  Doppler  Frequency  (in  Hertz)?  ") ; 


62 

372 


Dr.  Dobb 's  Journal,  May  1987 


C  CHEST 


Listing  Twenty-six 

(Tejct  in  April) 


/, - 

*  NRPROCS . C 

*  Copyright  (c)  1986,  Allen  I.  Holub.  All  rights  reserved. 

*  Routines  for  processing  individual  commands,  the  following 

*  commands  cause  a  break  unless  *  is  used  as  the  command 

*  character:  .bp  .br  .ce  . fi  .in  .nf  . sp  .ti 
*/ 

# include  <stdio.h> 

# include  <ctype.h> 

♦include  "nr.h" 


extern  int  mgetc.O,  fgetcO; 

extern  char  * skipspace ( ) ,  “skiptoO,  *cpy(); 

extern  char  “strsave  (  char*  ); 

extern  double  parse  (  char**  ); 

static  Nestlev  -  0;  /*  .{/.)  Nesting  level  */ 

/* - - - */ 


setnum(  target,  num,  offset  ) 
int  “target,  num,  offset; 

{ 

/*  If  offset  is  true,  set  target  to  its 

*  previous  value  +  num,  otherwise  set 

*  it  to  num.  Numbers  are  not  allowed 

*  to  go  negatative. 

*/ 


) 


if (  loffset  ) 

“target  -  (num  <  0) 

else 

{ 

if (  (  “target  +-  num 
‘target  -  0; 


) 


) 


0  :  num 


<  0  ) 


/* - 7 - 

*  Routines  to  process  individual  commands:  (note  that  the 

*  comment  command  . \"  is  processed  by  expand ()  in  nr.c. 

*  The  \"  is  the  actual  comment  delimiter  and  a  dot  on  a 

*  line  by  itself  is  treated  as  a  comment. 

*/ 


sblock  () 

{ 

/*  . {  -  NOT  AN  NROFF  COMMAND  - 

*  Starts  a  block  for  an  .if,  .ie,  or  .el. 

*  Bumps  the  process  level  up  a  notch.  This 

*  routine  can  not  be  inhibited  by  "Inhibit 

*  A  \{\  or  \(  is  mapped  to  a  . )  for  nroff 

*  compatability .  (}}) 

*/ 


++Nestlev; 

process  (  I  file,  I  filename,  Ismacro,  Macv  ); 
return  0; 

) 

/* - */ 


eblock  () 

{ 

/*  . }  -  NOT  AN  NROFF  COMMAND  - 

* 

*  Terminate  a  . {  block 

*  Forces  process ()  to  terminate,  bumping  the 

*  nesting  level  back  down  a  notch.  This  routine 

*  can't  be  inhibited  by  Inhibit.  A  \)  is  treated 

*  like  a  comment  in  terms  of  escape  processing 

*  but  it  escape ()  will  call  eblock ()  in  this 

*  case  too.  ({) 

* 

*  This  command  is  not  inhibitable 
*/ 


if (  — Nestlev  >-  0  ) 
return  1; 

else 

( 

Nestlev  -  0  ; 

err ("Mis-matched  .}  (No  corresponding  .(  )\n"); 
return  0; 

) 

) 

/“ - */ 

ad(lstr) 

unsigned  char  *lstr  ; 

{ 

/*  .ad  [b  n  1  r  c] 

*  Turn  on  adjusting.  If  “lstr  is  null  then  BOTH  is 

*  used,  otherwise  the  indicated  adjustment  mode  is 

*  set . 

*/ 

Adjusting  —  1; 


switch (  *lstr  ) 

( 

case  1 \0 ' : 
case  BOTH: 

case  ALT_BOTH:  Ad j mode 
break; 

case  LEFT: 
case  RIGHT: 

case  CENTER:  Adjmode 

break; 

default: 

err ("Bad  mode:  use 


-  BOTH  ; 


-  *lstr; 


(1)  eft  (r)ight  (c.)  enter 


(b) oth- (n) ormal. \n") ; 


} 

) 

/ - - - 

cm(  str  ) 
char  “str; 

{ 

/*  .cm  [on]  —  NOT  AN  NROFF  COMMAND  — 

*  enable  nroff-style  copy  mode  inside  macro 

*  definitions.  If  no  argument,  nroff  copy 

*  mode  is  disabled.  In  normal  copy  mode  only 

*  \"  and  \<CR>  are  recognized.  In  nroff  mode 

*  the  following  are  recognized: 

*  \"  \<cr>  \n  \*  \$  \\  \.  \t  \a 
*/ 

Nr_c.pmode  -  “str  ; 

} 

- - - 

af(lstr,  rstr) 
char  “lstr,  “rstr; 

( 

/*  .af  R  [1  001  i  I  a  A  e  E] 

*  Alter  format  of  number  register  R  to  the 

*  indicated  mode.  Default  is  arabic. 

*/ 

register  int  c; 

if(  “lstr  ) 

( 

switch (  c  -  “rstr  ) 

{ 

case  PADDED: 

while (  isdigit (“rstr)  &&  c  <  '9'  ) 

{ 

rstr++; 

C++; 

) 

if(  isdigit (  “rstr  )  ) 

err ("Only  9  digits  of  zero  fill  allowed\n"); 

break; 

case  ' \0 ' : 
case  LC_ROMAN 
case  UC_ROMAN 
case  LC_ALPHA 
case  UC_ALPHA 
case  LC_ENG: 
case  UC_ENG : 
case  ARABIC: 

default : 

err(  "Illegal  number  register  format  <%c>\n",  c  ); 

} 

putnreg (lstr,  c  ,  0,  -1,  0,  0  ); 

} 

) 

- - - 


c.  -  ARABIC;  break; 


break; 


am(lstr,  rstr) 
char  “lstr,  “rstr; 

{ 

/“  .am  xx  yy 

»  Append  text  to  the  macro  named  xx  until 

*  either  . .  or  .yy  (in  rstr)  is  found  at  the 

*  start  of  the  line. 

“/ 

if (  “lstr  ) 

mappend(  lstr,  rstr  ); 

else 

errC'Missing  macro  name  to  .am\n"); 

) 

/* - */ 

asdstr,  rstr) 
char  “lstr,  “rstr; 

{ 

/“  .as  lstr  rstr 

“  append  rstr  to  end  ofstring  named  in  lstr 

(continued  on  ne?ct  page) 


64 


Dr.  Dobb's  Journal,  May  1987 

373 


C  CHEST 


Listing  Twenty-six  (Listing  continued) 


if (  *lstr  ) 

sappend(  lstr,  rstr  ); 

else 

err("Missing  string  name\n") ; 


bd(  lstr,  rstr  ) 


•lstr,  *rstr; 

/•  .bd  on  off  -  MODIFIED  NROFF  COMMAND  - 

*  Initialize  bold  face  mode. 

*  Send  lstr  to  the  printer  to  put  it  into  bold  mode, 

*  send  rstr  to  turn  off  boldface.  Maximum  length  of 

*  either  string  is  80  characters.  Use  \x  to  send 

*  control  characters. 


static  char 


on [ 8 1 ] ,  off [81]  ; 


on [ 80 ]  -  of f [ 80 ]  -  0; 


strncpy(  on,  lstr,  80  ); 
strncpy(  off,  rstr,  80  ); 


Bd_on  -  on  ; 
Bd  off  -  off; 


bo(  num,  str,  offset  ) 

char  *str; 

int  num,  offset; 


/*  .bo  [  [+-]N  ] 


-  NOT  AN  NROFF  COMMAND 


Put  the  next  N  input  lines  into  boldface. 


setnum(  &Num_bold,  num,  offset  ); 


bp(  num,  str,  offset,  dobreak  ) 
char  *str; 

{ 

/*  .bp  t  [+-]  n  ] 

begin  new  page,  having  number  N.  If  N  is 
absent,  use  the  current  page  number  +  1; 
Note  that  N  is  applied  to  the  new  page, 

*  not  the  current  one,  so  a  footer  on  the 

current  page  will  reflect  the  old  number. 


/•  Re-enable  spacing 


if (  num  ) 

Nospace 

if(  dobreak  ) 
brk(); 


prblank (  (PGLEN-OLINE)  +  1  );  /•  Finish  page 

if(  num  ) 

PAGE  -  offset  ?  PAGE  +  num  :  num  ; 


br(  str,  dobreak  ) 
char  *str; 

{ 

/•  .br  -  stop  filling  and  print  current  buffer. 

if (dobreak) 

brk() ; 


c.2  (  lstr  ) 
char  *lstr; 

{ 

/•  .c2  [c]  -  change  nobreak  character  to  c  (*lstr) 

•  if  c  isn't  defined,  use 

*/ 

Nobreak  -  *lstr  ?  •lstr  :  ' ‘ '  ; 


make  c.  the  command  character.  If 
missing,  use  dot  (.). 


66 

374 


Dr.  Dobb's  Journal,  May  1987 


*lstr  ?  *lstr  : 


ce(  num,  str,  offset,  dobreak  ) 


*str; 

num,  offset  ; 


if (  dobreak  ) 
brk ( ) ; 


—  Center  the  next  N  input  lines 
without  filling.  Default  N  is  1. 


Num  center  -  offset  ?  (Num  center  +  num)  :  num  ; 


cf  (  str  ) 
char  *str; 


/*  .cf  file  —  copy  file  directly  to  standard 

*  output.  Useful  for  downloading 

*  fonts. 

*/ 

FILE  *  fd; 

register  int  c; 

if (  !*str  ) 

err(  "Missing  filename  in  . cf\n"  ); 

else 

{ 

if(  !(fd  -  fopen (str, "rb") )  ) 

errC'.c.f  %s.  Can't  open  file\n",  str); 

else 

{ 

while (  (c  -  getc(fd))  1-  EOF  ) 
putc (  c,  stdout  ) ; 

f close (  fd  ); 


ch(  num,  str,  offset  ) 

char  *str; 

int  num,  offset; 

/*  .c.h  xx  [+-] N  —  Change  trap  postion  for  macro  xx 

*  to  N.  Any  existing  trap  at  that 

*  position  is  destroyed  (NROFF  will  shadow  the 

*  earlier  trap,  not  destroy  it) .  If  N  is  absent, 

*  the  trap  is  removed. 


if (  *str  ) 

movetrap(  str,  num,  offset  ); 

else 

errC'Missing  macro  name  in  .ch  command\n")  ; 


cu(  num,  str,  offset  ) 

char  *str; 

int  num,  offset; 


.cu  [ [ h —  ] N ]  —  Continuous  underline  next  N 

input  lines.  All  characters  are 
underlined,  even  spaces. 


setnum(  &Cont_ul  ,  num,  offset  ); 


da (  lstr  ) 
char  *lstr; 


/*  .da  [xx] 


—  Append  to  diversion  xx.  Stop 
appending  when  a  .da  or  . di  without 
an  argument  is  encountered. 


if(  *lstr  ) 

dappend(  lstr  ); 

else 

endiv ( )  ; 


db(  lstr  ) 
char  *lstr; 


—  NOT  AN  NROFF  COMMAND  — 

Enable  debugging  mode  (same  as 
-v  -c  on  the  command  line)  if  an 


(continued  on  neyt  page) 


Dr.  Dobb 's  Journal,  May  1987 


67 

375 


C  CHEST 


Listing  Twenty-six  (Listing  continued ) 


argument  is  present,  else  disable 
debugging  mode. 


1  \0 ' 


de(lstr,  rstr) 


*lstr,  *rstr; 


.de  xx  yy  —  ENHANCED  NROFF  COMMAND  — 

Define  a  macro  called  xx.  Definition  stops 
when  a  .<rstr>  is  found  at  the  beginning  of 
If  rstr  is  missing  . .  is  used.  If  both 
arguments  are  missing,  all  currently  defined 
macros  are  printed  (like  .pm  in  real  nroff 
except  contents  of  macro  are  printed  too) . 

If  the  macro  already  exists,  it  is  deleted. 
Two  copy  modes  are  supported  (see  .cm). 


if(  *lstr  ) 

mcreate (  lstr,  rstr  ); 

else 

printm() ; 


df(  lstr,  rstr  ) 
char  *lstr,  •rstr; 

{ 

/*  • df  F  <start>  <end>  <cwidths> 

—  NOT  AN  NROFF  COMMAND  — 

*  Define  a  font.  F  is  a  font  name  (one  character), 

*  <start>  is  a  macro  to  invoke  when  font  is  invoked. 

*  <end>  is  a  macro  to  invoke  when  you  switch  out  of 

*  the  font.  <cwidths>  is  the  name  of  a  file  that  holds 

*  the  character-width  tables  (up  to  255  char-sized 

*  numbers  delimited  by  whitespace  or  blank  lines) . 

*  If  no  font  name  is  specified  then  existing  fonts 

*  are  printed  to  standard  output. 

*  Lastfont  (below)  points  at  the  most  recently 

*  added  font.  This  routine  assumes  that  main()  will 

*  call  it  to  initialize  the  roman  font  before  any 

*  other  fonts  are  defined.  The  behaviour  is  a  little 

*  strange  though.  findfontO  always  returns  0  when 

*  the  'R'  font  is  requested.  Consequently 

*  findfontO  won't  return  -1,  for  a  nonexistant  font, 

*  when  'R'  is  defined.  Be  careful. 

•/ 


register  FONT 

register  char 

static  FONT 

int 

FILE 

UCHAR 


•fp; 

*p; 

•lastfont  -  fiFonts [0] ; 
i,  existing  ; 

•stream; 

•mallocO ; 


if <  !*lstr  ) 

( 

for (  fp  -  Fonts-1;  ++fp  <-  lastfont;  ) 


{ 


printf ("Font  %c:  start  with  <%s>,  end  with  <%s>, 

widths; ", 

fp->name,  fp->smac,  fp->emac  ) ; 

p  -  fp->widths; 

for(  i  -  0;  i  <  MAX_CHARS_IN_FONT;  i++  ) 


( 


if(  i  %  8  —  0  ) 
printf ("\n") ; 


if(  i  <  •  '  ) 

printf ("*%c:%-3d  ",  i+ ' 0 • ,  p(i]); 

else 

printf ("%2c:%-3d  ",  i  ,  p[i]); 


printf  ("\n - \n"); 

) 

else 

{ 

existing  -  findfont ( •lstr) ; 


if (  lastfont  >-  &  Fonts [  NUMFONTS-1  ]  &&  existing  <  0  ) 

err ("May  not  define  more  than  %d  fonts\n",  NUMFONTS); 

else 

( 

if(  existing  <  0  )  /*  Font  doesn't  exist  */ 

fp  -  ++lastfont; 


else 

< 


/•  Redefining  existing  font  •/ 


fp  -  & Fonts [  existing  ] ; 
if(  fp->left  ) 

free (  fp->left  ); 


fp->name  -  *lstr  ; 

fp->resolution  -  Hs_amt; 

(continued  on  page  72) 


68 

376 


Dr.  Dobb's  Journal,  May  1987 


C  CHEST 


Listing  Twenty-six  (Listing  continued) 


p  -  skipto ( 1  1 ,  rstr  ,  Esc  ) ; 
if(  *p  ) 

*p+  +  -  ' \0  '  ; 


fp->smac[0]  -  rstr[0]  ; 
fp->smac[l]  -  rstr[l]  ; 
fp->smac[2]  -  ' \0 • ; 

p  -  skipspace(  p,  Esc  ); 

fp->emac[0]  -  *p  ?  *p++  :  1 \0 '  ; 

fp->emac [1]  -  ("p  &&  *p  !-  '  ')  ?  *P++  :  ' \0 •  ; 
f p->emac. [2]  -  1  \0  ' ; 

p  -  skipspac.e(  p,  Esc  );  /*  and  then  go  to  following  word  */ 

fp->left  -  malloc (  MAX_C H AR S_ I N_F ON T  + 
strlen  (Right_str)  + 
strlen(Left_str)  +  2  ); 

if(  !f  pole  ft  ) 

errC'.df:  Not  enough  memory  for  width  tables\n"); 
return; 

} 


fp->right  -  cpy(  fp->left,  Left_str  )  +  1; 
fp->widths  -  c.py(  fp->right,  Right_str  )  +  1; 

memset (  fp->widths,  l,  MAX_CHARS_IN_FONT  ); 


} 

/ 


} 


if (  *P  ) 

if(  ! (stream  -  fopen (p, "rb") )  ) 

err (" .df . . .%s.  Can't  open  file\n",  p) ; 

else 

{ 

p  -  fp->widths; 
i  -  MAX_CHARS_IN_FONT; 

while  (  fscanf ( stream,  "%d",  p)  —  1  &&  — i  >"  0  ) 
p++; 

fclose (  stream  ); 

) 

} 


■*/ 


di (lstr) 
char  "lstr; 

/*  .di  xx  —  divert  output  to  macro  xx. 

*  terminate  the  diversion  with  a  .di 

*  or  .da  (see)  without  an  argument. 
"/ 


if(  "lstr  ) 

dc.reate(  lstr  ); 

else 

endiv ( ) ; 

) 


/"■ 


"/ 


dstlstr,  rstr) 
char  "lstr,  "rstr; 

/"  . ds  xx  str  —  define  string  xx  to  hold  the 
"  indicated  string.  If  the  string 

*  exists,  it  is  deleted.  See  also;  .as 

"/ 

if (  "lstr  ) 

screate(  lstr,  rstr); 

else 

err("Missing  string  name\n") ; 

) 

/* - - - . - — . -*/ 


dt (  num,  str,  offset) 

char  "str; 

int  num,  offset; 

*  /*  .dt  [ +-] N  xx  Set  a  diversion  trap  that  will 

*  be  sprung  after  N  lines  have  been 
"  processed  in  the  current  diversion.  Only  one 

*  diversion  trap  may  be  active. 

*/ 

if (  llsdiv  ) 

err ("No  diversion  currently  active\n"); 
return; 

) 

if(  ! num  ||  ! "str  )  /"  Clear  existing  trap  "/ 

{ 

Divtrap  -  -1; 

Dtrap_name[0]  -  0; 

else  if (  num  >-  VERT  )  /*  Set  a  diversion  trap  "/ 

Dtrap_name[0]  -strjoj:  (continued  on  page  74) 

Dtrap_name [1 ]  -  str[l];  “  ° 


72 


Dr.  Dobb's  Journal,  May  1987 

'ill 


_ C  CHEST 

Listing  Twenty  -six  ( Listing  continued ) 


setnum(  fiDivtrap  ,  num,  offset  ); 


err("Passed  diversion  trap  when  trap  set\n") ; 


ec(  lstr  ) 
char  *lstr; 
{ 


—  Change  escape  character  from  \ 
to  c.  Use  \  if  c  is  missing. 


Esc  -  *lstr  ?  *lstr  :  •  \\ • 


el(  str) 
char  *str; 


/*  .el  —  else  clause  part  of  .ie.  This  command 

*  is  normally  processed  as  part  of  the  . ie 

*  command.  If  we  get  here,  there  is  no 

*  corresponding  .ie  statement. 

*  This  command  is  not  inhibitable 


err (".el  not  associated  with  . ie\n"); 
return  0; 


/*  .em  xx  — 


Define  xx  as  the  end  macro,  executed 
after  all  output  has  been  processed 


Endm  -  strsave(  str  ); 


—  NOT  AN  NROFF  COMMAND  — 

Disable  the  escape  mechanism  entirely.  It 
can  be  restored  again  with  a  .ec.  command. 


N  —  MODIFIED  NROFF  COMMAND  — 

This  command  pushes  various  commonly  used 
variables  on  an  environment  stack.  Nroff 
supports  several  environments  and  the  shell 
supports  only  one.  If  an  argument  is  present, 
the  current  environment  is  save.  If  no 
argument  is  present,  a  previously  saved 
environment  is  popped  from  the  stack.  See 
push_env  and  the  definition  of  the 
environment  strucutre  (both  in  nrmsc.c) 
for  more  information.  The  stack  is  five 
environments  deep. 


if(  *str  ) 

push_env ( ) ; 

else 

pop_env ( ) ; 


/*  .ex  —  exit  back  to  the  operating  sytem  just. 
*  as  if  input  had  ended. 

*/ 


fi(  str.  dobreak  ) 
char  *str; 

( 

/*  —  enable  line  filling 


if(  dobreak  ) 
brk(); 


(continued  on  page  78) 


74 

378 


Dr.  Dobb 's  Journal,  May  1987 


C  CHEST 


C  CHEST 


Listing  Twenty-six  (Listing  continued) 


Listing  Twenty-six  (Listing  continued) 


Inhibit  -  (PAGE  &  1)  ; 


else  if (  *expr  —  'o'  ) 

Inhibit  -  ! (PAGE  &  1) ; 


ft(  str  ) 
char  *str; 


/*  .ft  F  —  Change  font  to  F  at  the 

*  beginning  of  the  next 

*  input  text  line.  Font  changes  can  also  be 

*  imbedded  with  a  \fF  escape  sequence.  Note  that 

*  if  font  F  doesn't  exist,  the  error  won't  be 

*  flagged  until  the  output  routines  try  to  process 

*  the  font  change  request.  F  may  be  a  number  that 

*  was  fetched  from  the  \n(.f  number  register  at 

*  some  earlier  time.  .ft  0  is  the  same  as  .ft  R. 
*/ 

chgfont (  *str  ); 


hd(  num,  str,  offset,  dobreak,  tail  ) 
char  *str; 

char  *tail; 

/*  .hd  cleft  str>  N  cright  str>  —  NOT  NROFF  - 

*  Define  strings  to  send  printer  cursor  left  or 

*  right  by  1/N  spaces.  The  width  of  a  space  is 

*  taken  from  the  currently  active  font  width 

*  table.  It  will  be  1  in  the  default,  non- 

*  proportionally  spaced  font.  N  determines  the 

*  minimum  resolution  for  the  space  between 

*  characters  in  proportional  spacing  mode. 


static  char  lstr [81]; 
static  char  rstr[8lj; 

strnc.py(  lstr,  str  ,  80  );  lstr[80]  -  0; 
strncpy(  rstr,  tail,  80  );  rstr [80]  -  0; 

Left_str  -  lstr; 

Right_str  -  rstr; 

Hs  amt  -  num; 


hy(  num  ) 
{ 


[N]  —  MODIFIED  NROFF  COMMAND  — 

Enable  hyphenation.  N  is  ignored. 


Hyphenate  -  1; 


id(  lstr,  rstr  ) 
char  *lstr,  *rstr; 


/*  .id  on  off  —  NOT  AN  NROFF  COMMAND  - 

■* 

*  Send  "on"  to  the  printer  to  put  it  into  italics 

*  (underline)  mode,  rstr  to  take  it  out.  Maximum 

*  length  of  either  string  is  80  characters.  Use 

*  \x<two  hex  digit s>  to  send  a  control  character. 
*/ 

static  char  on (81],  off [81]  ; 
on[80]  -  of f [80]  -  0; 

strncpy(  on,  lstr,  80  ); 
strncpy(  off,  rstr,  80  ); 

Ul_on  -  on  ; 

U1  off  -  off; 


Inhibit  -  1; 

err  ("Illegal  expression\n") ; 


process  (  action,  Ifilename,  2,  Mac.v  ); 
return  rval; 


iff (lstr,  rstr) 
char  "lstr,  *rstr; 


/*  .if  condition  action 

*  Simple  if  statement  (doesn't  take  an  else 

*  clause) .  The  expression  parser  used  to 

*  evaluate  the  <condition>  is  more  powerful 

*  than  NROFF' s.  Multi-line  blocks  can  be 

*  used  by  using  a  .(  as  an  action,  (be  sure 

*  to  terminate  the  block  with  a  . ) 

*  If  input  is  inhibited  we  want  to  process 

*  the  tail  without  modifying  the  inhibit  status 

*  (in  case  the  tail  is  a  . {  command;  otherwise 

*  we  set  inhibit  based  on  the  value  of  the 

*  expression  and  then  process  the  tail. 

* 

*  This  command  is  not  inhibitable 


if (  doif(lstr,  rstr)  ) 
Inhibit  -  0; 


r  *iselse(  str  ) 

r  *str; 

/*  Used  by  ie()  (below)  returns  0  if  str  doesn't 

*  hold  a  legal  .el  command,  otherwise  returns 

*  a  pointer  to  just  past  the  '1'. 


if (  ! ISCMD (  *str++  )) 
return  0; 


while (  isspace(*str)  ) 
str++; 


return (  str[0]- 


&&  str  [1 ]  — ' 1 '  )  ?  str+2 


ie(lstr,  rstr) 
char  "lstr,  *rstr; 

/*  .ie  condition  action  --  MODIFIED  NROFF  COMMAND 

*  if  part  of  an  if /else.  Is  only  non-standard  in 

*  that  the  expression  parser  is  more  powerful 
«  than  NROFF ' s 


This  command  is  not  inhibitable 


static  char 
int 


line  [MAXSTR] ,  lnum; 
did_something; 


/*  Remember  the  current  line  number  for  the 

*  sake  of  the  error  message  printed  when  we  can't 

*  find  an  else  clause. 


doif(  expr,  action  ) 
char  "expr,  ^action; 

/*  Te3t  an  expression  and  do  an  if  statement  (or 

*  the  if  part  of  a  .ie.  Set  Inhibit  as  appropriate. 

*  Modify  Inhibit  to  reflect  the  expression. 

*  We  call  process  if  input  is  inhibited  in  order 

*  to  handle  nested  if's  and  blocks.  Return  1  if 

*  input  was  not  inhibited  and  we  executed  the 

*  expression,  otherwise  return  0. 


lnum  -  INLINES  ; 

did_something  -  ! Inhibit; 
doif (  lstr,  rstr  ) ; 

if (  !getline(line,  0,  Ismacro?  mgetc:  fgetc)  | |  ! 

(rstr-iselse (line) ) ) 

/*  Complain  if  the  line  that  should  contian 
*  the  .el  and  isn't  there. 


if(  ! Inhibit  ) 

{ 

if(  start expr (expr)  ) 

Inhibit  -  !  (int)  parse (  fiexpr  ); 

else  if(  *expr  ~  *e'  ) 


err  ("Missing  .el  for  .  ie  on  line  %d\n".  Imam  ); 


if(  did_something  ) 

Inhibit  -  [Inhibit  ; 


(continued  on  page  82) 


78 


Dr.  Dobb's  Journal,  May  1987 

379 


C  CHEST 


Listing  Twenty  -six  (Listing  continued) 

/*  Intitialize  rstr  to  point  just  past  the 

*  . el,  skip  any  white  space  and  quotes  if 

*  necessary.  Then  call  process  with  the 

*  command  tail  as  its  argument. 


rstr  -  skipspace (rstr.  Esc); 
if(  *rstr  —  ""  ) 

*skipto("",  ++rstr.  Esc)  -  0; 

process (  rstr,  Ifilename,  2,  Macv  ); 

} 

if (  did_something  ) 

Inhibit  -  0; 


return  0; 

) 


/ 


*/ 


ig (  str  ) 
char  *str; 


{ 


/* 

*/ 


. ig  —  Ignore  input  by  writing  to  a  dummy  macro 


) 


mwrite(  (char  *)0,  str  ); 


/*■ 


in(  num,  str,  offset, 
char  *str; 

{ 

/*  -in  [+-] 
*/ 


dobreak  ) 


N  —  Change  the  indent  level 


*/ 


if (  dobreak  ) 
brk(); 

setnum(  &  INDENT,  num,  offset  ); 
I 


/ 


*/ 


it (  num, 
char 
int 
{ 


str,  offset  ) 
*str; 

num,  offset  ; 


/* 


*/ 


•it  [ +— ] N  xx  Input  line  trap.  Spring  macro 

xx  after  N  lines  of  input  have  been 
read.  Only  one  input  line  trap  may  be  active. 

A  .it  destroys  a  previous  trap  if  on  exists. 


if (  !num  | |  i *str  ) 

{ 

/*  Remove  current  input  trap  */ 

Itrap  -  -i; 

Itrap_name [0]  -  0; 
return; 

} 


setnum(  &ltrap,  num,  offset  ); 

if (  Itrap  <-  0  ) 

{ 

err (".it  xx  N:  N  may  not  be  negative\n") ; 
Itrap  -  -1; 

} 

else 

{ 

Itrap_name [0 ]  -  str[0]; 

Itrap_name [1]  -  str[oj; 


/*■ 


lc.  (  str  ) 
char  *str; 
[ 


/* 

*/ 


.lc  C  —  Change  leader  character  from  .  to-C 


/ 


Leader  -  *str  ; 

) 

/* - 


11 (  num,  str,  offset  ) 
char  *str; 

[ 

/*  .11  [+-)N  —  Change  line  length  to  N 

*  / 

setnum(  4LINLEN,  num,  offset  ); 


/* 


ls(  num, 
char 
{ 


str,  offset  ) 

•str; 

/•  .Is  N  —  Change  line  spacing  to  N  spaces 


*/ 


*/ 


82 

380 


Dr.  Dobb's  Journal,  May  1987 


if(  num  >-  1  ) 


) 


else 


setnum (  SLSPACE,  num,  offset  ); 
err("\".ls  N\",  N  must  be  >-  l\n") 


/* 


/ 


lt(  num,  str,  offset  ) 
char  *str; 

/*  .It  [+-]N  —  Change  the  length  of  a  3-part  title 
*/ 

setnum(  &Title_len,  num,  offset  ); 


- - - 


mc(  num,  str  ) 
char  *str; 

{ 

/*  .me  Str  [N]  —  ENHANCED  NROFF  COMMAND  — 

* 

*  Print  the  string,  str,  N  spaces  to  the  right 

*  of  the  right  margin.  This  differs  from  nroff, 

*  which  uses  a  single  character  rather  than 

*  a  string.  If  str  and  N  are  both  missing,  the 

*  margin  character  is  disabled.  The  string  is 

*  limited  to  20  characters  (including  any  spaces 

*  implied  by  N) .  If  N  is  missing  or  0,  2  is  used. 

V 

static  UCHAR  buf [21]; 

UCHAR  *p; 

if (  num  <-  0  ) 

num  -  2; 


for(  p  -  buf;  — num  >-  0  ;  *p++  -  '  '  ) 


strncpy(  p,  str,  &buf[21]  -  p  ); 

Rraarg_str  -  buf; 

) 

/* - - 

mf (  macro,  file  ) 
char  *mac.ro,  ‘file; 

/*  .mf  xx  file  —  NOT  AN  NROFF  COMMAND  — 

* 

*  Copy  the  contents  of  the  macro  xx  to  the 

*  indicated  file.  The  macro  may  not  be  in 

*  use  at  the  time.  This  command  is  particularly 

*  useful  for  saving  a  collected  index  that  hasn't 

*  been  sorted  yet. 

*/ 

dump_mac(  macro,  file  ); 


} 

- - - 

ml (  str  ) 

{ 

/*  .ml  Str  —  NOT  AN  NROFF  COMMAND 


*  Like  a  me  but  prints  the  string  at  the  left 

*  margin  rather  than  the  right  margin.  The  page 

*  offset  must  be  at  least  as  large  as  the  string, 

*  which  is  limited  to  21  characters. 

‘/ 

static  char  buf (21); 

strnc.py(  buf,  str,  20  ); 

Lmarg_str  -  buf; 


) 

/* - */ 

na (  str  ) 
char  *str; 

( 

/‘  .na  —  Turn  of  adjusting 

V 

Adjusting  -  0; 

} 

/* - V 

nb{  str  ) 
char  *str; 

( 

/*  .nb  —  NOT  AN  NROFF  COMMAND  — 


*  Used  in  conjuntion  with  a  .nm,  will  cause  blank 

*  lines  to  be  numbered  as  well  as  nonblank  lines. 

*  Useful  if  you’re  using  nr  to  format  listings. 

V 

Nmjblanks  -  *str  ; 

> 

/* - */ 

(continued  on  ne^ct  page) 


Dr.  Dobb’s  Journal,  May  1987 


C  CHEST 


Listing  Twenty-Six  (Listing  continued) 


ne (  num  ) 
f 


/*  .ne  N  —  If  the  distance  between  the  current 

line  and  the  next  output  line  trap  is 

*  less  than  N,  skip  forward  to  the  next  trap. 

*  The  assumption  is  that  the  trap  will  be  an 

*  end  of  page  trap. 


register  int 


i; 


if (  (i  -  TOTRAP)  <  num  ) 
prblank (  i  ) ; 


nf(  str,  dobreak  ) 
char  *str; 

{ 

/*  .nf  — 


if  (  dobreak  ) 
brk(); 


Disable  line  filling,  flushing  the 
buffer  first. 


FILL  -  0; 


nh  { ) 
{ 


/*  .nh 
* 

*/ 

Hyphenate 


Turn  off  hyphenation  (that  was 
turned  on  with  a  .hy  command) . 


nm(  str  ) 
char  *str; 


/*  ,NM  N  M  S 


MODIFIED  NROFF  COMMAND  — 


*  N  -  first  line  number 

*  M  -  only  even  multiples  of  M  are  printed 

*  S  -  print  string  after  number 

*  If  you  need  to  change  M  without  changing  N,  use 

*  .nm  x  M  S  where  x  is  any  non-number.  Same  goes  for 

*  .nr  x  x  S. 

*  If  no  arguments  are  specified  turn  off  numbering 

*  but  remember  current  line  number  etc.  Use  .nm  x 

*  to  resume  where  x  is  any  non-numeric,  argument. 

*  Bugs:  The  arrays  gotten  from  mallocO 

*  for  the  S  argument  are  never  free()ed. 


char 

extern  double 


p; 

parse  () ; 


if(  Nm_on  -  *str  ) 

{ 

splitfields (4str,  &p) ; 
if(  isdigit (*str)  ) 

LINE  -  (int)  parse (  &str  ) 

splitfields  Up,  4 str)  ; 

if(  isdigit (*p)  ) 

Nm_mult  -  (int)  parse (  &p 


if  ( 
{ 


■str  ) 


/*  Do  N  argument 

/*  Do  M  argument 

/'  Do  S  argument 


if (  ! (Nm  str 
( 


strsave (str) ) 


err("Can't  get  enough  memory  for  .nm\n") ; 
Nm  str  -  "  "  ; 


nr (  num,  str,  offset,  dobreak,  tail  ) 
char  *str; 

char  “tail; 

( 


/*  .nr  R  [+-]N  [ (-] M]  —  ENHANCED  NROFF  COMMAND  — 

*  create  or  modify  number  register  R  by  (to)  N.  If 

*  M  is  present,  it  is  incremented  when  invoked 

*  with  \n+x,  \n+ (xx,  \n-x  or  \n+(xx.  If  M  is  absent, 

*  1  is  used.  Unlike  nroff,  .nr,  with  no  arguments, 
•^prints  all  currently  defined  number  registers. 


if (  * 

else 


str  —  '\0') 
pr_nregs ( ) ; 


(continued  on  page  86) 


84 

382 


Dr.  Dobb's  Journal,  May  1987 


C  CHEST 


C  CHEST 


Listing  Twenty-six  (Listing  continued) 

\  putnreg(  str#  0,  num,  offset,  1,  atoi(tail)  ); 


Listing  Twenty-six 


setnum(  SOFFSET,  num,  offset) ; 


.ns  —  Inhibit  line  spacing:  Dont'  print 
newlines  until  some  text  is 
printed  or  a  .bp  N  (the  N  is  required)  or  a 
.rs  is  executed 


No space  -  1; 


od(lstr, 

{ 


.  on  Off  —  NOT  AN  NROFF  COMMAND  — 

Define  two  strings,  one  to  enter  overstrike 
mode  on  the  printer  (on)  and  a  second  to 
exit  (off) .  Maximum  string  length  is 
80  characters. 


pr_traps ( )  ; 


/*  Print  all  the  traps  */ 


rd(  lstr  ) 
char  *lstr; 

( 

/*  .rd  prompt 


Read  insertion  from  standard  input  until  two 
newlines  in  a  row  are  encountered,  .rd  creates 
a  macro  called  "  ".  It  fills  the  macro  from 

standard  input,  expands  the  macro,  and  then 
deletes  the  macro,  lstr  is  a  prompt  which  will 
be  printed  to  stderr.  A  BEL  is  output  as  a 
prompt  whether  or  not  lstr  is  specified. 


static  char  on[81],  off [81] 
on[80]  -  off[80]  -  0; 

strncpy (  on,  lstr,  80  ); 
strncpyt  off,  rstr,  80  ); 

Os_on  -  on  ; 

Os  off  -  off; 


fprintf (stderr,  "\007"); 
if (  *lstr  ) 

fprintf (stderr,  "\n%s",  lstr); 

oifile  -  Ifile; 

Ifile  -  stdin; 
mcreate (  "  "  ,  "\n"  ); 

Ifile  -  oifile; 

expand_macro (  "  "  ) ; 

munlink(  "  "  ); 


os (  num, 

{ 


str,  offset  ) 


NOT  AN  NROFF  COMMAND  — 


*  Just  like  .ul  except  it  overstrikes  the  next 

*  N  input  lines  rather  than  underlining  them. 


setnum(  &Num  os,  num,  offset) ; 


rm(lstr) 
char  *lstr; 


Remove  the  named  macro  or  string, 
if  the  macro  is  on  the  disk,  the 
fill  will  be  deleted. 


ou (str) 

char  *str  ; 
{ 

/*  .ou  str 


—  NOT  AN  NROFF  COMMAND  — 


*  Output  string  directly  to  the  current  output, 

*  without  going  through  the  normal  text  processing 

*  mechanism.  Line  number,  adjusting,  etc.  will 

*  not  be  affected.  This  command  for  sending  control 

*  sequences  directly  to  the  printer  (ie.  for 

*  initializations  etc.  Use  \x<2  hex  digits>  to  send 

*  non-printing  characters.  The  top  bit  of  the 

*  character  is  trimmed  off  before  transmitting,  so 

*  you  can  send  an  ASCII  null  as  a  "\x80".  Also  note 

*  that  the  -c  flag  (which  causes  control  characters 

*  to  be  printed  in  readable  form)  has  affect  on  the 

*  output  of  this  command. 


ots(  str  ); 


if (  *lstr  ) 

munlink(  lstr  ); 

else 

err ("Missing  macro  name\n"); 


rr (lstr, 
char 
{ 


,  rstr) 

*lstr,  *rstr; 

/*  . rr  xx  Remove  number  register  xx.  Non- 

*  existant  number  registers  evaluate 

*  to  zero  when  used  in  an  expression. 


register  int  i; 

if (  *lstr  ) 

rm_nreg (lstr)  ; 

else 

err ("Missing  number  register  name\n"  ); 


/*  .pc  C  —  Change  the  character  used  to  indicate 

*  a  page  number  in  a  3-part  title  (.tl)  from 

*  %  to  C. 

*/ 


rs (  str 
char 


Restore  line  spacing  turned  off  with 
a  previous  .ns  command.  Note  that 
.bp  N  (the  N  is  required!  also  works, 
as  does  printing  some  text. 


Page_ch  -  *lstr  ; 


pl(  num,  str,  offset  ) 
char  *str; 


/*  .pi  [ +- ] N  —  Set  page  length  to  N 
*/ 

setnum(  &PGLEN,  num,  offset); 


/*  .so  file  —  get  (source)  input  from  named 

*  file.  The  position  in  the  current 

*  file  is  remembered  and  processing  will 

*  continue  when  the  sourced  file  is  exhausted. 
*/ 


po(  num,  str,  offset  ) 
char  *str; 


/*  .po  [+-]N  —  Set  page  offset  to  N 

*/ 


register  FILE  *fd; 

if(  !  (fd  -  fopen(lstr,  "r"))  ) 

err ("Can't  open  %s\n",  lstr  ); 

else 

( 

process  (  fd,  lstr,  0,  0  ); 

(continued  on  page  92) 


86 


Dr.  Dobb's  Journal,  May  1987 

383 


C  CHEST 


Listing  Twenty-six  (Listing  continued) 


fclose(  fd  ); 


sp (  num,  str,  offset,  dobreak) 
char  "str; 

{ 

/*  . sp  [N]  —  Space  down  N  lines.  Default  N  is  1 


if (  dobreak  ) 
brk(); 


if(  num  >  0  ) 

prblank(  num  ); 


else  if  (num  <  0  ) 
go_up (  num  ) ; 


ss  (  num  ) 

{ 

/*  .ss  N  —  Change  the  width  of  a  space  in  the 

*  currently  active  font  to  N.  Default 

*  N  is  1. 

"/ 


Fonts (  CURFONT  ] .widths ( 


] 


ta (  str  ) 
char  "s 
{ 


) 

/* - 

tc (str) 
char 
{ 


.ta  [A,B, ...Z]  —  If  no  argument,  clear  all  tab 
[+]B[RCL]  stops.  The  argument  is  a  list 

of  tabstops.  Each  argument  can  be  a  specific 
column  (eg.  9),  or  an  offset  from  the  previous 
number  (eg.  +8),  in  addition,  each  number  can 
be  followed  by  a  tab  type: 

R  -  Right  justified  in  field 
C  -  Centered  in  field 
L  -  Left  justified  in  field 


if<  "str  ) 

tabset (  str  ) ; 

else 

tabclr () ; 


"str; 

/*  *tc  C  —  MODIFIED  NROFF  COMMAND  — 

Set  tab  expansion  character  to  C. 

*  if  no  C,  tab  expansion  is  disabled.  This  lets 

*  us  get  a  “'I  or  *A  through  to  the  printer  control 

*  tp. 

*/ 

if(  !*str  ) 

Tabs_enabled  -  0; 

else 

{ 

Tabs_enabled  -  1; 

Tab  -  "str  ; 


) 

/* - 

ti(  num,  str,  off,  dobreak 
char  "str; 


/"  .ti  N 


if (  dobreak  ) 
brk(); 


Temporary  indent  is  set  to  N  spaces. 
A  temporary  indent  applies  only  to 
the  next  line  of  output. 


Tempi n 


tl(  str  ) 
char  "str; 


/*  .tl  /A/B/C/  Print  a  3-part  title  with  A  left 

justified,  B  centered,  and  C  right 
"  justified.  The  /  can  be  any  character.  The 
*  title  is  printed  at  the  current  page  offset 
"  but  indent  is  ignored,  the  length  is  defined 
"  with  the  .It  command. 

*/ 

title (  str  ); 


(continued  on  page  94) 


92 

384 


Dr.  Dobb's  Journal,  May  1987 


_ C  CHEST 

Listing  Twenty-six  (Listing  continued) 

/* - */ 


tm(lstr  ) 
char  *lstr; 

/*  .tm  str  —  Print  string  to  standard  error. 

*/ 

fprintf (stderr,  lstr  ); 

} 

/ - - 


tp() 

{ 

/*  .tp  —  NOT  AN  NROFF  COMMAND  — 

*  prints  the  current  tab  stops  the  current 

*  line  length 
*/ 

tabprint  ()  ; 

} 

/* - - 


ul(  num,  str,  offset) 

char  *str; 

int  num,  offset; 

/*  .ul  [N]  —  Underline  the  next  N  input  lines, 

*  if  N  is  absent.  Only  alphanumeric 

*  characters  are  underlined,  punction, 

*  spaces,  etc.,  will  not  be. 

*/ 

setnum(  iNum_under,  num,  offset  ); 

) 

- - - 

vd(  num,  str,  offset,  dobreak,  tail  ) 
char  *  str; 

char  *tail; 

/*  ,vd  <up  str>  N  <down  str>  —  NOT  NROFF  COMMAND  — 

* 

*  defines  strings  to  send  printer  cursor  up  or 

*  down  by  1/N  lines. 

*/ 

static  char  dnstr[81]; 
static  char  upstr[8lj; 

strncpy(  Up_str  -  upstr,  str  ,  80  ); 
strncpy(  Dn_str  -  dnstr,  tail,  80  ); 

Vs_amt  -  num 

) 

- - - 


wa (  num  ) 

{ 

/*  .wa  [N]  —  NOT  AN  NROFF  COMMAND  — 

*  Waits  for  about  N  seconds  (at  most  N  +  1) . 

*  If  N  —  0  or  if  no  argument  a  prompt  is 

*  printed  and  the  program  waits  for  you  to 

*  type  Enter. 

*/ 

int  sec,  osec,  garbage  ; 
if(  !num  ) 

fprintf (stderr,  "\n\007\nType  CR  to  continue..."  ); 
fprintf (stderr,  "%c\n",  getch()  ); 

) 

else 

( 

num++; 


fprintf (  stderr,  "\n") ; 
while (  — num  >-  0  ) 

fprintf (  stderr,  "waiting:  %02d\r",  num  ); 
time (^garbage,  igarbage,  &osec,  ^garbage  ); 
do  { 

time (^garbage,  igarbage,  isec,  igarbage  ); 
)  while (  osec  —  sec  ); 

} 

fprintf (  stderr,  "\n") ; 

) 

) 

- - - 


wh(  num,  str  ) 
char  *  str; 

/ *  . wh  N  X  —  Set  output  line  trap.  The  macro  X 


Dr.  Dobb 's  Journal,  May  1987 

385 


is  executed  imediately  after 
printing  line  N  on  the  current  page.  If  N--0 
the  the  trap  is  sprung  at  the  top  of  page 
(above  line  1) .  If  X  is  absent,  the  trap  at 
line  N  is  removed.  If  N  is  negative  then  the 
trap  is  set  relative  to  the  bottom  of  the 
page  length  (as  set  with  .pi)  that  was  in 
.wh  was  executed.  The  macro  will  replace  any 
previously  installed  macro,  macros  do  not 
shadow  one  another  as  in  the  real  nroff. 


If  the  "inhib"  field  is  set  then  the  command  won't  be 
executed  when  input  has  been  inhibited  by  if/else 
processing.  Only  control-flow  commands  are  uninhibited. 
To  change  a  command  name  you  need  only  modify  the  table. 
The  exception  is  . {  because  this  command  is  generated 
explicitly  when  a  \{  is  encountered. 

Consult  commando  in  nrcmd.c  for  more  details  on  how  the 
table  is  used.  ()}) 


set_linetrap(str,  num) ; 


CTAB  Cmdtabf ] 

{ 


ws (  num  ) 

{ 


•ws  N  —  NOT  AN  NROFF  COMMAND  — 

Enables  wordstar-mode  output: 

N  --  0  (or  missing)  Wordstar  mode  disabled 
N  1  all  single  newlines  mapped  to  to  Wordstar 
soft  carriage  returns.  \n\n  is  printed  as 
two  hard  carriage  returns,  however. 

N  2  like  N--1  except  that  single  carriage 

returns  are  replaced  with  space  characters. 

Note  that  you'll  also  want  to  do  the  following: 


•  po  0 

.bd  \x02  \x02 
.ud  \xl3  \xl3 
.od  \xl8  \xl 8 


Wordstar  -  num; 


\"  No  page  offset 
\"  “'B  for  bold 
\"  •'S  for  underline 
\"  for  overstrike 


*  List  of  legal  commands.  Command  names  must  be  listed  in 

*  alphebetic.al  (ASCII)  order.  The  subroutine  pointed  to  by 

*  the  "rout"  field  is  called  when  the  command  is  recognized. 

*  Command  format  types  (column  labeled  "tp"  below)  are: 

*  Type  0:  .xx  [<str>]  default  used  for  <str> 

*  Type  1:  .xx  N  [<str>]  default  used  for  N 

*  Type  2:  .xx  <str>  N  [<tail>]  default  used  for  N 

*  Type  3:  .xx  <lstr>  <rstr>  default  used  for  <lstr> 

* 

*  If  the  default  field  is  empty  ("")  it  is  passed  to  the 

*  subroutine  as  an  empty  string  in  type  0  and  1  commands 

*  and  as  a  0  in  type  1  and  2  commands. 


name 

rout 

"ad"  , 

ad 

"af"  , 

af 

"am"  , 

am 

"as"  , 

as 

"bd"  , 

bd 

"bo"  , 

bo 

"bp"  , 

bp 

"br"  , 

br 

"c2"  , 

c2 

"cc"  , 

cc 

"c.e"  , 

c.e 

"cf"  , 

cf 

"ch"  , 

ch 

"cm"  , 

cm 

"cu"  , 

cu 

"da"  , 

da 

"db"  , 

db 

"de"  , 

de 

"df"  , 

df 

"di"  , 

di 

"ds"  , 

ds 

"dt"  , 

dt 

"ec"  , 

ec 

"el"  , 

el 

"em"  , 

em 

"eo"  , 

eo 

"cv"  , 

ev 

"ex"  , 

ex 

"fi"  , 

fi 

"ft"  , 

ft 

"hd"  , 

hd 

"hy"  , 

hy 

"id"  , 

id 

"ie"  , 

ie 

"if"  , 

iff 

»ig"  , 

ig 

"in"  , 

in 

rout.,  tp,  inhib,  default 


(continued  on  page  97) 


Dr.  Dobb 's  Journal,  May  1987 

386 


95 


C  CHEST 


Listing  Twenty  -six  ( Listing  continued ) 


••it"  , 

it 

1/ 

1 

III! 

"lc"  , 

lc 

I, 

1 

" . " 

"11"  , 

11 

1, 

1 

o 

CO 

"Is"  , 

Is 

1/ 

1 

"1" 

"It"  , 

It 

1/ 

1 

"80" 

"me"  , 

me 

2, 

1 

Mil 

"mf"  , 

mf 

3, 

1 

"" 

"ml"  , 

ml 

o. 

1 

"" 

"na"  , 

na 

o. 

1 

"" 

"nb"  , 

nb 

o. 

1 

"" 

"ne"  , 

ne 

1/ 

1 

"1" 

"nf"  , 

nf 

o. 

1 

"" 

"nh"  , 

nh 

o. 

1 

mi 

"nm"  , 

nm 

0, 

1 

•in 

"nr"  , 

nr 

,  2, 

1 

"0" 

"ns"  , 

ns 

o, 

1 

Mil 

"od"  , 

od 

3, 

1 

MM 

"os"  , 

os 

1/ 

1 

"1" 

"ou"  , 

ou 

o. 

1 

MM 

"pc"  , 

pc 

,  0, 

1 

"%" 

"pi"  , 

Pi 

1/ 

1 

"66" 

"po"  , 

po 

1, 

1 

"0" 

"Pt"  / 

Pt 

,  o. 

1 

MM 

"rd"  , 

rd 

,  0, 

1 

"\007 

"rm"  , 

rm 

,  0, 

1 

MM 

"rr"  , 

rr 

/  o. 

1 

MM 

"rs"  , 

rs 

,  0, 

1 

MM 

"so"  , 

so 

/  0, 

1 

/  "" 

"sp"  , 

sp 

/  1/ 

1 

,  "1" 

"ss"  , 

ss 

/  1/ 

1 

,  "1" 

"ta"  , 

ta 

/  o. 

1 

/  "" 

"tc"  , 

tc 

/  0, 

1 

/  "  " 

»ti"  , 

ti 

/  1/ 

1 

,  "0" 

"tl"  , 

tl 

/  0, 

1 

,  MM 

"tm"  , 

tm 

/  0/ 

1 

,  "\007 

"tp"  / 

tp 

,  0, 

1 

/  "" 

"ul"  , 

ul 

,  1/ 

1 

,  "1" 

"vd"  , 

vd 

,  2, 

1 

,  "" 

"wa"  , 

wa 

,  1/ 

1 

/  "" 

"wh"  , 

wh 

/  1/ 

1 

,  "" 

"ws"  , 

ws 

,  1/ 

1 

,  "" 

( 

s block 

,  o. 

0 

/  "" 

) 

eblock 

/  0, 

0 

/  "" 

(sizeof (Cmdtab)  /  sizeof ( *Cmdtab) ) ; 


Listing  One 


End  Listing  Twenty-six 


STAT.C  Statistics  routines: 

newsample (  n  )  Add  a  new  sample  to  the  mean/average 
totals. 

running_mean  ()  Returns  the  running  mean  of  the  samples. 

*  true_mean()  Returns  the  true  mean  of  the  samples. 

*  deviation ()  Returns  the  standard  deviation  from  the 

*  running  mean. 

*  reset_mean (n)  Resets  everyting  to  0.  'n'  is  the  boxcar 
length  that  will  be  used  for  subsequent 
samples.  If  n  is  0,  the  default  length  of 
4  is  used. 


#def ine  DEF_BOXLEN  4 

static  unsigned  long 
static  unsigned  long 
static  unsigned  long 
static  unsigned  long 
static  unsigned  long 
static  unsigned  long 
static  unsigned  int 


Average 

Numnums 

Mean_total 

Mean 

Dev_total 

Dev 

Boxlen 


0; 

0; 

0; 

0; 

0; 

0; 

DEF  BOXLEN 


d  newsample (  n  ) 

/*  Add  a  new  point  into  the  various  mean  and  deviation 
*  variables. 

V 


register  unsigned  long  dif; 

Average  +=  n; 

Numnums++; 

Mean_total  —  Mean  ; 
Mean_total  +-  n; 

Mean  =  Mean  total  »  Boxlen; 


/*  find  running  mean 


(continued  on  nejct  page) 


Ur.  Dobb's  Journal,  May  1987 


97 

387 


C  CHEST 


Listing  One  (Listing  continued) 


43| 

dif 

-  abs (  Mean  -  n  ) ; 

/* 

Distance  to  point 

*/ 

44] 

45] 

dif 

*-  dif; 

/* 

square  it. 

*/ 

46 1 

Dev_total 

--  Dev; 

/* 

find  average 

* 

47 1 

Dev_total 

+-  dif; 

/* 

difference 

*/ 

48| 

Dev  -  Dev_ 

total  »  Boxlen; 

49 

50 

51 

52 

53 

54 

55 

56 

57 

58 

59 

60 
61 
62 

63 

64 

65 

66 

67 

68 

69 

70 

71 

72 

73 

74 

75 

76 

77 

78 

79 

80 
81 
82 

83 

84 

85 

86 

87 

88 

89 

90 

91 

92 

93 

94 

95 

96 

97 

98 

99 
100 


109 

110 
111 
112 

113 

114 

115 

116 

117 

118 

119 

120 
121 


lnt 

{ 


lnt 

{ 


- v 

running_mean  ()  /*  Return  the  current  running  mean  */ 

return  Mean; 


*/ 


truejnean()  /*  Return  the  current  true  mean  */ 

return  Average  /  Numnums; 


int 

t 


deviation  0 


/*  Return  the  current  standard  */ 

/*  deviation  from  the  running  mean.  */ 


extern  double  sqrt(); 

return  (int)  sqrt  (  (double)Dev  ); 


void  reset_mean  (  boxcar_val  ) 


( 


/*  Reset  various  global  variables  to  their  initial 

*  values,  Mboxcar_val“  is  used  for  the  boxcar 

*  width.  It  is  a  shift  value,  not  a  true  width.  If 

*  it's  0,  the  default  value  of  4  is  used  instead. 


Average  -  0; 

Numnums  -  0; 

Mean_total  -  0; 

Mean  -  0; 

Dev_total  -  0; 

Dev  -  0; 

Boxlen  -  boxcar  val  ?  boxcar  val  :  DEF  BOXLEN 


fifdef  MAIN 


idefine  NUMSAMPLES 


test (  how  ) 
{ 


101| 

/* 

how  -  0 

Straight  line 

1021 

* 

how  -  1 

Triangle 

103| 

* 

how  -  2 

Random 

1041 

V 

105| 

1061 

int 

i,  j,  count,  m. 

a,  d,  dir  -  1 

107| 

108| 

for  ( 

count  -  0;  count 

+-  dir;  ) 

if (  count  --  NUMSAMPLES  ) 
dir  -  -1; 

newsample (  i  -  ((  how  ~  2  )  ?  (rand()  %  NUMSAMPLES)  : 
(  how  —  1  )  ?  (count 

(NUMSAMPLES  /  2 

m  -  running_mean  ()  ; 
d  -  deviation  (); 
a  -  true_mean(); 

for (  j  -  1;  j  <-  NUMSAMPLES;  j++  ) 

( 


)  : 
))); 


122| 

if(  j>i 

SC  j>m  SC  J>a  ] 

1 

123| 

break; 

124| 

125| 

if 

(  j  —  i  ) 

print f (“*" 

126| 

else  if 

(  j  —  m  ) 

print f ("m" 

127| 

else  if 

(  3  “  a  ) 

printf ("a” 

128| 

else 

printf ("  M 

98 

38i 


Dr.  Dobb's  Journal,  May  1987 


printf  ("\ n") ; 


char  buf[80]; 

int  how; 

reset_mean (  4  ) ; 
test (  0  ) ; 

printf (“Straight  line  with  length  16  boxcar\n\f “) ; 

reset_mean (  2  ) ; 
test (  1  ) ? 
test (  1  ) ; 

printf ("Triangle  wave  with  length  4  boxcar\n\f") ; 

reset_mean (  4  > ; 
test (  1  ) ; 
test (  1  ) ; 

printf ("Triangle  wave  with  length  16  boxcar\n\f ") ; 

reset_mean  (  6  )  ; 
test  (  1  ) ; 
test  (  1  ) ; 

printf ("Triangle  wave  with  length  64  boxcar\n\f ") ; 

reset_mean (  2  ) ; 
test  (  2  ) ; 

printf ("Random  input  with  length  4  boxcar\n\f") ; 

reset_mean (  4  ) ; 
test (  2  ) ; 

printf ("Random  input  with  length  16  boxcar\n\f") ; 

reset_mean (  6  ) ; 
test  (  2  ) ; 
test  (  2  im¬ 
print  f  (“Random  input  with  length  64  boxcar\n\f") ; 

lifdef  NEVER 

while (  1  ) 

{ 

printf (  “triangle,  random,  or  linear  (r/t/l)?“  >; 
gets  (  buf  ) ; 


how  -  (  *buf 


•r'  )  ?  2  :  (  *buf 


printf  ("Boxcar  length?  "); 

gets  (  buf  ) ; 

test  (  atoi  (buf) ,  how  )  ; 


End  Listing  One 


(Softstrips  are  on  page  101. ) 


Dr.  Dobb's  Journal,  May  1987 


99 

389 


o 


FILES: 

STAT.C 


Both  of  these  Soft- 
strips  bv  Cauzin  Sys¬ 
tems  contain  com¬ 
plete  versions  of 
Listing  One  for  this 
month  s  C  Chest.  The 
strip  on  the  left  is  in 
high-density  format, 
and  the  one  on  the 
right  is  in  medium 
density. 


C  CHEST 


COLUMNS 


C  CHEST 


Statistical  Applications  of  Digital  Low-Pass  Filters, 
Exec  Bug  in  Microsoft  C 


In  keeping  with  this  issue’s  dual 
theme  of  music  and  scientific  pro¬ 
gramming,  this  month's  C  Chest  looks 
at  a  set  of  subroutines  that  have  appli¬ 
cations  in  both  worlds. 

Two  of  the  more  useful  statistical 
functions  are  the  arithmetic  mean 
and  standard  deviation  from  that 
mean.  Given  some  set  of  data  points, 
the  mean  is  just  the  average  value  of 
the  points.  The  standard  deviation  is 
the  average  distance  from  the  mean 
to  the  various  data  points.  To  be  more 
precise,  it’s  the  square  root  of  the 
arithmetic  mean  of  the  squares  of  the 
distance  of  each  data  point  from  the 
mean.  The  standard  deviation  is  usu¬ 
ally  used  as  a  measure  of  the  disper¬ 
sion  of  the  various  points  around  the 
mean — a  sort  of  average-error  indi¬ 
cation — the  higher  the  standard  devi¬ 
ation,  the  larger  the  average  error.  In 
other  words,  it  tells  you  how  close 
the  real  data  is  to  the  average,  or  ex¬ 
pected,  value. 

Computing  a  true  mean  and  stand¬ 
ard  deviation  can  be  quite  difficult, 
especially  if  the  number  of  data 
points  is  large  or  if  these  points  have 
large  values.  It’s  easy  to  run  out  of 
precision  because  you  have  to  sum  all 
the  points  in  the  input  sample  before 
you  can  divide.  Moreover,  you  can't 
compute  either  the  true  mean  or  de¬ 
viation  until  you've  collected  all  the 
data  points;  you  can’t  do  it  on  the  fly. 
Fortunately,  there  are  ways  to  ap¬ 
proximate  the  mean  and  deviation 
that  don't  have  these  limitations.  I'll 


by  Allen  Holub 

look  at  one  of  these  here — an  expo¬ 
nential  smoothing  function,  or  digital 
low-pass  filter. 

All  sounds,  regardless  of  the  wave¬ 
shape,  can  be  broken  up  into  the  sum 
of  a  series  of  sine  waves.  The  lowest- 
frequency  component  is  the  funda¬ 
mental,  and  the  higher-frequency 
components  are  called  harmonics,  or 
partials.  A  low-pass  filter  removes 


the  higher-frequency  components 
from  a  particular  sound  and  leaves 
the  lower-frequency  ones  intact. 
Most  stereos  have  a  low-pass  filter 
built  into  their  tone  controls.  When 
you  turn  the  treble  control  all  the 
way  down,  all  the  high-frequency 
components  of  the  sound  are  re¬ 
moved,  leaving  only  the  low-fre¬ 
quency  components. 

Mathematically,  a  low-pass  filter  is 
a  “leaky  integrator.”  It  computes  the 
integral,  not  of  a  whole  curve,  but  of 
the  most  recently  seen  parts  of  the 
curve.  You  can  look  at  an  integral  as 
the  area  under  the  curve — the  area 
of  a  shape  bounded  on  one  side  by 
some  function  and  on  the  other  by 
the  x  axis.  Consequently,  you  can  get 
a  good  approximation  of  the  integral 
from  a  set  of  discrete,  equally  spaced 
points  by  summing  all  the  distances 
from  the  x  axis  to  the  points.  That  is, 
if  you  consider  each  point  to  repre¬ 
sent  a  box  whose  height  is  the  dis¬ 
tance  to  the  curve  and  whose  width 
is  1,  the  area  of  each  box  is  the  height 
and  the  total  area  is  just  the  sum  of 
the  heights.  If  the  distance  between 
the  points  is  not  1,  you  can  compen¬ 
sate  by  multiplying  the  sum  by  the 
actual  distance. 

To  use  the  integral  as  a  low-pass  fil¬ 
ter,  you  limit  the  range  of  the  integra¬ 
tion,  including  in  the  sum  only  those 
points  in  an  n-point  wide  window. 
The  wider  the  window,  the  lower 
the  cutoff  frequency  of  the  filter. 
That  is,  when  a  larger  part  of  the 
curve  is  included  in  the  sum,  the 
parts  of  the  curve  that  change  fastest 
(the  higher-frequency  components) 
tend  not  to  affect  the  sum  as  much  as 
the  components  that  change  more 


slowly.  Returning  to  statistics,  the 
value  of  the  arithmetic  mean  is  just 
the  integral  divided  by  the  number  of 
points. 

For  those  of  you  who  are  electroni¬ 
cally  inclined,  a  standard  low-pass  fil¬ 
ter  circuit  is  shown  in  Figure  1,  page 
105.  If  Rieak  is  removed,  this  circuit  is 
an  integrator.  That  is,  the  various 
voltages  present  in  the  input  will  all 
be  summed  in  the  capacitor,  C.  If  the 
input  is  positive,  the  capacitor  is 
charged;  if  the  input  is  negative,  the 
capacitor  is  discharged.  The  integra¬ 
tor  has  been  modified,  however,  by 
putting  a  leakage  resistor  (Rieak) 
around  the  capacitor.  This  resistor 
causes  the  capacitor  to  discharge 
slowly,  even  when  there's  no  input. 
The  value  of  the  resistor  determines 
the  rate  of  discharge,  which  in  turn 
determines  the  width  of  the 
window. 

Digital  low-pass  filters  can  be  im¬ 
plemented  in  several  ways  The  most 
straightforward  is  actually  to  aver¬ 
age  some  set  of  contiguous  input 
points.  Every  time  you  acquire  a  new 
input  point,  you  discard  the  oldest 
point,  add  the  new  one,  and  refigure 
the  average.  The  more  points  that  are 
included  in  the  average,  the  lower 
the  cutoff  frequency.  A  fixed-length 
circular  queue  is  a  good  choice  of 
data  structures.  You  can  treat  it  as  a 
queue  when  you  insert  new  points 
and  as  an  array  when  you  average 
the  points.  This  method,  called  a  box¬ 
car  average,  isn't  too  useful  in  prac¬ 
tice,  however.  It's  just  too  hard  to 
keep  all  those  points  around,  add 
them  up,  and  then  divide  by  the 
number  of  points  every  time  you  get 
a  new  input  sample. 

A  more  practical  method  that  is 
similar  to  a  true  boxcar  average  is  ex¬ 
ponential  smoothing.  It’s  best  ex¬ 
plained  with  an  example.  Say  you 
have  a  length-16  boxcar.  Every  time 
you  get  a  new  input  sample,  you  sub¬ 
tract  l/16th  of  the  current  value 
from  the  average  and  then  add 


102 


Dr.  Dobb's  Journal,  May  1987 

391 


C  CHEST 

(continued  from  page  102) 

l/16th  of  the  new  sample’s  value  to 
the  average.  In  pseudocode: 

Width  =  16; 
whiled) 

{ 

Boxcar  -=  Boxcar  /  Width; 
Boxcar  +  =  input* )  /  Width; 

} 

Because  you’re  always  subtracting 
l/16th  of  the  current  boxcar,  the  con¬ 
tents  change  exponentially.  (Think 
about  what  happens  if  all  the  input 
samples  suddenly  go  to  zero;  the  val¬ 
ue  of  Boxcar  will  decrease  by  l/16th 
of  its  former  value  every  time 
through  the  loop.)  Figure  2,  below, 
shows  a  simple  case  that  demon¬ 
strates  the  exponential  aspect  of  the 
algorithm  going  in  the  other  direc¬ 
tion.  I’m  using  a  length-16  boxcar, 
and  the  input  set  defines  a  straight, 
horizontal  line  (marked  with  aster¬ 
isks).  The  mean  (marked  with  ms) 
starts  out  at  0  and  gradually  (expo¬ 


nentially)  converges  on  the  input 
line.  If  the  boxcar  had  been  smaller,  it 
would  have  converged  faster. 

You’ll  notice  a  few  similarities  be¬ 
tween  this  algorithm  and  the  elec¬ 
tronic  equivalent  in  Figure  1.  In  par¬ 
ticular,  Width  takes  the  place  of  Rieak 
in  the  circuit.  That  is,  changing  the 
width  of  the  window  effectively 
changes  the  cutoff  frequency  of  the 
filter.  If  Width  is  1,  there  will  be  no 
filtering;  the  input  will  just  pass 
through  to  the  output.  When  Width  is 
16,  no  change  in  the  input  that  hap¬ 
pens  in  fewer  than  16  samples  will 
make  it  through  to  the  output.  That 
is,  only  those  changes  in  the  input 
that  happen  more  slowly  than  the 
rate  at  which  the  average  changes 
will  make  it  through  to  the  output. 

Exponential  smoothing  can  be  im¬ 
plemented  naively  using  the  above 
algorithm,  but  as  in  most  such  algo¬ 
rithms,  a  little  thought  gives  you  both 
more  efficiency  and  more  accuracy. 
There  are  two  changes  that  are  easy 
to  make.  First,  by  limiting  the  boxcar 
length  to  a  power  of  2,  you  can  re¬ 
place  the  divides  with  right  shifts. 


Second,  rather  than  have  the  Boxcar 
variable  hold  the  mean  itself,  you  can 
make  it  hold  16  times  the  mean, 
thereby  eliminating  one  of  the  di¬ 
vides  and  giving  you  better  precision 
at  the  same  time.  The  modified  algo¬ 
rithm  is: 

whiled) 

{ 

Boxcar  -  =  Mean  ; 

Boxcar  -f  =  input! )  ; 

Mean  =  Boxcar  >  >  4; 

} 

The  boxcar  is  updated  with  every  in¬ 
put  sample.  The  mean  is  computed 
by  dividing  Boxcar  by  16.  (A  right 
shift  of  4  bits  is  a  divide  by  16.)  When 
you  subtract  Mean  from  Boxcar, 
you’re  effectively  subtracting  l/16th 
of  the  mean  from  itself.  You  don’t 
loose  any  precision  here,  though,  as 
you  would  if  you  divided  before 
subtracting. 

Now  let's  apply  the  boxcar  algo¬ 
rithm  in  a  digital-filter  application. 
Figure  3,  page  106,  shows  the  algo¬ 
rithm  being  used  on  a  triangle  wave. 


cut-off  frequency  = 

Figure  1:  Analog  low-pass  filter  or  leaky  integrator 


Dr.  Dobb's  Journal,  May  1987 

392 


105 


C  CHEST 

(continued  from  page  105) 


The  input  data  is  the  same  in  all  three 
parts  of  the  figure,  but  the  length  of 
the  boxcar  is  different.  In  all  three 
cases,  the  input  is  marked  with  aster¬ 
isks,  the  boxcar  average  with  side¬ 
ways  ms,  and  the  true  average  (arith¬ 
metic  mean)  with  sideways  as.  The 
input  data  contains  100  points  per 
cycle  (200  points  in  the  entire  graph). 
Figure  3a  shows  a  boxcar  of  length  4. 
Here  there’s  not  much  to  see.  The 
points  on  the  input  triangle  are  slight¬ 
ly  rounded,  but  that’s  about  it.  It's  in¬ 
teresting  to  note  the  slight  phase  shift 
between  the  input  and  output.  This 
phase  shift  is  also  a  characteristic  of 
analog  low-pass  filters  (built  with  ca¬ 
pacitors,  resistors,  and  such).  It’s 
caused  by  the  amount  of  time  re¬ 
quired  to  charge  the  capacitor  in  the 
analog  filter.  In  the  digital  equivalent, 
it’s  the  amount  of  time  needed  for  the 
mean  to  ramp  up  to  the  output  value. 
So,  you’d  expect  the  phase  shift  to  in¬ 
crease  as  the  width  of  the  boxcar  (or 
the  value  of  the  capacitor)  increases. 

Figure  3b  shows  the  same  input, 
now  being  filtered  with  a  length-16 
boxcar.  Here  the  effect  of  the  filter  is 
noticeable.  Most  of  the  high-frequen¬ 
cy  harmonics  of  the  triangle  have 
been  removed,  leaving  something 
very  close  to  a  sine  wave  represent¬ 
ing  the  fundamental.  Note  that  the 
phase  shift  has  increased  with  the 
amount  of  filtering,  just  as  expected. 

In  Figure  3c,  I'm  using  a  length-64 
boxcar.  Pretty  much  all  the  harmon¬ 
ics  now  have  been  removed,  and  I've 
just  about  eliminated  the  fundamen¬ 
tal,  too.  If  I  were  to  increase  the  box¬ 
car  length  to  100 — the  number  of 
points  per  cycle  in  the  input — then 
I’d  filter  out  the  fundamental  entire¬ 
ly,  leaving  a  straight  horizontal  line. 
It  makes  sense  if  you  think  about  it.  If 
you  average  a  complete  cycle  of  a 
sine  wave,  there  will  be  a  negative 
point  to  match  every  positive  point, 
so  the  average  over  the  entire  cycle 
will  be  zero. 

So  the  cutoff  frequency  of  the  low- 
pass  filter  is  a  function  of  the  sample 
rate  (in  data  points)  and  the  width  of 
the  boxcar.  The  wider  the  boxcar,  the 
lower  the  cutoff  frequency.  When 
the  boxcar  has  the  same  number  of 
points  in  it  as  a  sine  wave  of  a  particu¬ 
lar  frequency,  that  frequency  won’t 


make  it  through  the  filter  at  all.  Com¬ 
puting  the  actual  cutoff  frequency  of 
a  digital  filter — one  that’s  not  a  nice 
power  of  2,  for  instance — is  nontrivi¬ 
al,  however.  If  you’re  interested, 
there's  a  good  description  in  Hal 
Chamberlin's  book  (cited  in  the  bibli¬ 


ography),  pages  481-495. 

Now,  let's  bring  all  this  back  into 
the  realm  of  statistics.  Looking  again 
at  Figure  2,  because  the  input  data  is  a 
horizontal  straight  line,  the  arithme¬ 
tic  mean  is  the  same  straight  line. 
Consequently,  in  this  example  the 


106 


Dr.  Dobb's  Journal,  May  1987 

393 


boxcar  average  is  converging  on  the  true  mean  (toward  the  right  side  of  boxcar  is  too  short,  you  end  up  with  a 
true  mean  exponentially  and  reaches  the  graph).  In  fact,  this  characteristic  digital  filter  that  never  converges,  as 
it  after  about  50  points  are  processed  is  generally  true.  A  boxcar  average  in  Figure  2a.  Here  the  boxcar  just 
(there  are  100  points  in  the  graph).  will  converge  on  the  arithmetic  tracks  the  input  data  without  ever 
Looking  at  Figure  3c,  the  same  mean  given  enough  input  points.  The  approaching  the  mean.  So  you  have 
thing  is  happening.  The  boxcar  aver-  shorter  the  boxcar,  the  faster  it  con-  to  do  a  balancing  act.  The  longer  the 
age  is  converging  gradually  on  the  verges.  Note,  however,  that  if  the  boxcar,  the  closer  you'll  get  to  the 


Dr.  Dobb  s  Journal,  May  1987 

394 


107 


C  CHEST 

(continued  from  page  107) 

true  arithmetic  mean.  On  the  other 
hand,  a  long  boxcar  takes  more  time 
to  converge  than  a  short  one.  If  the 
boxcar  is  too  short,  you'll  never 
converge. 

Figure  4,  page  110,  shows  the  box¬ 
car  algorithm  being  applied  to  a  set  of 
input  points  randomly  distributed 
around  a  straight  line.  As  before,  as¬ 
terisks  are  used  to  mark  the  points, 
sideways  as  are  the  arithmetic  mean, 
and  sideways  ms  are  the  boxcar  aver¬ 
age.  Figure  4a  shows  a  length-4  box¬ 
car.  Here  the  boxcar  output  jumps 
around  almost  as  much  as  the  input 
data,  so  it’s  not  all  that  useful.  In  Fig¬ 
ure  4b,  a  length-16  boxcar  is  applied. 
The  boxcar  converges  on  the  mean 
after  about  SO  points.  It  still  jumps 
around  a  bit,  though.  Figure  4c  shows 
a  length-64  boxcar.  It  tracks  the  arith¬ 
metic  mean  very  closely  after  about 
150  samples.  On  the  other  hand,  it 
takes  150  samples  to  get  close  enough 
to  be  useful.  Note  that  this  last  exam¬ 
ple  could  be  made  to  converge  faster 
if  you  initialized  the  boxcar  to  the 
arithmetic  mean  of  the  first  few 
points  rather  than  to  zero. 

Implementation 

All  this  stuff  is  implemented  by  the 
short  set  of  routines  in  Listing  One, 
page  97.  In  fact,  Figures  3  and  4  are 
output  from  the  program  in  Listing 
One.  There  are  six  subroutines  here. 
NewsampleO  passes  a  new  sample 
into  the  boxcar.  It  is  called  for  every 
input  point.  Running— meant )  returns 
the  current  value  of  the  exponential 
boxcar  average,  true— meant)  returns 
a  true  arithmetic  mean,  and  devi- 
ationt )  returns  an  approximation  of 
the  standard  deviation  (also  comput¬ 
ed  with  an  exponential  boxcar).  Giv¬ 
en  the  distance  (call  it  D)  between  any 
given  sample  and  the  true  mean,  the 
standard  deviation  is  the  square  root 
of  the  average  D2. 

The  true— meant )  function  is  here 
mostly  to  check  the  algorithm.  It 
won’t  work  if  the  sum  of  the  input 
samples  requires  more  precision 
than  a  long  can  muster. 

The  boxcar  average  is  calculated 
on  lines  39-41  of  Listing  One,  using 
the  algorithm  described  earlier.  The 
standard  deviation  is  computed  in  a 
similar  way.  A  boxcar  average  of  the 


squares  of  the  differences  between 
the  boxcar  mean  and  the  current  in¬ 
put  sample  is  kept  by  the  code  on 
lines  43-48.  The  difference  is  figured 
on  line  43.  It's  squared  on  line  44  and 
then  added  into  the  running  average 
on  lines  46-48.  The  square  root  is  tak¬ 
en  on  line  71,  when  the  standard  de¬ 
viation  is  requested.  This  delay  saves 
you  the  overhead  of  taking  a  square 
root  with  every  sample.  On  the  down 
side,  you  can  overflow  the  boxcar  if 
the  samples  are  too  big. 

The  final  subroutine  of  interest  is 


Exec( )  functions  in 
the  Microsoft  C 
compiler 

don't  work,  correctly 
when  putenv( ) 
is  also  used. 


reset— meant )  on  lines  76-85.  It  resets 
all  the  boxcars  to  0,  and  its  parameter 
can  be  used  to  set  the  boxcar  length. 
Note  that  because  Bolden  is  used  to  do 
a  right  shift  rather  than  a  divide,  the 
parameter  to  reset— meant )  is  actually 
2,  raised  to  the  bogcar—vahh  power. 
You  may  want  to  add  a  second  argu¬ 
ment  to  this  routine — an  initial  value 
of  the  boxcar.  You  can  then  take  the 
true— meant )  of  the  first  few  samples 
and  use  that  value  to  initialize  the 
running  mean. 

You  may  want  to  make  several 
other  changes,  depending  on  your 
application.  First,  because  the  true 
mean  is  not  all  that  reliable,  you'll 
probably  just  want  to  remove  it  from 
the  routines.  Delete  the  Average  and 
Numnums  variables  and  all  refer¬ 
ences  to  them,  including  the  true 
—meant)  subroutine.  Next,  all  my 
variables  are  unsigned  longs.  Conse¬ 
quently,  I  can’t  keep  a  fractional 
mean  around,  and  the  range  of  the 
data  is  limited  to  the  precision  of  a 
long.  You  may  want  to  change  all 
these  to  doubles.  Finally,  Bojden  is  the 
number  of  bits  to  shift  rather  than  a 
true  divisor.  This  means  that  the  box¬ 
car  length  is  limited  to  powers  of  2, 
which  may  not  be  enough  resolution 


for  you.  On  the  other  hand,  it  lets  you 
use  an  efficient  implementation  of 
the  boxcar.  If  you  go  to  something 
other  than  a  power  of  2,  you’ll  need 
to  use  something  closer  to  the  naive 
algorithm,  introducing  an  extra  di¬ 
vide  into  the  algorithm  and  slowing  it 
down.  One  final  easy-to-do  improve¬ 
ment  helps  with  the  start-up  time 
and  was  mentioned  earlier.  As 
you’ve  seen,  when  left  to  its  own  de¬ 
vices,  the  algorithm  converges  expo¬ 
nentially.  You  can  improve  this  be¬ 
havior  by  taking  the  true  mean  of  the 
first  few  samples  and  then  using  the 
value  of  the  true  mean  as  the  algo¬ 
rithm's  starting  point,  rather  than  0. 

Microsoft  Bug 
of  the  Month 

A  quick  note  about  a  bug  I  found  in 
the  Microsoft  C  compiler,  Version  4.0. 
The  eyect )  functions  don’t  work  cor¬ 
rectly  when  putenvt)  is  also  used  in 
the  same  program.  Memory  gets 
fragmented  in  ways  that  make  a  sec¬ 
ond  egect )  call  return  with  an  "out  of 
core”  error  status  (ENOMEM).  One 
way  around  this  bug  is  to  mimic  Unix 
and  write  your  own  putenvt).  Start 
out  in  maint )  by  copying  the  environ¬ 
ment  strings  into  a  static  array.  Envp, 
an  argv-like  pointer  to  the  environ¬ 
ment  strings,  is  passed  as  the  third  ar¬ 
gument  to  maint).  There’s  no  count, 
however — Envp  has  a  NULL  in  the  last 
entry.  Once  the  environments  are 
copied,  you  can  add  new  strings  to 
your  own  static  array  instead  of  call¬ 
ing  putenvt ).  You  can  then  e)tec  to  an¬ 
other  program  with  the  ejcecvet) 
function,  which  is  passed  an  environ¬ 
ment  that  it  in  turn  passes  to  the  child 
process.  Of  course,  if  you  use  this 
method,  you'll  need  to  write  your 
own  getenvt ),  too  (because  the  default 
routine  doesn’t  know  about  your  stat¬ 
ic  array). 

Nifty  Stuff 

I  get  a  lot  of  stuff  in  the  mail.  Most  of 
it’s  not  very  interesting,  but  occasion¬ 
ally  something  useful  comes  along. 
Hitherto,  I’ve  just  mentioned  the 
products  I’ve  liked  without  giving 
much  additional  commentary,  but 
I’ve  decided  to  start  writing  slightly 
longer  reviews  for  products  I  particu¬ 
larly  like.  Two  such  are  Custom  Soft¬ 
ware  Systems’  implementation  of  the 
Unix  vi  editor  (PC/VI,  Version  1.11) 
and  Lattice's  version  of  the  Unix 


Dr.  Dobb's  Journal,  May  1987 


109 

395 


C  CHEST 

(continued  from  page  109) 
make  utility  (LMK,  Version  2.20b). 

PC/VI 

PC/VI  is  a  full  implementation  of  vi — 


and  I  do  mean  full.  From  the  user's 
perspective,  it  is  identical  to  the  Unix 
program.  If  you're  familiar  with  the 
real  vi,  you  can  install  PC/VI  and  then 
use  it  immediately,  without  ever 
looking  at  the  manual.  Unlike  other 
vi  implementations  I’ve  seen,  PC/VI 


supports  full  visual  and  ex  modes, 
macros  (using  both  map  and  abbre¬ 
viate),  and  full  regular-expressions  (in 
both  searches  and  ex-mode  substitu¬ 
tions),  and  it  can  edit  very  large  files 
(though  it  slows  down  when  the  file 
gets  too  large).  It  even  has  a  LISP 


E  *_  *  E 

*  eEe£  *  eee  c  _  E  EEE  ee 

C  £  £*E  r=  e  «  £c 

e  ee£  e*£  s  E  * e  EEeE 

E  *  (=  E_S.S-  E  _ * 


£  * 

Ee/  EE‘ 


■«««  E  “e8  E  "  =  =g  E 

6«,„Eb"  .  E  .  EE  ,  E  *  E 

«  m"  « 

e  e  E  E  *  ♦ 

„  EE  * 

gE  *E 


*  E  EE 

£  E  E 

E 


Figure  4:  The  effect  of  exponential  smoothing  on  randomly  distributed  data.  Figure  4a  (above):  Length-4  window 


^  *w««  „  _  *  e  cpEe  e~  ce 


Eeeeeeeee6e.  e*eeEEEeee 


EEg£E£%  £E 


Figure  4b:  Length-16  window 


««■  8  «-«««  **  «  *  ^i=Ecce„E 


*  8  „  «><»«» 
«•«,  M  “ 


‘EEEEgEEgEEEEEE 


ES6EEeEEEEEeEeeSEEeEeEeeee 


ee6ee.  e6eEEe6eeeeeeEeeeese< 


E6SEee 


cEE£ESE&  EE  . 


Figure  4c:  Length-64  window 


110 

396 


Dr.  Dobb's  Journal,  May  1987 


mode.  Moreover,  Version  1.11  is  bug 
free  as  far  as  I  can  tell.  (Z,  the  vi 
shipped  with  the  Aztec  compiler,  has 
an  annoying  tendency  to  go  off  into 
outer  space  occasionally,  taking  your 
work  with  it.)  The  PC/VI  shell  escape 
works  perfectly  with  both  my  own 
shell  and  COMMAND.COM;  it  gets  the 
shell’s  name  from  COMSPEC.  Because 
Version  1.11  uses  unique  file  names 
for  temporary  files,  you  can  even  in¬ 
voke  PC/VI  from  within  a  shell  that 
was  created  from  within  PC/VI — as¬ 
suming  you've  enough  memory.  It 
supports  all  the  vi  command-line 
switches  ( +  number,  + /pattern,  -ttag, 
and  so  forth)  and  an  EXINIT  environ¬ 
ment  too. 

PC/VI  also  supports  tags,  a  feature 
particularly  useful  to  programmers. 
Tags  give  you  the  ability  to  edit  a 
large  program  by  subroutine  rather 
than  by  file.  You  first  run  a  program 
called  CTAGS,  which  creates  a  tags 
file,  a  cross-reference  of  your  C  pro¬ 
gram.  You  can  then  ask  PC/VI  to  show 
you  a  particular  subroutine  (either 
with  a  :ta<file>  command  or  with  a 
Ctrl-]),  and  it  will  automatically  save 
the  current  file,  read  in  the  file  that 
holds  the  subroutine,  and  position 
the  cursor  at  the  first  line  of  the  sub¬ 
routine  itself.  PC/VI  comes  with  a 
CTAGS,  though  this  program  works 
only  with  C  and  assembly-language 
source  files.  It’s  not  too  difficult  to 
make  your  own  tags  file  using  a  pro¬ 
gram  such  as  grep,  however. 

And  if  all  this  weren’t  enough,  PC/ 
VI  is  terminal  independent — it’s  not 
tied  down  to  the  IBM  PC.  It  uses  TERM- 
CAP  files  for  configuration  purposes 


and  even  comes  with  a  complete 
TERMCAP  interface  library  that  you 
can  use  in  your  own  programs  (it's  an 
object-module  library  that  can  link 
only  to  Microsoft  C  compiled  pro¬ 
grams,  though). 

Several  versions  of  PC/VI  are  on  the 
distribution  disk  (for  a  vanilla  PC,  for 
a  non-PC  MS-DOS  computer,  for  a  PC/ 
AT,  and  so  forth),  so  you  can  choose 
the  version  that’s  most  appropriate 
for  your  application.  It  also  comes 
with  CTAGS;  the  TERMCAP  libraries,  a 
TERMCAP  configuration  file  with  sup¬ 
port  for  about  15  terminals  (IBM  PC, 
VT-100,  ANSI,  H-19,  D4XX,  and  so  forth 
as  well  as  support  for  Hersey  Micro 
Consulting’s  FANSI-CONSOLE  driver); 
and  SPLIT,  a  utility  that  breaks  up 
large  files  into  smaller  chunks.  PC/VI 
does  have  one  failing— it  doesn’t  sup¬ 
port  the  !!  command  from  visual 
mode  (which  executes  a  command 
from  a  subshell  and  then  inserts  the 
standard  output  from  that  command 
directly  into  the  document).  This  can 
be  accomplished  by  redirecting  to  a 
file  from  a  normal :/  shell  escape  and 
then  reading  the  file  with  a  :r, 
however. 

Make 

The  Lattice  make,  called  LMK,  is  an 
enhanced  version  of  the  Unix  make 
utility.  It  has  all  the  functionality  of 
the  Unix  program,  and  to  my  knowl¬ 
edge,  it’s  the  most  powerful  on  the 
market  (though,  at  $125,  it’s  also  one 
of  the  most  expensive).  The  only 
make  that  comes  close  is  Polymake 
(from  Polytron  Software),  but  Poly¬ 
make  has  two  limitations  that  grate 


after  a  while.  I  should  preface  my 
complaints  by  saying  that  the  version 
of  Polymake  I’ve  been  using  is  more 
than  a  year  old — I  haven’t  had  any 
notifications  of  an  update,  though. 

The  first  problem  with  Polymake 
is  its  inability  to  deal  with  subdirec¬ 
tories  correctly — at  least,  it  gets  very 
confused  when  the  dependencies, 
the  file  being  made,  and  the  makefile 
itself  are  all  in  different  directories. 
The  ability  to  deal  with  subdirector¬ 
ies  is  useful  when  you  keep  a  large 
library  directory  and  you  want  the 
sources  in  one  subdirectory,  the  ob¬ 
ject  modules  in  a  different  subdirec¬ 
tory,  and  the  library  itself  some¬ 
where  else  again.  LMK  has  none  of 
Polymake’s  limitations;  it  happily 
makes  anything  anywhere. 

The  second  problem  with  Poly¬ 
make  is  the  large  amount  of  memory 
it  uses  by  itself  when  it's  working.  I 
need  this  memory  for  my  own  pro¬ 
grams.  Of  the  three  makes  I  have 
around,  Polymake  performs  the 
worst  in  this  department  (requiring 
almost  96K  for  itself);  next  (at  84K)  is 
the  make  that  comes  with  the  Micro¬ 
soft  C  compiler.  The  best  performer, 
however,  is  LMK,  which  uses  only 
35K  of  core.  (I  got  these  numbers  by 
running  chkdsk  from  within  a  make¬ 
file.)  A  lot  of  memory  is  saved  by  the 
Lattice  product’s  simply  not  execut¬ 
ing  a  shell  unless  it  actually  has  to.  If 
you  tell  LMK  to  run  a  normal  pro¬ 
gram,  it  does  so  without  the  extra 
baggage  of  a  second  command  inter¬ 
preter  in  memory. 

I  should  add  as  an  aside  that  I'm 
using  my  own  shell,  rather  than  COM- 
MAND.COM,  as  a  subshell  to  LMK.  It  has 
no  problems  with  this  configuration 
as  it  reads  the  shell’s  name  from  the 
COMSPEC  environment  rather  than 
assuming  that  the  shell  is  called  COM- 
MAND.COM.  This  capability  lets  me 
use  my  own  shell’s  script  files  from 
within  make  (rather  than  normal 
COMMAND.COM  batch  files).  Unfortu¬ 
nately  there’s  one  unnecessary,  shell- 
related  problem — LMK  wastes  mem¬ 
ory  by  invoking  a  shell  to  do 
redirection.  It  ought  to  do  the  redirec¬ 
tion  itself. 

Availability 

All  the  source  code  for  articles  in  this 
issue  is  available  on  a  single  disk.  To 
order,  send  $14.95  to  Dr.  Dobb  s  Jour¬ 
nal,  501  Galveston  Dr.,  Redwood  City, 


Dr.  Dobb's  Journal,  May  1987 


111 

397 


C  CHEST 

(continued  from  page  111) 


dons  of  Microprocessors.  2d  ed.  Has- 
brouck  Heights,  N.J.:  Hayden,  1985. 
This  is  a  great  introduction  to  elec¬ 
tronic  music  in  general.  Chapter  14 
has  a  lucid  description  of  digital  filter 
theory  that's  understandable  to 
nonmathematicians. 

Programs  for  Digital  Signal  Process¬ 
ing.  New  York:  IEEE  Press,  1979.  This 
book  has  a  wealth  of  FORTRAN  pro¬ 


grams  for  digital  signal  processing, 
including  filter  programs.  It  assumes 
you  know  enough  theory  to  under¬ 
stand  the  programs,  however — it's 
pretty  dense. 

Electronotes  Newsletter,  1  Pheasant 
Lane,  Ithaca,  NY  14850,  is  a  small  but 
meaty  periodical  for  electronic  mu¬ 
sic  hackers — people  who  want  to  ac¬ 
tually  build  the  stuff  as  well  as  play  it. 


CA  94063  or  call  (415)  366-3600  ext. 
216.  Please  specify  the  issue  number 
and  format  (MS-DOS,  Macintosh, 
Kay  pro). 


Bibliography 

Chamberlin,  Hal.  Musical  Applica 


Flotsam  and  Jetsam 


O  Books 

Several  people  have  written  re¬ 
quest  ing  a  reading  list  that  covers 
both  books  about  C  and  books 
about  programming  generally.  This 
month's  Flotsam  and  Jetsam  con¬ 
tains  such  a  list. 

First,  if  you  don't  know  a  struc¬ 
tured  programming  language,  you're 
better  off  starting  out  by  learning 
Pascal  rather  than  C.  Though  Pascal 
isn't  as  powerful  a  language  as  C,  it 
holds  your  hand  quite  a  bit  and 
forces  you  into  good  programming 
practices.  A  very  good  introduction 
to  Pascal  is  Doug  Cooper  and  Mike 
Clancy's  Oh!  Pascal  (New  York:  Nor¬ 
ton,  1982).  Also  of  interest  if  you’re 
coming  to  C  from  FORTRAN  is  Brian 
Kernighan  and  P.  J.  Plauger’s  The  Ele¬ 
ments  of  Programming  Style,  2d  ed. 
(New  York:  Yourdon,  1978).  This  book 
teaches  structured  programming 
techniques  entirely  in  FORTRAN,  an 
inherently  unstructured  language. 
You'll  also  need  to  know  a  little  as¬ 
sembly  language  to  learn  C.  In  the 
IBM  PC  world,  a  good  introduction  to 
8086  assembly  language  is  Robert  La- 
fore's  Assembly  Language  Primer  for 
the  IBM  PC  and  XT  (New  York: 
Plume/Waite,  1984).  The  rudiments 
of  assembly  language  are  also  cov¬ 
ered  in  The  C  Companion,  discussed 
later. 

In  addition  to  programming  lan¬ 
guages,  you'll  need  to  know  about 
data  structures— binary  trees, 
queues,  linked  lists,  hash  tables,  and 
the  like.  Two  excellent  books  are 
Robert  L.  Kruse’s  Data  Structures  and 
Program  Design  (Englewood  Cliffs, 
N.J.:  Prentice-Hall,  1984)  and  Aaron 
M.  Tenenbaum  and  Moshe  J.  Augen- 
stein's  Data  Structures  Using  Pascal, 
2d  ed.  (Englewood  Cliffs,  N.J.:  Pren¬ 
tice-Hall,  1986).  Both  books  contain 
extensive  examples  written  in  Pascal. 


Kruse's  explanations  (and  his  pro¬ 
grams)  are  a  little  more  clear  than 
Tenenbaum  and  Augenstein's,  and  I 
prefer  his  book  for  this  reason.  Ten¬ 
enbaum  and  Augenstein’s  book  is 
more  comprehensive,  however.  Also 
of  interest  is  Robert  Sedgewick’s  Al¬ 
gorithms  (Reading,  Mass.:  Addison- 
Wesley,  1983),  which  contains  algo¬ 
rithms  for  doing  just  about 
everything  imaginable  (at  least  in  the 
realm  of  programming).  Everything 
from  splines  to  Gaussian  elimination 
to  sorting  routines  to  fast  Fourier 
transforms  is  covered.  It’s  an  invalu¬ 
able  reference. 

As  for  C  itself,  the  C  language  was 
originally  brought  to  the  world’s  at¬ 
tention  by  Brian  Kernighan  and  Den¬ 
nis  Ritchie  in  The  C  Programming 
Language  (Englewood  Cliffs,  N.J.: 
Prentice-Hall,  1978).  The  book  is  usu¬ 
ally  called  K  &  R.  Unfortunately,  this 
book  is  dense  to  the  point  of  unreada¬ 
bility  in  places.  I  don’t  recommend  it 
unless  you're  a  very  experienced 
programmer,  but  if  you  fall  into  that 
category,  it’s  very  good.  K  &  R  is  just  a 
language  description;  it  assumes  that 
you  know  how  to  program.  The  best 
general  introduction  to  C  that  I  know 
of  is  Bryan  Costales'  C  from  A  to  Z 
(Englewood  Cliffs,  N.J.:  Prentice-Hall, 
1985).  Herbert  Schildt's  C  Made  Easy 
(Berkeley,  Calif.:  Osborne/McGraw- 
Hill,  1985)  is  also  good.  Both  of  these 
books  are  much  more  readable  than 
K  &  R.  Neither  covers  the  advanced 
parts  of  the  language  in  depth, 
however. 

There  are  several  nontextbooks 
that  are  good  aids  to  learning  C.  My 
own  book  The  C  Companion  (Engle¬ 
wood  Cliffs,  N.J.:  Prentice-Hall,  1986) 
was  developed  as  supplementary 
class  notes  for  a  C  class  I  teach.  It  cov¬ 
ers  many  of  the  topics  that  are  left  out 
of  most  C  textbooks — both  basic  top¬ 


ics  such  as  binary  arithmetic  and  as¬ 
sembly  language  and  advanced  top¬ 
ics  such  as  the  complex  uses  of 
pointers  and  writing  subroutines 
with  a  variable  number  of  argu¬ 
ments.  Another  book  worth  having  is 
Rex  Jaeshcke's  Solutions  in  C  (Read¬ 
ing,  Mass.:  Addison-Wesley,  1986). 
Rex’s  book  is  a  collection  of  C  pro¬ 
gramming  tips.  He  explains  many  of 
the  more  advanced  parts  of  the  lan¬ 
guage  with  numerous  short 
examples. 

Two  good  books  of  exercises  are 
available.  Clovis  L.  Tondo  and  Scott  E. 
Gimple’s  The  C  Answer  Book  (Engle¬ 
wood  Cliffs,  N.J.:  Prentice-Hall,  1985) 
contains  answers  for  all  the  exercises 
in  K  &  R.  It's  quite  useful  if  you're 
using  that  text.  Another  good  exer¬ 
cise  book  is  Alan  R.  Feuer's  The  C  Puz¬ 
zle  Book  (Englewood  Cliffs,  N.J.:  Pren¬ 
tice-Hall,  1982).  The  problems  in  this 
book  address  virtually  every  aspect 
of  the  C  language.  Moreover,  the 
problems  are  designed  to  familiarize 
you  with  common  errors  that  will 
probably  show  up  as  bugs  in  your 
programs.  Every  exercise  is  accom¬ 
panied  by  a  detailed  solution.  I  could 
have  saved  myself  weeks  of  debug¬ 
ging  time  had  this  book  been  avail¬ 
able  when  I  first  learned  the 
language. 

The  best  reference  to  the  C  lan¬ 
guage  is  Samuel  P.  Harbison  and  Guy 
L.  Steele,  Jr.'s  C:  A  Reference  Manual 
(Englewood  Cliffs,  N.J.:  Prentice-Hall, 
1984).  The  second  edition  (1987), 
which  incorporates  the  ANSI  exten¬ 
sions,  should  be  available  by  the  time 
you  read  this  column.  Of  course,  the 
ANSI  standard  itself  (X3-J11)  is  a  good 
reference.  The  public  review  period 
for  the  draft  standard  ended  March  7, 
so  the  real  standard  should  be  avail¬ 
able  any  time  now.  For  more  infor¬ 
mation,  contact  the  X3  Secretariat: 


Dr.  Dobb’s  Journal,  May  1987 


It’s  invaluable  if  you're  in  that 
category. 

DDJ 

(Listings  begin  on  page  64.) 

Vote  for  your  favorite  feature/article. 

Circle  Reader  Service  No.  5. 


Computer  and  Business  Equipment 
Manufacturers  Association,  311  First 
St.  NW,  Ste.  500,  Washington,  D.C. 
20001-2178;  (202)  737-8888.  Unfortu¬ 
nately,  the  standard  is  very  expen¬ 
sive  ($65  for  a  few  hundred  Xeroxed 
pages).  Check  the  computer  science 
library  of  a  local  university  before  ac¬ 
quiring  a  copy  for  yourself. 

The  final  category  of  books  is  the 
useful-examples-of-nontrivial-pro- 
grams  category.  Several  good  books 
are  available.  Joe  Campbell's  Crafting 
C  Tools  for  the  IBM  PC  (Englewood 
Cliffs,  N.J.:  Prentice- Hall,  1986)  is 
packed  with  useful  subroutines  and 
programs  for  the  IBM  environment.  It 
is  distinguished  by  numerous  Notes 
on  C  Usage  sections  that  discuss  the  C 
programming  issues  involved  in  the 
programs  themselves.  William  J. 
Hunt’s  The  C  Toolbox  (Reading,  Mass.: 
Addison-Wesley,  1985)  is  also  packed 
with  useful  stuff,  including  a  com¬ 
plete  B-tree  database  management 
package.  It's  not  as  tied  into  the  IBM 
PC  as  is  Campbell’s  book.  Both  of  the 
operating  system  design  books  that 
were  reviewed  in  the  December  DDJ 
— Ted  J.  Biggerstaff's  Systems  Soft¬ 
ware  Tools  (Englewood  Cliffs,  N.J.: 
Prentice-Hall,  1986)  and  Douglas  Com¬ 
er’s  Operating  System  Design,  the  Xinu 
Approach  (Englewood  Cliffs,  N.J.: 
Prentice-Hall,  1984)— are  good  re¬ 
sources.  And  of  course,  DDJ  itself  sells 
the  source  code  for  several  large  C 
programs — such  as  my  own  MS-DOS 
shell — on  disk. 

So,  I've  just  skimmed  the  surface  of 
what’s  available  and  have  already 
spent  several  hundred  dollars  of 
your  hard-earned  money.  Hopefully, 
you'll  find  more  extensive  reviews  of 
some  of.  these  books  in  future  issues 
of  DDJ,  but  until  then  you’ll  at  least 
have  a  place  to  start  when  you  go  into 
a  bookstore  or  library.  Good  luck. 


Dr.  Dobb's  Journal,  May  1987 


16-BIT  SOFTWARE  TOOLBOX 


Quality  of  Life  Tools 

oday’s  software  market  is  rife 
with  "user-friendly"  MS-DOS 
shells  such  as  the  Norton  Command¬ 
er  (which  incidentally  was  written 
by  John  Socha,  not  Peter  Norton),  X- 
Tree,  and  KeepTrack.  These  little  ras¬ 
cals  allow  naive  users  to  navigate 
through  the  hierarchical  directory 
structure,  copy  files,  initialize  disks, 
and  the  like  as  long  as  they  can  find 
the  arrow  keys.  I  suspect  that  most 
experienced  programmers  stay  as 
far  away  from  these  programs  as  1 
do — user  interfaces  that  clutter  up 
the  screen  and  constantly  ask  “Are 
you  sure?”  just  get  in  the  way  during 
the  day-long  cycles  of  edit,  compile, 
link,  and  debug. 

Of  course,  the  default  MS-DOS  com¬ 
mand  processor  (COMMAND.COM)  is  a 
long  way  from  the  last  word  in  user 
interfaces,  too,  and  there  is  certainly 
a  place  in  programmers’  hearts  for 
new  shells  that  can  enhance  that  in¬ 
terface  without  altering  it  past  recog¬ 
nition.  I'd  like  to  discuss  briefly  two 
such  products  this  month  and  solicit 
information  on  other  such  products 
from  DDJ  readers  and  software 
vendors. 

Command  Pius 

Command  Plus  is  an  alternative  shell 
from  ESP  Software  Systems  that  com¬ 
pletely  replaces  MS-DOS'  COMMAND 
.COM.  Command  Plus  offers  signifi¬ 
cantly  enhanced  COPY,  DEL,  and  Dm 
commands  and  adds  a  high-perform¬ 
ance  file  BROWSE  command,  com- 


by  Ray  Duncan 

mand  aliasing,  the  ability  to  accept 
multiple  commands  on  the  same  line 
(separated  by  &  delimiters),  com¬ 
mand  logging,  and  a  directory  stack. 
It  also  includes  an  extensive  shell  pro¬ 
gramming  language  called  SCRIPT 
that  is  upward  compatible  from  the 
normal  MS-DOS  batch  file  commands. 


SCRIPT  supports  integer,  long  integer, 
and  string  variables;  various  opera¬ 
tions  on  those  variables;  sophisticated 
control  structures,  such  as  CASE;  key¬ 
board  input  into  a  string  variable; 
cursor  positioning;  and  a  bevy  of  op¬ 
erations  on  environment  variables 
and  file  names,  extensions,  dates,  and 
attributes. 

But  for  me,  the  two  most  useful  fea¬ 
tures  of  Command  Plus  are  its  sup¬ 
port  for  the  command  history  and 
for  regular  expressions.  Plain  vanilla 
MS-DOS  allows  you  to  specify  sets  of 
files  with  the  wildcard  characters  * 
and  ?.  Command  Plus'  regular  ex¬ 
pressions  give  you  much  more  flexi¬ 
ble  control  over  file  names,  with  the 
ability  to  specify  or  exclude  single 
characters  or  ranges  of  characters  in 
any  position.  The  file  specification 
[abcd4-9]*.asm,  for  example,  match¬ 
es  any  file  with  the  extension  .asm 
and  whose  name  begins  with  one  of 
the  letters  a  through  d  or  with  one  of 
the  numerals  4  through  9. 

The  history  feature  of  Command 
Plus  pushes  each  command  onto  a 
history  list  as  it  is  entered.  The  de¬ 
fault  size  of  this  list  is  10  commands 
(the  oldest  one  is  simply  lost  as  each 
new  command  is  entered),  but  you 
can  configure  it  to  hold  as  many  as  48 
commands. 

You  can  recall  and  display  a  previ¬ 
ous  command  by  using  the  arrow 
keys  to  traverse  the  list,  edit  the  com¬ 
mand  if  necessary,  then  press  the  En¬ 
ter  key  to  carry  out  the  command 
again.  For  example,  if  four  com¬ 
mands  ago  you  entered: 

LINK  FOO,„SLIBW  +  LIBH.FOO.DEF 


and  you  want  to  repeat  the  same  link 
with  a  different  library,  you  would 
just  hit  the  up  arrow  four  times  to 
redisplay  the  original  LINK  com¬ 
mand,  move  to  the  position  to  be 
changed  with  the  left  and  right  ar¬ 
row  keys,  type  the  new  library 
name,  and  hit  Enter.  Incidentally, 
more  editing  functions  are  available 
than  in  normal  MS-DOS  (such  as  word 
tabbing  right  or  left),  and  the  editing 
keys  are  configurable.  You  can  also 
view  the  entire  history  list  and  select 
old  commands  for  editing  by  their 
number  on  the  list  if  you  wish.  I  have 
found  that  the  history  feature  saves 
hundreds  of  keystrokes  and  mis¬ 
typed  commands  daily. 

Now,  I  know  Unix  partisans  are  go¬ 
ing  to  write  in  and  tell  me  once  again 
that  Unix  shells  have  had  these  fea¬ 
tures  since  the  Dark  Ages.  I  am  aware 
of  this  already,  I  too  am  pleased  that 
Unix  has  at  least  one  or  two  redeem¬ 
ing  features,  and  I'll  be  the  first  to 
agree  that  Unix  software  comes  from 
the  Dark  Ages.  But  it’s  nice  to  have 
histories,  regular  expressions,  and  a 
decent  script  language  without  hav¬ 
ing  to  sacrifice  a  megabyte  of  RAM,  10 
megabytes  of  fixed  disk,  and  half  my 
CPU  cycles  on  the  altar  of  Unix. 

The  only  significant  deficit  I  see  in 
the  current  release  of  Command  Plus 
is  the  lack  of  support  for  replaceable 
parameters  in  command  aliases.  We 
can  always  hope  that  the  vendor  will 
see  fit  to  add  this  in  a  future  version! 
You  can  obtain  more  information  on 
Command  Plus  from  ESP  Software 
Systems  Inc.,  11965  Venice  Blvd.,  Ste. 
309,  Los  Angeles,  CA  90066;  (213)  306- 
7408. 

ProCED 

ProCED,  written  by  Chris  Dunford,  is 
an  extremely  powerful  command¬ 
line  editor  for  MS-DOS.  It  is  loaded  as  a 
Terminate  and  Stay  Resident  (TSR) 
utility  and  is  not  a  complete  replace¬ 
ment  for  COMMAND.COM.  What  it 


118 

400 


Dr.  Dobb's  Journal,  May  '1987 


does  offer  is  vastly  improved  com¬ 
mand-line  editing,  support  for  syn¬ 
onyms  (aliases)  with  replaceable  pa¬ 
rameters,  command  chaining, 
command  logging,  and  inspection/ 
recall/editing  of  previous  commands 
via  a  history  list  similar  to  that  de¬ 
scribed  earlier  for  Command  Plus 
(note  that  ProCED  predates  Command 
Plus  by  a  considerable  margin,  how¬ 
ever).  Nearly  all  ProCED's  special 
characters,  buffers,  and  stacks  are 
configurable  by  the  user,  and  lists  of 
synonyms  and  other  configuration 
information  can  be  automatically 
loaded  from  a  file  at  system  start-up. 

ProCED  works  by  capturing  and  re¬ 
placing  the  MS-DOS  buffered  key¬ 
board  input  function  (inf  Zlh,  func¬ 
tion  Oah);  therefore,  its  capabilities 
are  available  in  any  MS-DOS  program 
that  performs  its  input  through  this 
function,  including  most  debuggers. 
By  the  same  token  it  does  not  work 
with  some  programs,  such  as  the  Nor¬ 
ton  Commander,  that  perform  their 
keyboard  input  character  by 
character. 

A  particularly  nice  aspect  of 
ProCED  is  that  it  contains  "hooks”  that 
allow  programmers  to  write  and  in¬ 
stall  new  memory-resident  com¬ 
mands,  called  "user  commands.” 
These  are  loaded  under  the  control  of 
ProCED-like  miniature  TSRs  and  be¬ 
have  as  though  they  were  COMMAND 
COM  "intrinsic"  commands.  The 
ProCED  package  includes  several  ex¬ 
amples,  including  ATTR1R  (displays  or 
alters  file  attributes),  COIR  (a  sorted  di¬ 
rectory),  and  SEND  (transmits  an  arbi¬ 
trary  string  of  data  to  any  file  or 
device). 

Chris  recently  sent  me  a  develop¬ 
ment  (prerelease)  version  of  ProCED 
that  has  two  terrific  new  features. 
The  first  is  called  command  extrapo¬ 
lation:  if  you  type  a  few  letters  of  a 
command  and  then  press  Ctrl-X  ("X), 
ProCED  searches  the  history  buffer 
for  the  first  matching  command  and 
displays  it  for  editing.  If  it  doesn’t 
find  the  one  you  want,  you  simply  hit 
'X  repeatedly  until  the  command 
you  desire  appears.  The  second  new 
feature  is  a  built-in  file-name  search, 
which  is  similar  to  command  extrap¬ 
olation. 

At  any  point  in  a  command  line, 
you  can  type  a  partial  file  specifica¬ 
tion,  or  a  file  name  that  contains 
wildcards,  and  then  press  the  Tab 


Dr.  Dobb  s  Journal,  May  1987 


16-BIT 

(continued  from  page  119) 

key  to  have  ProCED  replace  the  tenta¬ 
tive  file  spec  with  matches  from  the 
current  directory.  You  can  also  move 
to  a  new  directory  by  pressing  the 
space  bar  when  a  directory  name  ap¬ 
pears  and  then  press  the  Tab  key 
again  to  see  further  file  names.  Com¬ 
mand  extrapolation  and  file-name 
search  will  be  present  in  the  next  re¬ 
tail  version  of  ProCED,  which  may  be 
available  by  the  time  you  read  this. 

Time  for  a  testimonial:  1  have  been 
using  ProCED  for  at  least  a  year  and 
wouldn’t  want  to  live  without  it.  I 
even  carry  a  copy  of  it  with  me  on  a 
floppy  disk  when  traveling,  so  I 
won't  feel  deprived  when  using 
someone  else's  PC!  ProCED  is  available 
from  the  Cove  Software  Group,  P.O. 
Box  1072,  Columbia,  MD  21044;  (301) 
992-9371. 

80286  Resources 

In  a  recent  column,  I  mentioned  the 
book  Inside  the  80286  (by  Ed  Strauss, 
published  by  Waite  Group/Brady)  as 
an  excellent  source  of  information  on 
the  80286's  protected  mode,  virtual 
memory  management,  task  switch¬ 
ing,  and  interrupt  handling.  DDJ 
readers  have  also  brought  to  my  at¬ 
tention  the  following  book: 

Morse,  Stephen  P.;  and  Albert,  Doug¬ 


las  J.  The  80286  Architecture.  New 
York:  Wiley,  1986.  279  pages  includ¬ 
ing  index.  ISBN  0-471-83185-9. 

The  architectural  and  systems-lev- 
el  discussion  that  occupies  nearly  all 
of  Strauss’  book  is  compressed  into 
about  40  pages  here.  Most  of  the  re¬ 
mainder  of  the  book  is  a  primer  on 
the  80286's  opcodes  and  addressing 


I  have  been 
using  ProCED 
for  a  year > 
and  I 

wouldn't  want 
to  live 
without  it. 


modes,  including  the  traditional  ex¬ 
planation  of  bits,  bytes,  and  hex  arith¬ 
metic  and  some  rehashed  tables  from 
Intel  manuals.  The  last  chapter  of  the 
book  is  entitled  "286  Hardware: 
Building  a  Computer,”  but  its  discus¬ 
sion  is  generic  and  somewhat  vague 
whereas  in  Strauss’  book  the  hard¬ 
ware  discussion  is  extremely  specific 
and  includes  schematics  and  soft¬ 
ware  listings  that  you  could  use  to 
breadboard  your  own  primitive 
80286  machine.  All  in  all,  although 


Morse  and  Albert’s  credentials  are 
good,  I  don’t  think  this  book  makes 
the  grade.  I’d  recommend  you  spend 
your  money  on  Strauss'  book  and  a 
copy  of  the  Intel  80286  Programmer’s 
Reference  instead. 

Pretty  Pictures 
Department 

I  suspect  at  least  half  of  the  readers  of 
this  column  reacted  to  the  Scientific 
American  article  on  Mandelbrot  sets 
by  writing  their  own  programs  to 
plot  these  intriguing  images.  Most  of 
us,  however,  also  found  that  plots  in 
the  350-line,  16-color  modes  of  the 
EGA  take  nearly  forever  unless  the 
program  is  painstakingly  optimized. 
Relax,  someone  else  has  done  the 
work!  A  slick,  fast  program  to  plot 
Mandelbrot  images,  called  Fractal- 
Magic-EGA,  is  available  for  $25;  the 
source  code  (Turbo  Pascal)  is  also 
available  for  an  additional  $50.  Con¬ 
tact  Sintar  Software,  P.O.  Box  3746, 
Bellevue,  WA  98009;  (206)  455-4130. 

Performing  SETs 
from  a  Program 

Daniel  Briggs  of  the  Solar  Astronomy 
Department  at  CalTech,  writes: 
"Someone  recently  wanted  to  know 
about  how  to  change  directories  in  a 
batch  file  and  then  return  home 
again.  There  are  several  different  ap¬ 
proaches,  but  here’s  the  one  I  took. 

"The  GETDIR.ASM  program  [Exam- 


120 

402 


Dr.  Dobb's  Journal,  May  1987 


pie  1,  right]  illustrates  a  means  of  pro¬ 
viding  string  functions  to  batch  files. 
In  this  case,  the  program  SETs  the  val¬ 
ue  of  the  environment  variable  DIR  to 
be  the  current  default  directory.  This 
value  can  then  be  referenced  later  in 
a  batch  file  in  the  form  %DIR%.  The 
undocumented  hook  into  the  current 
copy  of  the  command  processor,  int 
2eh,  is  used  to  do  the  SET — somewhat 
unportable  but  by  far  the  best  meth¬ 
od  among  the  various  alternatives. 
This  program  can  be  altered  easily 
for  other  uses  by  changing  the  value 
at  var—name  to  the  desired  variable 
name  and  the  function  get— value  to 
the  desired  string  function. 

"With  the  GETDIR  program  in 
hand,  you  can  write  sequences  such 
as  this  in  batch  files: 

get-dir  !  equivalent  to  "set 

DIR=  Ccurrent 
directory  >" 

cd  <  somewhere  else> 

<do  stuff  > 

cd  %DIR%  !  %<variable  name>% 
replaced  by  its  value 

"The  GETDIR  approach  has  the  dis¬ 
advantage  that  it  cannot  be  nested. 
There  is  a  set  of  shareware  utilities 
out  that  includes  the  programs  PUSH- 
DIR  and  POPDIR,  which  do  what  they 
sound  as  though  they  do.  Try  a  local 
BBS;  they’re  fairly  common.  I  like  my 
approach,  though,  because  it  pro¬ 
vides  a  simple  means  of  adding  any 
string  function  that  you  can  dream 
up  into  a  batch  file." 

Availability 

All  the  source  code  for  articles  in  this 
issue  is  available  on  a  single  disk.  To 
order,  send  $14.95  to  Dr.  Dobb's  Jour¬ 
nal,  501  Galveston  Dr.,  Redwood  City, 
CA  94063  or  call  (415)  366-3600  ext. 
216.  Please  specify  the  issue  number 
and  disk  format  (MS-DOS,  Macintosh, 
Kaypro). 


DDJ 

Vote  for  your  favorite  feature/article. 
Circle  Reader  Service  No.  6. 


GETDIR. ASH 


set  environment  variable  DIR  -  current  directory  path 


;  Copyright  (C)  1986  Daniel  Briggs 

;  . 

,*  This  program  illustrates  a  means  of  providing  string  functions 
t  to  batch  files.  In  this  case,  the  program  will  set  the  value  of 
;  the  environment  variable  DIR  to  be  the  current  default  directory. 
This  value  can  then  be  referenced  later  in  a  batch  file  by 
using  the  phrase  4DIR%.  The  undocumented  hook  into  the  current 
copy  of  the  command  processor,  int  2eh  is  used  to  do  the  set. 

To  assemble,  link,  and  convert  to  an  executable  COM  file: 

MASM  GETDIR; 

LINK  GETDIR; 

EXE2BIN  GETDIR. EXE  GETDIR.COM 


#tk_si*e 

cseg 

envjptr 

DOS__entry 

parameter 
var_name 
varjva lue 

iSP 

set  str 


equ 


lOOOh 


; a  good  healthy  stack 


segment  para  public  •CODE* 

assume  csjcseg,  ds:cseg,  ss:cseg,  es:cseg 


org 

label 


org 

label 


jrop 

db 

db 

db 

dw 


proo 

cld 


»ov 

call 

mow 

mow 

mov 


2ch 

near 

lOOh 

tar 

*et  otr 


; point*  to  local  environment 
.•aklp  to  the  end  of  the  PSP 


{atari  of  command  to  be  paeaed 
; counted  string  name 
{buffer  for  variable  value 


repne  acaab 
dec  dl 
mov: 
not 
mov’t 


0,  'set  ' 

•DIR-' 

80  dup  (?) 

offaet  end_code  +  stk_aire 

near 

{addressing  already  aet  by  loader 
ap,  ISP  ;aet  stack  pointer 

ai.  offaet  var  value 

get_value  ~  { fill  value  with  dealred  airing 

*1.  0 

cx,  -1 

di,  offaet  parameter.! 


byte  ptr  dss(dl],  Odh 
cx 

parameter,  cl 


{  find  the  null 
,•  point  to  it 
{  aacii  CR 
{  create  count  byte 


mov 

mov 

ear 

inc 

mov 

int 

mov 

int 

puah 

pop 

mov 


bx,  iSP 
cl,  4 
bx,  cl 
bx 

ah,  4ah 
21h 


itcp  of  program 


{ea  already  aet  appropriately 
.■shrink  down  to  memory  neaded 


al,  offaet  parameter 

2eh  {invoke  the  command  proceaaor 


ca 

sa 

ap,  iSP 


.-reset  stack 

;put  the  atack  pointer  back 


;  Can  you  find  a  meana  of  getting  a  atatua  back  from  thle  technique? 


set_.tr 

comment 


mov 

int 

mov 

int 

endp 


ah,  4dh 
21h 

ah,  4ch 
21h 


.•get  the  returned  atatua  <**vrong**> 
.-terminate  w/atatua 


Thla  ia  the  routine  that  actually  aeta  the  value  of  the 
atrlng.  In  thla  example,  it  aeta  the  buffer  pointed  to 
by  DS;SI  to  the  current  pathname.  What  ever  atrlng  la  aet 
muat  be  terminated  by  a  null.  The  procedure 
aasumea  that  the  bufrer  atarta  initialized  to  0a,  ao  doea 
not  add  the  final  null. 

/ 


.•not  provided  by  DOS 


get_value 

proc 

near 

mov 

byte  ptr  ds 

inc 

Si 

mov 

ah,  47h 

mov 

dl,  0 

mov 

di,  si 

int 

21h 

ret 

get^value 

endp 

end__code 

label 

near 

cseg 

ends 

end 

DQS_entry 

;  get  current  directory 


Example  1:  Providing  string  functions  to  batch  files 


Dr.  Dobb's  Journal,  May  1987 


121 

403 


COLUMNS 


STRUCTURED  PROGRAMMING 


True  BASIC  Challenges  Modula-2 


In  this  issue  I  will  discuss  the  im¬ 
plementation  of  modules  in  Ver¬ 
sion  2  of  True  BASIC,  present  two 
True  BASIC  modules,  and  compare 
True  BASIC's  modules  with  those  of 
Modula-2. 

The  dichotomous  nature  of  True 
BASIC  is  marked  by  the  support  of 
structured  code  on  one  hand  and  the 
lack  of  structured  data  on  the  other. 
Modules  as  well  as  external  libraries 
seem  to  add  more  code  structure  con¬ 
structs.  The  fact  that  they  help  create 
reusable  software  libraries  is  wel¬ 
come  as  an  effective  timesaving  tool 
and  the  implementation  of  modern 
software  engineering  methodology. 

When  I  first  heard  about  modules 
in  True  BASIC,  I  asked  the  technical 
staff  at  True  BASIC  Inc.  how  different 
they  were  from  external  libraries. 
They  answered  by  pointing  out  that 
modules  offer  better  control  over  ex¬ 
ported  routines  and  a  more  ad¬ 
vanced  data  interface.  You  can  access 
all  the  routines  in  a  library,  and  in 
addition,  the  local  variables  in  li¬ 
brary  routines  are  invisible  to  other 
routines  both  inside  and  outside  the 
library.  This  means  that  the  argu¬ 
ment  list  is  the  main  path  for  sharing 
data,  aside  from  using  temporary 
data  files. 

Modules  modify  the  interaction  of 
routines  and  data  by  supporting  the 
following  sections: 

•  MODULE  <name> 

•  PUBLIC  <Iist  of  exported  variables 

by  Namir  Clement 
Shammas 

and  arrays> 

•  PRIVATE  <list  of  unexported 
routines> 

•  DECLARE  DEF  <list  offunctions> 

•  SHARE  <list  of  variables  and  arrays 
common  among  the  module  routines 
only> 

•  module  initialization  code 


•  functions  and  procedures 

•  END  MODULE 

The  PUBLIC  declaration  is  used  to 
list  the  names  of  scalar  variables  and 
arrays  that  are  exported  to  client 
programs.  This  makes  public  vari¬ 
ables  global  and  accessible  to  all  parts 
of  the  module  as  well  as  to  the  pro¬ 
gram  that  calls  the  module. 

The  PRIVATE  declaration  lists  the 
functions  and  procedures  that  are  lo¬ 
cal  to  the  module.  Compared  to  Mod¬ 
ula-2,  this  is  the  reverse  of  the  EX¬ 
PORT list.  All  routines  that  are  not  list¬ 
ed  as  private  can  be  called  by  client 
programs. 

The  DECLARE  DEF  statement  indi¬ 
cates  the  names  of  all  the  functions 
defined  inside  the  module. 

The  SHARE  declaration  lists  the  sca¬ 
lar  variables  and  arrays  that  are  ac¬ 
cessible  to  all  the  module  routines 
but  not  to  the  client  programs.  They 
need  not  and  should  not  appear  in 
the  routines’  argument  lists.  The  ad¬ 
ditional  advantages  of  shared  vari¬ 
ables  are: 

1.  They  support  data  abstraction  by 
enabling  the  access  of  data  structures 
while  hiding  the  details. 

2.  Shared  variables  are  static:  they  re¬ 
tain  their  values  between  calls  to  rou¬ 
tines  in  the  module. 

3.  No  conflict  results  from  using  the 
same  variable  names  in  a  client  pro¬ 
gram  or  shared  variables  in  other 
routines. 

Module  initialization  is  carried  out 
automatically  before  the  program 
starts  running.  This  means  that  any 


PUBLIC  variable  involved  in  the  ini¬ 
tialization  step  must  be  assigned  an 
initial  value  from  within  the  module 
itself,  which  makes  the  initialization 
step  independent  of  the  client 
programs. 

The  new  version  of  True  BASIC  has 
a  powerful  LOAD  command,  which 
enables  you  to  load  libraries  and 
modules  and  so  extend  the  True  BA¬ 
SIC  language.  Loading  a  library  that 
implements  hyperbolic  functions, 
for  example,  enables  you  to  type  (in 
the  command  mode)  PRINT  SIN  11(2.4) 
and  obtain  a  result.  The  loaded  li¬ 
braries  and  modules  cut  down  on  the 
time  required  to  link  programs  with 
external  libraries  and  modules. 

Examples 

I  will  now  discuss  two  examples  of 
True  BASIC  modules.  Example  1,  page 
125,  contains  a  linear  regression 
module.  The  PUBLIC  declaration  ex¬ 
ports  the  three  statistics  slope,  inter¬ 
cept,  and  coefficient  of  determina¬ 
tion.  The  PRIVATE  statement  declares 
routines  Mean  and  Sdev  as  local  to 
the  module.  DECLARE  DEF  points  out 
that  Mean  and  Sdev  are  local  func¬ 
tions  used  to  calculate  the  mean  and 
standard  deviation  for  intermediate 
results.  The  SHARE  statement  lists  the 
statistical  summations  as  shared  stat¬ 
ic  variables. 

The  call  to  InitializeSum  is  the  first 
statement  in  the  module  initializa¬ 
tion  section.  It  is  followed  by  assign¬ 
ing  numeric  codes  for  missing  data  to 
the  three  regression  statistics.  These 
values  can  be  used  by  the  client  pro¬ 
gram  to  detect  that  no  meaningful  re¬ 
sults  are  available  and  so  distinguish 
them  from  random  junk  data. 

The  InitializeSum  subroutine  is 
used  to  zero  the  statistical  summa¬ 
tions — this  is  carried  out  automatical¬ 
ly  when  the  program  starts  running. 
For  the  sake  of  clarity,  I  called  the 
subroutine  from  the  program.  Sub¬ 
routine  AcumData  takes  data  from 


124 

404 


Dr.  Dobb's  Journal,  May  1987 


the  two  arrays  X  and  Y  and  updates 
the  summations.  Your  program  can 
call  it  repeatedly  to  process  data  in 
batches.  Since  the  summations  are 
static  variables,  they  maintain  their 
values  between  calls.  Thus,  when 
calling  LineFit,  the  summations  sup¬ 
ply  the  required  information  to  cal¬ 
culate  the  regression  statistics. 

Example  2,  page  126,  shows  a  sim¬ 
ple  True  BASIC  program  that  uses  the 
regression  module.  The  LIBRARY 
keyword  is  used  to  import  it,  and  DE¬ 
CLARE  PUBLIC  is  used  to  import  the 
public  variables  exported  by  the  re¬ 
gression  module.  Notice  that  the  sta¬ 
tistical  summations  are  invisible  to 
the  client  program.  If  I  want  to  write 
a  version  that  uses  an  external  True 
BASIC  library,  I  must  pass  the  summa¬ 
tions  in  the  argument  lists.  Calling 
LineFit  involves  no  parameters  be¬ 
cause  I  have  elected  to  make  the  re¬ 
sults  public. 

My  second  example  involves  sort¬ 
ing  and  indexing  an  array  of  strings. 
In  writing  the  program  I  made  cer¬ 
tain  choices  to  demonstrate  several 
features  of  True  BASIC  modules.  Ex¬ 
ample  3,  pages  128-129,  shows  mod¬ 
ule  Sort.  The  PUBLIC  statement  indi¬ 
cates  that  the  module  declares  the 
array  Item$( )  and  the  array  counter 
NData  as  globally  accessible  to  client 
programs.  The  SHARE  declaration 
lists  two  arrays:  the  first  is  an  array 
of  pointers;  the  second  is  an  index  ta¬ 
ble.  These  arrays  are  shared  within 
the  routines  of  the  module.  The  mod¬ 
ule  initialization  consists  of  assigning 
values  to  shared  variables. 

The  Sort  module  consists  of  three 
exported  routines.  The  first  one  en¬ 
sures  that  the  sizes  of  public  array 
Item$( )  and  the  shared  pointer  array 
are  adequate.  The  REDIM  statement 
(one  of  the  new  features  in  True  BA¬ 
SIC,  Version  2)  expands  the  arrays  as 
needed.  The  size  of  the  index  table  is 
independent  of  the  size  of  array 
ltem$( ).  It  maps  indices  for  the  char¬ 
acters  A  through  Z  in  uppercase 
only. 

Subroutine  Sort— and— Index  per¬ 
forms  two  tasks:  it  sorts  the  array 
Item$( )  using  pointers  and  then  sets 
up  an  index  table.  It  calls  the  local 
subroutine  ShellSort  to  perform  a 
pointer-based  Shell  sort  on  array 
Item$( ).  The  second  local  subroutine, 
Set— Index,  is  called  to  set  up  array 
Tablet ).  The  first  entry  encountered 


starting  with  the  letter  A  is  stored  in 
location  1  of  array  Tablet ),  that  of  B 
in  location  2,  and  so  on.  The  table  in¬ 
dex  is  initialized  with  Os. 

Function  Search— Index  is  used  to 
search  for  a  specific  occurrence  of  a 
string  and  returns  the  index  of  array 
Item$()  or  0  if  not  found.  Using  the 
index  table,  this  function  is  able  to 
zoom  in  on  a  feasible  search  range, 
knowing  where  to  start  and  stop. 

Example  4,  page  130,  shows  a  cli¬ 


ent  program  that  uses  module  Sort.  It 
contains  DATA  statements  that  sup¬ 
ply  the  array  Item$()  with  some 
keywords  from  the  Pascal  language. 
The  DO . .  .  WHILE  loop  counts  the 
number  of  items  in  the  DATA  state¬ 
ments.  A  dummy  string  is  used,  in¬ 
stead  of  the  Item$()  array,  for  the 
READ  statement  to  avoid  a  possible  ar¬ 
ray-bound  error.  The  RESTORE  state¬ 
ment  resets  the  DATA  statement 
pointer.  The  program  calls  for  sub- 


MDDULE  Regress 

!  Simple  linear  regression  module 
PUBLIC  Slope,  Intercept,  Rsqr  !  Global  variables 
PRIVATE  Mean,  Sdev  !  Routines  local  to  module  only 
DECLARE  DEF  Mean,  Sdev 

SHARE  Sum,  SumX,  SumX2,  SumY,  SumY2,  SumXY  !  static  local  variable 

i -  Initialize  module  - 

CALL  InitializeSum 

; -  module  routine  definitions  - 

def  Missing  =  -1.0E+200 
sub  InitializeSum 

!  Set  statistical  summations  to  zero 
let  Sum,  SumX,  SumX2  =  0 
let  SumY,  SumY2,  SumXY  =  0 
let  Slope,  Intercept,  Rsqr  =  Missing 

end  sub 

sub  AcumData(Xi) ,  Y 0  ,  NData) 

!  Subroutine  to  accumulate  sta:t  summations 

FOR  I  =  1  TO  NData 
let  Xt  =  X(I) 
let  Yt  =  Y(I) 

let  Sum  =  Sum  +  1 
let  SumX  =  SumX  +  Xt 
let  SumY  =  SumY  +  Yt 
let  SumX2  =  SumX2  +  Xt  *  Xt 
let  SumY2  =  SumY2  +  Yt  *  Yt 
let  SumXY  =  SumXY  +  Xt  *  Yt 
NEXT  I 

end  sub 

! - define  internal  functions - 

def  Mean  (A,  B)  =  A  /  B 

def  Sdev(Sum2,  Sum,  N)  =  SQR((Sum2  -  SurrT2/N)  /  (N-l) ) 
sub  LineFit 

let  MeanX  =  Mean (SumX, Sum) 
let  MeanY  =  Mean (SumY, Sum) 
let  SdevX  =  Sdev (SumX2,  SumX, Sum) 
let  SdevY  =  Sdev (SumY2, SumY, Sum) 

!  calculate  sought  regression  results 

let  Slope  =  (SumXY  -  MeanX  *  MeanY  *  Sum)  /  SdevXA2  /  (Sum  -  1) 
let  Intercept  =  MeanY  -  Slope  *  MeanX 
let  Rsqr  =  (SdevX  /  SdevY  *  Slope) A2 

end  sub 

END  MODULE 


Example  1:  True  BASIC  source  code  for  linear  regression  module 


Dr.  Dobb's  Journal,  May  1987 


125 

405 


!  PROGRAM  Regress  demonstrates  calling  module  "regress" 
OPTION  BASE  1 

! -  Module  declarations  - 

Library  "REGRESS .MDL" 

DECLARE  PUBLIC  Slope,  Intercept,  Rsqr 
DIM  X (100) ,  Y (100) 

let  MAX_DATA  =  100 

CLEAR  !  clear  screen 
DO 

INPUT  PROMPT  "Enter  number  of  data  points  "  :  NData 
PRINT 

LOOP  UNTIL  (NData  >  2)  AND  (NData  <=  MAXIDATA) 

FOR  1=1  TO  NData 

PRINT  "For  data  point  #  ";I 
INPUT  PROMPT  "  enter  X  "  :  X(I) 

INPUT  PROMPT  ”  enter  Y  "  :  Y(I) 

PRINT 
NEXT  I 

Call  InitializeSum  !  initialize  stat  summations 
Call  AcumData (X,  Y,  NData) 

Call  LineFit 
CLEAR 

PRINT  USING  "Rsqr  =  #.######"  ;  Rsqr 

PRINT  USING  "Slope  =  +# .  #####**AAn  .  slope 

PRINT  USING  "Intercept  =  +# . #####AAAA"  :  Intercept 


Example  2:  True  BASIC  source  code  for  application  program  using  the 
regression  module  in  Example  1 


STRUCTURED  PROGRAMMING 

(continued  from  page  125) 

routine  Set-Up  in  the  module  Sort. 
This  ensures  that  the  public  array 
Item$( )  and  the  pointer  array  (local  to 
the  module)  have  enough  spaces. 
The  RESTORE  statement  is  followed 
by  a  FOR . . .  NEXT  loop  to  read  the 
DATA  statements  into  array  ltem$(). 
The  module  subroutine  Sort _ 
and— Index  is  invoked  to  prepare  the 
index  table.  I  included  a  DO . . .  UNTIL 
loop  to  enable  you  to  type  in  the  Pas¬ 
cal  keywords  and  find  their  location 
in  array  Item$( ). 

The  second  example  illustrates 
data  hiding  by  using  static  shared  ar¬ 
rays  within  the  Sort  module.  The  ar¬ 
ray  of  pointers  and  index  table  re¬ 
main  invisible  to  the  client  program. 
The  limitation  of  shared  variables  is 
that  there  can  be  only  one  instance  of 
each  variable.  To  use  them  as  argu¬ 
ments,  the  module  must  include  rou¬ 
tines  to  store,  recall,  and  manage  the 
shared  arrays  to  simulate  and  handle 
multiple  instances. 

True  basic  vs.  Modula-2 

How  do  True  BASIC  modules  com¬ 
pare  with  those  of  Modula-2?  Here 
are  some  comparison  aspects: 

•  Using  modules  in  True  BASIC  is  op¬ 
tional.  The  core  implementation  has 
enough  constructs  to  enable  you  to 
avoid  using  modules  altogether  if 
you  write  all  the  software  you  use. 
Attempting  the  same  type  of  inde¬ 
pendence  is  impossible  with  Modula- 
2,  in  which  the  core  language  is 
much  smaller  and  requires  you  to 
use  modules  for  common  operations 
such  as  I/O  and  string  handling. 

•  In  Modula-2,  a  library  module  con¬ 
sists  of  two  components:  a  definition 
module  and  an  implementation 
module.  Modules  in  True  BASIC  are 
contained  in  one  file.  The  difference 
comes  from  the  language  design  phi¬ 
losophy.  Modula-2  is  aimed  at  large 
advanced  software  projects  involv¬ 
ing  teams  of  programmers.  In  a  top- 
down  software  project,  the  specifica¬ 
tions  of  each  module  must  first  be 
determined,  hence  the  need  for  a 
definition  module  that  declares  mod¬ 
ule  specifications.  Modula-2  libraries 
can  be  distributed  with  the  source 
code  for  the  definition  module  and 
the  symbol,  link,  and  other  files  for 


126 

406 


Dr.  Dobb's  Journal,  May  1987 


STRUCTURED  PROGRAMMING 

(continued  from  page  126) 


the  implementation  module.  In  True 
BASIC,  if  you  use  the  run-time  pack¬ 
age  to  form  .EXE  files,  you  must  sup¬ 
ply  separate  documentation.  True 
BASIC  module  developers  can  agree 
on  using  .DEF  files  that  have  routine 
headings  and  comments. 

•  You  can  include  True  BASIC  mod¬ 
ules  after  the  END  statement  of  a 
main  program.  In  Modula-2,  modules 
are  always  in  separate  files. 

•  In  True  BASIC’s  PUBLIC  declaration, 
variables  are  exported  in  a  way  simi¬ 
lar  to  having  variables  listed  in  the 
Modula-2  EXPORT  list.  True  BASIC 
does  not  export  data  types  (transpar¬ 
ent  or  opaque),  however,  because  it 
does  not  support  Pascal-like  data 
structures. 

•  The  PRIVATE  declaration  is  needed 
in  True  BASIC  to  specify  local  routines 
because  they  coexist  with  the  export¬ 
ed  ones. 

•  Shared  variables  in  True  BASIC  mod¬ 
ules  are  somewhat  similar  to  vari¬ 
ables  in  nested  local  modules  in  Mod¬ 
ula-2  (that  is,  a  module  library  using  a 
local  module).  Both  types  of  variables 
are  static  and  retain  their  values  be¬ 
tween  successive  calls  to  the  mod¬ 
ules.  Shared  variables  in  True  BASIC 
are  more  flexible,  however,  because 
their  scope  extends  throughout  the 
module.  In  Modula-2,  the  static  vari¬ 
ables  in  a  local  module  have  a  scope 
confined  to  the  local  module.  This 
gives  True  BASIC  shared  variables  the 
best  of  both  worlds:  static  variables 
that  are  accessible  to  all  the  module’s 
routines. 

•  Module  initialization  is  similar  for 
True  BASIC  and  Modula-2. 

•  In  Modula-2  you  are  able  to  resolve 
the  problem  of  duplicated  exported 
identifiers  (variables  and  routines) 
more  easily.  You  IMPORT  the  entire 
modules  and  use  the  dot  notation  to 
tell  the  compiler  exactly  which  rou¬ 
tine  to  use.  Suppose,  for  example, 
that  you  have  obtained  a  new  I/O 
module  (call  it  NewIO )  and  you  want 
to  import  the  procedure  Write- 
Stringf ).  At  the  same  time  you  need 
to  use  WriteStringO  from  the  stand¬ 
ard  module  InOut.  You  simply  im¬ 
port  both  modules  using: 

IMPORT  NewIO; 

IMPORT  InOut; 


MODULE  Sort 

PUBLIC  Item$ (100) ,  NData  !  Global  array  and  variable 
PRIVATE  Set_Up,  SbellSort  !  Routines  local  to  the  module 
DECLARE  DEF  Search_Index 

SHARE  Ptr (100),  Table (26),  FALSE,  TRUE,  HI_CHAR,  MAX_DATA 

i -  Module  initialization  - 

let  TRUE  =  1 
let  FALSE  =  0 
let  HI_CHAR  =26 
let  MftX_DATA  =  100 

i -  Routines  definition  - 

sub  Set_Up 

!  Make  sure  that  the  arrays  have  enough  space 
IF  NData  >  MAX_DATA  THEN  !  adjust  array  sizes  if  needed 
MAT  REDIM  Item$  (NData) 

MAT  REDIM  Ptr (NData) 

END  IF 

end  sub 

I _ 

sub  Set_Index 
!  Build  index  table 

MAT  Table  =  ZER  !  Initialize  array 

FOR  Char_Index  =  1  TO  HI_CHAR 

let  C$  =  CHR$ (64  +  Char_Index)  !  — >  ’A*  to  'Z' 

IF  Char_Index  =  1  THEN  !  Start  searching  at  the  beginning 
let  Index  =  1 

ELSE 

!  Seach  backwards 

let  Index  =  1  !  assume  worst  case  as  default 
let  J  =  Char_Index  !  use  J  as  copy  of  index  I 

DO  WHILE  J  >  1 

IF  Table (J-l)  >  0  THEN  !  'found  good  'last  index' 
let  Index  =  Table (J-l) 
let  J  =  0  !  zero  to  exit  loop 

ELSE 

let  J  =  J  -  1  !  one  step  backward 
END  IF 

LOOP 
END  IF 

let  Found  =  FALSE 

DO  WHILE  (Index  <=  NData)  AND  (Found  =  FALSE) 
let  J  =  Ptr  (Index) 
let  S$  =  Item$ (J) [1:1] 

IF  S$  =  C$  THEN  !  Match  found 
let  Found  =  TRUE 

let  Table (Char_Index)  =  Index  !  store  entry  index 

ELSE 

let  Index  =  Index  +  1  !  increment  index  for  .more  search 
END  IF 
LOOP 

NEXT  Char_Index 
end  sub 


sub  ShellSort 

!  Sort  the  pointers  and  keep  ItemS  ()  unchanged 

!  Initialize  pointers 
FOR  I  =  1  TO  NData 
let  Ptr  (I)  =  I 
NEXT  I 

!  Start  the  Shell  sort 
let  Offset  =  NData 

(continued  on  negt  page) 


Example  3:  True  BASIC  source  code  for  module  Sort 


128 


Dr.  Dobb's  Journal,  May  1987 

407 


Whenever  you  want  to  write  a  string 
using  the  NewIO  version,  you  write 
NewIO.  WriteString( ),  and  to  use  that 
of  InOut,  you  write  InOut. Write- 
Stringf ).  Of  course,  this  forces  you  to 
use  the  dot  notation  with  every 
identifier  imported  from  NewIO  and 


InOut.  Alternatively,  you  can  use  the 
familiar  import  list  for  the  most  used 
module  and  keep  using  the  dot  nota¬ 
tion  for  the  other  ones. 

In  True  BASIC  modules,  a  conflict  is 
present  on  two  levels:  public  vari¬ 
ables  and  exported  routines.  True  BA- 


let  InOrder  «  TRUE 
FOR  J  =  1  TO  (NData  -  Offset) 
let  I  =  J  +  offset 

IF  Item?  (Ptr  (I) )  <  Item?  (Ptr  (J) )  THEN 
let  Tempo  =  Ptr (I) 
let  Ptr  (I)  =  Ptr  (J) 
let  Ptr(J)  =  Tempo 
let  InOrder  =  FALSE 
END  IF 
NEXT  J 

LOOP  UNTIL  InOrder  =  TRUE 


def  Search  Index (Datum?,  Occur) 

!  Search  for  the  n  th  Occur (ance)  of  Datum?  in  array  Item?() 
.  Use  index  table  for  faster  search 

=  UCASE? (Datum? [1:1])  !  pick  first  character  in  Datum 
let  Index  =  Ord(S?)  -  64  !  Get  index  for  search  table 
let  Table_Index  =  Table (Index)  !  get  index  entry 
let  Occurance  =  ABS (INT (Occur) )  !  assign  Occur  to  local  copy 

IF  Table_Index  >  0  THEN  !  Yes  there  is  an  entry' 
let  Found  «  FALSE 
let  More_Loop  =  TRUE 

DO  WHILE  <Table_Index  <=  NData)  AND  (Found  =  FALSE)  AND 

(More_Loop  =  TRUE) 

let  J  =  Ptr (Table_Index)  !  store  pointer  in  J 
IF  Datum?  =  Item? ( J)  THEN  !  found  a  match 

let  Occurance  =  Occurance  -  1  !  Decrement  occurai 
IF  Occurance  <  1  THEN  !  Yes,  this  the  one  we  wan 
let  Found  =  TRUE 

let  Search_Index  =  Ptr (Table  Index) 

ELSE  !  No!  keep  searching! 

•  let  Table  Index  =  Table  Index  +  1 
END  IF 

ELSE  !  Should  we  keep  searching? 

IF  S?  =  Item? (J) [1:1]  THEN  !  Yes 

let  Table_Index  =  Table_Index  +  1 
ELSE  !  No,  we  are  have  gone  too  far  and  not  founc 
let  More  Loop  =  FALSE  !  stop  looping 
let  Search_Index  =  0  !  search  has  failed 

END  IF 
END  IF 

LOOP 


END  MODULE 


Example  3:  Continued 


Dr.  Dobb's  Journal,  May  1987 

408 


STRUCTURED  PROGRAMMING 

(continued  from  page  129) 


SIC  uses  the  same  storage  location  for 
identical  public  variables  declared  in 
two  or  more  modules.  It  is  your  re¬ 
sponsibility  to  manage  the  values  in 
duplicated  public  variables.  The  solu¬ 
tion  is  relatively  simple:  reassign 
their  values  to  program  variables 
with  different  names.  This  protects 
you  from  unwanted  changes  in  the 
values  of  public  variables  while  call¬ 
ing  different  modules.  Concerning 
duplicate  routines,  True  BASIC  con¬ 
siders  the  first  duplicate  routines  to 
be  valid  ones.  The  only  remedy  is  to 
change  the  duplicate  routine  names. 
•  All  the  current  IBM  PC  Modula-2  im¬ 
plementations  link  entire  library 
modules  without  performing  code 
optimization.  Import  one  routine, 
and  the  rest  of  the  library  follows!  By 
contrast,  linking  a  True  BASIC  main 
program  with  libraries  and  modules 


is  optimized. 

The  advanced  software  engineer¬ 
ing  features  of  True  BASIC  add  more 
power  and  punch  to  BASIC.  The  im¬ 
plementation  of  modules  is  indeed 
controversial,  and  I  expect  it  to  gen¬ 
erate  great  likes  and  dislikes. 

Availability 

All  the  source  code  for  articles  in  this 
issue  is  available  on  a  single  disk.  To 
order,  send  $14.95  to  Dr.  Dobb's  Jour¬ 
nal,  501  Galveston  Dr.,  Redwood  City, 
CA  94063  or  call  (415)  366-3600  ext. 
216.  Please  specify  the  issue  number 
and  format  (MS-DOS,  Macintosh, 
Kaypro). 

DDJ 

Vote  for  your  favorite  feature/article. 

Circle  Reader  Service  No.  7 


!  PROGRAM  Sort_and_Search 

Library  "Sort.mdl" 

DECLARE  DEF  Search_Index 
DECLARE  PUBLIC  ItemS  0,  NData 

CLEAR 

let  NData  =  0 

!  Count  items  in  DATA  statements 
DO  WHILE  MORE  DATA 

let  NData  =  NData  +  1 

READ  Dummy$  !  use  dummy  variable 

LOOP 

RESTORE  !  DATA  counter 

CALL  Set_Up  !  Adjust  ItemS  0  if  needed 

!  Read  DATA  into  ItemS  () ,  now  that  we  have  enough  space 
FOR  I  =  1  TO  NData 
READ  ItemS (I) 

NEXT  I 

CALL  Sort_and_Index  !  Sort  and  prepare  index  table 

let  Occur  =  1  !  Search  for  first  occurance 
DO 

INPUT  PROMPT  "Enter  sought  keyword  or  [Q]  to  exit  ?  "  :  SearchS 
let  SearchS  =  UCASES (SearchS) 

IF  SearchS [1:11  <>  "Q"  THEN 

PRINT  SearchS;"  is  item  number  " ; Sea rch_Index (SearchS,  Occur) 
PRINT 

ELSE 

PRINT 

PRINT  "PRESS  ANY  KEY  TO  EXIT  » 

END  IF 

LOOP  UNTIL  SearchS [1:1]  =  "Q" 

!  DATA  statements  contain  a  list  of  Pascal  keywords 
DATA  WRITE,  READ,  ASSIGN,  SEEK,  HI,  LO,  SQRT 
DATA  SQR,  TAN,  SIN,  COS 

DATA  IFF,  THEN,  ELSE,  WHILE,  REPEAT,  BEGIN 

DATA  FUNCTION,  VAR,  TYPE 

DATA  RECORD,  SET,  FOR,  PROCEDURE,  PROGRAM 

END 


Example  4:  True  BASIC  source  code  for  application  program  using 
module  Sort,  shown  in  Example  3 


130 


Dr.  Dobb's  Journal,  May  1987 

409 


COLUMNS 


ARTIFICIAL  INTELLIGENCE 


Object-Oriented  lisp  on  PCs 


This  month  I  am  going  to  focus  on 
a  very  powerful  yet  inexpen¬ 
sive  version  of  LISP  for  PCs  and  com¬ 
patibles  that  offers  an  object-oriented 
extension  called  SCOOPS  and  other  in¬ 
teresting  features. 

PC  Scheme  2.0 

PC  Scheme  2.0,  from  Texas  Instru¬ 
ments,  is  one  of  the  most  impressive 
programming  tools  currently  avail¬ 
able  for  the  development  of  Al-ori- 
ented  software  on  PCs.  One  of  the 
reasons  for  this  is  that  it  is  a  dialect  of 
LISP  compact  enough  to  be  viable  on 
the  PC  without  hobbling  the  power 
of  the  language.  It  is  useful  as  a  learn- 
ing  tool  as  well,  both  because  of  its 
low  price  and  because  of  its  compati¬ 
bility  with  the  standard  textbook 
used  for  teaching  undergraduate 
programming  courses  at  MIT — The 
Structure  and  Interpretation  of  Com¬ 
puter  Programs,  by  Abelson  and  Suss- 
man  (MIT  Press,  1985).  PC  Scheme  has 
many  features  that  previously  have 
existed  only  on  expensive  hardware 
such  as  LISP  machines.  There  are  lim¬ 
its  as  to  what  such  features  can  do 
when  running  on  small  machines, 
but  their  very  presence  is  highly  wel¬ 
come  and  a  strong  indication  of  the 
power  now  available  to  program¬ 
mers  at  an  affordable  price. 

PC  Scheme  is  a  superset  of  the 
Scheme  dialect  of  LISP,  which  was  de¬ 
veloped  ten  years  ago  at  MIT  by  Guy 
Steele  and  Gerry  Sussman  as  a  medi¬ 
um  for  teaching  new  and  powerful 


by  Ernest  R.  Tello 

programming  concepts.  Today 
Scheme  is  considered  one  of  the  most 
modern  and  progressive  of  the  LISP 
dialects.  Because  much  of  its  power  is 
available  in  a  relatively  small  size, 
Scheme  has  many  advantages  for 
small  machines.  This  version  of  the 
Scheme  dialect  is  one  of  the  relative 
newcomers  in  the  family  of  LISP  im¬ 


plementations  for  PCs. 

What  is  so  different  and  so  special 
about  Scheme?  And  how  does  it  com¬ 
pare  with  Common  LISP? 

Scheme  is  like  Common  LISP  in  that 
it  supports  the  lexical  scoping  of  vari¬ 
ables,  but  it  also  offers  several  other 
important  and  progressive  ideas  that 
still  have  not  passed  into  the  LISP 
mainstream,  though  it  is  possible  that 
in  the  future  many  of  them  will. 

Although  at  first  it  seems  as  though 
Scheme  is  loaded  with  a  kitchen  sink 
of  various  programming  ideas,  the 
underlying  basis  of  it  is  actually  quite 
simple  and  well  integrated.  One  im¬ 
portant  idea  in  Scheme  is  that  of  an 
environment  that  can  be  saved  as  a 
context,  allowing  control  to  shift  tem¬ 
porarily  to  another  such  environ¬ 
ment  and  then  to  return  again  to  the 
original  environment.  This  may 
sound  like  the  context  switching  that 
is  familiar  to  computer  scientists 
working  at  the  systems  level,  but 
here  it  has  a  different  meaning.  Here, 
the  context  in  terms  of  an  environ¬ 
ment  means  a  set  of  bindings  of  vari¬ 
ables  and  named  functions  taken  as  a 
unit. 

A  programming  concept  that  is  tak¬ 
en  particularly  seriously  in  Scheme  is 
that  of  first-classness.  This  idea  did 
not  originate  with  Scheme,  but  in  this 
dialect  of  LISP,  it  has  been  taken  to  its 
most  complete  expression.  Generally 
speaking,  a  first-class  object  is  one 
that  has  no  restrictions  on  the  way  it 
can  be  used.  More  specifically,  in 
most  programming  languages,  only 
numbers,  characters,  and  strings  at 
the  most  are  first-class  objects.  Even 


here,  frequently  only  integers  are 
really  first  class,  not  numbers  in  gen¬ 
eral,  and  often  there  may  be  certain 
restrictions  on  the  use  of  one  or  more 
of  these  types  of  objects  in  various  re¬ 
spects.  Usually,  for  example,  you  can¬ 
not  pass  arrays,  records,  and  func¬ 
tions  as  arguments  to  functions  or 
store  them  in  one  another.  Even  in 
conventional  LISP  dialects,  some  spe¬ 
cial  handling  is  required  when  a 
function  is  passed  as  an  argument  to 
another  function. 

In  this  respect,  Scheme  is  quite  rad¬ 
ical.  The  idea  is  for  everything  in 
Scheme  to  be  a  first-class  object.  In  PC 
Scheme,  for  example,  not  only  proce¬ 
dures  but  also  environments  and  two 
other  things  called  continuations  and 
engines  can  be  stored  in  compound 
data  structures,  returned  as  argu¬ 
ments  by  a  procedure,  and  bound  to 
variables  in  three  distinct  ways. 

Functions  in  LISP  are  defined 
through  lambda  bindings.  Because 
this  is  strongly  analogous  to  variable 
binding,  some  LISP  afficionados  have 
wondered  why  functions  are  not  de¬ 
clared  just  like  variables  and  lists  as: 

(SETQ  [name]  lambda  [args]  function- 

body) 

Well,  in  Scheme,  this  is  exactly  what 
happens,  although  it  is  the  SETQ  that 
is  dropped  and  DEFINE  is  used  for  the 
binding  of  all  objects  globally  with 
lexical  scoping.  SET!  is  used  only  for 
changing  the  binding  of  objects  that 
have  already  been  created.  Scheme 
uses  the  convention  that  functions 
ending  in  an  exclamation  point  modi¬ 
fy  their  arguments  and  those  that 
end  in  a  question  mark  are  predicates 
that  return  true  or  false.  So,  for  exam¬ 
ple,  zerop  in  LISP  becomes  zero?  in 
Scheme. 

The  commitment  to  making  every¬ 
thing  in  Scheme  a  first-class  object  is 
nothing  short  of  revolutionary.  It  is 
staggering  to  think  of  the  full  poten- 


132 

410 


Dr.  Dobb's  Journal,  May  1987 


tential  of  a  programming  system 
with  such  capabilities.  It  is  doubtful 
whether  anyone  has  attempted  to 
take  this  feature  of  Scheme  to  the  lim¬ 
its  of  its  power. 

This  is  by  no  means  the  only  radi¬ 
cal  concept  in  the  Scheme  design, 
however.  Another  concept  that  is  im¬ 
portant  in  the  Scheme  dialect  is  that 
of  a  continuation — a  basic  concept  of 
control  structures  in  programs. 
Many  of  the  more  familiar  LISP  con¬ 
trol  structures  such  as  catch  and 
throw  can  be  regarded  as  exemplify¬ 
ing  the  idea  of  a  continuation,  but  in 
Scheme  the  more  general  construct  is 
available  that  enables  the  more  spe¬ 
cific  ones  to  be  custom-built.  Essen¬ 
tially,  a  continuation  is  the  process  to 
which  a  computation  will  progress  at 
a  future  point  in  time  as  has  been 
specified  through  a  programming 
construct.  More  specifically,  the  con¬ 
tinuation  is  the  part  of  a  program  that 
can  be  thought  of  as  waiting  for  the 
result  of  a  current  computation,  and 
in  Scheme  such  a  continuation  is  a 
first-class  object,  just  as  any  current 
piece  of  data  can  be. 

Control  Structures 

Some  LISP  programmers  might  be 
shocked  to  learn  that  there  are  no 
PROGs  in  Scheme.  This  is  somewhat 
more  of  a  policy  statement  than  an 
absolute  exclusion,  however. 
Scheme  has  some  other  special  forms 
for  control  that,  although  they  do  not 
specifically  replace  PROGs,  certainly 
do  nearly  anything  that  most  PROG 
constructs  can.  One  exception  to  this 
is  PROG2,  because  Scheme  does  not 
seem  to  have  a  control  form  that  eval¬ 
uates  only  the  second  clause  in  a  se¬ 
quence.  But  obviously  there  are 
other  ways  of  doing  what  PROG 2 
does.  Many  stalwart  LISP  program¬ 
mers  consider  the  wholesale  use  of 
PROG  constructs  to  be  a  crutch  to  be 
avoided  whenever  possible,  much 
like  the  way  structured  program¬ 
ming  acolytes  feel  about  GOTOs. 

LETREC  is  an  interesting  variant  of 
the  LET  macro  that  allows  a  construct 
called  mutually  recursive  functions, 
which  by  definition  typically  come 
in  pairs.  The  best  way  of  understand¬ 
ing  this  is  by  a  specific  example.  The 
following  one  is  from  the  PC  Scheme 
manual;  it  implements  two  interde¬ 
pendent  functions,  even ?  and  odd?, 
within  the  same  binding  environ¬ 


ment,  each  of  which  recursively  calls 
the  other: 

(define  odd-r-even 
(lambda  (n) 

(letrec 

((even? 

(lambda  (n) 

(if  (zero?  n) 

*!true 

(odd?  (-1+  n)»» 

(odd? 

(lambda  (n) 

(if  (zero?  n) 

^Ifalse 

(even?  (-1+  n))))))))) 

The  LETREC  control  structure  allows 
two  or  more  lambda  procedures  to 
be  defined  in  the  same  environment. 
None  of  the  lambda  procedures  are 
self-sufficient,  but  collectively  they 
work  in  a  highly  efficient  manner  by 
calling  one  another  for  parts  of  their 
operation. 

The  Full-Screen  Editor 

PC  Scheme  comes  with  a  powerful 
Emacs-style  editor  that  offers  many 
useful  and  convenient  enhance¬ 
ments,  though  the  speed  of  many  of 
its  operations  could  be  improved 
upon.  One  useful  innovation  is  that, 
as  you  enter  right  parentheses,  not 
only  does  the  matching  parenthesis 
become  highlighted  and  blink  but 
also  the  expression  with  which  the 
clause  begins  that  is  enclosed  by 
these  parentheses  is  printed  in  a  mes¬ 
sage  area  down  at  the  bottom  of  the 
screen.  This  is  considerably  more 
than  a  cosmetic  enhancement  be¬ 
cause  in  those  cases  in  which  a  long 
LISP  function  is  used — larger  than  a 
single  screen — the  blinking  paren¬ 
thesis  approach  alone  is  useless.  This 
enhancement  virtually  puts  to  an 
end  any  need  for  counting  parenthe¬ 
ses  needed  at  the  end  of  a  LISP  func¬ 
tion  on  a  routine  basis,  though  for  de¬ 
bugging,  of  course,  it  still  never  goes 
away. 

Engines 

As  I  indicated  briefly  earlier,  PC 
Scheme  supports  an  extension  that 
includes  a  special  construct  that  pro¬ 
vides  for  resource-oriented  schedul¬ 
ing.  An  engine  is  a  special  procedure 
that  is  given  a  certain  time,  measured 
in  ticks  that  are  based  on  hardware 
clock  interrupts,  to  complete  its  com¬ 


putation.  It  is  supervised  by  two  rou¬ 
tines  called  the  a  success  procedure 
and  the  a  failure  procedure.  If  the  en¬ 
gine's  computation  is  completed  be¬ 
fore  its  assigned  time  has  expired, 
then  the  success  procedure  is  in¬ 
voked  and  the  result,  together  with 
the  number  of  expired  ticks,  is  re¬ 
turned.  If  the  time  expires  before  the 
computation  is  finished,  the  failure 
procedure  is  invoked  with  the  cre¬ 
ation  of  a  new  engine,  which  contin¬ 
ues  the  original  computation.  Al¬ 
though  there  is  no  built-in  support 
for  allowing  a  running  engine  to  in¬ 
voke  another  engine,  the  implemen¬ 
tation  of  such  nested  engine  mecha¬ 
nisms  is  possible  using  certain  special 
techniques.  Engines  are  particularly 
well  suited  for  developing  discrete 
time  simulation  applications,  which 
ordinarily  would  require  multipro¬ 
cessing  support  on  the  operating  sys¬ 
tem  level. 

SCOOPS 

Although  object-oriented  program¬ 
ming  with  the  message  passing  para¬ 
digm  is  not  a  necessary  part  of  the 
Scheme  dialect,  it  seems  to  fit  in  well 
with  the  Scheme  approach.  TI  has  in¬ 
cluded  a  powerful  object-oriented 
extension  called  SCOOPS  with  PC 
Scheme.  The  two  things  that  make 
the  SCOOPS  extension  alone  more 
than  worth  the  price  of  the  whole 
package  are  its  support  of  multiple 
inheritance  and  active  values — 
things  that  previously  were  only 
available  on  very  expensive 
hardware. 

The  object-oriented  approach  used 
by  the  SCOOPS  package  is  the  nonhier- 
archical  mixins  made  popular  by  the 
Flavor  system  used  on  Symbolics  LISP 
machines.  Like  Smalltalk  and  most 
object-oriented  programming  sys¬ 
tems,  SCOOPS  has  a  class  system  and 
the  ability  of  classes  to  inherit  vari¬ 
ables  and  functions  from  other  class¬ 
es.  But  unlike  Smalltalk  and  like  Fla¬ 
vors,  SCOOPS  classes  are  not  limited  to 
inheritance  from  superclasses  in  a 
simple  tree  hierarchy.  Mixins  allow 
classes  to  be  defined  that  can  inherit 
from  any  other  classes  the  program¬ 
mer  chooses. 

SCOOPS'  support  for  active  values — 
the  capability  often  called  procedur¬ 
al  attachments  in  frame-based  sys¬ 
tems — greatly  extends  the  power  of 
its  class  system  because  active  values 


Dr.  Dobb's  Journal,  May  1987 


133 

411 


ARTIFICIAL  INTELLIGENCE 

(continued  from  page  133) 


provide  a  means  to  assign  a  function, 
or  even  a  complex  program,  to  be 
evaluated  whenever  an  active  value 
variable  is  accessed.  This  provides  for 
association  of  complex  data  struc¬ 
tures  to  SCOOPS  instance  variables 
and  the  opportunity  for  calculated 
values  based  on  both  initial  assign¬ 
ments  to  an  instantiated  object  and  to 
conditions  in  a  dynamically  chang¬ 
ing  environment. 

To  illustrate  the  use  of  active  val¬ 
ues,  you  can  take  a  simplified  version 
of  the  difficult  problem  of  composite 
objects  as  used  in  the  CommonLoops 
system  at  Xerox  PARC.  A  composite 
object  is  an  instance  of  a  class  that  is 
considered  as  composed  of  objects 
that  are  instances  of  other  classes, 
some  of  which  may  themselves  be 
composite  objects.  For  example,  the 
body  is  composed  of  a  head,  arms, 
legs,  and  torso.  The  head,  hands,  and 
feet  can  also  be  represented  as  com¬ 
posed  of  other  objects  such  as  eyes, 
ears,  fingers,  toes,  and  so  on.  How  can 
composite  objects  be  implemented  in 
a  system  such  as  SCOOPS?  Ordinarily, 
SCOOPS  does  not  even  support  lists  as 
values  of  slots  or  instance  variables, 
much  less  more  complex  data  struc¬ 
tures.  At  the  minimum,  a  composite 
object  has  to  contain  a  list  of  all  of  its 
components.  Active  values  allow  this 
to  be  done  by  assigning  procedures  to 
instance  variables  that  access  exter¬ 
nal  data  structures  and  knowledge 
structures.  The  format  for  specifying 
an  active  value  is: 

(instvar 

( [VARIABLE] 

(active  [INITIAL-VALUE] 

[GET-FUN] 

[PUT-FUN] 

Here,  GET-FUN  and  PUT-FUN  represent 
two  procedures — each  of  one  argu¬ 
ment  only — that  are  automatically 
evaluated  when  the  usual  get  and  put 
methods  for  an  instance  variable  are 
sent  to  an  object.  The  INITIAL-VALUE 
of  the  variable  is  the  argument  that  is 
passed  to  these  functions. 

One  way  to  make  the  functions  ac¬ 
cess  a  list — so  that  when  the  usual  get 
method  is  used,  it  returns,  for  exam¬ 
ple,  a  list  of  the  composite  object's 
component  parts — is  to  make  the  ini- 


S  (C)  Copyright  i987  Ernest  R.  Tello 

(define-class  composite-object 

(classvars  class-part-name  class-part-num) 

(instvars  (part -names  (active  partB  get-parts  add-part)) 
(numbers-of-parts  (active  'i-parts  num-parts 

more-parts) ) ) 

(options 

(gettable-variables  class-part -name  part-names 


numbers-of-parts) 


settable-variables 
inlttable-variables) ) 


(define  human-body-parts  •()  ) 

(putprop  ' human-body-parts  1  'head) 

(putprop  • human-body-parts  1  'neck) 

(putprop  'human-body-parts  2  ’arms) 

(putprop  'human-body-parts  2  'hands) 

(putprop  'human-body-parts  1  'trunk) 

(putprop  'human-body-parts  2  'legs) 

(putprop  'human-body-parts  2  'feet) 

(define  (num-parts  p-list) 

(princ  p-list)) 

(define  part-map  (proplist  'human-body-parts)) 

(define-class  body 

(classvars  (class-part -name  'body-parts) (class-part-num 

•human-body-parts) ) 

(mixins  composite-object) ) 

(define  body-parts  '(head  neck  arms  hands  trunk  legs  feet)) 

(define-method  (composite-object  put-cpart-name)  (new-part) 
(set!  body-parts 

(append  (eval  (get-class-part-name) ) 
(list  new-part)))) 

(define  (add-part  new-part) 

(append!  (get-class-part -name)  (list  hew-part))) 

(define  (get-parts  val) 

(princ  val) ) 


(define  my-body 
(make-instance  body 

'part-names  body-parts 
'numbers-of-parts  part-map  )) 

(conpile-class  composite-object) 

(compile-class  body) 


Example  1:  Composite  objects  in  SCOOPS 


[26]  (describe  body) 
CLASS  DESCRIPTION 


NAME 

CLASS  VARS 
INSTANCE  VARS 
METHODS 


BODY 

(CLASS-PART-NAME  CLASS-PART-NUM) 
(NUMBERS-OF-PARTS  PART-NAMES) 

( GET-CLASS-PART-NAME  SET-CLASS-PART-NUM 


SET-CLASS-PART-NAME  SET-PART-NAMES  GET-PART-NAMES  SET-NUM- 
BERS-OF-PARTS  GET-NUMBERS-OF-PARTS  PUT-CPART-NAME) 

MIXINS  :  (COMPOSITE-OBJECT) 

CLASS  COMPILED  :  « ! TRUE 
CLASS  INHERITED  :  #!TRUE 


Example  Z:  PC  Scheme  description  of  the  class  Body  as  displayed  on  the 


134 

412 


Dr.  Dobb's  Journal,  May  1987 


ARTIFICIAL  INTELLIGENCE 

(continued  from  page  134) 


tial  value  of  the  active  value  variable 
the  name  of  the  list  of  parts  and  de¬ 
fine  the  get  and  put  functions  such 
that  they  can  return  and  append  this 
list.  The  appending  function  is  tricky 
to  implement  because  it  needs  the 
name  of  the  list  so  that  things  can  be 
appended  to  it.  Because  it  is  desirable 
to  make  such  a  function  as  general  as 
possible  so  that  it  Can  be  used  with 
any  composite  object,  it  has  to  have  a 
way  of  finding  the  name  of  the  spe¬ 
cific  list  of  parts  that  applies  only  to 
the  particular  object  in  question.  The 
problem  is  that  the  initial  value, 
which  is  the  name  of  the  list  in  ques¬ 
tion,  is  not  returned  by  its  normal  get 
function  anymore  but  is  passed  as  an 
argument  to  the  active  value  func¬ 
tion.  One  way  to  get  around  this 
problem  is  to  use  another  variable 
that  is  not  an  active  value  variable  to 
store  the  name  of  the  list  where  it 
may  be  easily  accessed  by  a  global 
function. 

I’ll  now  illustrate  a  successful  ap¬ 
plication  of  this  strategy  in  PC 
Scheme.  Two  classes  have  been  im¬ 
plemented — Composite-Object  and 
Body — where  the  first  is  a  mixin,  or 
superclass,  for  the  second.  The  code 
for  these  classes  is  supplied  in  Exam¬ 
ple  1,  page  134.  When  you  ask 
Scheme  to  describe  the  class  Body , 
you  see  the  screen  display  shown  in 
Example  2,  page  134.  The  variables  of 
Body  have  all  been  inherited  from 
Composite-Object.  The  variable  part- 
names  is  implemented  as  an  active 
value  that  accesses  a  list  and  prints  its 
contents  when  called.  The  variable 


numbers-of-parts  is  also  an  active  val¬ 
ue,  but  in  this  case,  its  get  function 
returns  and  prints  a  property  list  that 
contains  a  list  of  body  parts  each  with 
the  property  of  how  many  such  parts 
the  body  should  contain.  One  possi¬ 
ble  extension  of  this  example  would 
be  to  define  various  subclasses  for  dif¬ 
ferent  types  of  organisms.  So,  for  ex¬ 
ample,  humans,  horses,  ants,  spiders, 
and  centipedes  would  have  different 
entries  on  their  property  lists  for  the 
number  of  legs. 

Using  the  current  example  I  creat¬ 
ed  an  instance  of  the  Body  class 
called  My-Body.  I  set  the  values  of  its 
variables  so  as  to  reference  appropri¬ 
ate  lists  and  property  lists  for  the 
names  and  numbers  of  its  parts. 
Then,  sending  it  the  messages  indi¬ 
cated  produced  the  results: 

[27]  (send  my-body  get-part-names) 
(HEAD  NECK  ARMS  HANDS  TRUNK  LEGS 

FEET) 

[28]  (send  my-body  get-numbers-of- 

parts) 

(FEET  2  LEGS  2  TRUNK  1  HANDS  2  ARMS  2 
NECK  1  HEAD  1) 

One  of  the  nice  discoveries  I  made 
when  developing  this  use  of  active 
values  is  that,  once  this  interface  to 
more  complex  auxiliary  data  struc¬ 
tures  has  been  correctly  implement¬ 
ed,  the  values  will  then  actually  be 
displayed  as  if  they  are  part  of  the 
object  when  the  describe  function  is 
called.  So,  in  the  case  of  the  object 
My-Body,  which  is  an  instance  of  the 
Body  class,  the  following  result  is  re¬ 
turned  when  requesting  its 
description: 


[30]  (describe  my-body) 

INSTANCE  DESCRIPTION 

I 

I 

Instance  of  Class  BODY 
Class  Variables : 

CLASS-PART-NAME :  BODY-PARTS 
CLASS-PART-NUM :  HUMAN-BODY- 

PARTS 

Instance  Variables : 
NUMBERS-OF-PARTS  :  (FEET  2  LEGS  2 
TRUNK  1  HANDS  2  ARMS  2  NECK  1  HEAD  1) 

In  the  full  implementation  of  the 
Composite-Object  class,  there  would 
be  various  additional  methods,  in¬ 
cluding  one  that  could  automatically 
initialize  the  objects  that  were  part  of 
any  instance  of  this  class  or  any  of 
those  of  which  it  was  a  mixin.  This 
method  would  include  a  recursive 
procedure  that  would  access  the 
part-names  slot  to  get  a  list  of  all  of  its 
components  and  then  the  numbers- 
of-parts  property  list  to  find  out  how 
many  of  each  were  needed.  You 
could  then  repeat  this  for  all  of  those 
parts  that  were  also  composite  ob¬ 
jects,  until  all  the  objects  being  instan¬ 
tiated  were  simple  and  not  compos¬ 
ite.  As  a  variant,  you  could  build  a 
subclass  of  Composite-Object,  like  the 
Perspective  class  in  KRL  and  Loops,  in 
which  the  objects  that  were  its  com¬ 
ponents  would  only  be  instantiated 
on  request.  The  idea  of  a  Perspective 
is  a  composite  whose  components 
are  not  exactly  parts  but  various  dif¬ 
ferent  roles  in  which  the  individual 
object  participates.  In  this  sense,  it 
would  be  as  if  the  individual  had  var¬ 
ious  distinct  aspects,  each  of  which 
could  independently  be  members  of 
different  classes. 


DDJ 

Vote  for  your  favorite  feature/article. 
Circle  Reader  Service  No.  8. 


Vendor 

PC  Scheme 

Texas  Instruments 

12501  Research  Blvd.,  MS  2223 

Austin,  TX  78769 

(512)  250-7533 

Reader  Service  No.  43 


136 


Dr.  Dobb's  Journal,  May  1987 

413 


FORUM 


VIEWPOINT 

(continued  from  page  14) 


Finally,  print  out  a  copy  of  every  pro¬ 
gram  that  does  not  print  out  a  copy  of 
itself. 

This  program,  in  the  process  of 
generating  all  programs,  will  eventu¬ 
ally  generate  itself.  Does  it  print  out  a 
copy  of  itself?  If  it  does,  it  is  breaking 
the  rule  by  printing  out  a  copy  of  a 
program  that  prints  out  a  copy  of  it¬ 
self.  If  it  does  not,  it  is  breaking  the 
rule  by  failing  to  print  out  a  copy  of  a 
program  that  does  not  print  out  a 
copy  of  itself.  This  fatal  contradiction 
proves  that  the  halting  problem  has 
no  solution. 

You  may  recognize  this  as  Russell's 
paradox  (the  set  of  all  sets  that  do  not 
contain  themselves)  or  as  the  barber 
paradox  (the  barber  who  shaves  ev¬ 
ery  man  who  does  not  shave 
himself). 

Any  problem  that  a  debugger  can 
convert  to  the  halting  problem,  such 
as  the  string-output  problem,  is 
equally  unsolvable.  Some  other  obvi¬ 
ous  examples  are: 

1.  determining  whether  a  program 
will  reach  a  specified  point  (Ada  pro¬ 
grammers:  this  is  why  PROGRAM _ 
ERROR  has  to  be  a  run-time  error,  not 
a  compile-time  error) 

2.  determining  whether  a  variable  is 
initialized  before  it  is  used 

3.  determining  whether  a  given  seg¬ 
ment  of  code  is  inaccessible  and  will 
never  be  executed 

4.  determining  whether  two  pro¬ 
grams  do  the  same  thing 

Of  course,  a  debugger  or  compiler 
can  sometimes  predict  such  errors — 
for  example,  inaccessible  code  can 
sometimes  be  identified  at  compile 
time.  But  universal  solutions  to  such 
problems  do  not  exist. 

The  impossibility  of  determining 
whether  two  programs  do  the  same 
thing  means  that  it  is  always  possible 
to  defeat  a  certain  kind  of  Trojan 
horse.  In  a  lecture  reprinted  in  the 
Notices  of  the  ACM  (August  1984),  Ken 
Thompson  argued  that  he  could  put  a 
Trojan  horse  into  a  C  compiler  that 
would  miscompile  the  login  state¬ 
ment  to  allow  him  access  to  any  Unix 
system  compiled  with  it,  and  it 
would  miscompile  the  C  compiler  to 
insert  a  copy  of  itself.  The  Trojan 
horse  itself  would  not  appear  in  the 
source  code  for  the  C  compiler.  In  a 


letter  to  the  editor,  Steve  Draper  not¬ 
ed  that  such  a  Trojan  horse  can  be 
defeated  by  paraphrasing  the  C  com¬ 
piler  (writing  different  code  that  does 
the  same  thing)  and  then  recompiling 
it.  No  Trojan  horse  can  infallibly  rec¬ 
ognize  paraphrased  programs— 
hence  there  is  always  a  paraphrase 
that  will  defeat  the  Trojan  horse. 

My  own  opinion  in  this  matter  is 
that,  unless  the  Trojan  horse  were 
skillfully  written,  most  paraphrases 
would  defeat  it,  and  in  fact  it  would 
probably  be  defeated  eventually  by 
normal  software  maintenance.  Any 
Trojan  horse  smart  enough  to  recog¬ 
nize  most  paraphrases  would  proba¬ 
bly  be  much  larger  than  the  rest  of 
the  C  compiler.  You'd  never  get  it 
through  the  gates. 

The  halting  problem  is  intimately 
related  to  two  other  problems,  which 
were  posed  by  the  mathematician 
David  Hilbert  in  1900.  Is  there  a  for¬ 
mal  proof  or  disproof  for  every 
mathematical  statement?  Is  there  an 
algorithm  to  find  proofs? 

The  first  question  was  answered  in 
the  negative  by  Kurt  Goedel  in  1931. 
Goedel's  proof  was  complex,  but  if 
you  accept  the  unsolvability  of  the 
halting  problem,  it  can  be  proved 
simply.  Whether  a  particular  pro¬ 
gram  halts  is  a  mathematical  state¬ 
ment.  In  fact,  many  mathematical 
theorems  are  already  special  cases  of 
the  halting  problem  because  you  can 
write  a  program  to  search  for  coun¬ 
terexamples  and  halt  when  it  finds 
one.  The  theorem  is  equivalent  to  the 
assertion  that  the  program  never 
halts. 

If  there  were  always  a  formal 
proof  or  disproof  of  the  assertion  that 
a  program  halts,  then  you  could  sim¬ 
ply  generate  all  proofs  (more  or  less 
as  the  program  described  earlier  gen¬ 
erated  all  programs)  until  you  found 
either  a  proof  or  a  disproof.  That 
would  solve  the  halting  problem.  Be¬ 
cause  the  halting  problem  is  in  gener¬ 
al  unsolvable,  there  must  be  at  least 
one  mathematical  statement  of  this 
kind  that  is  undecidable — that  is,  it 
cannot  be  formally  proved  or 
disproved. 

This  shows  that  it  is  impossible  in 
general  to  prove  that  a  program 
works.  Specific  programs  or  limited 
classes  of  programs  can  be  proved  to 
do  certain  things,  but  there  is  no  way 


140 

414 


Dr.  Dobb's  Journal,  May  1987 


to  do  this  for  every  program. 

Given  that  some  mathematical 
statements  are  undecidable,  is  there  a 
program,  the  "decidability  pro¬ 
gram,”  that  can  tell  whether  any 
mathematical  statement  is  decidable, 
even  without  deciding  whether  it  is 
true  or  false?  As  you  might  have 
guessed  from  the  tone  of  this  article, 
the  answer  is  again  no.  If  you  have  a 
decidability  program,  you  can  take 
any  program  and  ask  whether  it 
halts.  Then  apply  the  decidability 
program  to  this  question.  If  the  ques¬ 
tion  is  decidable,  a  search  of  all 
proofs  will  prove  it  or  disprove  it.  If 
the  question  is  undecidable,  then  the 
program  never  halts;  otherwise,  you 
could  prove  that  it  halts  by  simply 
running  it  until  it  halts. 

Therefore,  theorem-proving  pro¬ 
grams,  however  successful  they 
might  be  in  limited  areas,  can  never 
prove  everything.  Some  things  must 
always  remain  beyond  their  grasp. 

These  arguments  are  not  rigorous 
in  the  mathematical  sense  because 
too  much  has  been  left  out.  A  major 
part  of  Turing's  and  Goedel's  work 
involved  formalization  of  the  con¬ 
cepts  of  “computation”  and  “proof” 


to  the  point  at  which  their  arguments 
would  be  accepted  by  mathema¬ 
ticians. 

You  may  have  already  spotted  one 
tacit  assumption  that  does  not  corre¬ 
spond  to  reality.  The  programs  are 
not  constrained  by  memory  limita¬ 
tions.  If  a  program  does  have  a  mem¬ 
ory  limitation,  then  the  halting  prob¬ 
lem  can  in  theory  be  solved — but 
only  by  a  program  with  a  much  larg¬ 
er  memory. 

This  is  how  it  can  be  done.  A  pro¬ 
gram  with  a  memory  limitation  has 
only  a  finite  number  of  states.  A  de¬ 
bugger  can  single-step  it,  keeping 
track  of  the  states  it  has  occupied.  If  it 
occupies  the  same  state  twice  before 
halting,  it  will  repeat  the  same  se¬ 
quence  of  states  indefinitely  and  will 
never  stop. 

To  do  this,  the  debugger  needs 
enough  memory  to  keep  track  of 
which  states  the  program  has  occu¬ 
pied.  Only  one  bit  is  required  for 
each  possible  state,  but  the  number  of 
possible  states  for  even  a  simple  pro¬ 
gram  is  truly  mind-boggling.  Every 
combination  of  bits  in  the  memory  is 
a  different  state.  Hence  a  program 
with  only  1,024  bytes  of  memory  has 


at  least  2(1024X  8)  states  due  to  mem¬ 
ory  configuration  alone,  to  say  noth¬ 
ing  of  flags  and  registers.  This  num¬ 
ber  of  flip-flops  would  not  fit  into  the 
entire  known  universe.  It  can  there¬ 
fore  be  said  that  the  halting  problem 
has  no  solution  even  in  this  case. 

It  should  be  clear,  then,  that  there 
are  definitely  some  limits  to  what  ar¬ 
tificial  intelligence  can  accomplish 
and  that  mathematicians’  and  pro¬ 
grammers'  jobs  can  never  be  com¬ 
pletely  automated.  (This  is  a  great 
comfort  to  me  because  I  am  a  math¬ 
ematician  and  programmer.) 

Only  perfect  solutions  are  impossi¬ 
ble,  however.  It  can  still  be  argued, 
and  it  is  argued  by  some,  that  artifi¬ 
cial  intelligence  programs  will  even¬ 
tually  be  able  to  solve  every  problem 
that  the  human  mind  can  solve,  with 
at  least  the  same  success  rate.  And  if 
the  only  requirement  is  practical  so¬ 
lutions,  not  perfect  solutions,  then 
many  interesting  but  theoretically 
unsolvable  problems  can  be  solved. 

DDJ 

Vote  for  your  favorite  feature/article. 

Circle  Reader  Service  No.  1 . 


Dr.  Dobb's  Journal,  May  1987 


141 

415 


PROGRAMMER'S  SERVICES 


THE  STATE  OF  BASIC 


New  BASIC  Subroutines 
Continued 

In  this  issue,  we  resume  the  discus¬ 
sion  about  subroutines  and  look  at 
how  they  are  implemented  in  Sum¬ 
mit  Software's  BetterBASlC  and  Bor¬ 
land's  new  Turbo  BASIC. 

Subroutines  in  BetterBASlC  strongly 
resemble  Pascal  procedures.  As  a 
matter  of  fact,  the  PROCEDURE: 
<name>  syntax  is  used  to  declare  a 
BetterBASlC  subroutine.  Once  you 
type  PROCEDURE:  followed  by  a  sub¬ 
routine  name,  BetterBASlC  creates  a 
new  workspace  for  that  subroutine. 
This  gives  subroutines  a  great  degree 
of  freedom  because  they  can  have 
their  own  line  numbers  (1  to  32,767) 
and  local  variables.  Unlike  Pascal 
procedure  declarations,  subroutine 


argument  lists  in  BetterBASlC  are  de¬ 
clared  on  separate  lines. 

Example  1,  below,  shows  a  simple 
subroutine,  Increase,  to  increment  an 
integer.  Notice  that  PROCEDURE:  In¬ 
crease  is  typed  on  one  line  and  its  ar¬ 
guments,  variables  /  and  J,  are  on  fol¬ 
lowing  separate  lines.  BetterBASlC 
requires  the  use  of  the  keyword  arg 
to  distinguish  between  an  argument 
and  a  local  variable.  The /VAR  is  used 
to  declare  that  a  variable  is  passed  by 
reference;  otherwise,  as  in  the  case  of 
J,  it  is  passed  by  value.  The  /OPT  is  a 
directive  that  assigns  a  default  value 
to  a  variable. 

Example  2,  below,  shows  the  sim¬ 
ple  main  program  that  calls  the  pro¬ 
cedure  Increase  twice.  In  the  first 
call,  Increase  is  supplied  with  one  ar¬ 
gument  (corresponding  to  variable  I 
in  the  declaration  part  of  the  proce¬ 
dure  Increase ).  This  causes  the  de¬ 
fault  value  of  1  to  be  assigned  to  vari¬ 
able  J.  In  the  second  call  to  Increase, 
two  arguments  are  used,  causing  the 
value  of  2  to  be  passed  to  argument ./. 

The  BetterBASlC  feature  of  assign¬ 
ing  default  values  to  variables  “omit¬ 
ted’’  while  calling  a  subroutine  has  a 
parallel  feature  in  the  Ada  language. 
To  avoid  chaos  in  subroutine  calls 


when  using  this  feature,  you  must 
observe  the  following  rule:  you  can¬ 
not  “skip''  arguments.  In  other 
words,  once  you  rely  on  the  default 
value  of  an  argument,  all  the  argu¬ 
ments  that  follow  must  have  default 
values,  which  must  be  invoked.  You 
cannot  pick  and  choose.  To  use  the 
default-value  feature  in  a  BetterBASlC 
subroutine,  divide  your  argument  list 
into  two  logical  sets  of  parameters. 
The  first  set  should  always  require 
values  to  be  exchanged  with  the  sub¬ 
routine;  hence,  the  parameters  must 
be  present  during  a  subroutine  call. 
The  second  set  consists  of  parameters 
that  should  have  logically  related  de¬ 
fault  values  (we  say  logically  to  stress 
that  these  default  values  are  all  attri¬ 
butes  to  a  single  default  state).  As  a 
result,  these  parameters  are  either 
present  (to  provide  data  for  a  nonde¬ 
fault  state)  or  absent  (to  refer  to  the 
default-state  value).  Another  ap¬ 
proach  to  using  the  second  parame¬ 
ter  set  can  be  related  to  the  fact  that 
its  default  values  cannot  be  attributes 
to  a  finite  default  state.  In  this  case, 
arrange  the  declaration  of  the  sub¬ 
routine  parameters  in  a  sorted  order 
based  on  the  overall  probability  of 
not  using  a  default  value.  This  places 


:  INTEGER  ARG:  i/VAR 

f  INTEGER  ARG :  J/OPT=  1 

*0  I  »  I  +  J 

\ 

END  PROCEDURE 

Example  1:  BetterBASlC  procedure  to 
increment  an  integer  variable 


INTBGERA,  B 
0A=  10 
)  B  »  12 

risfcrease A  »  increment  with  default  value  of  1 

pr isr A  ’  prints  11  =  10+1  (default) 
increaseA, B 
60  print  A  •  prints  1 3  =  1 1  +  2 
ssn 

Half  V:  ■  '  f-. 


Example  2:  BetterBASlC  demonstration  program 
using  procedure  Increase 


sub  Stat (X# ( 2) ,  Col* ,  Average# ,  Sdev# ) static 

local  Sum#  ,  SuraX#  ,  SumXX# ,  Row* 

Sum#  =0.0  PYE' 

Sumx#  =o.o  .  >; 

SumXX#  =  0.0 

for  Row*  =  LBound (X#  ,  1 !  TO  UBound  (X# , 1 ) 

Sum#  = Sum#  +1.0 

Sumx#  «  SumX#  +  x#  (Row* ,  Col* ) 

SumXX#  =  SumXX#  =  X#(HowS,  Col*)  *2 
NEXT  ROW* 

IF  Sum#  >1.0  THEN 
Average#  =  SumX# /  Sum# 

Sdev#  =  sqt<(  (SumXX#  -  Sumx#  ‘2  /  Sum# )  / 

(Sum# -1.0) 

else  •  code  for  insufficient  data 
Average#  =  - 1 ,  QE+30 
Sdev#  =  -1.QE+3Q 
END  IF 

END  SUB 


Example  3:  Turbo  BASIC  subroutine  to  obtain  the  av¬ 
erage  and  standard  deviation  of  data  stored  in  an  array 


144 

416 


Dr.  Dobb's  Journal,  May  1987 


parameters  that  more  seldomly  re¬ 
sort  to  default  values  in  the  beginning 
of  the  list  and  vice  versa. 

Looking  at  Example  2,  you  may  ob¬ 
serve  more  differences  in  syntax  be¬ 
tween  BetterBASlC  and  the  other  BA¬ 
SIC  implementations  we  discussed  in 
the  previous  column — for  example, 
the  CALL  keyword  and  parentheses 
are  not  used  in  BetterBASlC. 

Because  BetterBASlC  supports  Pas¬ 
cal-like  record  structures,  you  can 
use  them  to  pass  many  variables  and 
still  keep  a  short,  formal  argument 
list.  This  feature  enables  BetterBASlC 
to  refrain  from  supporting  SHARED 
variables  (as  in  QuickBASIC)  to  keep 
the  argument  list  small.  Passing  re- 
cord-type  arguments  in  BetterBASlC 
subroutines  is  even  safer  because  you 
maintain  tighter  control  over  shared 
data  and  greatly  minimize  any  unde¬ 
sirable  side  effects. 

BetterBASlC  does  not  offer  built-in 
functions  to  return  the  lower  and  up¬ 
per  bounds  of  arrays,  which  makes 
writing  general-purpose,  array-ma¬ 
nipulating  routines  a  bit  more  in¬ 
volved.  Because  the  lower  bound  of 
any  array  can  be  0  or  1,  you  can  write 
such  routines  to  start  at  index  1.  Your 
routines  must  rely  heavily  on  inte¬ 
ger-type  parameters  that  supply  the 
upper  array  bounds,  however.  Such 
reliance  makes  the  routines  extreme¬ 
ly  vulnerable  to  corrupted  upper 
bounds  values;  there  is  no  easy  way 
to  compare  these  parameters  with 
the  actual  array  bounds  they  repre¬ 
sent.  The  positive  side  of  using  such 
parameters  is  that,  when  arrays  are 
not  fully  populated  with  valid  data, 
you  still  need  to  supply  data 
counters.  Thus,  the  upper  array 
bound  parameters  frequently  end  up 
being  used  as  data  counters. 

Turbo  BASIC  implements  subrou¬ 
tines  in  a  manner  that  resembles 
QuickBASIC:  GOSUB  using  labels  to  di¬ 
rect  the  jumps  to  subroutines  and 
named  subroutines.  A  callable  sub¬ 
routine  (or  procedure  as  it  is  called  in 
Turbo  BASIC)  can  have  an  argument 
list  to  pass  arguments  that  are  scalar 
and/or  array  variables.  With  Turbo 
BASIC  you  must  specify  the  number 
of  dimensions  of  an  array  using  an 
integer  constant  enclosed  in  paren¬ 
theses  following  the  array  name. 


Like  QuickBASIC  and  True  BASIC,  vari¬ 
ables  are  passed  by  reference  and  ex¬ 
pressions  are  passed  by  value.  Turbo 
BASIC  supports  ST  A  TIC  and  RECURSIVE 
subroutines  as  well  as  LOCAL,  STATIC, 
and  SHARED  attributes  for  variables  in 
a  subroutine.  The  LOCAL  attribute  de¬ 
clares  a  variable  to  have  a  scope  and 
visibility  limited  to  the  subroutine. 
Local  arrays  must  be  dimensioned  as 
dynamic  arrays  (Turbo  BASIC  sup¬ 
ports  static  and  dynamic  array  di¬ 
mensioning).  Static  variables  retain 
their  values  between  subroutine 
calls,  whereas  shared  variables  be¬ 


come  global  to  the  rest  of  the  pro¬ 
gram.  Turbo  BASIC  also  implements 
EXIT  SUB  to  enable  program  flow  to 
return  to  the  caller  and  supplies 
built-in  functions  to  return  the  lower 
and  upper  bounds  of  an  array. 

Example  3,  page  144,  shows  a  Tur¬ 
bo  BASIC  subroutine  that  returns  the 
basic  statistics  and  data  stored  in  an 
array.  It  is  similar  to  the  QuickBASIC 
and  True  BASIC  versions  presented  in 
the  last  column. 

DDJ 

Vote  for  your  favorite  feature/article. 

Circle  Reader  Service  No.  9. 


Dr.  Dobb's  Journal,  May  1987 


14S 

417 


PROGRAMMER'S  SERVICES 

BOOKS 


Numerical  Recipes:  The  Art  of  Scien¬ 
tific  Computing.  Press,  William  H.; 
Flannery,  Brian  P.;  Teukolsky,  Saul 
A.;  and  Vetterling,  William  T.  Cam¬ 
bridge,  England:  Cambridge  Univer¬ 
sity  Press,  1986.  $40. 

In  the  beginning,  there  was  Ham¬ 
ming  (1962).  And  Hamming  begat  Ac¬ 
ton  (1970),  who. begat  Dahlquist  and 
Bjorck  (1974),  who  begat  Ralston  and 
Rabinowitz  (1978),  who  begat  Stoer 
and  Burlisch  (1980).  Now  we  have  Nu¬ 
merical  Recipes,  the  latest  addition  to 
a  fine  line.  For  those  who  have 
worked  with  numerical  analysis,  it  is 
a  welcome  addition  to  your  arsenal; 
for  those  of  you  who  are  new  to  the 
subject  or  who  have  not  yet  begun  to 
build  your  library,  it  is  the  one  book 
to  buy  if  you  are  going  to  have  to 
solve  anything  numerically  on  a 
computer. 

Content  first — the  book  is  compre¬ 
hensive.  It  has  the  usual  chapters  on 
the  solution  of  linear  algebraic  equa¬ 
tions,  interpolation  and  extrapola¬ 
tion,  integration  of  functions,  evalua¬ 
tion  of  functions,  root  finding  and 
nonlinear  sets  of  equations,  minimi¬ 
zation  or  maximization  of  functions, 
eigensystems,  integration  of  ordi¬ 
nary  differential  equations,  two- 
point  boundary  value  problems,  and 
an  introduction  to  partial  differential 
equations.  In  addition,  it  contains  a 
collection  of  chapters  on  topics  not 
usually  found  in  other  books:  special 
functions,  random  numbers,  sorting, 
Fourier  transform  spectral  methods, 
statistical  description  of  data,  and 
modeling  of  data.  These  chapters,  to¬ 
gether  with  the  others,  group  in  one 
place  almost  all  the  techniques  that 
today’s  scientists  and  engineers  com¬ 


monly  use  to  get  the  job  done. 

Issues  of  style  come  next.  The 
knowledge  of  mathematics  required 
to  cope  with  the  text  is  university  lev¬ 
el — that  is,  you  should  have  some  fa¬ 
miliarity  with  linear  algebra  and  cal¬ 
culus.  The  authors  stay  away  from 
what  I  would  consider  an  overly  the¬ 
oretical  approach  in  both  the  text 
and  the  mathematical  notation,  al¬ 
though  you  do  have  to  understand 
the  normal  amount  of  symbolic  ma¬ 
nipulation  that  comes  when  dealing 
with  matrices,  sums,  and  integrals. 

In  each  subject  area,  the  authors 
present  several  methods  after  dis¬ 
cussing  the  problem  in  the  introduc¬ 
tory  section  of  the  chapter.  Each  sec¬ 
tion  covers  a  method,  with  text  and 
some  equations  as  appropriate.  Usu¬ 
ally  there  is  a  FORTRAN  subroutine  at 
the  end  of  the  section,  which  not 
only  illustrates  the  method  but 
which  can  also  be  used  to  get  real 
work  done.  (The  authors  have  also 
translated  all  the  subroutines — there 
are  more  than  200  of  them — into  Pas¬ 
cal  and  have  included  these  in  an  ap¬ 
pendix.)  As  they  discuss  each  alter¬ 
nate  method,  the  authors  give  you 
their  candid  opinion  of  the  strengths 
and  weaknesses  of  the  approach, 
usually  placing  it  in  some  historical 
context.  There  are  references  at  the 
end  of  each  section  and  a  great  bibli¬ 
ography  at  the  end  of  the  book.  The 
references  and  bibliography  alone 
are  an  invaluable  source  for  getting 
more  information  when  you  need  to 
go  further. 

The  book’s  layout  is  clean.  The 
type  is  easy  to  read,  the  choice  of  no¬ 
tation  is  excellent,  and  the  programs 
are  easy  to  follow  because  they  are 
well  commented.  It  is  a  little  unfortu¬ 
nate  that  FORTRAN  was  chosen  as  the 
in-text  language;  this  reflects  FOR¬ 
TRAN’S  omnipresence  in  scientific 
computing  but  carries  with  it  the  ter¬ 
rible  burden  of  short,  elided  names 
for  variables.  The  Pascal  versions  suf¬ 
fer  from  being  machine-translated 
from  the  FORTRAN.  The  authors 
promise  to  swap  the  FORTRAN  and 
Pascal  roles  in  the  next  edition  and 
would  like  to  hear  from  those  folks 
who  would  like  to  see  a  C  version. 
My  vote  would  be  for  a  clean  imple¬ 
mentation  in  Ada. 


The  authors  have  style.  They  chose 
the  name  of  the  book  to  be  reminis¬ 
cent  of  a  cookbook,  but  in  their 
words,  there  is  a  difference  between 
a  cookbook  and  a  restaurant  menu: 
"The  latter  presents  choices  among 
complete  dishes  in  each  of  which  the 
individual  flavors  are  blended  and 
disguised.  The  former — and  this 
book — reveals  the  individual  ingredi¬ 
ents  and  explains  how  they  are  pre¬ 
pared  and  combined.”  The  strength 
of  the  book  is  that  with  each  recipe 
(read:  computer  subroutine),  there  is 
enough  explanatory  underpinning 
so  that,  with  a  reasonable  amount  of 
care  and  intelligence  on  the  part  of 
the  reader,  the  proverbial  bullet  in 
the  foot  can  be  avoided.  The  authors' 
writing  style  makes  the  material 
easy  to  follow  and  not  dull  reading  at 
all;  it's  great  to  see  hard-earned  expe¬ 
rience  come  through  as  charm  and 
wit. 

Although  this  book  owes  a  debt  to 
all  its  predecessors — most  notably 
for  its  acknowledged  stylistic  similar¬ 
ity  to  Acton's — it  is  different  in  that  it 
takes  positions  and  makes  judg¬ 
ments.  It  is  a  guidebook,  where  oth¬ 
ers  are  compendia.  It  consciously 
does  not  spend  a  lot  of  time  on  meth¬ 
ods  that  the  authors  feel  have  been 
popular  in  the  past  but  have  now, 
perhaps  recently,  been  superseded 
by  others. 

Supplementary  materials,  which  I 
have  not  seen,  are  available — ma¬ 
chine  readable  source  for  the  sub¬ 
routines  in  either  language,  example 
books  that  show  how  to  use  the  sub¬ 
routines,  and  machine  readable  ver¬ 
sions  of  the  example  books.  To  get 
one  complete  set  (one  language)  of 
everything  costs  less  than  $60,  which 
when  coupled  with  the  price  of  the 
book,  is  less  than  $100.  When  you 
consider  the  amount  of  time  it  would 
take  you  to  get  a  routine  working 
when  you  need  one,  you  would  have 
to  have  an  extremely  low  hourly 
rate  not  to  be  able  to  easily  justify  the 
cost  of  these  materials. 

— Joe  Marasco 

DDJ 

Vote  for  your  favorite  feature/article. 

Circle  Reader  Service  No.  1 0. 


146 

418 


Dr.  Dobb's  Journal,  May  1987 


PROGRAMMER'S  SERVICES 


OF  INTEREST 


For  the  Amiga 

Metacomco,  the  author  of  Amiga- 
DOS,  has  released  an  improved  ver¬ 
sion  of  the  Amiga  command-line  in¬ 
terpreter  called  Shell.  Shell  is  a 
programming  environment  featur¬ 
ing  command-line  history  and  edit¬ 
ing,  aliases,  path  and  push/pop  di¬ 
rectories,  and  variables.  Shell  is 
compatible  with  all  standard  com¬ 
mand-line  interpreters  and  sells  for 
$69.95.  Reader  Service  No.  16. 
Metecomco 

5353  Scotts  Valley  Dr.,  #E 
Scotts  Valley,  CA  95066 
(408)  438-7201 

The  latest  version  of  the  Lattice  Ami- 
gaDOS  C  compiler  has  a  new  library 
with  more  than  100  functions  and  in¬ 
creased  library  modularity.  Version 
3.10  also  features  new  addressing 
modes,  fast  pointer  and  integer  math, 
fast  IEEE  floating-point  routines,  and 
multitasking  support.  The  base-level 
compiler  sells  for  $225;  the  profes¬ 
sional  package  (which  includes  text 
management  and  make  utilities, 
screen  editor,  and  debugger)  costs 
$375.  Reader  Service  No.  17. 

Lattice  Inc. 

P.O.  Box  3072 
Glen  Ellyn,  IL  60138 
(312)  858-7950 

Central  Coast  Software  has  en¬ 
hanced  DOS-2-DOS,  a  disk-to-disk  file 
transfer  program  for  the  Amiga  that 
transfers  all  MS-DOS  file  types  to  and 
from  AmigaDOS.  DOS-2-DOS  now  sup¬ 
ports  3%-inch  720K  disks,  formats 
both  3Vz-  and  5V4-inch  MS-DOS  disks, 
converts  ASCII  file  line-ending  char¬ 
acters,  and  provides  WordStar  com¬ 
patibility.  The  program  costs  $55. 


Reader  Service  No.  18. 

Central  Coast  Software 
268  Bowie  Dr. 

Los  Osos,  CA  93402 
(805)  528-4906 

For  the  Macintosh 

The  Complete  Book  of  Macintosh  As¬ 
sembly  Language  Programming,  Vol¬ 
ume  II,  by  Dan  Weston  features  a  col¬ 
lection  of  assembly-language  projects 
that  explore  advanced  topics  in  Mac 
programming.  Published  by  Scott, 
Foresman  &  Co.,  the  book  covers 
the  new  ROM  of  the  Mac  Plus,  how 
the  clipboard  is  used  and  how  it  is 
converted  by  Switcher,  and  how  to 
use  the  hierarchical  file  system  and 
includes  source  code  listings.  It  costs 
$22.95.  Reader  Service  No.  19. 

Scott,  Foresman  &  Co. 

1900  E.  Lake  Ave. 

Glenview,  IL  60025 
(312)  729-3000 

The  APVPLUS  System  for  the  Macin¬ 
tosh  from  STSC  is  a  full-featured  APL 
language  interpreter.  The  system  is 
compatible  with  STSC's  APL*PLUS  Sys¬ 
tem  for  the  IBM  PC  and  allows  exist¬ 
ing  applications  to  be  converted  and 
run  on  the  Mac.  The  package  takes 
advantage  of  standard  Mac  features, 
and  common  desk  accessories  can  be 
used  from  the  APL  environment.  The 
APL'PLUS  System  for  the  Macintosh 
runs  on  a  Macintosh  with  at  least 
512K  RAM  and  one  disk  drive  and 
sells  for  $395.  Reader  Service  No.  20. 
STSC  Inc. 

2115  E.  Jefferson  St. 

Rockville,  MD  20852 
(800)  592-0050 

Miscellaneous 

Real-Time  Computer  Science 
Corp.  (HTCS)  is  now  shipping  RTX286, 
a  real-time,  multitasking,  multiuser 
operating  system  for  the  IBM  PC/AT. 
RTX286  is  a  complete  implementation 
of  Intel's  iRMX286  operating  system 
specifically  configured  for  the  AT 
and  its  peripheral  devices.  It  takes  ad¬ 
vantage  of  the  protected  mode  of  the 
iAPX286  processor,  offering  memory- 
access  protection  as  well  as  allowing 
users  to  access  up  to  16  megabytes  of 
memory  directly.  RTX286  is  priced  at 
$2,395,  and  RTX286-C  (a  configurable 


version)  costs  $2,795.  Reader  Service 
No.  21. 

Real-Time  Computer  Science  Corp. 
(RTCS) 

1390  Flynn  Rd. 

Camarillo,  CA  93010 
(805)  987-9781 

Syncra  PC,  from  Eastman  Communi¬ 
cations  ,  is  a  software  package  that  al¬ 
lows  error-free  transfer  of  documents 
and  files  among  IBM  PCs  and  compati¬ 
bles.  The  program  must  be  operating 
on  both  the  transmitting  and  receiv¬ 
ing  computers  in  order  to  transfer 
data.  It  can  also  communicate  directly 
with  corresponding  Syncra  software 
packages  on  DEC  VAX  and  IBM  OS/DOS 
mainframes  and  System/36  minis. 
Syncra  PC  uses  an  asynchronous  pro¬ 
tocol  and  can  be  transmitted  between 
1,200  and  9,600  bps.  An  automated  fea¬ 
ture  allows  completely  unattended 
operation.  A  data-compression/com- 
paction  feature  allows  document  and 
file  sizes  to  be  reduced  by  50  percent 
or  more.  Retail  price  is  $79.  Reader 
Service  No.  22. 

Eastman  Communications 
1099  Jay  St. 

Rochester,  NY  14650 
(716)  464-5500 

Boca  Research’s  BOCARAM  is  avail¬ 
able  for  IBM  PCs,  PC/XTs  (including  the 
XT  286),  PC/ ATs,  and  compatibles.  BO¬ 
CARAM  fits  into  the  XT  286  box,  con¬ 
necting  to  the  8-bit  connector  to  ex¬ 
pand  its  memory  from  640K  to  2 
megabytes  per  board.  The  board  con¬ 
forms  to  EMS  3.2,  which  permits  the 
use  of  application  software  packages 
that  access  memory  up  to  8  mega¬ 
bytes.  BOCARAM  software  includes  a 
RAM  disk,  print  spooler,  and  memory 
diagnostic  program  in  addition  to  the 
Boca  Research  Expanded  Memory 
Manager  driver.  Prices  range  from 
$245  to  $740,  depending  on  the 
amount  of  memory.  Reader  Service 
No.  23. 

Boca  Research 
6104  Congress  Ave. 

Boca  Raton,  FL  33431 
(305)  997-6227 

Discovery  Systems  has  released  an 
audio-cassette  training  program  for 
Autodesk’s  AutoLlSP,  a  training 


148 


Dr.  Dobb's  Journal,  May  1987 

419 


course  for  AutoCAD  users.  The  eight 
lessons  provide  a  step-by-step  pro¬ 
gram  with  complete  instructions  to 
create  custom  AutoLlSP  functions, 
custom  menus,  and  other  timesaving 
utilities.  The  price  is  $179.  Reader  Ser¬ 
vice  No.  24. 

Discovery  Systems 
34  Autumnleaf 
Irvine,  CA  92714 
(714)  733-9890 

American  Computer  &  Peripheral 

has  introduced  an  accelerator  card 
that  utilizes  the  80386  microproces¬ 
sor.  The  386  TURBO  board  can  bring  a 
6-MHz  IBM  PC/AT  or  compatible  up  to 
12  MHz  and  an  8-MHz  computer  up 
to  16  MHz.  Clock  rates  are  switchable 
through  software  without  rebooting. 
Software  written  for  the  AT  (includ¬ 
ing  DOS,  ROM  BIOS,  EGA  ROM,  and  SO 
on)  executes  from  a  1-megabyte 
cache  memory.  The  board  sells  for 
$1,995.  Reader  Service  No.  25. 
American  Computer  &,  Peripheral 
Inc. 

2720  Croddy  Wy. 

Santa  Ana,  CA  92704 
(714)  545-2004 

TEFT  (Terminal  Emulator  and  File 
Transfer)  from  S.  M.  Vorkoetter 
Software  allows  an  IBM  PC,  PC/XT, 
PC/AT,  PCjr,  or  compatible  to  be  used 
as  an  intelligent  terminal  for  commu¬ 
nicating  with  a  host  computer  or  a 
BBS.  Features  of  TEFT  include  VT100 
terminal  emulation,  text  file  transfer, 
conversion  of  binary  files  to  text  files 
and  vice  versa,  a  batch  mode,  and 
baud  rates  from  50  to  9,600  baud  (50  to 
4,800  on  a  PCjr).  The  product  requires 
128K  RAM,  one  disk  drive,  and  an  IBM- 
compatible  serial  adapter  card.  TEFT 
is  not  copy-protected  and  is  priced  at 
$60.  Reader  Service  No.  26. 

S.  M.  Vorkoetter  Software 
P.O.  Box  872 
Waterloo,  Ont. 

Canada  N2J  4C3 

SK  DOS,  from  Star-K  Software  Sys¬ 
tems,  is  a  single-user  operating  sys¬ 
tem  for  68xxx-based  machines.  This 
generic  DOS  is  easily  implemented  on 
a  new  system  and  allows  programs 
written  for  one  system  to  run  on 
many  others.  It  includes  more  than 


40  commands  and  system  programs, 

including  a  6809  emulator.  SK  DOS 

sells  for  $125.  Reader  Service  No.  27. 

Star-K  Software  Systems 

P.O.  Box  209 

Mt.  Kisco,  NY  10549 

(914)  241-0287 

Electronic  Specialists  has  released 
an  RS-232  bus-protection  device 
called  Kleen  Line.  The  Kleen  Line  se¬ 
curity  system  is  designed  to  suppress 
damaging  line  spikes  caused  by  light¬ 
ning  or  large  electrical  machinery. 
Units  can  be  configured  with  any,  or 
all,  of  the  RS-232  bus  lines  protected. 
Model  PDS-232  M/F,  which  guards 
lines  1,  2,  3,  and  7,  sells  for  $143.  Read¬ 
er  Service  No.  28. 

Electronics  Specialists  Inc. 

171  S.  Main  St. 

Natick,  MA  01760 
(617)  655-1532. 

Polytron  Corp.  has  introduced  Poly- 
Shell,  a  DOS  extender  and  command 
interpreter  that  adds  a  Unix  interface 
and  much  of  the  capability  and  flexi¬ 
bility  of  Unix  to  MS-DOS.  The  shell  con¬ 
sists  of  two  major  components:  the 
Command  Interpreter,  which  can  be 
used  instead  of  or  in  conjunction  with 
the  MS-DOS  command  interpreter;  and 
the  PolyShell  Utility  Set,  which  in¬ 
cludes  several  utilities  previously  as¬ 
sociated  only  with  the  Unix  operating 
system.  PolyShell  is  invoked  as  a  pro¬ 
gram  under  DOS,  and  any  MS-DOS  com¬ 
mands  can  be  called  from  within  the 
shell.  PolyShell  runs  on  the  IBM  PC, 
PC/XT,  PC/AT,  and  compatibles  with 
DOS  2.0  or  later  and  requires  at  least 
256K  RAM.  A  hard  disk  is  recommend¬ 
ed.  A  single-user  license  costs  $149. 
Reader  Service  No.  29. 

Polytron  Corp. 

1815  N.W.  169th  Pi. 

Ste.  2110 

Beaverton,  OR  97006 
(503)  645-1150 

Graphics 

Dynaware  has  released  Dynaper- 
spective,  a  3-D  solid  modeling  graph¬ 
ics  software  package  for  PC-DOS  ma¬ 
chines.  Dynaperspective  currently 
has  drivers  for  the  following  graph¬ 
ics  boards:  EGA,  Number  Nine  Revo¬ 
lution,  Artistl,  and  Artist2.  In  addi¬ 


tion  to  the  graphics  boards, 
Dynaperspective  also  supports  most 
major  plotters,  printers,  mice,  and 
digitizing  tablets.  The  package  sells 
for  $1,850.  Reader  Service  No.  30. 
Dynaware 

1309  114th  SE,  Ste.  303 
Bellevue,  WA  98004 
(206)  451-0200 

Windows  Draw  from  Micrografx  is 
a  free-form  graphics  program  that 
runs  under  Microsoft  Windows. 
Windows  Draw  images  are  object- 
based  rather  than  pixel-based  and 
thus  achieve  device  independence, 
which  allows  the  images  to  be  print¬ 
ed  with  the  maximum  resolution  of 
the  printer  rather  than  that  of  the 
computer.  Windows  Draw  includes 
Windows  ClipArt  (a  collection  of 
Windows-compatible  artistic  images) 
and  sells  for  $299.  Reader  Service  No. 
31. 

Micrografx  Inc. 

1820  N.  Greenville  Ave. 

Richardson,  TX  75081 

(214)  234-1789 

Another  graphics  package  designed 
for  use  with  Microsoft  Windows  is 
Cricket  Graph,  from  Cricket  Soft¬ 
ware.  Designed  to  run  on  the  Macin¬ 
tosh,  the  package  has  page-layout  ca¬ 
pabilities  and  supports  a  variety  of 
printers,  plotters,  and  film  recorders. 
It  also  offers  a  variety  of  editing  and 
data  manipulation  capabilities:  data 
can  be  sorted;  grouped  by  ranges  of 
values;  smoothed;  and  transformed 
by  logarithmic,  trigonometric,  expo¬ 
nential,  and  statistical  functions. 
Cricket  Graph  sells  for  $295.  Reader 
Service  No.  32. 

Cricket  Software 
3508  Market  St.,  Ste.  206 
Philadelphia,  PA  19104 
(800)  345-8112 

(215)  387-7955 

Surf3-D/Surf87  from  dogStar  Soft¬ 
ware  is  a  3-D  surface  plotting  pro¬ 
gram  written  in  Turbo  Pascal  and  us¬ 
ing  TurboHALO  graphics  routines. 
The  program  can  work  with  most 
MS-DOS  display  screens  and  printers. 
The  program  calculates  and  plots  the 
surface  of  selected  x,y  functions  or  a 
function  you  supply  and  permits  you 


Dr.  Dobb's  Journal,  May  1987 

420 


149 


PROGRAMMER'S  SERVICES 
OF  INTEREST 

(continued  from  page  149) 


to  rotate  the  surface  about  any  axis. 
You  can  select  scaling  factors,  surface 
hatching,  and  other  options.  The  3-D 
plotting  routines  include  optional 
8087  math  coprocessor  support  that 
is  optimized  for  rotation.  Source  code 
is  included.  Surf3-D/Surf87  is  a 
shareware  product;  a  donation  of  $10 
is  suggested.  Reader  Service  No.  33. 
dogStar  Software 
P.O.  Box  302 
Bloomington,  IN  47402 
(812)  333-5616 

The  Hot  Shot  graphics  printer  inter¬ 
face  for  Commodore  computers 
from  Omnitronix  supports  graphics 
printing  on  most  popular  dot-matrix 
printers.  It  has  a  standard  internal  IK 
X  4  graphics  buffer  that  can  be  ex¬ 
panded  to  8K  to  help  speed  up  print¬ 
ing.  Screen  dumps  can  be  set  for  re¬ 
verse  or  inverse  printing,  and  on 
many  printers  you  can  select  en¬ 
hanced,  double-density  printing  of 
graphics  screens  and  graphics  char¬ 
acters.  Hot  Shot  sells  for  $59.95.  Read¬ 
er  Service  No.  34. 

Omnitronix  Inc. 

760  Harrison  St. 

Seattle,  WA  98109 
(206)  624-4985 


For  the  Mac 

The  MacBus/RTl-800  series  software, 
from  National  Instruments,  allows 
control  of  the  Analog  Devices  RTI-800 
series  analog  and  digital  I/O  boards 
with  the  Macintosh  Plus  using  Mac- 
Bus.  The  software  enables  users  to 
program  the  boards  with  Microsoft 
BASIC,  Megamax  C,  and  LabVlEW 
(LabVlEW  and  C  application  exam¬ 
ples  are  included  and  fully  ex¬ 
plained).  MacBus/RTI-800  series  soft¬ 
ware  sells  for  $195.  Reader  Service 
No.  35. 

National  Instruments 
12109  Technology  Blvd. 

Austin,  TX  78727-6204 
(800)  531-4742 
In  TX  (800)  IEEE-488 
(512)  250-9119 

MacroMind  has  realeased  Maze 
Wars  +  ,  a  real-time  multiplayer 
game  for  the  AppleTalk  network 
that  is  a  direct  descendent  of  the  clas¬ 


sic  Maze  Wars  game  from  MIT  and 
Xerox  PARC  in  the  early  70s.  A  termi¬ 
nal  program  is  built  into  the  game  to 
allow  connection  with  another  play¬ 
er  via  1, 200-2, 400-baud  modem  or  by 
direct  null  modem.  Messages  can  also 
be  passed  back  and  forth.  Maze 
Wars+  is  not  copy-protected  and 
costs  $49.95;  site  licenses  are  available 
for  $15  per  node.  Reader  Service  No. 
36. 

MacroMind  Inc. 

1028  W.  Wolfram  St. 

Chicago,  IL  60657 
(312)  871-0987 

OWL  International  has  released  a 
hypertext  system  for  the  Mac  called 
Guide.  Guide  incorporates  many  fea¬ 
tures  of  standard  word  processors 
and  outline  processors  as  well  as  ad¬ 
ditional  facilities  for  information 
management,  such  as  annotation  and 
cross-referencing.  Guide  requires  a 
512K  Mac,  Mac  Plus,  or  Mac  XL  and 
can  work  with  any  graphics  pro¬ 
gram  that  supports  the  clipboard.  It 
can  be  used  with  MacWrite,  Micro¬ 
soft  Word,  Aldus  Pagemaker,  or  any 
program  that  can  read  MacWrite 
files.  Guide  sells  for  $134.95.  Reader 
Service  No.  37. 

OWL  International  Inc. 

14218  N.E.  21st  St. 

Bellevue,  WA  98007 
(206)  747-3203 

Bering  Industries  has  developed  a 
line  of  5V4-inch  removable  cartridge 
systems  for  the  Mac.  The  new  line  of 
Bernoulli  drives,  called  Totem,  in¬ 
cludes  three  models  :  a  single  20-me- 
gabyte  removable  cartridge  for 
$1,495;  a  dual  20-megabyte  remov¬ 
able  cartridge  for  $2,295;  and  a  com¬ 
bination  20-megabyte  removable  car¬ 
tridge  plus  a  20-megabyte  fixed  hard 
disk  for  $2,295.  Bering  also  sells  a  20- 
megabyte  5V4-inch  fixed  drive  for  the 
Mac  for  $795.  Reader  Service  No.  38. 
Bering  Industries 
250  Technology  Circle 
Scotts  Valley,  CA  95066 
(408)  438-8779 


For  the  PC 

IOTools,  from  Rhoads  Software, 
provides  terminal-independent  I/O 


mapping  with  a  constant  program¬ 
ming  interface.  The  package  gives 
you  control  over  console,  asynchro¬ 
nous,  and  parallel  I/O  as  well  as  sev¬ 
eral  useful  library  modules.  It  makes 
for  easy  management  of  the  charac¬ 
ters  from  the  keyboard  and  includes 
more  than  15  modules  that  export 
more  than  200  procedures.  Formats 
are  available  for  MS-DOS/Logitech 
and  Pecan,  and  the  price  is  $79.50,  or 
$950  for  source  code.  Reader  Service 
No.  39. 

Rhoades  Software 
504  Meeting  House  Ln. 

Kennett  Square,  PA  19348 
(215)  388-2626 

Command  Plus,  from  ESP  Software 
Systems,  is  a  command  processor 
for  MS-DOS  machines  that  simplifies 
and  enhances  MS-DOS'  Command  fea¬ 
tures  as  well  as  offering  many  addi¬ 
tional  features  such  as  a  batch  pro¬ 
gramming  language  with  a  Pascal¬ 
like  syntax  called  SCRIPT.  Other  new 
features  include  command  macros, 
command  recall,  file  browsing,  a  log 
facility,  the  ability  to  access  environ¬ 
ment  variables  from  the  command 
line,  and  the  ability  to  select  files  us¬ 
ing  ESP's  extended  file-name  pattern¬ 
matching  facility.  Command  Plus 
sells  for  $79.95.Reader  Service  No.  40. 
ESP  Software  Systems  Inc. 

11965  Venice  Blvd.,  Ste.  309 
Los  Angeles,  CA  90066 
(213)  390-7408 

Sapiens  V8,  from  Sapiens  Software 
Corp.,  is  a  virtual  memory  manager 
for  C  programmers  on  the  PC.  Fea¬ 
tures  include  8  megabytes  of  virtual 
memory  workspace,  the  ability  to 
link  V8  libraries  to  C  compilers,  and 
software  emulation  of  64-bit  archi¬ 
tectures.  Sapiens  V8  sells  for  $300. 
Reader  Service  No.  41. 

Sapiens  Software  Corp. 

P.O.  Box  7720 

Santa  Cruz,  CA  95061-7720 

(408)  458-1990 

DDJ 


150 


Dr.  Dobb's  Journal,  May  1987 

421 


FORUM 


SWAINE'S  FLAMES 


Steve  Jasik  could  reasonably  call 
his  MacNosy  V2  Documentation 
V2.50  the  most  eagerly  awaited  man¬ 
ual  of  the  year.  It's  not  that  drooling 
hordes  of  programmers  have  been 
demanding  this  Macintosh  disassem¬ 
bler — it’s  still  an  underappreciated 
marvel — but  those  of  us  who  grap¬ 
pled  with  the  original  documenta¬ 
tion  were  wet-chinned  with 
anticipation. 

The  new  manual  fairly  describes 
Nosy  as  an  information-recovery  tool 
for  the  Mac,  a  description  justified  by 
the  on-line  Inside  Mac  feature  alone. 
I  was  so  eager  to  use  Nosy  that  I  rash¬ 
ly  agreed  to  help  revise  the  original 
manual,  bailing  out  only  when  I  real¬ 
ized  that  I  would  have  to  understand 
Nosy  intimately  to  improve  the  docu¬ 
mentation  and  that  there  was  no 
way  I  was  going  to  get  Nosy  literate 
from  Steve’s  manual. 

Others  were  more  successful,  and 
the  resulting  documentation  is  not 
only  comprehensible  but  also  strewn 
with  uncloneable  Jasikisms  from  the 
Head  Nose. 

Nosy  would  be  a  handy  tool  for, 
say,  an  independent  programmer 
developing  desk  accessories  that 
worked  intimately  with  Microsoft 
Excel.  Tools  for  programmers  work¬ 
ing  in  teams  have  been  slower  in 
coming  than  such  individual  tools, 
but  computer-aided  software  engi¬ 
neering  (CASE)  has  taken  a  step  for¬ 
ward  with  the  arrival  of  commer¬ 
cially  priced  32-bit  machines  at  a 
time  when  successes  in  CAD/CAM  are 
ripe  for  translation  to  general  soft¬ 
ware  development. 

Rich  Carpenter  of  Index  Technolo¬ 
gies,  a  Cambridge,  Massachusetts, 
CASE  company,  argued  at  the  Person¬ 
al  Computer  Forum  in  Phoenix  this 
spring  that  current  software-devel¬ 
opment  approaches  have  been  bor¬ 
rowed  from  engineering  situations 
in  which,  for  example,  it  was  prohib¬ 
itively  expensive  to  move  elevators 


after  putting  up  inside  walls.  Soft¬ 
ware  development  needs  its  own 
models  for,  and  tools  in  support  of, 
design,  specification,  prototyping, 
version  management,  testing,  debug¬ 
ging,  and  documentation. 

Software  development  also  needs 
better  tools.  One  of  the  advantages  of 
C  is  that  you  can  get  fairly  efficient 
code  out  of  primitive  compilers, 
which  could  be  a  fair  assessment  of 
1986-vintage  microcomputer  compil¬ 
er  technology.  Not-so-primitive  opti¬ 
mizing  compilers  are  now  arriving:  I 
know  of  five  optimizing  C  compilers 
released  or  in  the  works,  and  Micro¬ 
soft’s  FORTRAN  compiler  is  one  of  the 
most  sophisticated  optimizing  com¬ 
pilers  on  personal  computers.  It  will, 
for  example,  turn 

y=sin(x)**2 

into  the  faster 

temp=sin(x) 

y=(temp*temp) 

Scientific  and  engineering  users  of 
personal  computers  can  expect  more 
than  just  a  fast  FORTRAN  from  soft¬ 
ware  companies,  though.  Both  Lotus 
and  Borland  now  have  engineering 
and  scientific  divisions  dedicated  to 
developing  products  for  this  group, 
which  Lotus  says  accounts  for  17  per¬ 
cent  of  its  existing  user  base.  Lotus  is 
even  considering  starting  a  magazine 
for  scientists  and  engineers. 

Legal  issues:  ADAPSO’s  efforts  to  get 
software  vendors  to  voluntarily  pro¬ 


vide  substantive  warranties  rather 
than  “as  is’’  disclaimers  on  products 
has  led  a  member  of  the  California 
Assembly,  Gloria  Molina,  to  recom¬ 
mend  against  legislation  to  force 
warranties.  The  Copyright  Office  has 
judged  Lotus’  1-2-3  user  interface  un- 
copyrightable  from  a  look-and-feel 
standpoint  because  it  is  basically  text. 
And  several  senators  are  reintroduc¬ 
ing  a  bill  to  create  an  Information 
Age  Commission,  whose  purpose 
would  be  to  study  the  impacts  of 
high  technology. 

Memorable  moments  at  the  Per¬ 
sonal  Computer  Forum:  hearing  that 
the  personal  computer  industry 
looks  willing  to  give  Bill  Gates  a  bil- 
lion-dollar  company  in  exchange  for 
a  little  stability;  Mitch  Kapor  deftly 
fending  off  an  embarrassing  ques¬ 
tion  about  the  Lotus  look-and-feel 
suit  by  attacking  David  Bunnell’s  edi¬ 
torial  on  the  Georgia  sodomy  law; 
and  the  following  story: 

“The  editor  kept  calling  our  product 
a  database  and  I  said,  ’It’s  really  a  pro¬ 
gramming  language.’  But  he  said, 
‘We  don’t  cover  languages,  Bruce.’  So 
I  said,  'Well,  really  it’s  more  of  a  data¬ 
base.’  Then  everything  was  fine." 

This  story  was  related  by  a  soft¬ 
ware  company  president  during  a 
dinner  at  which  several  software 
company  CEOs  told  M&.T  executives 
horror  stories  about  the  technical  in¬ 
competence  of  the  computer  press, 
including  the  story  of  the  product 
that  got  rave  reviews  for  a  "feature” 
that  was  really  a  bug  the  company 
had  been  trying  to  eliminate  for 
months.  The  hard  work  that  goes 
into  developing  a  good  program  de¬ 
serves  competent  reporting. 

micKauD 

Michael  Swaine 
editor-in-chief 


152 

422 


Dr.  Dobb 's  Journal,  May  1987 


TSR  Serial  Driver 


Analog-to-Digital 

Conversion 


Unix  Shell  Scripts 


Languages: 

Dynamic  Overlays  in  Turbo  Pascal 
Handling  Queues  in  C 
Al  Programming  in  SCOOPS 
User- Defined  Functions  in  BASIC 


JUNE  1987 


CONTENTS 


VOLUME  12,  ISSUE  6 


ARTICLES 


Efficient  ► 
queue  control 


Mathematical 
basis  for  a 
space-saving  tricky 


TSR  serial  driver  ► 


Handling  large  ^ 
Turbo  Pascal 
programs 


Unix  ► 
telecommunications 
software 

Queue  control 

in  C  ^ 

For  the 

programmer's  ► 
bookshelf 

Object-oriented  ^ 
programming 


Time  to  hock 
the  Volvo 


What's  right  with 
high-level 
languages? 


ALGORITHMS:  An  Efficient  Algorithm  for  Large  1 6 

Priority  Queues 

by  Robert  Jay  Brown 

By  deferring  decisions  about  lower-level  priorities,  this 
algorithm  is  able  to  assign  the  top  priority  more  efficiently. 

COMMUNICATIONS:  Two-Bit  Analog-to-Digital  22 

Conversion 

by  John  Musselman 

How  to  use  hardware  interrupts  and  a  demonstration  of 
techniques  common  in  real-time  programming 

ALGORITHMS:  The  XOR  Chain  28 

by  David  E.  Cortesi 

The  Swapped-Out  Intern  returns  with  a  technique  that 
exploits  a  curious  property  of  doubly  linked  lists. 

COMMUNICATIONS:  An  Extended  COM  Port  Driver  42 

by  Thomas  A.  Zimniewicz 

Excom  is  a  terminate-and-stay-resident  program  that  gives 
the  IBM  PC  interrupt-driven  buffered  input  with  flow- 
control  selection  and  support  for  higher  baud  rates. 

LANGUAGES:  Dynamic  Memory  Overlays  for  Turbo  50 
Pascal 

by  Steve  McMahon 

Steve  shows  how  to  get  around  the  64K  limit  on  executable 
code  imposed  by  Turbo  Pascal  without  resorting  to  slow 
disk  overlays. 

COMMUNICATIONS:  A  Unix  BBS  Using  Shell  Scripts  54 

by  Jan  L.  Harrington 

Jan  shows  just  how  she  wrote  a  bulletin  board  system 
using  Unix  V  Bourne  shell  scripts  and  XMODEM. 


COLUMNS 


C  CHEST 
by  Allen  Holub 

Allen  delves  into  the  subject  of  priority  queues  and,  in 
Flotsam  and  Jetsam,  discusses  standard  * include  files. 
16-BIT  SOFTWARE  TOOLBOX 
by  Ray  Duncan 

Ray  discusses  programming  books.  He  also  offers  a  poor 
man's  MAKE  utility  and  an  MS-DOS  programming  tip. 
ARTIFICIAL  INTELLIGENCE 
by  Ernest  R.  Tello 

Ernie  presents  some  examples  of  object-oriented 
programming  using  SCOOPS,  an  extension  of  PC  Scheme, 
"the  Turbo  Pascal  of  the  PC  LISP  family.” 


102 


112 


116 


FORUM 


EDITORIAL  6 

by  Michael  Swaine 

RUNNING  LIGHT  8 

by  Levi  Thomas 

ARCHIVES  8 

LETTERS  10 

by  you 

VIEWPOINT  14 

by  Brian  R.  Anderson 
SWAINE  S  FLAMES  136 

by  Michael  Swaine 


PROGRAMMER'S 

SERVICES 


THE  STATE  OF  BASIC:  128 
A  comparison  of  user-defined 
functions  in  four  BASICS 
OF  INTEREST:  132 

Products  for  programmers 
ADVERTISER  INDEX:  113 
Where  to  find  those  ads 


About  the  Cover 

Handling  the  traffic  flow  of  multi¬ 
tasking  requires  an  efficient 
queueing  algorithm  such  as  the 
one  that  has  just  given  the  green 
light  to  a  task  in  this  month’s  cov¬ 
er  illustration. 

This  Issue 

If  this  issue  looks  packed,  it  is. 
Three  articles  deal  with  commu¬ 
nications  techniques.  Robert  Jay 
Brown  and  Allen  Holub  discuss 
algorithms  for  priority  queues. 
Dave  Cortesi,  a  longtime  DDJ  col¬ 
umnist,  contributes  a  rare  bit  on 
doubly  linked  lists.  Steve  McMa¬ 
hon  gives  aid  and  comfort  to  se¬ 
rious  Turbo  Pascal  program¬ 
mers;  Ernie  Tello  and  Allen 
Holub  give  tips  to  new  PC 
Scheme  and  C  programmers,  re¬ 
spectively;  and  Forth,  assembly- 
language,  Modula-2,  and  BASIC 
programmers  will  all  find  some¬ 
thing  of  interest  in  this  issue. 

Next  Issue 

So  you  really  did  it?  You  sold  the 
Volvo  and  bought  a  386  machine 
and  now  you're  going  to  develop 
software  for  it?  Only  you  don’t 
want  to  wait  for  Microsoft  to  de¬ 
liver  OS/2  sometime  next  year? 
We  hope  you  saved  $2.95  for  our 
July  issue,  in  which  we  will  dis¬ 
cuss  development  tools  for  the 
386  that  are  available  today. 


Dr.  Dobb's  Journal,  June  1987 

424 


3 


FORUM 


EDITORIAL 


The  summer  of 
1987.  Midyear. 

Time  to  take  stock;  time 
to  assess  what  we’ve 
seen  of  1987  and  mull 
over  what  it  all  means. 

What  can  we  say?  That 
it  was  the  spring  of 
hype,  the  summer  of 
virtuality?  I  think  it’s  a 
watershed  summer,  a 
good  time  to  be  a  soft¬ 
ware  developer. 

Microsoft,  IBM,  and  Apple  all 
claimed  that  we  were  entering  a 
new  generation  of  personal  comput¬ 
ing.  Both  Apple  and  IBM  announced 
significant  new  machines.  Both  Ap¬ 
ple  and  IBM  committed  to  fast,  smart 
new  buses  and  more  open  architec¬ 
tures,  which  will  create  new  chal¬ 
lenges  and  opportunities  for  hard¬ 
ware  and  software  developers. 

Microsoft  revealed  plans  for  get¬ 
ting  parts  of  its  OS/2  for  80286  and 
80386  processors  into  developers' 
hands  this  year.  Ever  since  Gary  Kil- 
dall  put  an  operating  system  on  an 
8080,  the  only  arguably  revolution¬ 
ary  advance  in  personal  computer 
operating  systems  has  been  the  Mac¬ 
intosh  system  software.  It  might  be  a 
stretch  to  suggest  that  OS/2  could  be 
the  next,  but  when  it  really  arrives  it 
should  prove  fertile  ground  for  soft¬ 
ware  development. 

Both  Unix  and  C  took  steps  toward 
standardization  in  the  first  half  of 
this  year,  and  the  crowded  C  compil¬ 
er  market  ripened  toward  some  sort 
of  mitosis.  Both  Unix  and  C  presented 
new  tools  and  new  environments  for 
development.  Version  management 
and  CASE  became  more  real  for  per¬ 
sonal  computer-based  development. 

If  you’re  reading  this  in  Tokyo,  you 
already  know  that  several  Japanese 
computer  companies  have  taken  a 
big  step  toward  a  viable  Japanese- 
user-oriented  personal  computer  in¬ 
dustry  by  standardizing  on  the  TRON 
real-time  operating  nucleus.  TRON 
should  have  a  much  greater  impact 
on  Japanese  computer  users  than 


the  Fifth  Generation 
Project,  and  it’s  also  go¬ 
ing  to  have  a  signifi¬ 
cant  impact  on  soft¬ 
ware  developers  by 
providing  an  entirely 
new,  coherent,  Japa¬ 
nese-script-based  pro¬ 
gramming  platform. 

These  new  environ¬ 
ments  and  new  tools 
speak  of  new  opportu¬ 
nities  for  the  software 
developer.  But  they  speak  in  the  am¬ 
bient  din,  and  it’s  easy  to  miss  their 
message. 

If  you’re  currently  evaluating  the 
software  development  opportunities 
presented  by  these  recent  advances, 
could  I  make  some  modest 
suggestions? 

Don’t  listen  too  closely  to  the  mar¬ 
keters.  Marketing  experts  are  magi¬ 
cians;  when  it’s  difficult  to  create  val¬ 
ue,  marketers  can  create  need.  They 
can  convince  people  that  there  is 
something  terribly  wrong  with  them 
that  only  your  product  can  cure.  But 
it’s  so  easy  to  create  real  value  with  a 
piece  of  software  that  I’m  just  not 
sure  that  we  need  the  digital  deodor¬ 
ant.  If  the  product  is  good,  marketing 
it  becomes  a  simpler  matter. 

And  I  wouldn’t  listen  too  closely  to 
users,  heretical  as  that  statement 
may  be.  Listening  to  users  doesn’t 
generally  contribute  a  lot  to  creating 
something  new.  User  feedback  is  use¬ 
ful  when  you’re  tweaking  existing 
products.  Users  may  not  know  any¬ 
thing  about  software,  but  they  know 
what  they  don’t  like. 

Ultimately,  I  argue,  the  real  break¬ 
throughs  come  from  listening  to  the 
technology.  That  should  be  particu¬ 
larly  true  at  a  technological  water¬ 
shed  like  this  summer. 

Hock  the  Volvo,  buy  that  68020  or 
80386  machine,  and  start  playing 
around,  v  __ 

SujrxM 

Michael  Swaine 
editor-in-chief 


Dr.  Dobb's  journal  of 

Software  Tools 

FOR  THE  PROFESSIONAL  PROGRAMMER 

Editorial 

Editor-in-Chief  Michael  Swaine 
Editor  IVylerlSpemry 
Managing  Editor  Vince  Leone 
Assistant  Editors  Sara  Noah  Ruddy 
Levi  Thomas 

Technical  Editor  Allen  Holub 
Consulting  Editor  Nick  Turner 
Contributing  Editors  Ray  Duncan 
Michael  Ham 
Bela  Lubkin 
Namir  Shammas 
Ernest  R.  Tello 

Copy  Editor  Rhoda  Simmons 

Production 

Production  Manager  Bob  Wynne 

Art  Director  Michael  Hollister 
Assoc.  Art  Director  Joe  Sikoryak 
Technical  Illustrator  Frank  Pollifrone 
Cover  Artist  Barron  Storey 
Circulation 

Circulation  Director  Maureen  Kaminski 
Newsstand  Sales  Mgr.  Stephanie  Ericson 
Book  Marketing  Mgr.  Jane  Sharninghouse 
Circulation  Coordinator  Kathleen  Shay 
Administration 
Finance  Director  Kate  Wheat 
Business  Manager  Betty  Trickett 
Accounts  Payable  Supv.  Mayda  Lopez-Quintana 
Accts.  Receivable  Supv.  Laura  Di  Lazzaro 
Advertising  Director 
Ferris  Ferdon  (415)  366-3600 
Account  Managers 
Lisa  Boudreau  (415)366-3600 
Martha  Brandt  (415)  366-3600 
Gary  George  (404)  897-1923 
Michael  Wiener  (415)366-3600 
Cynthia  Zuck  (718 )  499-9333 
Promotions/Srvcs.  Mgr.  Anna  Kittleson 
Advertising  Coordinator  Charles  Shively 

M&T  Publishing  Inc. 

Chairman  of  the  Board  Otmar  Weber 
Director  C.  F.  von  Qpadt 
President  and  Publisher  Laird  Foshav 
Associate  Publisher  Michael  Swaine 


Dr.  Dobb's  Journal  of  Software  Tools  (USPS  307690) 
is  published  monthly  by  M&.T  Publishing  Inc.,  501  Gal¬ 
veston  Dr.,  Redwood  City,  CA  94063;  (415)  366-3600. 
Second-class  |X)stuge  paid  at  Redwood  City  and  at  ad¬ 
ditional  entry  points. 

Article  Submissions:  Send  manuscripts  and  disk 
(with  article  and  listings)  to  the  Kditor. 

DDJ  on  CompuServe:  Type  GO  l)DJ 
Address  Correction  Requested:  Postmaster:  Send 
Form,  3579  to  Dr.  Dobb  's  Journal,  P.O.  Box  27809,  San 
Diego,  CA  92128.  ISSN  0888-3076 

Customer  Service:  For  subscription  problems  call: 
outside  CA  (800)  321-3333;  in  CA  (619)  485-9623  or  566- 
6947.  For  book/software  order  problems  call  (415)  366- 
3600. 

Subscriptions:  $29.97  jxjr  1  year;  $56.97  tor  2  years. 
Canada  and  Mexico  add  $27  per  year  airmail  or  $10 
per  vear  surface.  All  other  countries  add  $27  per  year 
airmail.  Foreign  subscriptions  must  be  prepaid  in  U.S. 
funds  drawn  on  a  U.S.  bank.  For  foreign  subscriptions, 
TKLKX:  752-351. 

Foreign  Newsstand  Distributor:  Worldwide  Media 
Service  Inc.,  386  Park  Avc.  South,  New  York,  NY  10016; 
(212)  686-1520  TKLKX:  620430  tWUll. 

Kntire  contents  copyright  0  1987  bv  M&.T^~^\ 
Publishing  Inc.  unless  otherwise  noted  on  ttABCtf 
specific  articles.  All  rights  reserved. 

Dr.  Dobb's  Journal  of  Software  Tools  is  published  by  M&.T 
Publishing  Inc.  under  license  from  People's  Computer  Company, 
2682  Bishop  Ur..  Suite  107,  San  Ramon,  CA  94583,  a  nonprofit 
corporation. 


6 


Dr.  Dobb's  Journal,  June  1987 

425 


FORUM 


RUNNING  LIGHT 


Greetings,  pro¬ 
grammers.  I'm 
Levi  Thomas,  assistant 
editor  and  host  of 
DDJFORUM,  our  SIG  on 
CompuServe.  If  my 
writing  style  seems 
conversational,  it’s  be¬ 
cause  I'm  accustomed  to  the  SIG,  on 
which  my  readers  can  talk  back. 

It's  an  interesting  phenomenon, 
the  on-line  life.  When  I  first  began 
working  for  DDJ  as  electronic  editor 
I  felt  like  a  DJ  on  the  graveyard  shift. 
Posting  messages  on  the  message 
board  of  the  SIG,  I  got  the  eerie  feel¬ 
ing  that  there  was  really  nobody  out 
there,  that  just  maybe  I  was  talking  to 
myself.  Luckily  the  FORUM  members 
are  a  verbal  bunch,  and  even  the 
lurkers  can  be  coaxed  into  respond¬ 
ing  if  I  ask  the  right  questions  or  start 
a  topic  that  gets  their  dander  up. 

Now  I  want  to  talk  to  you  hard¬ 
copy  readers.  I  wanna  pull  on  yer 
coats  about  something.  I  hope  you'll 
talk  back  to  me,  either  on  line  or  via 
the  U.S.  Snail. 

< hopping  onto  soapboy> 
Microcomputers  have  grown  up. 
We  all  know  the  story  of' the  mi¬ 
crocomputer's  Wonder  Bread  Years. 
Born  with  features  only  a  mother 
could  love  (front  panel  toggle  switch¬ 
es,  2K  of  RAM,  no  permanent  storage, 
and  so  on),  they  have  matured  into 
smart,  fast  machines. 

The  microcomputer  community 
has  also  grown  up.  What  started  a 
few  years  ago  as  a  hobby  has  become 
a  business.  The  community  has  be¬ 
come  an  industry.  For  many  of  you, 
your  fascination  with  computers  has 
turned  into  a  shot  at  The  Dream  Job: 
A  chance  to  make  a  living  at  some¬ 
thing  you  love  to  do.  A  chance  to 
make  work  into  play. 

However,  like  many  new  adults, 
this  industry  seems  determined  to 
disown  all  of  its  childlike  qualities  as 


it  seeks  a  grown-up 
identity.  It’s  been  giv¬ 
ing  up  many  of  the 
values  that  fueled  its 
growth.  The  Suits 
have  moved  in,  and 
things  are  tightening 
up. 

Examples:  It’s  becoming  next  to 
impossible  to  get  a  job  on  pure  abili¬ 
ty;  college  degrees  are  the  new  pre¬ 
requisite.  The  sense  of  community 
we  once  got  from  the  computer  mag¬ 
azines  is  disappearing.  I  see  the  term 
hacker  changing — not  evolving  into 
the  essence  of  what  it  originally 
meant  but  devolving  into  a  deroga¬ 
tory  term,  a  term  feared  by  this  new, 
"grown-up"  industry.  (Last  year  I 
was  involved  with  the  Hacker’s  Con¬ 
ference.  Several  computer  business¬ 
es  were  nice  enough  to  help  fund  the 
affair,  but  some  of  them  asked  that 
their  names  not  be  disclosed.  They 
were  nervous  about  having  their 
company  names  associated  with  the 
word  hacker .) 

<returning  soapbox  to  closet> 

DDJ  is  still  a  reader-responsive  mag¬ 
azine.  If  you  would  like  to  write  for 
our  October  (Forth),  November 
(graphics),  or  December  (operating 
systems)  issues,  send  us  an  outline. 
You  can  address  your  article  propos¬ 
als  to  our  new  editor,  Tyler  Sperry 
(you’ll  meet  him  on  this  page  next 
month).  If  you  just  can’t  stand  the 
thought  of  using  the  U.S.  Sail,  you  can 
e-mail  me  on  CompuServe 
(76703,4060),  Usenet  (welfilevi),  or  ar- 
panet  (well!levi@lll-crg.arpa).  We 
look  forward  to  hearing  from  you. 


ARCHIVES 


Inaugural  Columns 

"Welcome  to  the  first  edition  of  Pro¬ 
gramming  Pastimes  and  Pleasures.  Togeth¬ 
er  we  will  explore  the  prosaic  and  fantas¬ 
tic  possibilities  of  programming. 
Sometimes  I  will  pose  a  problem  to  solve; 
sometimes  I  will  discuss  a  programming 
technique  or  idea  you  may  not  have  heard 
of;  at  other  times  I  will  tell  you  about  a 
game  you  can  play  with  your  computer. 
At  times  I  will  be  serious,  and  at  other 
times  I  will  be  off-the-wall.  Along  the  way, 
you  will  learn  things  you  didn’t  know  be¬ 
fore,  and  you  will  find  wonderful  new 
ways  to  waste  time  with  your  computer. 
Most  of  all,  I  hope  you  will  have  fun.” — 
Charles  WethereU,  DDJ,  January  1979. 

"Doctor  Dobb  s  Clinic  is  a  new  venture.  It 
is  a  place  for  the  display  of  techniques  and 
discoveries.  We  want  to  cover  any  sort  of 
method  or  trick  you’ve  found  in  your  ex¬ 
ploration  of  (for  example,  but  not  limited 
to):  cp/m  or  oasis  or  flex  or  apple  dos  or 

TRSDOS  or  NEWDOS  or  RSTS  or  UNIX  or  UCSD 
PASCAL  or.  .. 

"During  our  rounds  at  the  clinic  we  may 
examine  a  compiler  or  an  interpreter,  re¬ 
vealing  its  bugs  or  showing  how  it  can  be 
made  to  perform  better.  We'd  like  to  tell  of 
unobvious  uses  for  standard  utilities.  We 
want  to  uncover  errors  in  published  docu¬ 
mentation,  to  warn  people  away  from  pit- 
falls,  and  to  show  off  those  eureka!'  mo¬ 
ments  that  make  systems  work 
rewarding.” — Dave  Cortesi  (Resident  In¬ 
tern),  DDJ,  May  1981. 

"Just  after  watching  Doc  Dobb  s  national¬ 
ly  televised  speech  from  the  NCC  in  Texas, 
I  was  surprised  to  receive  a  collect  call 
from  the  Old  Man  himself.  The  lack  of 
public  domain  software  for  the  16-bit  mi¬ 
crocomputers  is  appalling!’  remarked  our 
Fearless  Leader.  No  one  ants  to  shell  out 
that  kind  of  money  just  so  they  can  stare  at 
the  operating  system's  sign-on  message.  I 
want  you  to  institute  a  regular  column  in 
DDJ  that  will  address  the  needs  of  16-bit 
system  users  and  promote  the  discussion 
and  interchange  of  software.’  Luckily,  af¬ 
ter  laying  down  that  sweeping  and  rather 
alarming  mandate,  the  good  Doctor  had  to 
hang  up  to  go  out  on  a  house  call  (someone 
had  choked  on  an  Apple)  and  I  was  left  to 
my  own  devices.” — Ray  Duncan,  DDJ,  Sep¬ 
tember  1982. 


Dr.  DoBB'S  ToURNALof 

COMPUTER 

Calisthenics  Orthodontia 

Running  Light  Without  Overbyte 


8 

426 


Dr.  Dobb's  Journal,  June  1987 


FORUM _ 

LETTERS 


What’s  Right  with  High- 
Level  Languages? 

Dear  DDJ, 

Mike  Suman's  criticisms  of  Modula-2 
in  the  February  1987  Viewpoint  call 
for  some  comments. 

First,  although  many  Modula-2  im¬ 
plementations  are  now  available, 
based  on  Wirth's  Programming  in 
Modula-2  (either  the  second  or  third 
edition),  both  the  language  specifica¬ 
tions  and  the  preferred  standard  li¬ 
braries  are  still  being  fine-tuned  by 
such  bodies  as  M2WG,  the  Modula-2 
Working  Group  of  the  British  Stan¬ 
dards  Institution.  As  I  understand  the 
situation,  draft  proposals  from  BSI 
now  await  reaction  and  fur¬ 
ther  input  via  the  ISO.  MODUS 
Quarterly  (published  by  the 
Modula-2  Users’  Association) 
has  been  covering  this  debate 
and  provides  a  ‘'democratic” 
forum  for  suggestions,  includ¬ 
ing  Wirth’s  own  reactions. 

Mike  should  throw  in  his  pen¬ 
n’orth  before  the  stonecutters 
start  chiseling.  Contact 
George  Symons,  MODUS,  P.O. 

Box  51778,  Palo  Alto,  CA 
94303;  (415)  322-0547. 

Second,  I  must  defend  Brian 
Anderson’s  use  of  the  identifi¬ 
ers  FIRST  and  LAST  in  X68000. 
Although  in  the  final,  pub¬ 
lished  version  these  turn  out 
to  be  nonce  constants,  during 
the  development  stage  (and 
later  when  the  MC68030  is  in¬ 
corporated!)  who  knows 
what  values  they  might  as¬ 
sume  or  where  else  they 
might  occur? 

Finally,  you  must  really 
sympathize  with  Niklaus.  You 


want  to  keep  a  language  ‘‘sim¬ 
ple,”  ‘‘portable,”  and  “small-core,” 
yet  hardly  is  the  ink  dry  before  com¬ 
mittees  emerge  wanting  to  add  their 
pet  piece  of  FORTRAN  or  Ada.  Where 
do  you  draw  the  line?  Certainly  ar¬ 
ray  and  record  constants  would  be 
useful  (as  in  Turbo  but  not  standard 
Pascal)  and  may  find  their  way  into 
the  canon.  Modula-2  does  have  the 
useful,  non-Pascal  Module  Body  con¬ 
struct  (a  piece  of  code  invoked  just 
once  when  the  module  is  first 
executed). 

Mike’s  preferred  tabular  layout  for 
setting  up  Brian  Anderson's  Ta- 
ble68K  will  be  possible  under  my 
own  forthcoming  language  called 
Modula  1-2-3. 

Stan  Kelly-Bootle 

25  Parkwood  Ave. 

Mill  Valley,  CA  94063 
Stan  Kelly-Bootle  is  the  author  of  The 
Modula-2  Primer.  — eds. 

Dear  DDJ, 

In  his  February  1987  Viewpoint, 
Mike  Suman  raises  a  couple  of  objec¬ 
tions  that  he  suggests  show  some¬ 
thing  wrong  with  Modula-2  and 
other  high-level  languages.  I  don’t 


think  his  objections  are  cogent,  and  I 
don’t  think  they  have  anything  to  do 
with  high-level  languages. 

His  first  complaint  concerns  the 
use  of  constants  to  replace  Arabic  in¬ 
tegers.  This  is  not  peculiar  to  high- 
level  languages  but  is  a  feature  of 
most  assembly  languages,  too,  so 
whatever  is  wrong,  it  can’t  be  a  prob¬ 
lem  of  Modula-2  or  of  high-level  lan¬ 
guages  in  general. 

His  particular  complaint  concerns 
the  use  of  FIRST  and  LAST  to  replace  1 
and  118,  respectively;  in  the  program 
on  which  he  is  commenting,  they  ap¬ 
pear  in  the  declaration: 

Table68K  :  ARRAY[FIRST  .  .  LAST]  OF 
TableRecord; 

This  does  look  odd,  but  the  usual  use 
of  constants  in  this  situation  ■  would 
omit  the  constant  FIRST  and  replace 
the  constant  LAST  by  something 
more  informative,  such  as  Number- 
OfOpcodes,  so  the  declaration  would 
look  like  this: 

Table68K  :  ARRAY[1 .  .  NumberOfOp- 
codes]  OF  TableRecord; 

I  can’t  see  why  he  would  ob¬ 
ject  to  that.  He  does  say,  "no 
one  can  maintain  that  it  is 
really  easier,  or  safer,  to 
change  values  in  an  early  def¬ 
inition  than  it  is  to  change 
them  in  the  only  place  in 
which  they  are  used  later, 
when  it  is  obvious  what  the 
effects  of  the  change  are  go¬ 
ing  to  be.”  But  six  months  and 
five  revisions  later,  how  will 
you  know  that  the  number  of 
opcodes  is  used  only  once  in 
the  program,  and  how  easy 
will  it  be  to  find  that  one 
place? 

His  larger  complaint  is  that 
Modula-2  contains  no  com¬ 
pact  and  readable  way  to  ini¬ 
tialize  a  complex  table.  But 
then,  what  language  does? 

There  are  really  two  points 
mixed  up  here.  One  is  that 
Modula-2  provides  no  way  to 
initialize  variables  other  than 
by  assignment  statements; 
there  are  no  facilities  for  de- 


10 


Dr.  Dobb's  Journal,  June  1987 

427 


fining  initial  values  at  the  point  of 
declaration  as  in  some  other  lan¬ 
guages  (although  the  compiler  I  use 
provides  this  as  an  extension).  Unless 
the  compiler  is  very  clever,  this  may 
mean  some  unnecessary  work  at  ex¬ 
ecution  time,  but  this  does  not  seem 
like  a  major  problem  and  certainly 
not  one  inherent  in  high-level 
languages. 

The  second  point  is  that  it  is  tedious 
to  initialize  a  large  table  (in  the  exam¬ 
ple  Suman  discusses,  it  consists  of  118 
records  of  4  fields  each)  with  assign¬ 
ment  statements.  But  there  are  other 
ways.  For  instance,  I  like  to  write  a 
table  in  a  readable  form  in  an  ASCII 
disk  file  and  read  it  at  run  time  to  ini¬ 
tialize  the  table.  It’s  hard  to  beat  this 
for  readability,  the  table  can  be  edit¬ 
ed  without  recompiling  the  pro¬ 
gram,  and  the  code  to  read  the  table 
is  usually  not  difficult  to  write.  There 
is  a  problem  with  reading  in  enu¬ 
merated  types  from  an  ASCII  file:  be¬ 
cause  it  can  be  a  nuisance  to  translate 
the  ASCII  version  of  the  identifiers  for 
the  values,  you  can  compromise  by 
using  Arabic  numbers  for  their  ordi¬ 
nal  values  instead.  This  can  be  en¬ 
dured  because  the  table  file  can  easi¬ 
ly  contain  a  key  for  the  values. 

Suman  offers  an  assembly-lan¬ 
guage  version  of  the  table  in  which 
sets  are  initialized  by  16-digit  binary 
numbers.  I  suppose  it’s  a  matter  of 
taste,  but  I  don't  find  118  entries  like 
that  very  readable,  and  I’m  sure  I’d 
make  plenty  of  errors  typing  them 
into  my  program.  And  what  would 
he  do  if  he  had  a  table  with  real  num¬ 
bers  in  it? 

I  suppose  it  would  be  nice  if  we 
could  write  our  tables  in  readable 
tabular  form  in  our  code  and  have 
the  compiler  do  the  work,  but  there 
are  lots  of  tasks  we  might  like  our 
compilers  to  perform,  and  we  can't 
build  them  all  in.  I  would  guess  that 
initializing  complex  tables  is  one  of 
those  tasks  that  is  just  as  well  left  out 
of  general-purpose  languages. 

John  G.  Bennett 

301  Roslyn  St. 

Rochester,  NY  14619-1813 

Dear  DDJ, 

Though  the  Viewpoint  by  Mike  Su¬ 
man  is  interesting  and  raises  some 
valid  points,  I  find  it  to  be  misleading 


in  three  specific  ways. 

First,  the  point  about  the  constants 
FIRST  and  LAST  being  used  only  once 
is  not  quite  right.  Consider  the  index 
variable  i.  The  initialization  and  ma¬ 
nipulation  of  i  should  always  be  in 
the  range  FIRST . .  LAST,  and  further, 
it  should  exactly  cover  this  range. 
The  program  is  poorly  written  in 
that  the  index  should  have  been  ini¬ 
tialized  to  FIRST,  not  1  explicitly. 
Next,  the  value  of  the  index  at  the 
end  of  the  initialization  should  have 
been  checked  to  see  that  it  was  equal 
to  LAST,  so  that  you  know  all  the  ele¬ 
ments  of  the  table  have  been  initial¬ 
ized.  Far  from  pointing  out  a  place  in 
which  Modula-2  is  too  strict,  this 
points  out  a  place  in  which  the  pro¬ 
grammer  just  didn't  use  the  language 
properly. 

Second,  the  fact  that  Modula-2 
doesn’t  allow  the  specification  of 
constant  arrays  is  not  an  indication 
that  Modula-2's  type  checking  is  too 
strict  or  that  its  model  of  program¬ 
ming  is  inadequate.  It  means  that 
Modula-2  doesn't  have  this  feature.  It 
is  clear  that  allowing  constant  aggre¬ 
gates  to  be  uttered  in  the  language  is 
an  independent  issue  from  whether 
the  language  is  inconveniently  strict 
or  not.  Strictness  and  features  pro¬ 
vided  are  two  completely  different 
issues. 

Third,  asking  if  imagined  (or  even 
real)  problems  with  the  Modula-2 
language  mean  there  is  ' ‘something 
wrong  with  the  direction  in  which 
we  are  being  led”  is  to  miss  the  point 
altogether.  The  point  of  "higher-lev¬ 
el”  languages  that  have  a  strict  type 
structure  is  not  to  prevent  program¬ 
mers  from  full  expression  or  to  make 
it  harder  to  write  legal  programs. 
The  point  is  to  help  programmers  by 
making  it  easier  for  them  to  tell 
when  they  have  written  a  meaning¬ 
less  or  illegal  program.  Here  we  see 
that  a  programmer  misused  the  con¬ 
stant  definition  facility  of  the  lan¬ 
guage  and  that  the  language  lacks  the 
ability  to  utter  constant-valued  ag¬ 
gregates.  But  neither  of  these  faults  is 
a  fault  of  strict  type  checking,  nor  of 
modularity,  nor  of  the  high-level  na¬ 
ture  of  the  language. 

So,  if  we  are  being  led  in  the  direc¬ 
tion  of  modularity  and  static  type 
safety,  Mike  hasn't  even  begun  to 


present  evidence  that  the  direction  is 
wrong.  He  has  shown  that  small  lan¬ 
guages  often  lack  features  that  are 
awkward  to  do  without,  and  he  has 
shown  that  people  often  forget  that 
index  variables  ought  to  be  initial¬ 
ized  and  manipulated  using  the  same 
bounds  that  were  used  to  declare  the 
array  they  index  into.  These  are  in¬ 
deed  things  to  be  wary  of,  and  they 
show  that  Modula-2  is  neither  om¬ 
nipowerful  nor  errorproof.  But  this 
is  no  reason  to  throw  away  the  baby 
of  type  checking  and  modularity 
with  the  bathwater  of  a  possibly  too 
small  language  and  human  error. 
Wayne  Throop 
86  Fearrington 
Pittsboro,  NC  27312 


Happy  Ducks 

Dear  DDJ, 

Thank  you  for  the  very  funny  Febru¬ 
ary  cover  revealing  the  true  nature 
of  us  WordStar-imprinted  program¬ 
mers.  But  no  thanks  for  the  text  edi¬ 
tors  article.  For  a  DDJ  feature  article, 
it  was  amazingly  uninformative — 
simply  an  excuse  for  a  two-barrel 
discharge  against  us  happy  ducks. 

Especially  unfair  was  the  state¬ 
ment  that  WordStar-style  editors 
"usually  use  'weird'  file  formats  that 
can't  be  read  by  any  other  editor 
without  some  sort  of  conversion.” 
WordStar  itself  works  in  straight  AS¬ 
CII  when  in  its  program-editing  (non¬ 
document)  mode.  But  that  is  indeed 
not  the  most  usual  WordStar-style 
editor.  The  MUWSSE  is,  of  course,  the 
Turbo  Pascal  editor,  which  many 
waterfowl  use  with  other  compilers 
as  well.  And  that  is  nothing  but 
straight  ASCII.  So,  just  what  were  you 
talking  about? 

Another  gross  misstatement  was 
the  wish  list.  Every  programmer  I 
know,  even  strictly  dry  land,  will  tell 
you  that  wish  number  1  is  speed,  2  is 
speed,  and  3  is  speed.  Other  wishes 
appear  only  after  pausing  for  breath. 
No  mention  of  that  in  your  list.  Of 
course,  again,  the  MUWSSE — the  Tur¬ 
bo  editor — is  the  fastest  thing  under 
ten  fingers.  And  for  the  price  of  just 
about  any  dedicated  programming 
editor,  you  can  get  Turbo  Pascal  and 
SuperKey,  and  then  you  have  the 
(continued  on  page  122) 


12 

428 


Dr.  Dobb's  Journal,  June  1987 


FORUM 


VIEWPOINT 


What’s  Right  with  High- 
Level  Languages? 

In  “What's  Wrong  with  High-Level 
Languages”  (Viewpoint,  February 
1987),  Mike  Suman  brings  up  some  in¬ 
teresting  points  about  the  limitations 
inherent  in  any  computer  language 
but  fails  to  mention  any  of  the 
strengths  of  Modula-2  or  of  modern 
high-level  languages  in  general. 

As  a  college  instructor  who  has 
spent  several  years  developing  and 
teaching  a  course  in  assembly-lan¬ 
guage  programming,  I  can  well  ap¬ 
preciate  the  advantages  of  low-level 
programming.  Assembly  language 
has  some  shortcomings  that  can  be 
much  more  serious  than  the  limita¬ 
tions  of  high-level  languages, 
however. 

Assembly  language  is  much  hard¬ 
er  to  learn  than  are  most  high-level 
languages,  and  you  must  substantial¬ 
ly  relearn  it  if  you  move  from  one 
computer  to  another.  I  grant  that  the 
second  assembly  language  is  much 
easier  to  learn  than  the  first.  It 
would,  however,  take  a  considerable 
effort  to  adapt  from  the  68000  to  the 


by  Brian  R.  Anderson 

8086,  for  instance,  especially  when 
you  consider  the  peculiarities  of  Mi¬ 
crosoft’s  MASM. 

Assembly  language  is  hard  to  write 
and  even  harder  to  debug  (compared 
to  Modula-2),  partially  because  often 
the  only  control  structures  available 


Brian  R.  Anderson,  5105  Lorraine 
Ave.,  Burnaby,  B.C.  V5G  2S3  Canada. 
Brian  is  an  instructor  in  the  Electron¬ 
ics  Dept,  at  the  Vancouver  Vocational 
Institute. 


are  the  conditional  and  uncondition¬ 
al  jump  and  the  call  to  subroutine.  To 
test  several  conditions  for  a  loop,  you 
must  write  a  cascade  of  comparisons 
and  jumps.  To  produce  a  simple  rep¬ 
etition,  you  must  split  the  control  in¬ 
structions  between  the  outside  of  the 
loop,  the  top  of  the  loop,  and  the  bot¬ 
tom  of  the  loop.  Consider  Examples  1 
and  2,  below,  in  which  some  ele¬ 
ments  of  a  character  array  are 
cleared  using  68000  assembly  lan¬ 
guage  and  Modula-2,  respectively.  In 
both  examples,  the  range  of  items  to 
be  cleared  has  been  calculated  and 
left  in  variables  named  TOP  and  BOT. 
In  the  assembly-language  example, 
the  loop  constraints  are  developed  in 
five  different  lines  of  code,  whereas 


in  the  high-level  language  example, 
all  loop  constraints  are  developed  on 
a  single  line. 

Modern  high-level  languages  al¬ 
low  programmers  to  express  data  in 
terms  that  naturally-  match  a  wide 
range  of  typical  problems.  Modula-2 
handles  the  mathematical  concepts 
of  real  and  integral  numeric  types 
and  sets,  the  organizational  concepts 
of  records  and  files,  and  the  univer¬ 
sal  concepts  of  arrays  and  charac¬ 
ters.  In  assembly  language,  the  only 
data  type  is  the  computer  word.  Pro¬ 
grammers  must  impose  a  structure 
and  then  provide  algorithms  to  per¬ 
form  even  the  simplest  task.  Consid¬ 
er  how  much  easier  it  is  (for  exam- 
( continued  on  page  124) 


*********************************************** 

*  FOR  loop  in 

68000 

*  First  Array 

Element  to  Process:  TOP 

*  Last  Array 

Element  to  Process:  BOT 

*  The  array : 

DATA 

*  Used  to  Index  into  array:  DO 

*  Used  as  Pointer  to  array:  AO 

********************************************** 

000000 

00 

DATA  DS 

500  ,-array  of  bytes 

0001F4 

0000 

TOP  DC 

0  ; first  to  process 

0001F6 

0000 

BOT  DC 

* 

0  ;last  to  process 

* 

.•other  data 

* 

;and  code 

* 

*  Assumes  TOP 

s  BOT  previously  calculated 

0001FB 

41F900000000 

LEA 

DATA, AO  ;set  pointer 

0001FE 

3039000001F4 

MOVE 

TOP, DO  ;init  index 

000204 

B079000001F6 

LOOP  CMP 

BOT , DO  ; check  bounds 

00020A 

6206 

BHI.S 

LABEL  ;end  loop 

0020C 

' 42300000 

CLR.B 

0  (AO,  DO)  ,-loop  body 

000210 

60F2 

BRA.  S 

LOOP  ; repeat 

LABEL 

000212 

END 

Example  1:  Clearing  elements  of  a  character  array  in  68000  assembly  language 


MODULE  TRIAL; 

VAR 

DATA  :  ARRAY  [1- - 500]  OF  CHAR; 

TOP,  BOT,  i  :  CARDINAL; 

BEGIN 

(*  Assumes  TOP  &  BOT  have  been  previously  calculated  *) 

FOR  1  :=  TOP  TO  BOT  DO 
DATA[i]  :-  OC; 

END; 

END  TRIAL. 


Example  2:  Modula-2  version  of  the  code  in  Example  1 


14 


Dr.  Dobb's  Journal,  June  1987 

429 


ARTICLES 


An  Efficient 
Algorithm  for  Large 
Priority  Queues 


by  Robert  Jay  Brown 


In  many  real-time  pro¬ 
grams,  it  is  necessary  to 
share  resources  that  are 
in  short  supply.  To  help 
manage  these  situations,  a 
priority  system  is  often  es¬ 
tablished.  When  several  con¬ 
tenders  are  competing  for 
the  same  resource,  the  one 
with  the  highest  priority  gets  the  resource.  When  the  re¬ 
source  again  becomes  available,  the  next-priority  con¬ 
tender  gets  it.  A  simple  first-in/first-out  queue  can  be 
thought  of  as  a  priority  queue  in  which  the  time  a  con¬ 
tender  is  enqueued  is  the  priority  and  lower  numbers 
have  higher  priority. 

In  practice,  the  resource  being  waited  on  may  be  a 
physical  device,  such  as  a  printer;  a  logical  device,  such  as 
a  file  or  a  record  of  a  file;  or  the  CPU  itself.  Alternatively,  a 
timer  scheduler  may  be  implemented  by  using  the  de¬ 
sired  dispatch  time  as  the  priority  and  having  the  de¬ 
queueing  operation  wait  until  the  time  of  day  equals  the 
dispatch  time  at  the  head  of  the  queue. 

Basic  Queue  Operations 

A  priority  queueing  scheme  in  a  real-time  system  must  be 

Robert  Jay  Brown,  301  W.  High  St.,  P.O.  Boy  833,  Warsaw, 
KY  41095.  Robert  is  a  graduate  student  at  Florida  Atlantic 
University.  He  is  a  consultant  involved  in  designing  elec¬ 
tronic  surveillance  intercept  and  cryptography  systems. 


able  to  perform  the  follow¬ 
ing  operations  on  the  ele¬ 
ments,  or  nodes,  of  the 
queue:  determine  the  high- 
est-priority  node,  add  a  new 
node,  and  remove  a  node. 
Removing  the  highest-prior- 
ity  node  is  a  special  case  of 
the  more  general  operation 
of  removing  a  node  anywhere  in  the  queue.  This  is  called 
preempting. 

The  most  simple  implementation  for  priority  queues 
results  in  the  time  to  perform  at  least  one  of  the  above 
operations  being  directly  proportional  to  the  size  of  the 
queue.  For  extremely  large  queues,  this  becomes  un¬ 
workable.  The  problem  is  similar  to  sorting,  and  sorting 
can  be  done  in  a  time  proportional  to  the  logarithm  of  the 
number  of  elements,  so  you  should  be  able  to  do  as  well 
for  a.  priority  queue. 

In  fact,  you  can  do  a  bit  better  than  this.  Although  the 
priority  queue  algorithm  I  have  implemented  (see  Exam¬ 
ple  1,  page  18)  performs  in  logarithmic  time,  it  does  not 
keep  the  queue  completely  sorted.  It  defers  determining 
the  second-priority  element  until  after  the  top-priority 
element  is  removed.  This  does  not  change  the  logarithmic 
component,  but  it  makes  each  iteration  go  faster. 

A  Binary  Tree 

The  representation  for  the  queue  is  a  binary  tree  with 
left,  rite,  and  father  connections.  Each  element  also  con- 


The  central  concept  is  that 
inserting  and  removing  from 
the  queue  can  be  viewed  as 
merging  two  separate  queues. 


16 

430 


Dr.  Dobb's  Journal,  June  1987 


tains  a  sort  key,  which  can  be  interpreted  as  the  priority. 
In  addition,  a  dist  cell  is  used  to  indicate  the  minimum 
distance  from  that  node  to  a  leaf,  which  is  a  connection  to 
no  element,  or  a  terminator.  The  father  connection  is  not 
used  to  maintain  the  ordering  of  the  queue  but  is  used  to 
allow  rapid  removal  of  any  node  in  the  queue  (see  the 
example,  lines  28-37).  (Note:  my  structure  declaring 
words  [lines  3-26],  together  with  examples  of  their  use, 
are  available  on  the  East  Coast  Forth  Board  as  the  file 
STRUC.ARC,  and  so  I  will  not  describe  them  further  here.) 

The  word  and  data  cells  (lines  36-37)  are  used  to  contain 
dispatching  information.  Word  is  the  address  of  a  Forth 
word  to  perform  when  the  node  is  dispatched,  and  data  is 
a  word  of  data,  typically  a  pointer,  that  is  pushed  on  the 
stack  before  performing  the  word. 

The  following  priority  queueing  algorithm  was  first 
described  by  Clark  Allen  Crane  in  1971.  My  implementa¬ 
tion  is  an  extension  of  a  revised  version  of  Crane’s  algo¬ 
rithm  that  is  described  by  Donald  Knuth.  You  can  refer  to 
Knuth’s  The  Art  of  Computer  Programming:  Volume  3, 
Sorting  and  Searching  (Reading,  Mass.:  Addison- Wesley, 
1973)  for  a  complete  description  of  the  algorithm  and  an 
analysis  of  its  performance. 

Merging  Queues 

The  central  concept  behind  Crane’s  algorithm  is  that  the 
operations  of  inserting  and  removing  an  element  from 
the  queue  can  be  viewed  as  merging  two  separate 
queues.  Crane’s  algorithm  keeps  the  highest-priority 
node  at  the  root  of  the  tree,  and  the  subtrees  follow  the 


same  pattern.  Thus,  to  dispatch  the  head  element  of  the 
queue  (lines  103-106),  the  left  and  right  subtrees  are 
pruned  from  the  root  and  merged  back  together,  leaving 
the  root  out  of  the  result.  To  insert  an  element  on  the 
queue  (lines  98-101),  that  element  is  viewed  as  a  little  pri¬ 
ority  queue,  in  its  own  right,  of  one  element.  This  queue  is 
merged  with  the  original  queue,  and  the  element  is  there¬ 
by  inserted  into  the  original  queue.  To  remove  cT  node 
from  somewhere  in  the  middle  of  the  queue — that  is,  to 
preempt  it  (lines  108-111),  the  node's  left  and  rite  subtrees 
are  cut  off  and  merged,  forming  an  intermediate  result. 
Next,  the  preempted  node  is  removed  by  using  its  father 
pointer  to  cut  it  off  from  the  element  that  points  to  it. 
Finally,  the  original  queue,  less  the  preempted  node  and 
its  two  subtrees,  is  merged  with  the  intermediate  tree 
described  above. 

The  implementation  in  the  example  has  been  tested  on 
a  10-MHz  80286  with  one  wait  state  on  memory  access  and 
3.2  percent  DRAM  refresh  interference  running  under 
I, MI  PC/FORTH  +  3.1.  For  10,000  iterations  of  an  unque  on  a 
randomly  selected  element  of  the  queue,  followed  by  an 
enque  using  a  randomly  chosen  priority,  it  produced  the 
results  shown  in  Table  1,  page  18. 

Availability 

The  complete  source  code,  as  a  standard  Forth  screen  file, 
including  the  benchmark  test,  is  available  on  Compu¬ 
Serve  (GODDJ),  and  on  the  East  Coast  Forth  Board  ([703]  442- 
8695).  Also,  all  the  source  code  for  articles  in  this  issue  is 
available  on  a  single  disk.  To  order,  send  $14.95  to  Dr. 


Dr.  Dobb's  Journal,  June  1987 


17 

431 


QUEUEING  ALGORITHM 

(continued  from  page  17) 

Dobb  s  Journal,  501  Galveston  Dr., 
Redwood  City,  CA  94063  or  call  (415) 
366-3600  ext.  216.  Please  specify  the 
issue  number  and  format  (MS-DOS, 
Macintosh,  Kaypro). 


Vote  for  your  favorite  feature/article. 
Circle  Reader  Service  No.  2. 


#nodes 

msecs 

10 

7.4 

20 

9.3 

50 

11.9 

100 

13.8 

200 

15.5 

500 

17.9 

1000 

19.4 

2000 

20.7 

5000 

21.9 

10000 

23.1 

Table  1:  Benchmark  test  results 

(  Timer  queue  implementation  rjb  16:06  07/2186  ) 

(  structure  declaring  words  -  ) 

:  <struc>  CREATE  ,  DOES>  0  +  ;  (  2nd  generation  defining  word  ) 

(  struc 1  is  used  to  create  the  defining  word  for  a  structure  ) 
:  struc  CREATE  ,  (  org  struc  <struc>  ;  creates  a  structure  ) 

DOES>  (  size  <struc>  <element>  ;  creates  an  element  ) 
DUP  0  DUP  >R  ROT  +  SWAP  !  (  update  location  counter  ) 

R>  <struc>  ;  (  make  word  for  element  ) 

(  'array'  is  used  to  create  an  array  of  structures  ) 

:  array  CREATE  (  #slots  slot-size  array  <array>  ;  makes  array  ) 
,  *  ALLOT  (  save  slot-size  and  allocate  array  ) 
D0ES>  (  subscript  <array>  —  £<array>element  ) 

DUP  @  ROT  *  +  WSIZE  +  ;  (  return  pointer  to  the  slot  ) 

(  'org'  is  useful  for  doing  the  4th  version  of  a  C  'union'  ) 

:  org  (  n  org  <strucname>  ;  re-initializes  location  counter  ) 
BL  WORD  FIND  NOT  ABORT"  Undefined!  '■  >BODY  !  ; 

(  'sizeof'  is  used  for  declaring  structures  of  structures  ) 

:  sizeof  (  sizeof  <strucname>  —  n  ;  gets  the  size  of  a  struc  ) 
BL  WORD  FIND  NOT  ABORT"  Undefined  "  >BODY  @ 

STATE  8  IF  [COMPILE]  LITERAL  THEN  ;  IMMEDIATE 


timer  queue  entry 


tq 

(  a  node  in  the  timer  queue 

tq 

key 

(  the  sort  key:  dispatch  time 

tq 

dist 

(  distance  to  nearest  leaf 

tq 

left 

(  pointer  to  left  sub-tree 

tq 

rite 

(  pointer  to  right  sub-tree 

tq 

father 

(  pointer  to  father  of  node 

tq 

word 

(  word  to  execute  at  dispatch 

tq 

data 

(  pointer  to  data  for  word 

timer  queue  implementation 


(  left!  &  rite!  set  the  left  and  rite  subtrees,  respectively 
of  a  node,  called  the  father.  The  father  pointer  of  the  son 
node  is  updated  to  point  to  the  pointer,  left  or  rite,  that 
points  to  the  son,  so  that  it  may  be  cleared  on  an  unque.  ) 

:  left!  DUP  0<>  IF  OVER  left  OVER  father  !  (  Father  Son  —  ) 


Example  1:  Timer  queue  implementation 


18 

432 


Dr.  Dobb's  Journal,  June  1987 


47  THEN  SWAP  left  !  ; 

48 

49  :  rite!  DUP  0<>  IF  OVER  rite  OVER  father  !  (  Father  Son  —  ) 

50  THEN  SWAP  rite  !  ; 

51 

52  :  go-rt  DUP  rite  @  >R  DUP  ROT  rite!  R>  ;  (  go  dn  rite  side  ) 

53 

54  :  distS  DUP  IF  dist  @  THEN  ;  (  P  —  P=nil  ?  0  :  dist  8  ) 

55 

56  (  This  is  step  M2:  from  Knuth  p  619,  sol'n  to  prob.  32,  p  159  ) 

57  :  list -merge  (  P  Q  R  —  P  Q  R  D  ;  merge  priques  P  S  Q,  ) 

58  BEGIN  (  R  is  Roving  pointer,  D  is  Distance  to  near  leaf  ) 


59 

OVER  0=  IF 

(  Q  =  nil  ? 

) 

60 

2  PICK  distS  EXIT  ' 

THEN  ( 

yes, 

,  D  =  P->dist;  done ! 

) 

61 

2  PICK  0=  IF 

(  P  =  nil  ? 

) 

62 

ROT  DROP  OVER  SWAP 

(  yes,  P  =  Q 

) 

63 

2  PICK  distS  EXIT  ' 

THEN  ( 

yes, 

,  D  =  P->dist;  done! 

) 

64 

2  PICK  key  8  2  PICK  key  8 

(  P->key  <  Q— >key  ? 

) 

65 

<  IF  ROT  go-rt  ROT  ROT 

ELSE 

(  yes,  P  moves  right 

) 

66 

SWAP  go-rt  SWAP 

THEN 

(  no,  Q  moves  right 

) 

67 

AGAIN  ;  (  loop  until 

one  of 

the 

trees  is  eliminated 

) 

68 

69  i 

(  This  is  steps  M3:  and  M4:  from  Knuth  p  ' 

619  ) 

70  : 

:  fix-dist  (  P  Q  R  D  —  P  ;  fixes  up  distance  to  nearest  leaf 

) 

71 

BEGIN 

72 

OVER  0=  IF  3DROP  EXIT  THEN 

(  R 

=  nil  ?  yes,  done ! 

) 

73 

ROT  DROP  OVER  rite  8  ROT 

ROT 

(  Q  =  R->rite 

) 

74 

OVER  left  8  distS  OVER  < 

IF 

( 

R->left->dist  <  D  ? 

) 

75 

DROP  DUP  left  8  distS 

1+ 

(  D 

=  R->left->dist  +  1 

) 

76 

OVER  DUP  left  8  rite! 

(  R->rite  =  R->left 

) 

77 

OVER  4  PICK  left! 

ELSE 

(  R->left  =  P 

) 

78 

1+  OVER  4  PICK  rite! 

THEN 

(  D++;  R->rite  =  P 

) 

79 

DUP  2  PICK  dist  ! 

(  R->dist  =  D 

) 

80 

>R  ROT  DROP  SWAP  DUP  R> 

(  P  =  R;  R  =  Q 

) 

81 

AGAIN  ; 

(  spin 

down  the  right  sub-tree 

> 

82 

83  (  tq-merge  is  Knuth' s  Algorithm  M  from  p  619  ) 

84  :  tq-merge  0  list -merge  fix-dist  ;  (  P  Q  —  P  ;  merge  2  tq's  ) 

85 

86  (  the  timer  queue  root  pointer  )  VARIABLE  TQ  0  TQ  ! 

87 

88  (  TQ!  updates  the  father  pointer  in  the  first  node,  s  sets  TQ  ) 


89  :  TQ!  (  tq  —  ;  sets  TQ  and  father  of  tq  ) 

90  DUP  TQ  !  (  set  the  timer  queue  head  ) 

91  DUP  0=  IF  DROP  EXIT  THEN  (  empty?  yes,  done!  ) 

92  father  TQ  SWAP  !  ;  (no,  set  father  of  top  node  ) 

93 

94  :  ?waiting  DUP  father  @  @  =  ;  (  tq  —  flag  ) 

95 

96  (  -  timer  queue  package  entry  points  -  ) 

97 

98  :  tq-enque  (  Stq  —  ;  enques  to  timer  ) 

99  0  OVER  left  !  0  OVER  rite  !  (  nullify  both  sub-trees  ) 

100  1  OVER  dist  !  (  distance  to  a  leaf  is  1  ) 

101  TQ  8  tq-merge  TQ!  ;  (  merge  new  node  with  old  queue  ) 

102 

103  :  tq-deque  (  —  Stq  ;  deques  from  timer  ) 

104  TQ  e  DUP  0=  IF  EXIT  THEN  (  returns  nil  on  empty  queue  ) 

X05  DUP  (  save  head  for  answer  ) 

106  DUP  left  8  SWAP  rite  8  tq-merg  TQ!  ;  (  merge  remains  ) 

107 

108  :  tq-unque  Stq  —  Stq  ;  removes  from  timer  ) 

109  DUP  ?waiting  NOT  IF  EXIT  THEN  0  OVER  father  8  !  (  cut  ) 

110  DUP  left  8  OVER  "rite  8  TQ  8  (  both  subtrees  of  tq  S  TQ  ) 

111  tq-merge  tq-merge  TQ!  ;  (  paste  it  back  together  ) 


Dr.  Dobb's  Journal,  June  1987 


ARTICLES 


Two-Bit  Analog-to-Digital 

Conversion 


This  article  describes  a  simple 
technique  for  measuring  an 
analog  voltage  that  is  not  only 
useful  but  also  demonstrates  some 
programming  techniques  commonly 
used  in  real-time  applications.  The 
key  to  such  programming  is  the  use 
of  hardware  interrupts.  A  lot  of  pro¬ 
grammers  might  be  wary  of  hard¬ 
ware  interrupts,  but  once  you  use 
them  you  will  find  they  are  a  power¬ 
ful  tool.  I  work  with  embedded  con¬ 
trollers  a  lot,  and  so  I  use  them  all  the 
time. 

Using  Hardware  Interrupts 

The  simplest  interrupt  scheme  uti¬ 
lizes  a  hardware  timer  to  generate  an 
interrupt  at  regular  intervals.  How 
this  is  done  depends  on  the  hardware 
in  the  system,  so  it  is  not  possible  to  go 
into  much  detail.  In  general,  the  idea 
is  to  set  a  timer  to  its  free-running 
mode.  In  this  mode,  the  timer  is  set  to 
some  value  that  decrements  on  each 
system  clock  pulse.  When  the  timer 
reaches  zero,  it  generates  an  inter¬ 
rupt,  is  automatically  reloaded,  and 
starts  to  decrement  again.  Thus,  the 
interrupt  routine  is  entered  at  regu¬ 
lar  intervals.  This  creates  a  time  base 
for  any  routines  running  in  the  inter¬ 
rupt  and  guarantees  that  each  rou¬ 
tine  in  the  interrupt  will  be  executed 
within  a  certain  interval.  I  almost  al¬ 
ways  have  one  such  interrupt  run¬ 
ning  in  any  system  I  design.  Many 
routines  that  might  otherwise  quali¬ 
fy  for  their  own  interrupt  can  often 


John  Musselman,  Route  3,  Boy  344,  Es¬ 
condido,  CA  92025.  John  is  VP  of  engi¬ 
neering  for  PMC  Industries  Inc.  and  is 
a  consulting  product  design  engineer. 
He  has  designed  numerous  micro¬ 
computer-based  instruments  and 
controllers. 


by  John  Musselman 


An  example  of  putting 
interrupts  to  work, 
and  how  software  can 
directly  interact  with 
hardware 


run  in  this  "master"  interrupt,  which 
helps  simplify  things. 

The  driver  for  the  analog-to-digital 
(A-to-D)  circuit  appears  in  this  kind  of 
interrupt.  The  frequency  of  the  in¬ 
terrupt  is  not  critical,  although  the 
faster  it  is,  the  more  quickly  or  more 
accurately  the  conversion  can  be 
made.  Once  every  millisecond  (1,000 
times  a  second)  is  a  convenient  figure. 

Figure  l,  below,  is  a  simplified 
schematic  of  the  A-to-D  converter  cir¬ 
cuit.  Just  two  bits — one  output  and 
one  input — interface  the  circuit  to 
the  computer.  To  understand  the  cir¬ 
cuit,  first  consider  the  part  of  it  repre¬ 
sented  by  the  output  bit,  the  resistor, 


and  the  capacitor.  This  is  actually  a 
simple  D-to-A  converter.  The  output 
bit  is  set  high  or  low  during  each  in¬ 
terrupt.  The  resistor  and  capacitor 
values  are  relatively  large,  so  the  volt¬ 
age  on  the  capacitor  is  an  average  of 
the  output  voltage  over  time.  More  Is 
and  fewer  Os  produce  a  higher  volt¬ 
age;  fewer  Is  and  more  Os  give  a  low¬ 
er  voltage.  If  the  ratio  of  Is  to  Os  is 
held  constant  over  time,  the  voltage 
will  be  essentially  constant  and  pro¬ 
portional  to  the  percentage  of  1  bits 
being  output. 

A  simple  way,  then,  to  program  a 
D-to-A  with  one  output  bit,  a  resistor, 
and  a  capacitor  would  be  to  repeated¬ 
ly  output  m  Os  followed  by  n  Is, 
where  m+n  is  a  constant.  This  could 
be  programmed  with  a  software 
counter  that  decrements  on  every  in¬ 
terrupt.  Comparing  the  counter 
against  some  variable  determines 
whether  to  output  a  high  or  a  low 
during  that  particular  interrupt. 
When  the  counter  reaches  0,  it  is  re¬ 
set  to  m  +  n.  The  variable  thus  would 
control  the  output  voltage  as  the  vari¬ 
able  ranges  from  0  to  m+n. 

The  A-to-D  converter  does  not  use 


Unknown 

Voltage 


Figure  1:  Simplified  schematic  of  analog-to-digital  converter  circuit 


22 

434 


Dr.  Dobb  s  Journal,  June  1987 


exactly  this  method  to  create  its  out¬ 
put  voltage,  but  the  concept  is  the 
same.  A  counter  is  used  to  control  the 
conversion  cycle,  and  the  ratio  of  1 
bits  to  0  bits  output  during  each  cycle 
determines  the  output  voltage.  The 
difference  is  that  the  output  is  not  all 
0  bits  followed  by  all  1  bits;  the  Is  and 
Os  can  appear  in  any  order. 

In  Figure  1,  notice  that  the  un¬ 
known  analog  voltage  is  fed  into  one 
input  of  the  comparator.  The  other 
input  of  the  comparator  is  the  voltage 
generated  by  the  computer,  as  ex¬ 
plained  earlier.  The  output  of  the 
comparator  is  a  digital  signal  that  in¬ 
dicates  whether  the  unknown  volt¬ 
age  or  the  generated  voltage  is 
higher. 

The  Software  Driver 

The  A-to-D  software  driver  samples 
the  comparator  output  at  each  inter¬ 
rupt.  If  the  generated  voltage  is  lower 
than  the  unknown  voltage,  a  1  is  out¬ 
put,  charging  the  capacitor  to  a  slight¬ 
ly  higher  voltage  by  the  next  inter¬ 
rupt.  If  the  generated  voltage  is 
higher  than  the  unknown  voltage,  a  0 
is  output,  discharging  the  capacitor  to 
a  slightly  lower  voltage.  In  this  way, 
the  generated  voltage  seeks  the  level 
of  the  unknown  voltage  and  then 
hovers  about  it.  With  large  resistor 
and  capacitor  values,  the  searching 
effect  is  small  and  the  two  voltages 
are  essentially  equal.  You  can  deter¬ 
mine  what  voltage  you  are  output¬ 
ting  by  counting  the  percentage  of  1 
bits;  this  value  is  the  answer  sought. 

Example  1,  right,  shows  how  you 
do  this.  The  code  is  for  a  TMS7000  se¬ 
ries  processor.  It  is  fairly  easy  to  read, 
even  if  you  are  not  familar  with  this 
device,  but  let  me  point  out  a  few 
things.  First,  the  MOVD  instruction  is  a 
16-bit  move.  The  second  byte  of  the 
variable  is  specified  as  the  operand 
for  this  instruction,  as  in  MOVD 
X+l,Y+l.  Because  the  processor  does 
not  have  a  16-bit  increment  instruc¬ 
tion,  I  have  used  a  16-bit  decrement  to 
count  in  one's  complement.  The 
count  is  complemented  before  being 
saved  as  the  result. 

The  conversion  cycle  takes  1,000  1- 
millisecond  interrupts,  or  1  second.  A 
16-bit  counter,  COUNT,  keeps  track  of 
this  time  period.  Another  16-bit 
counter,  HIGH,  keeps  track  of  the 
number  of  1  bits  output  during  the 
cycle.  At  the  end  of  the  cycle,  HIGH  is 


INBIT: 

EQU 

1 

; INPUT  BIT 

POSITION 

OUTBIT: 

EQU 

2 

; OUTPUT  BIT 

POSITION 

PERIOD: 

EQU 

1000 

; #  OF  INTERRUPTS  IN  A  CONVERSION 

HIGH: 

DS 

2 

; COUNT  OF  HIGH  BITS 

COUNT : 

DS 

2 

; CONVERS ION 

CYCLE  COUNTER 

RESULT : 

DS 

2 

.•RESULT  OF 

CONVERSION 

INITIALIZATION. . . 


MUST  BE  INCLUDED  IN  THE  SYSTEM 
INITIALIZATION  BEFORE  INTERRUPTS 
ENABLED 


MOVD  I P  ER IOD- 1 , COUNT + 1 


INTERRUPT  ROUTINES . . . 


THE  FOLLOWING  ROUTINES  MUST  APPEAR 
IN  AN  INTERRUPT  WHICH  OCCURS 
AT  REGULAR  INTERVALS 


I/O. . . 


SEE  IF  INPUT  BIT  IS  HIGH  OR  LOW. . . 


ATOD:  MOVP  APORT,  B 

BTJO  «INBIT,B,  HI 


IF  IT'S  LOW,  GENERATED  VOLTAGE  IS 
BELOW  THE  UNKNOWN  VOLTAGE.  OUTPUT  A 
HIGH.  COUNT  ONE  MORE  HIGH  BIT... 


LO:  ORP  10UTBIT,  APORT 

DECD  HIGH+1  .-NOTE  ONES  COMPLEMENT  COUNT 

JMP  IODONE 


IF  IT'S  HIGH,  GENERATED  VOLTAGE  IS 
ABOVE  THE  UNKNOWN  VOLTAGE.  OUTPUT  A 
LOW. . . 


HI:  ANDP  tZSS-OUTBIT,  APORT 


IODONE: 


CONVERSION  CYCLE... 


SEE  IF  CONVERSION  CYCLE  DONE... 


DECD  COUNT+1 

JC  ATODDN 


IF  CONVERSION  DONE,  SAVE  RESULT 
AND  RESET  COUNTERS . . . 


MOVD 

HIGH+1, B 

.•RESULT  IS  ONES  COMPLEMENT 

COM 

A 

;OF  COUNT 

COM 

B 

MOVD 

B,  RESULT  +  1 

MOVD 

*-l, HIGH+1 

; ONES  COMPLEMENT  OF  ZERO 

MOVD 

# PER IOD- 1, COUNT+1 

ATODDN : 


Example  1:  Analog-to-digital  converter  driver 


Dr.  Dobb's  Journal,  June  1987 


23 

435 


TWO  BIT  A  TO  D 

(continued  from  page  23) 


dumped  to  RESULT,  and  both 
counters  are  reinitialized.  RESULT  can 
be  read  by  the  main  routine  at  any 
time  it  is  desired  to  know  the  input 
analog  voltage.  It  updates  once  a  sec¬ 
ond.  All  this  is  transparent  to  the 
main  routine. 

The  number  generated  by  the  A-to- 
D  routine,  RESULT,  can  take  any  value 
from  0  to  1,000.  This  means  that  there 
are  1,001  possible  values — 1  more 
than  the  number  of  interrupts  in  the 
conversion  cycle.  The  actual  voltage 
that  RESULT  represents  is  (RESULT/ 
1,000 )  times  the  voltage  swing  of  the 
output  bit.  If  you  actually  build  this 
circuit,  be  careful  to  measure  this 
voltage.  A  TTL  output  may  not  pro¬ 
vide  the  full  5  volts  that  an  NMOS  or 
CMOS  output  will. 

Also,  there  is  nothing  magic  about 
the  number  1,000.  The  conversion 
cycle  may  consist  of  any  number  of 
interrupts.  The  more  interrupts  used 
in  a  conversion,  the  greater  the  reso¬ 
lution  but  the  longer  it  takes  to  get  the 
result.  You  can  choose  the  number  of 
interrupts  to  suit  the  application. 

This  circuit  is  a  good  example  of 
how  to  put  interrupts  to  work  and 
how  software  can  interact  with 
hardware  in  a  direct  manner.  Using 
software  in  a  feedback  loop,  as  in  this 
example,  is  one  of  the  best  ways  to 
convert  from  the  analog  to  the  digital 
world  and  vice  versa. 

Availability 

All  the  source  code  for  articles  in  this 
issue  is  available  on  a  single  disk.  To 
order,  send  $14.95  to  Dr.  Dobb's  Jour¬ 
nal,  501  Galveston  Dr.,  Redwood  City, 
CA  94063  or  call  (415)  366-3600  ext. 
216.  Please  specify  the  issue  number 
and  format  (MS-DOS,  Macintosh, 
Kaypro). 

DDJ 

Vote  for  your  favorite  feature/article. 

Circle  Reader  Service  No.  3. 


24 

436 


Dr.  Dobb's  Journal,  June  1987 


ARTICLES 


The  XOR  Chain 


In  a  system  in  which  unsigned  bi¬ 
nary  words  and  storage  address¬ 
es  are  compatible  types,  a  pecu¬ 
liarity  of  the  exclusive-OR  function 
allows  both  links  of  a  doubly  linked 
list  item  to  be  stored  in  a  single  word. 
This  space-saving  gimmick  has  been 
part  of  the  folklore  of  computing  for 
some  time.  In  this  article  I  explore  the 
mathematical  basis  of  the  trick  and 
develop  a  package  of  C  functions  to 
implement  it  in  a  modular  style. 

The  Exclusive-OR  Function 

Consider  the  exclusive-OR  function 
(XOR).  You  can  think  of  XOR  as  an 
arithmetic  function  similar  to  addi¬ 
tion  and  subtraction;  it  obeys  similar 
metarules.  Like  them,  for  instance,  it 
possesses  0  as  an  identity  element. 
The  identity  element  of  a  function  is 
the  value  whose  use  preserves  an 
identity  over  the  function.  For 
addition: 

A  +  0  =  A 

That  is,  adding  0  is  an  identity  opera¬ 
tion;  it  leaves  A  unchanged.  In 
subtraction: 

A  -  0  =  A 
A  -  A  =  0 

Here,  not  only  does  subtraction  of  0 
leave  an  identity  unchanged,  but  sub¬ 
tracting  identical  values  yields  the 
identity  element.  Just  so: 

A  XOR  0  =  A  (1) 

A  XOR  A  =  0  (2) 

Relation  (2)  is  the  basis  of  the  common 


David  E.  Cortesi,  415  Cambridge  St., 
# 18 ,  Palo  Alto,  CA  94306.  Dave  is  a  for¬ 
mer  DDJ  columnist. 


by  David  E.  Cortesi 


You  can 
construct  a 
doubly  linked  list 
with  both  links  in  the 
same  word. 


assembly-language  trick  of  clearing  a 
register  by  XORing  it  with  itself. 

The  addition  operation  is  commu¬ 
tative — that  is: 

A  +  B  =  B  +  A 

This  is  not  true  of  subtraction,  but  it  is 
true  of  XOR: 

A  XOR  B  =  B  XOR  A  (3) 

Addition  is  also  associative — that  is: 

(A  +  B)  +  C  =  A  -MB  +  C) 

Again,  this  is  not  true  of  subtraction 
but  is  true  of  XOR: 

A  XOR  (B  XOR  C)  =  (A  XOR  B)  XOR  C  (4) 

The  XOR  operation  differs  from  addi¬ 
tion  and  subtraction  in  that  all  the  re¬ 
lations  (1)  to  (4)  hold  for  it,  whereas 
only  (1)  and  (2)  are  true  of  subtraction 
and  only  (1),  (3),  and  (4)  hold  for  addi¬ 
tion.  Because  all  four  relations  hold 
for  XOR,  so  does  this  peculiar  rule: 

(A  XOR  B  XOR  C)  XOR  B  XOR  C  =  A  (5) 

This  may  not  be  immediately  obvi¬ 
ous.  Come  at  it  in  steps:  apply  rela¬ 
tions  (3)  and  (4)  several  times  to  rear¬ 
range  the  formula  so  it  reads: 


A  XOR  (B  XOR  B)  XOR  (C  XOR  C)  =  A 

By  relation  (2),  this  is  equivalent  to: 

A  XOR  (0)  XOR  (0)  =  A 

but  by  (1),  0  XOR  0  is  just  0,  leaving  you 
with: 

A  XOR  0  =  A 

which  merely  restates  (1). 

Three  Links  in  a  Word 

Mathematically,  (5)  is  a  tautology,  a 
gassy  expansion  of  (1),  but  it  has  a 
practical  use  for  data  storage.  From  it 
follows  this:  if  you  have  a  binary 
word  W  that  contains  the  result  of: 

W  =  (A  XOR  B  XOR  C) 

and  if  you  know  any  two  of  these  val¬ 
ues  independently,  you  can  recover 
the  third  value  from  W.  By  relation 

(5): 

W  XOR  B  XOR  C  =  A 

Similarly,  you  can  recover  B  know¬ 
ing  A  and  C,  or  C  knowing  A  and  B. 
You  can  use  this  fact  to  construct  a 
doubly  linked  list  using  but  a  single 
binary  word  to  store  both  links  in 
each  list  item. 

How  do  you  do  this?  In  a  doubly 
linked  list,  each  list  item  contains  the 
addresses  of  its  predecessor  and  its 
successor.  Call  these  A  and  C,  and  use 
the  item’s  own  address  as  B.  Set  its 
single  link  word  W  to: 

W  =  (A  XOR  B  XOR  C) 

Now,  if  a  program  is  examining  a 
list  item,  it  obviously  must  know  that 
item’s  address.  Provided  it  arrived  at 


28 


Dr.  Dobb’s  Journal,  June  1987 

437 


the  item  by  tracing  the  list,  it  must 
have  come  down  to  it  from  its  prede¬ 
cessor  or  back  from  its  successor,  and 
hence  it  must  know  one  of  the  ad¬ 
dresses  A  or  C.  Therefore,  by  (5),  it  can 
recover  the  unknown  address  from 
word  W  and  use  it  to  proceed  on¬ 
ward  in  the  list. 

That's  the  basic  idea.  A  complete 
implementation,  however,  must 
handle  the  cases  of  the  empty  list  and 
lists  of  one  and  two  items,  should  al¬ 
low  for  inserting  and  deleting  ele¬ 
ments,  and  ought  to  be  packaged  as 
functional  subroutines  for  simplicity. 

Here  I'll  develop  such  a  package  in 
C.  In  the  following  code,  I  assume 
that  the  unsigned  binary  integer  is 
exactly  compatible  with  a  C  pointer 
(as  may  not  be  the  case  in,  for  in¬ 
stance,  the  Intel  8086  with  a  large- 
memory  model),  and  I’ve  left  out  the 
casts  that  the  more  picky  C  compilers 
may  want  inserted  to  make  that 
equivalence  manifest.  The  code  fa¬ 
vors  clarity  over  absolute  efficiency, 
so  you  may  wish  to  insert  register 
declarations  and  other  changes  for 
speed. 

Defining  an  Item 

In  C,  the  XOR  function  is  denoted  by 
an  up  arrow  and  inversion  of  a  Bool¬ 
ean  value  by  comparison  to  0.  For 
readability,  assume  that  the  follow¬ 
ing  defines  hold  in  the  example  code: 

^define  Xor  t 
^define  Not  0=  = 

^define  Nil  0 

Now  you  can  define  the  structure  of  a 
list  item: 

struct  Item  { 

unsigned  link; 

/’  other  contents  of  item  */ 

}; 

The  link  word  contains  the  XORed 
linking  addresses.  What  the  "other 
contents”  might  be  depends  on  the 
application,  and  so  I  assume  that 
functions  to  create  and  destroy  Items 
are  defined  elsewhere: 

extern  struct 

Item  *MakeItem(  ); 
extern  void 

Dropltem(i)  struct  Item  *i; 


Defining  a  List 

A  list  consists  of  a  chain  of  zero  or 
more  Items,  but  there  has  to  be  an  an¬ 
chor  for  the  chain:  a  single  fixed 
place  where  the  head  and  tail  of  the 
list  can  be  found.  I  define  its  form  as 
Anchor: 

struct  Anchor  { 

struct  Item  ‘head,  ’tail; 

}; 

For  any  list  there  will  be  just  one  An¬ 
chor.  Furthermore,  I  legislate  that  if  a 
list  is  empty: 


You  can 
think  of  XOR 
as  an  arithmetic 
function. 


(head  =  =  Nil)  &&  (tail  =  =  Nil) 

Otherwise,  head  points  to  the  first 
item  of  the  list  and  tail  points  to  the 
last  item.  If  the  list  contains  but  one 
item: 

(head  =  =  tail)  &&  (head  !  =  Nil) 

You  can  express  these  rules  in  Bool¬ 
ean  functions  that  test  an  Anchor: 

int  Zeroltems(a)  struct  Anchor  *a; 
{return! 

(Nil  —  =  a->head)&,& 

(Nil=  =a->tail) 


int  Oneltem(a)  struct  Anchor  *a; 
(return! 

(a->head=  =a->tail) 

&&(a->head  !=  Nil) 

);} 

Scanning  a  List 

The  Anchor  of  a  list  gives  you  access 
only  to  the  first  and  last  items.  If  you 
are  to  reach  intermediate  items,  you 
must  scan  over  the  list,  either  for¬ 
ward  from  the  head  or  backward 
from  the  tail. 


While  scanning  the  list,  your  posi¬ 
tion  is  always  maintained  in  a  pair  of 
pointers  that  contain  the  addresses  of 
two  items  that  are  logically  adjacent 
in  the  list.  The  item  that  is  nearer  the 
head  of  the  list  I  call  the  prior  item; 
the  one  closer  to  the  tail  I  call  the  neyt 
item.  When  I  have  these  items  set  up 
with  valid  addresses,  I  refer  to  such  a 
pair  of  pointers  as  a  Scan,  because  I 
can  use  the  pair  to  scan  over  the  list  in 
either  direction.  Because  a  pair  of 
pointers  is  always  required,  I  put 
them  in  a  record.  Because  I  am  al¬ 
ways  scanning  a  particular  list,  I  in¬ 
clude  the  address  of  the  list's  Anchor. 

struct  Scan  { 

struct  Item  “prior,  ’next; 
struct  Anchor  ’base; 

}; 

Although  a  list  may  have  only  one 
Anchor,  it  may  have  any  number  of 
Scans  associated  with  it.  No  Scan  is 
valid  until  it  has  been  associated  with 
some  list,  however.  Here’s  a  proce¬ 
dure  to  do  that: 

void  Associated, a) 
struct  Scan  *s; 
struct  Anchor  *a; 

{  s->base  =  a; } 

This  procedure  is  very  simple,  but  in 
some  applications,  it  might  have  to  do 
more.  It  might  be  desirable,  for  exam¬ 
ple,  to  keep  track  of  the  number  of 
scans  that  are  active  on  a  list,  and  the 
Associate  function  could  do  that  by 
incrementing  a  count  in  Anchor. 

A  program  may  scan  a  list  by  mov¬ 
ing  a  Scan  structure  from  the  current 
position  to  an  adjacent  one.  But  a  scan 
has  to  start  somewhere.  This  proce¬ 
dure  sets  a  scan  to  the  head  of  the  list: 

void  ToHead(s)  struct  Scan  *s; 

{ 

s->next  =  s->a->head; 
s->  prior  =  Nil; 

} 

When  a  scan  is  at  the  head,  the  pri¬ 
or  pointer  contains  Nil  (nothing  pre¬ 
cedes  the  head  of  a  list)  and  the  next 
item  is  the  head  item  of  the  list.  A  sim¬ 
ple  function  can  test  a  Scan  for  this 
condition: 

int  AtHead(s)  struct  Scan  *s; 

{  return!  s->  prior  =  =  Nil ); } 


Dr.  Dobbs  Journal,  June  1987 

438 


29 


XOR  CHAIN 

(continued  from  page  29) 

A  procedure  can  set  a  scan  to  the 
tail  of  its  list: 

void  ToTail(s)  struct  Scan  *s; 

{ 

s->prior  =  s->a->tail; 
s->next  =  Nil; 

} 

When  a  scan  is  at  the  tail  of  a  list, 
the  neyf  pointer  contains  Nil  (nothing 
follows  the  tail  of  a  list)  and  the  prior 
item  is  the  tail  item  of  the  list.  A  Bool¬ 
ean  function  can  test  for  this 
condition: 

int  AtTail(s)  struct  Scan  *s; 

{  return!  s->next  =  =  Nil ); } 

Recall  that  a  list  is  empty  when  the 
head  and  tail  pointers  in  its  Anchor 
are  Nil.  Verify  for  yourself  that,  for  a 
Scan  associated  with  an  empty  list, 
applying  either  ToHead  or  ToTail 
makes  both  AtHead  and  AtTail  report 
true.  Earlier  I  defined  a  function  Ze- 
roltems,  which  tests  an  Anchor  to  see 
if  it's  empty.  Now  I  can  define  a  func¬ 
tion  that  tests  not  an  Anchor  but  a 
Scan: 

int  EmptyList(s)  struct  Scan  *s; 

{  return!  AtHead(s)  && 

AtTail(s)); } 

Consider  the  implications  for  loop 
control.  Either  the  sequence: 

ToHead(z); 
while  (Not  AtTail(z)) 

{step  forward  in  list) 

or  the  sequence: 

ToTail(z) 

while  (Not  AtHead(z)) 

{step  backward  in  list} 

can  work  safely — that  is,  do  noth¬ 
ing — on  an  empty  list. 

Stepping  a  Scan 

Having  set  a  Scan  to  the  head  of  a  list, 
you  want  to  be  able  to  step  it  forward 
in  the  list.  You  do  that  using  the  iden¬ 
tities  of  XOR.  Consider  an  Item  that  is 
neither  at  the  head  nor  the  tail  of  a 
list.  If  it  is  located  at  address  B  and  if 


its  predecessor  is  at  address  A  and  its 
successor  at  address  C,  then  its  link 
word  contains  A  XOR  B  XOR  C.  The 
link  word  plus  the  contents  of  the 
Scan  let  you  step  it  forward: 

void  GoFwd(s)  struct  Scan  *s; 

{ struct  Item  *i; 

i  =  s->prior  Xor 
s->next  Xor 
s->next->link; 
s->prior  =  s->next; 
s->next  =  i; 

} 

Variable  i  receives  the  link  word  of 
the  next  Item  minus  the  contribu¬ 
tions  made  by  its  own  and  its  prede¬ 
cessor's  address:  in  short,  its  succes¬ 
sor’s  address. 

In  this  way  you  can  process  all  the 
items  of  a  list: 

Associated, a); 

ToHead(z); 
while!  Not  AtTail(z) ) 

{ 

Process(z->next); 

GoFwd(z); 

} 

If  you  are  accustomed  to  C,  you  may 
prefer  to  see  this  written  using  a  for 
loop: 

Associated, a); 
for(ToHeadd); 

Not  AtTaild); 

GoFwd(z)) 

Process(z->  next); 

which  amounts  to  the  same  thing. 

When  you  process  a  list  from  head 
to  tail,  the  Items  to  be  processed  are 
those  that  turn  up  in  the  next  posi¬ 
tion.  When  you  go  backward,  from 
tail  to  head,  you  take  them  from  the 
prior  slot.  Here’s  a  going-backward 
procedure: 

void  GoBak(s)  struct  Scan  *s; 

{ struct  Item  *i; 

i  =s->prior  Xor 
s->next  Xor 
s->prior->link; 
s->next  =  s->prior; 
s->  prior  =  i; 

) 

Here  you  take  the  link  word  of  the 


predecessor  Item  minus  the  contribu¬ 
tions  of  its  own  address  and  of  its  suc¬ 
cessor:  in  short,  the  address  of  its  pre¬ 
decessor.  With  this  procedure  and 
ToTail,  you  can  process  a  whole  list: 

Associated, a); 

ToTail(z); 

whilel  Not  AtHeadfz) ) 

{ 

Process(z->  prior); 

GoBak(z); 

} 

It  may  not  be  clear  that  these  pro¬ 
cedures  work  for  all  items  of  all  lists. 
Before  you  can  be  sure  they  do,  you 
need  to  define  what  the  link  words  of 
the  head  and  tail  items  contain.  The 
simplest  rule  works:  the  link  word 
contains  xOfi  o  for  irrelevant  address¬ 
es.  Table  1,  below,  shows  a  complete 
four-item  list  whose  anchor  is  Test. 
Presume  that  the  following  loop  is  to 
be  executed  using  the  list  in  this  table: 

Associated, Test); 

ToHeadd); 

while(Not  AtTail(z))  GoFwd(z); 

Trace  the  contents  of  prior  and  neyt 
at  each  step  and  assure  yourself  that 
all  the  pointers  work  out  correctly. 
(Surely  you  don’t  take  articles  such  as 
this  one  on  faith,  without  at  least 
desk-checking  the  code?)  Then  trace 
them  through  this  loop  to  verify  the 
rest  of  the  code  given  so  far: 

for(ToTaild,Test); 

Not  AtHeadd); 

GoBak(z)); 

What  would  happen  if  a  program 
applied  GoFwd  just  once  too  often? 
Or  GoBak?  Clearly  you  ought  to  ac¬ 
count  for  these  end  effects.  What 
shall  be  done  with  the  head  item 
(which  has  no  predecessor)  and  the 
tail  item  (no  successor)?  You  can  ei¬ 
ther  wrap  around  or  stick.  Here  is  a 
safe  StepFwd  procedure: 


Address 

Contents 

Test.head 

A 

A->link 

A  xor  B 

B->link 

A  xor  B  xor  C 

C->link 

B  xor  C  xor  D 

D->link 

C  xor  D 

Test.tail 

D 

Table  1:  A  four-item  list  whose  an¬ 
chor  is  Test 


30 


Dr.  Dobb  s  Journal,  June  1987 

439 


void  StepFwd(s)  struct  Scan  *s; 

{ if  (Not  AtTaiKs))  GoFwd(s); } 

that  simply  sticks  when  it  reaches  the 
tail.  Here  is  a  StepBak  procedure  that 
wraps  around: 

void  StepBak(s)  struct  Scan  *s; 

{ 

if  (AtHead(s))  ToTail(s); 
else  GoBak(s); 


Inserting  an  Item 

You  construct  a  list  by  starting  with 
Zeroltems  true  and  inserting  new 
items.  Example  1,  below ,  shows  one 
way  to  insert  a  new  item  between 
the  prior  and  next  items  of  a  scan.  In 
the  statement: 

s->prior->link  Xor= 

(i  Xor  s->next) 


you  take  the  predecessor’s  link,  re¬ 
move  from  it  the  contribution  of  the 
successor's  address  (by  XORing  with 
s->neyt);  and  install  the  value  of  its 
new  successor — the  item  with  ad¬ 
dress  i. 

A  similar  statement  fixes  the 
link  word  in  the  next  item,  except 
that  it’s  the  predecessor’s  address 
that  is  removed  in  favor  of  i.  Finally, 
the  new  item's  link  is  formed  from  its 
own  address  and  that  of  its  new  pre¬ 
decessor  and  successor.  It  becomes 
the  prior  item,  so  a  series  of  inser¬ 
tions  will  build  up  in  head-to-tail 


void  Insert  (  i  ,  s  ) 
struct  Item  *i  ; 
struct  Scan  *s  ; 

{ 

if  ( AtHead( s ) ) 
s-)a-)head  =  i  ■ 
else 

s-)pr i or- > link  Xor  = 

(  i  Xor  s-)next )  ; 

if  (AtTail(s)  ) 
s->a->tail  =  i  ; 
else 

s-}next-)link  Xor  = 

(  i  Xor  s-)prior  )  ; 

i-)link  = 

s-)prior  Xor  s-)next  Xor  i  ; 
s-)prior  =  i ; 

} 


Example  1:  A  way  to  insert  a  new 
item  between  the  prior  and  next  items 
of  a  scan. 


Address 

Contents 

Test.head 

Nil 

Test.tail 

Nil 

A.link 

? 

B.link 

? 

Clink 

? 

D.Iink 

? 

Table  2  An  empty  list  and  four 
prepared  items 


order. 

To  verify  this,  commence  with  an 
empty  list  and  four  prepared  items, 
as  shown  in  Table  2,  below,  and  exe¬ 
cute  the  following  sequence  of  oper¬ 
ations  by  hand: 

Associated, Test); 

Insert(A,z); 

Insert®, z); 

GoBak(s); 

Insert®, z); 

Insert(C,z); 

keeping  track  of  the  new  contents  of 


Dr.  Dobb  s  Journal,  June  1987 

440 


35 


XOR  CHAIN 

(continued  from  page  35) 

Anchor,  the  Item  links,  and  prior  and 
next. 

Deleting  an  Item 

You  need  a  way  to  delete  an  item  of  a 
list.  Deleting  comes  down  to  fixing 
the  things  that  point  to  an  item  so 
they  point  to  other  items  instead. 
Once  nothing  points  to  it  anymore, 
an  item  can  be  discarded.  The  dele¬ 
tion  procedure  in  Example  2,  page  36, 
deletes  the  next  Item  of  the  scan.  If 
there  is  no  next  Item,  it  does  nothing 
(ergo,  it  correctly  does  nothing  to  an 
empty  list). 

The  item  to  be  deleted  is  s->neyt. 
The  first  step  is  to  get  the  address  of 
its  successor,  which  will  take  its 
place.  It  is  obtained  from  its  link  field 
in  the  manner  of  GoFwd. 

Next  you  repair  the  predecessor  of 
the  deleted  item.  If  it's  the  head  of  the 
list,  the  anchor  must  be  repaired;  oth¬ 
erwise,  the  predecessor’s  link  field  is 
adjusted  to  point  to  the  deleted  item’s 
successor. 

The  successor  of  the  deleted  item  is 
then  repaired.  If  there  isn’t  one,  the 


deleted  item  was  the  tail  of  the  list, 
and  its  predecessor  becomes  the  new 
tail. 

You  could  empty  a  list  with  this 
loop: 

Associated, Test); 

ToHead(z); 

while  (Not  AtTail(z))  Delete(z); 

Although  this  procedure  won't  de¬ 
lete  when  the  scan  is  AtTail,  the  tail 
item  can  nevertheless  be  deleted 
using: 

ToTail(z);  StepBak(z);  Delete(z); 

Earlier  I  said  that  any  number  of 
Scan  structures  could  be  associated 
with  a  list.  Now  that  I've  defined  In¬ 
sert  and  Delete,  I  should  perhaps 
modify  that  claim.  What  might  the 
effect  be  on  one  Scan  if  Insert  were 
performed  via  a  different  Scan  of  the 
same  list?  (Answer:  not  much — the 
new  item  might  be  missed  by  the 
other  Scan.) 

What  about  Delete? 
(Answer:  catastrophe-the  other  Scan 
might  step  using  the  link  word  of  a 
deleted,  now-invalid  Item.) 


Conclusion 

The  C  functions  shown  here  form  the 
basis  of  a  package  for  handling  dou¬ 
bly  linked  lists.  These  lists  have  sever¬ 
al  advantages:  because  they  are  dou¬ 
bly  linked,  they  can  be  traversed  in 
either  direction  with  equal  ease,  and 
insertion  and  deletion  are  simple  and 
speedy  operations.  By  packaging  a 
list  position  as  a  single  Scan  structure, 
you  make  it  convenient  to  track  sev¬ 
eral  list  positions  at  once,  as  long  as  no 
deletions  are  allowed. 

There  are  disadvantages,  of  course. 
The  greatest  is  that  the  only  practical 
access  to  the  list  is  from  its  ends;  it 
isn’t  possible  to  enter  it  at  the  middle 
and  then  traverse  forward  or  back¬ 
ward.  (Traversal  requires  the  ad¬ 
dresses  of  two  logically  adjacent 
items,  which  can’t  be  maintained 
outside  the  list  itself  in  the  presence 
of  Insert  or  Delete.)  Thus  this  list  form 
can't  be  indexed  for  quick  entry  at 
intermediate  points. 

Readers  of  an  imaginative  turn  of 
mind  might  explore  some  other  pos¬ 
sibilities.  For  instance,  the  structure 
treated  here  as  a  linear  list  could  be 
wrapped  around  to  make  an  endless 
ring.  In  a  double-linked  ring,  Anchor 
might  be  eliminated  in  favor  of  a 
mandatory  single  Item. 

DDJ 

Vote  for  your  favorite  feature/article. 

Circle  Reader  Service  No.  4. 


void  Delete ( s )  struct  Scan  *s  ; 

}  struct  Item  *i  ; 

if  ( AtTail ( s ) ) return; 

i  =  s-)next-)link 

Xor  s-)pr  ior  Xor  s-)next ; 

if  ( AtHeadf  s  )  ) 
s-)a-)head  =  i ; 
else 

s-)pr ior-}link  Xor  =  (s-)next 
Xor  i  )  ; 

if  (i  =  =Nil) 

s-)a->tail  =  s-)prior; 
else 

i-)linkXor=  (  s-)next  Xor  s- 
>pr ior )  ; 

Dropltem(s-)next) ; 
s-}next  =  i  ; 

] _ 

Example  2  A  procedure  to  delete  an 
item  in  a  list 


38 


Dr.  Dobb 's  Journal,  June  1987 

441 


ARTICLES 


An  Extended 
IBM  PC 

COM  Port  Driver 

by  Thomas  A.  Zimniewicz 


The  IBM  PC  and  its  compatibles 
support  two  serial  ports — 
COM1  and  COM2 — in  a  wide 
variety  of  configurations.  Four  basic 
functions  are  provided— data  in,  data 
out,  set  configuration,  and  read  sta¬ 
tus.  What  the  IBM  PC  does  not  do  is 
walk  and  chew  gum  at  the  same 
time.  If  your  disk  is  seeking  and  more 
than  one  character  arrives,  it  is  lost. 
This  article  describes  what  I  think  a 
COM  port  driver  should  do,  discusses 
some  design  rules  and  implementa¬ 
tion  pitfalls  of  device  drivers,  and 
presents  a  device  driver  modification 
with  the  following  features: 

•  buffered  input — no  lost  data 
•  input  flow  control — XON/XOFF  or 
control  lines 

•  output  flow  control — control  lines 
•  works  with  existing  software 
•  removal — without  a  reboot 
•  19,200-  and  38,400-baud  operation 
•  type-ahead  with  CRTs 
•  upward  compatible 

What’s  Wrong? 

During  the  first  part  of  my  career  as  a 
software  engineer,  I  was  spoiled  by 
minicomputer  operating  systems 
that  were  good  at  capturing  serial 
data.  More  recently,  I  had  the  plea¬ 
sure  of  working  with  Unix — a  plea¬ 
sure,  that  is,  in  all  aspects  except  cap¬ 
turing  serial  data.  My  introduction  to 
Unix  was  on  a  PDP-11/44  that  could 
miss  characters  from  serial  data  in¬ 
put  at  300  baud  on  a  busy  day.  Every 
time  m^  job  required  monitoring  se¬ 
rial  data,  it  was  a  bad  day.  Then  one 


Thomas  A.  Zimniewicz,  269S  Pond 
Rd.,  Lima,  NY  14485.  Thomas  is  a  soft¬ 
ware  consultant. 


What  the  IBM  PC 
does  not  do 
is  walk  and  chew  gum 
at  the  same  time. 


day  I  noticed  an  idle  IBM  PC  that  had 
nothing  to  do  (no  other  users  to  get  in 
my  way)  except  perhaps  collect  my 
data.  It  would  be  simple — just  getchar 
from  CO  Ml  and  putchar  to  disk,  and 
with  no  other  work,  the  IBM  PC  could 
certainly  handle  at  least  9,600  baud.  A 
while  later  it  was  working,  but  (and 
there  always  is  one)  it  seemed  that 
there  were  gaps  in  the  data.  (I  discov¬ 
ered  later  that  the  gaps  coincided 
with  the  flashing  of  the  disk  light.)  I 
gave  up.  Much  later  I  found  myself 
using  an  IBM  PC  a  little.  I  didn't  like  the 
keyboard  and  monitor,  so  I  hooked 
up  my  CRT  to  a  COM  port  and  then 
found  there  was  no  type-ahead.  I  got 
mad  enough  to  fix  it. 

So,  what  were  my  goals?  The  main 
thing  was  buffered  interrupt-driven 
input  to  allow  type-ahead  and  high¬ 
speed  serial  data  capture.  I  wanted  to 
use  these  features  with  existing  pro¬ 
grams,  too.  This  meant  the  solution 
had  to  be  fully  compatible  and  some¬ 
how  get  between  programs  and  the 
hardware.  I  had  to  modify  PC-DOS 
and/or  the  BIOS  in  a  way  that  was 
transparent  to  an  application-level 
program.  In  this  sea  of  bad  news, 
there  was  one  very  bright  light — ev¬ 
erything  that  PC-DOS  and  the  BIOS 
lacked  in  capability,  it  made  up  for  in 
flexibility.  The  hooks  were  there,  de¬ 
vice  drivers  could  be  installed,  inter¬ 


rupts  could  be  stolen.  If  the  new  COM 
port  software  were  installed  on  an 
IBM  PC,  then  all  well-behaved  pro¬ 
grams  (those  that  used  PC-DOS  or  the 
BIOS  instead  of  directly  accessing  the 
hardware)  would  automatically  gain 
the  new  features.  Several  methods  of 
input/output  flow  control  were  pos¬ 
sible  and  the  hardware  could  handle 
higher  baud  rates  ...  I  sure  do  get 
carried  away  easily. 

Can  It  Be  Fixed? 

The  IBM  PC  actually  has  two  levels  of 
software  to  handle  the  COM  ports. 
The  high-level  code  is  part  of  PC-DOS 
and  is  loaded  into  RAM  during  the 
boot-up  of  PC-DOS.  This  code  handles 
those  functions  that  do  not  depend 
on  the  details  of  the  hardware  and 
provides  an  interface  to  the  COM 
port.  This  level  is  called  a  device  driv¬ 
er  and  is  accessed  by  high-level  PC- 
DOS  operations  such  as  those  that 
open  and  read  serial  data  system 
calls.  The  low-level  code  is  hardware 
specific  and  is  contained  in  ROM.  This 
code  is  specific  to  the  operation  of 
the  COM  port  hardware.  This  level  is 
called  the  BIOS  and  is  called  by  the 
device  driver  to  perform  operations 
specific  to  a  COM  port,  such  as  setting 
the  baud  rate  or  getting  a  character 
from  the  port  hardware. 

When  I  started  to  investigate  the 
job  at  hand,  I  suspected  that  a  whole 
new  device  driver  would  be  re¬ 
quired.  The  good  news  was  that  PC- 
DOS  faithfully  used  the  BIOS  for  all 
COM  port  operations.  This  meant  that 
all  I  needed  to  do  was  to  write  a  re¬ 
placement  for  the  parts  of  the  BIOS 
that  handle  the  COM  port.  For  a  while 
the  job  looked  really  easy.  I  consid¬ 
ered  stealing  the  software  interrupt 


43 

442 


Dr.  Dobb's  Journal,  June  1987 


(int  14h)  that  is  used  to  access  the  BIOS 
COM  port  code,  going  to  my  code  for 
operations  that  needed  changing, 
and  otherwise  just  passing  the  job  off 
to  the  BIOS.  Unfortunately,  I  had  to 
abandon  this  easy  way  out  because 
of  negative  side  effects  caused  by  al¬ 
most  every  BIOS  function.  For  exam¬ 
ple,  the  BIOS  "send  a  character”  func¬ 
tion  sets  the  control  signals  in  ways 
that  are  incompatible  with  some  of 
the  required  flow  control  modes. 

To  illustrate  some  of  the  technical 
issues  of  handling  a  COM  port,  let's 
take  a  simplified  look  at  the  job  of 
reading  serial  data.  When  the  BIOS 
call  is  made  to  read  a  character,  it 
loops  until  a  character  arrives  at  the 
port  or  a  counter  expires.  Either  the 
character  or  a  time-out  status  is  re¬ 
turned.  Most  programs  and  PC-DOS  ig¬ 
nore  the  time-out  and  try  again.  The 
COM  port  hardware  has  only  a  single 
character  buffer.  If  the  buffer  in  the 
COM  port  hardware  overflows  while 
the  processor  is  doing  anything  else, 
it  is  lost.  The  solution  to  these  lost 
characters  is  to  have  the  COM  port 
generate  an  interrupt  when  a  char¬ 
acter  arrives  and  have  the  interrupt 
handler  save  the  character  in  a  buff¬ 
er  until  the  program  gets  around  to 
needing  it.  The  code  might  look  like 
this: 

COM  port  interrupt  handler 
read  character 
if  buffer  count  <  buffer  size 
put  character  in  the  buffer 
increment  buffer  count 
else 

set  overrun  error 

read  character  routine 
loop 

if  buffer  count  >  0 

decrement  buffer  count 
take  character  and  return  it 
if  timer  expired 

return  time-out  status 
endloop 

If  this  algorithm  were  implement¬ 
ed,  it  would  probably  pass  all  its  ini¬ 
tial  tests.  Then,  when  you  least  ex¬ 
pected  it,  your  input  would  be 
garbled  and  no  errors  would  be  re¬ 
ported.  What  you  have  here  is  your 
classic  race  condition.  Imagine  the 
buffer  is  exactly  full  when  the  read 
routine  is  called  and  the  count  is  dec¬ 
remented  to  indicate  room  in  the 


buffer.  Immediately  after  the  count 
is  decremented  but  before  the  char¬ 
acter  is  removed,  an  interrupt  oc¬ 
curs.  The  handler  puts  another  char¬ 
acter  in  the  buffer  that  is  really  still 
full.  Things  go  downhill  from  here. 
This  type  of  error  is  an  easy  trap  for 
beginners  (and  sometimes  a  big  em¬ 
barrassment  for  pros).  One  possible 
fix  is  to  disable  interrupts  during  crit¬ 
ical  portions  of  the  read  character 
routine.  (Note  that  COM  port  inter¬ 
rupts  are  automatically  disabled  by 
the  hardware  during  the  processing 
of  a  COM  port  interrupt;  there  is  no 
need  to  worry  about  reentering  the 
handler  itself).  This  works,  but  it  has 


I  wanted 
buffered  input 
to  allow  type-ahead 
and  high-speed 
serial 

data  capture. 


serious  drawbacks.  If  interrupts  are 
off  for  too  long,  serial  data  (and  other 
real-time  events)  can  be  lost.  The  best 
solution  is  to  order  the  events  in  the 
read  character  routine  so  that  an  in¬ 
terrupt  cannot  hurt  it.  If  interrupts 
must  be  disabled,  try  to  limit  the  du¬ 
ration.  To  fix  the  bug  above,  just  take 
the  character  from  the  buffer  before 
decrementing  the  buffer  count. 

Code  Walk-Through 

Excom  is  functionally  equivalent  to 
the  BIOS  int  14h  COM  port  handler 
with  three  significant  enhance¬ 
ments — it  provides  interrupt-driven 
buffered  input  and  has  an  extended 
set  of  configuration  parameters,  in¬ 
cluding  flow-control  selection  and 
higher  baud  rates.  Excom  is  made  up 
of  three  major  sections — the  COM 
port  data  input  interrupt  handler, 
the  replacement  for  the  BIOS  int  14h 
handler,  and  the  code  to  install  and 
initialize  excom.  A  problem  exists  in 
trying  to  describe  excom,  unlike  a 
well-written  application  in  which 
functions  can  be  isolated  to  a  single 
routine  or  group  of  routines,  because 
its  structure  dictates  that  many  of  its 
capabilities  are  distributed  through¬ 


out  the  code.  I  will  first  take  you  on  a 
quick  trip  through  the  listing  (Listing 
One,  page  64)  to  show  its  structure 
and  then  later  describe  several  of  ex¬ 
com ’s  more  interesting  functions. 

Excom  is  a  .COM  program  that  is 
run  once  before  the  COM  ports  are 
needed  and  that  uses  the  terminate- 
and-stay-resident  feature  of  PC-DOS  to 
remain  active  as  long  as  is  needed.  If 
required  it  can  be  removed,  restor¬ 
ing  the  interrupt  vectors  to  their  pre¬ 
vious  values  and  freeing  memory. 
Two  requirements  for  a  COM  pro¬ 
gram  are  to  have  all  segment  regis¬ 
ters  pointing  to  the  start  of  the  pro¬ 
gram's  memory  space  and  to  have  an 
executable  origin  of  lOOh  to  make 
room  for  the  program  header.  Exe¬ 
cution  starts  at  location  lOOh,  which 
is  a  jump  to  the  initialization  code.  A 
few  constants  are  then  defined,  the 
first  three  of  which  have  to  do  with 
the  buffer  size  and  can  be  modified 
to  tune  the  excom  buffers  to  your 
needs.  The  Microsoft  assembler  sup¬ 
ports  a  concept  similar  to  structures 
in  C.  A  structure  called  pcb  (port  con¬ 
trol  block)  is  defined  to  describe  the 
dynamic  state  of  a  port.  It  is  defined 
as  a  structure  so  that  one  piece  of 
code  can  handle  two  ports  just  by  set¬ 
ting  a  register  to  point  to  the  appro¬ 
priate  structure.  Storage  for  two  port 
control  blocks  is  then  allocated. 
Space  is  reserved  to  store  the  old  in¬ 
terrupt  vectors  required  for  the  pro¬ 
gram  to  remove  itself.  Next  the  baud 
rate  table  contains  "magic”  numbers 
required  to  set  the  UART's  baud  rate 
clocks.  Note  that  the  last  two  entries 
are  used  for  the  excom  extended 
baud  rates. 

The  first  executable  code  is  the 
COM  port  hardware  interrupt  han¬ 
dler.  The  label  intOB  is  the  entry 
point  for  int  OBh,  COM2,  whereas 
intOC  is  for  int  OCh,  COM1.  Each  entry 
sets  ds:si  to  point  to  the  pcb  associat¬ 
ed  with  its  port  and  enters  common 
code.  The  rest  of  the  routine  reads 
the  character,  puts  it  in  the  buffer, 
takes  any  action  required  for  flow 
control,  and  exits.  There  is  no  analo¬ 
gy  to  this  code  in  the  BIOS.  The  sec¬ 
ond  code  segment  is  the  int  14h  han¬ 
dler,  which  performs  five  distinct 
functions  corresponding  to  the  value 
in  the  register  ah.  The  functions  are 
initialize  port,  transmit  a  character, 
receive  a  character,  get  port  status, 
and  extend  initialization.  At  the 


Dr.  Dobb's  Journal,  June  1987 


43 

443 


COM  PORT  DRIVER 

(continued  from  page  43 ) 


end  of  excom  is  the  initialization. 
This  is  the  code  that  runs  only  when 
excom  is  being  installed.  Initializa¬ 
tion  starts  by  releasing  the  memory 
for  excom's  copy  of  the  environ¬ 
ment,  which  is  not  used.  The  body  of 
the  initialization  uses  PC-DOS  calls  to 
get  the  old  interrupt  vectors  and 
then  install  the  three  new  interrupt 
handlers.  The  interrupts  that  are 
thus  stolen  are  OBh  (COM1  hardware), 
OCh  (COM2  hardware),  and  14h  (BIOS 
COM  port  calls).  Then  the  excom  COM 
port  data  structures  are  initialized. 
This  initialization  code  is  at  the  end 
so  that  it  is  not  kept  as  part  of  the  resi¬ 
dent  code.  Initialization  ends  with 
the  PC-DOS  call  to  make  everything 
before  the  initialization  code  stay 
resident. 

Key  Functions 


The  BIOS  does  COM  port  input  by 
waiting  for  a  character  to  arrive. 
Even  though  it  is  not  used,  the  IBM  PC 
has  all  the  hardware  required  to  gen¬ 
erate  an  interrupt  when  a  character 
arrives.  This  capability  is  activated 
by  changing  the  interrupt  vector  (a 
RAM  location)  to  point  to  the  appro¬ 
priate  handler  (see  lines  555-592  of 
Listing  One).  Now  the  vectors  are  set 
up,  but  because  the  BIOS  doesn't  use 
COM  port  interrupts,  the  hardware 
has  not  been  initialized  to  generate 
interrupts.  Two  separate  pieces  of 
hardware  have  to  be  initialized— 
first  the  COM  port  hardware  (an  8250 
UART)  has  to  be  programmed  to  gen¬ 
erate  interrupts  and  then  the  inter¬ 
rupt  controller  (an  8259)  has  to  be  set 


sends  an  interrupt  complete  com¬ 
mand  to  the  interrupt  controller  be¬ 
fore  returning. 

The  BIOS  port  status  request  simply 
reads  the  two  UART  status  registers 
and  returns  them  to  the  caller.  One 
register  is  called  the  line  status  and 
contains  bits  for  receive  errors,  re¬ 
ceive  data  ready,  time-out,  and  trans¬ 
mit  ready.  The  other  is  called  mo¬ 
dem  status  and  has  the  UART’s 
control  lines.  When  characters  are 
transmitted  or  received,  the  ah  regis¬ 
ter  returns  the  line  status.  When  a 
port  is  initialized  or  status  is  request¬ 
ed,  ah  is  returned  as  above  and  al  re¬ 
turns  the  modem  status.  In  excom  in¬ 
put  characters  are  stored  in  a 
circular  buffer  as  they  are  read  from 
the  UART  and  are  then  removed  from 
the  buffer  when  a  character  read 
takes  place.  This  presents  a  dilemma 
for  excom  because  some  status  bits 
are  associated  with  the  received 
character  whereas  others  are  really 
UART  status.  The  solution  is  to  have 
the  interrupt  handler  read  the  status 
associated  with  receive  characters 
and  buffer  both  character  and  status 
together.  Then  when  a  status  is  need¬ 
ed  and  there  are  characters  in  the 
buffer,  the  status  associated  with  in¬ 
put  characters  is  taken  from  the  next 
character  in  the  buffer.  The  rest  of 
the  status  is  taken  directly  from  the 
UART  (see  lines  429-457).  This  ap¬ 
proach  is  not  always  perfect,  but  ex¬ 
com  has  been  used  successfully  with 
many  port  handling  packages,  in¬ 
cluding  COMMAND.COM. 

One  of  the  major  goals  for  excom 
was  to  provide  configurable  flow 
control  for  input  data.  The  objective 
of  flow  control  is  to  signal  the  source 
of  the  data  to  stop  sending  when  the 
buffer  is  getting  full  (see  lines  96-120). 
Then,  when  the  buffer  is  emptied  a 
bit,  the  sender  is  told  to  resume  send¬ 
ing  (see  lines  394-409).  The  exact 
number  of  characters  in  the  buffer 
when  input  is  stopped  and  started  is 
configurable  (lines  16-18).  Having  the 
stop  and  restart  values  be  different 
by  a  few  characters  tends  to  mini¬ 
mize  the  number  of  times  the  data  is 
stopped  and  restarted.  The  actual 
mechanisms  used  to  tell  the  sender  to 
stop  are  any  combination  of  DTR 
(data  transfer  register),  RTS  (request 
to  send),  and  XON/XOFF  (Control-S/ 
Control-Q).  DTR  and  RTS  are  turned 
off  as  the  buffer  fills  and  turned  back 


to  allow  the  interrupt  to  be  sent  on  to 
the  CPU.  I  expected  this  to  be  a  simple 
flip  through  the  data  sheets  and  a 
dozen  lines  of  code.  It  didn't  work — 
no  interrupts  were  generated.  After 
much  pain,  I  discovered  that  a  spare 
output  called  OUT2  on  the  UART  was 
used  to  gate  the  interrupt  signal  from 
the  UART.  I  have  no  idea  why.  I  then 
asserted  OUT2  on  the  UART  and  voi- 
la— interrupts  (see  lines  290-295). 
When  an  interrupt  does  occur,  the 
CPU  stops  what  it's  doing  and  jumps 
through  the  appropriate  vector — in 
this  case  to  intOB  or  intOC  (lines  73- 
158).  The  interrupt  handler  then 
reads  the  character,  which  resets  the 
interrupt  status  in  the  UART  and 


on  when  room  becomes  available. 
Control-S  is  transmitted  as  the  buffer 
fills,  and  Control-Q  is  transmitted 
when  room  becomes  available.  I  al¬ 
ways  thought  that  RTS  was  used  as 
described  by  its  name,  but  I’ve  seen 
others  use  it  for  input  flow  control.  It 
was  easy  to  add  and  it  didn’t  seem  to 
do  any  harm,  so  I  used  it  for  input 
flow  control,  too.  Add  to  this  arsenal 
the  ability  to  cross  a  few  wires  in  a 
cable,  and  almost  any  situation  is  cov¬ 
ered — except,  of  course,  the  sender 
that  ignores  your  attempts.  This  is 
where  excom  shines — interrupts 
and  big  buffers  provide  great  power. 
Data  can  be  copied  from  a  COM  port 
to  a  floppy  on  a  lowly  old  4.77-MHz 
IBM  PC  at  9,600  baud  without  missing 
a  character.  To  complete  the  picture, 
I  also  added  output  flow  control.  The 
BIOS  requires  that  both  DSR  and  CTS 
be  set  before  a  character  can  be  sent. 
Excom  can  be  configured  to  ignore 
either  or  both  of  these  signals. 

In  order  to  be  useful,  excom  had  to 
be  upward  compatible  with  the  BIOS 
int  14h.  The  area  of  extended  initial¬ 
ization  options  presented  some  spe¬ 
cial  problems,  There  are  no  unused 
bits  in  the  BIOS  set  configuration  com¬ 
mand.  Therefore  I  had  to  add  a  new 
command — int  14h  with  ah  having 
the  values  from  0  to  3  were  already 
used,  so  ah  =  4  was  the  obvious 
choice  (lines  460-499).  The  problem 
was  that  the  old  initialization — ah  = 
0 — included  the  baud  rate  and  the 
new  initialization  could  also  specify 
baud  rates  not  included  in  the  BIOS. 
The  obvious  solution  was  to  require 
that  the  BIOS  initialization  occur  be¬ 
fore  the  extended  initialization.  If  ex¬ 
com  was  to  be  used  with  existing 
software,  though,  this  restriction 
was  impractical.  The  actual  solution 
was  to  have  excom  ignore  the  baud 
rate  in  the  BIOS  initialization  if  the  ex¬ 
tended  initialization  had  been  used 
to  set  a  nonstandard  baud  rate  (lines 
304-306).  Thus,  you  set  the  baud  rate 
to  19,200,  start  your  existing  soft¬ 
ware,  and  a  BIOS  initialization  call  is 
made,  but  excom  ignores  the  baud 
rate.  Another  concession  to  upward 
compatibility  is  that  the  extended  op¬ 
tions  are  relative  to  the  BIOS  defaults. 
Excom  allows  DTR  input  flow  control 
to  be  enabled,  whereas  output  CTS 
flow  control  can  be  disabled. 


Dr.  Dobb  s  Journal,  June  1987 

444 


45 


COM  PORT  DRIVER 

(continued  from  page  45) 

Not  AH  Is  Gold 

Along  the  way  from  the  problem  to 
the  solution,  there  were  many  little 
struggles,  one  of  which  may  be  of  in¬ 
terest.  The  rest  are  too  embarrassing 
to  put  into  print. 

While  first  using  excom,  I  ran  into 
a  strange  problem.  I  have  a  home¬ 
made  shell  that  reads  the  keyboard 
(CRT  hooked  to  COM  port)  by  calling 
int  14h  directly.  I  ran  a  program  that 
did  CRT  output  via  PC-DOS  but  did  not 
read  any  keyboard  characters.  While 
it  was  running,  I  typed  a  command 
ahead.  When  the  program  stopped, 
my  type-ahead  was  echoed  minus 
the  first  character.  This  was  early  in 
my  use  of  excom,  so  I  assumed  a  bug 
in  excom.  Ignoring  the  problem,  I 
continued  what  I  was  doing.  Some 
time  later  I  ran  an  interactive  pro¬ 
gram  that  used  PC-DOS  for  keyboard 
inputs,  and  my  lost  character  from 
20  minutes  ago  appeared  on  the 
screen  as  the  first  character  echoed. 
Wow!  At  first  I  passed  it  off  as  a  bug 
in  excom's  input  buffer  and  a  coinci¬ 
dence.  It  happened  again!  I  consid¬ 
ered  a  career  as  a  crossing  guard. 

You  may  have  noticed  that  I  de¬ 
scribed  the  exact  way  in  which  the 
above  programs  read  their  input — 
that  was  the  key  to  understanding. 
PC-DOS  makes  a  feeble  attempt  to  do 
some  input  buffering,  and  when  PC- 
DOS  calls  int  14h  to  transmit  a  charac¬ 
ter  to  the  port,  a  status  is  returned.  If 
a  data  ready  is  indicated,  PC-DOS  does 
a  data  read  and  saves  the  character 
for  later.  PC-DOS  only  stores  one  char¬ 
acter  in  this  way  as  far  as  I  can  tell.  So 
my  type-ahead  was  eaten  by  PC-DOS, 
to  be  provided  to  the  next  call  to  PC- 
DOS  for  a  character.  Beware  of  mix¬ 
ing  PC-DOS  and  BIOS  character  reads! 

Anything  Left  to  Do? 

Of  course  there  is!  The  two  questions 
I’ve  heard  most  often  as  a  software 
engineer  are  “When  will  the  code  be 
complete?”  and  "How  many  more 
bugs  are  there?”  The  answers  have 
always  been  the  same:  "About  a 
week  or  two”  and  “Three.”  The  an¬ 
swers  are  the  same  here. 

If  you  put  a  file  on  a  floppy  and 
enter  the  command  type  file,  you 
will  notice  a  flip-flop  of  data  scrolling 
by  and  disk  activity.  This  is  because 


the  BIOS  waits  for  both  the  disk  and 
the  display.  Excom  has  input  buffer¬ 
ing  to  prevent  loss  of  data.  Output 
buffering  won't  fix  any  problems, 
but  it  will  speed  up  data  output  in  the 
case  when  the  source  of  the  data  is  a 
slow  device.  If  excom  had  output 
buffering,  it  would  take  the  charac¬ 
ters  faster  than  they  were  actually 
being  transmitted.  Thus  PC-DOS 
would  think  a  block  of  data  had  been 
sent  and  go  to  the  source  for  more 
while  excom  was  still  sending  data 
out  of  its  buffer.  The  flow  would  be 
limited  by  making  PC-DOS  wait  when 
excom's  output  buffers  were  full. 

Earlier  I  mentioned  that  excom 


An 

interesting 
possibility 
with  excom 
is  nesting 
several 
invocations. 


can  work  with  any  well-behaved 
program.  What  about  the  naughty 
ones?  At  work,  I  use  an  old  version  of 
CrossTalk  that  steals  the  COM  port  in¬ 
terrupt  and  doesn’t  put  it  back.  I 
don't  know  if  this  has  been  fixed.  Are 
you  listening  CrossTalk  folks?  Upon 
return  from  CrossTalk,  excom  has 
lost  the  COM  port  interrupt.  One  pos¬ 
sible  solution  would  be  to  run  Cross- 
Talk  from  a  batch  file  that  removed 
excom,  ran  CrossTalk,  and  then  rein¬ 
stalled  excom.  The  problem  with  this 
is  that  you  must  identify  and  fix  all 
offenders.  It  may  be  possible  for  ex¬ 
com  to  examine  its  environment  for 
damage  and  perform  repairs.  I'm 
sure  this  would  not  cover  all  such 
problems,  but  I  think  all  of  them  that 
I  am  currently  aware  of  could  be 
fixed.  Self-repairing  programs — 
maybe  I  should  do  an  article  on 
“Core  Wars.” 

When  I  use  my  shell  on  a  COM  port 
with  or  without  excom  installed, 
XON/XOFF  output  flow  control  seems 
to  work.  There  must  be  problems.  I 
have  heard  of  requests  to  provide 
XON/XOFF  in  relation  to  printer  driv¬ 


ers  and  so  on.  In  any  case,  because  PC- 
DOS  may  not  see  characters  until 
some  time  after  excom  does,  any  PC- 
DOS  attempt  at  XON/XOFF  output  flow 
control  is  hampered  by  excom.  This 
implies  that  XON/XOFF  output  flow 
control  should  be  added  to  excom. 

Because  the  IBM  PC  does  not  have 
memory  management  hardware, 
having  a  memory-resident  program 
occupy  the  wrong  location  can  cause 
problems.  When  a  shell  runs  a  pro¬ 
gram,  it  is  loaded  just  after  the  shell. 
If  the  program  remains  resident 
upon  its  termination,  the  shell  is  in  a 
squeeze.  The  shell  cannot  get  more 
memory  to  hold  shell  variables,  envi¬ 
ronment,  temporary  buffers,  and  so 
on.  A  desirable  extension  to  excom 
would  be  to  have  it  relocate  itself  to 
the  end  of  RAM  and  remain  resident 
there. 

Excom  has  a  bug!  Software  is  no 
fun  if  its  perfect.  Control-S  and  Con- 
trol-Q  are  sent  without  any  regard  to 
the  UART  being  ready  or  the  control 
lines  being  correct.  This  sounds  pret¬ 
ty  bad,  but  in  reality  it's  only  a  prob¬ 
lem  when  there  is  a  heavy  flow  of 
data  in  both  directions,  which  is  pret¬ 
ty  rare.  The  reason  I  allow  this  bug  to 
exist  is  that  the  proper  solution  re¬ 
quires  interrupt-driven  output.  DDJ 
has  only  so  many  pages  for  my  giant 
listings. 

The  version  of  excom  on  the  list¬ 
ings  disk  is  an  updated  version  with 
interrupt-driven  output,  corrected 
sending  of  Control-S/Control-Q,  the 
ability  to  repair  the  damage  done  by 
ill-behaved  programs,  the  ability  to 
do  output  XON/XOFF  flow  control, 
and  self-relocation  to  the  end  of  RAM. 
(See  the  "Availability”  section  at  the 
end  of  this  article.) 

Use  Excom  on  Vour  IBM  PC 

To  make  excom.com,  use  the  follow¬ 
ings  commands: 

masm  excom. asm  ; 
link  excom.obj ; 
exe2bin  excom.exe  excom.com 
del  excom.exe 

Normally,  you  should  install  ex¬ 
com  early  in  the  boot  process — be¬ 
fore  a  print  /d — because  the  print 
spooler  steals  int  14h.  The  real-time 
clock  initialization  program  on  my 
clone  messes  up  the  int  14h  vector  for 
an  unknown  reason;  therefore  I  in- 


46 


Dr.  Dobb's  Journal,  June  1987 

445 


COM  PORT  DRIVER 

(continued  from  page  46) 


stall  excom  after  running  the  clock 
initialization.  A  bit  of  experimenta¬ 
tion  may  be  required.  Excom  is  self¬ 
installing;  just  run  excom  to  install  it. 
If  all  you  need  is  buffered  input, 
you've  finished.  Excom  will  be  prop¬ 
erly  set  up  by  programs  using  the 
COM  ports,  and  the  existing  mode 
command  will  configure  excom.  To 
tap  the  power  of  excom,  you  may 
need  to  access  some  of  the  extended 
features.  If  you  use  excom  with  your 
own  programs,  you  can  just  make  an 
int  14h  call  with  ah  =  4  to  set  any  ex¬ 
tended  options  you  may  need.  If  you 
use  excom  with  existing  programs, 
you  need  a  way  to  set  the  extended 
options  from  your  keyboard  or  a 
batch  file. 

Listing  Two,  page  75,  is  a  simple  lit¬ 
tle  C  program  called  exmode  that  sets 
extended  configuration  parameters. 
Exmode  works  with  the  Microsoft 
4.0  C  compiler.  If  you  have  a  differ¬ 
ent  one,  you  may  need  to  rewrite  the 
function  intl4 — it  just  sets  registers 
and  performs  a  software  int  14h.  You 


can  also  use  exmode  to  install  and  re¬ 
move  excom,  usually  useful  only 
while  testing. 

Exmode  is  self-documenting;  just 
run  exmode  and  it  will  tell  you  what 
it  can  do.  Note  that  running  exmode 
does  not  add  to  existing  extended  op¬ 
tions  but  sets  the  complete  set — that 
is,  exmode  coml  nodsr  followed  by 
eymode  coml  nods  is  not  the  same  as 
e?cmode  coml  nodsr  nods.  In  the  first 
case,  only  nods  is  left  set;  in  the  sec¬ 
ond,  both  nocts  and  nodsr  are  set. 
COMl’s  settings  can  be  cleared  by 
running  exmode  coml.  The  previous 
examples  will  not  change  any  of 
COM2's  settings. 

An  interesting  possibility  with  ex¬ 
com  is  nesting  several  invocations  of 
excom.  Say  you  have  excom  set  up 
the  way  you  want  it  for  a  shell  run¬ 
ning  on  your  CRT  and  want  to  tempo¬ 
rarily  run  a  communications  pro¬ 
gram.  Just  install  excom  again,  set  it 
up  for  communications,  run  the 
communication  program,  and  then 
use  exmode  to  remove  excom.  The 
most  recent  copy  of  excom  is  re¬ 
moved,  restoring  your  old  excom 
untouched. 


Availability 

All  the  source  code  for  articles  in  this 
issue  is  available  on  a  single  disk.  To 
order,  send  $14.95  to  Dr.  Dobb  s  Jour¬ 
nal,  501  Galveston  Dr.,  Redwood  City, 
CA  94063  or  call  (4151  366-3600  ext. 
216.  Please  specify  the  issue  number 
and  format  (MS-DOS,  Macintosh, 
Kaypro). 

Bibliography 

Sargent,  Murray;  and  Shoemaker, 
Richard  L.  The  IBM  Personal  Comput¬ 
er  from  the  Inside  Out.  Reading, 
Mass.:  Addison- Wesley,  1984. 
National  Semiconductor  Corp. 
NSC800  Microprocessor  Family  Data¬ 
book.  National  Semiconductor,  1985. 
International  Business  Machines 
Corp.  Technical  Reference-Personal 
Computer  XT  and  Portable  Personal 
Computer.  IBM,  1984. 

DDJ 

(Listings  begin  on  page  64.) 

Vote  for  your  favorite  feature/article. 

Circle  Reader  Service  IMo.  5. 


Dr.  Dobb's  Journal,  June  1987 

446 


49 


ARTICLES 


Dynamic  Memory  Overlays 
for  Turbo  Pascal 


Borland  International’s  Tur¬ 
bo  Pascal  has  a  significant 
limitation  that  hinders  its 
usefulness  for  developing  large  pro¬ 
grams.  Because  the  compiler  can 
generate  code  for  only  a  single  code 
segment,  it  imposes  a  64K  limit  on  ex¬ 
ecutable  code  space.  Turbo  Pascal's 
overlay  system,  designed  to  circum¬ 
vent  this  restriction,  potentially  al¬ 
lows  for  much  larger  amounts  of 
code  but  at  the  cost  of  dramatically 
slowing  program  execution  as  over¬ 
lay  code  is  loaded  from  disk  files. 

This  article  describes  a  way  to  use 
Turbo  Pascal's  overlay  facility  in  con¬ 
junction  with  dynamic  memory  to 
build  memory  overlays  that  can  be 
loaded  and  executed  far  more  rapid¬ 
ly  than  can  disk  overlays.  These 
memory  overlays  are  easy  to  set  up 
and  can  optionally  be  left  on  disk  at 
run  time  if  sufficient  memory  is  lack¬ 
ing.  This  technique  lets  you  have  the 
best  of  both  worlds:  a  program  that 
can  execute  in  limited  memory  with 
disk  overlays  but  that  can  take  ad¬ 
vantage  of  extra  memory  to  load 
overlays  into  RAM. 

Turbo  Pascal's 
Overlay  Scheme 

Borland's  solution  to  the  large-pro¬ 
gram  problem  is  an  overlay  system. 
Programmers  confronted  with  a  pro¬ 
gram  that  would  otherwise  overrun 
the  64K  code  segment  limitation  can 
specify  that  the  code  of  a  group  of 
procedures  and  functions  should 

Steve  McMahon,  P.O.  Boy  3262,  Berke¬ 
ley,  CA  94703.  Steve  has  been  working 
with  Turbo  Pascal  since  Version  1.0. 
His  company,  SunType  Publishing 
Systems,  develops  software  for  the 
newspaper  industry. 


by  Steve  McMahon 


A  fast  and  versatile 
way  to  handle 
large  programs 
in  Turbo  Pascal 


share  the  same  execution  space  in 
the  code  segment.  You  make  this  dec¬ 
laration  by  grouping  the  routines 
contiguously  in  the  source  code  and 
preceding  the  declaration  of  each 
overlay  routine  with  the  Turbo  re¬ 
served  word  overlay.  On  compilation 
of  the  program,  all  the  routines  of  an 
overlay  group  are  compiled  to  a  sep¬ 
arate  overlay  file,  where  the  execut¬ 
ing  program  can  find  them  as  neces¬ 
sary.  You  can  specify  as  many 
overlay  groups  as  you  wish  within  a 
program.  The  only  coding  limitation 
is  that  routines  within  an  overlay 
group  cannot  call  themselves  (they 
cannot  be  directly  recursive)  or  other 
routines  within  the  same  overlay 
group.  (The  recursion  limitation  can 
be  circumvented  by  calling  the  over¬ 
lay  routine  from  a  recursive,  non¬ 
overlay  function  or  procedure.) 

The  advantages  of  this  scheme  are 
that  it  is  easy  to  use  and  that  it  allows 
you  to  construct  programs  with  code 
not  only  larger  than  64K  but  also  po¬ 
tentially  larger  than  available  RAM. 
The  disadvantage  is  a  significant  per¬ 
formance  degradation:  disk  activity 
and  DOS  file  handling  overhead  are 
now  required  every  time  an  overlay 
routine  is  executed.  Careful  selection 
and  grouping  of  overlay  routines  can 
minimize  such  performance  degra¬ 
dation,  but  it  is  still  possible  for  a  pro¬ 
gram  to  spend  far  more  time  reading 


overlays  from  disk  than  doing  any¬ 
thing  else. 

Overlay  performance  can  be  sig¬ 
nificantly  enhanced  by  storing  the 
overlay  file  on  a  virtual  memory,  or 
RAM,  disk.  Turbo  Pascal's  OvrPath 
procedure  makes  this  possible  by  al¬ 
lowing  run-time  specification  of  the 
location  of  overlay  files.  This  solution 
to  the  performance  problem, 
though,  has  requirements — that  a 
RAM  disk  be  available,  overlay  files 
be  located  there,  and  the  program  be 
correctly  informed  of  that  location — 
that  are  unacceptable  for  application 
programs  intended  for  use  in  normal 
DOS  environments  by  unsophisticat¬ 
ed  operators. 

A  solution  to  the  overlay  perform¬ 
ance  problem  that  is  faster,  more 
versatile,  and  completely  invisible  to 
an  application  program’s  users  is 
presented  here.  But  first,  let's  take  a 
look  at  how  Turbo  Pascal’s  overlay 
system  works. 

How  Overlays  Worh 

When  Turbo  Pascal  compiles  a  pro¬ 
gram  that  has  overlay  procedures,  it 
reserves  an  overlay  area  in  the  code 
segment  (and  in  the  .COM  file)  large 
enough  to  accommodate  the  largest 
of  the  procedures  or  functions  in  a 
given  overlay  group.  All  the  proce¬ 
dures  and  functions  in  that  overlay 
set  are  then  compiled  for  execution 
at  the  same  position  within  the  re¬ 
served  overlay  area.  Those  proce¬ 
dures  and  functions  are  stored  all  to¬ 
gether  in  a  separate  overlay  file  that 
bears  the  same  name  as  the  main 
program  but  has  the  number  of  the 
overlay  group  (  .000,  .001,  and  so  on) 
as  a  file-name  extension. 

When  some  other  portion  of  the 
program  makes  a  call  to  an  over- 


50 


Dr.  Dobb's  Journal,  June  1987 

447 


layed  function  or  procedure,  the  call¬ 
ing  code  passes  in  registers  the  offset 
of  the  required  overlay  function  or 
procedure  inside  the  overlay  file  and 
the  length  of  the  code  fragment  to  be 
loaded.  The  call  is  made  to  a  frag¬ 
ment  of  code  positioned  at  the  top  of 
the  reserved  overlay  area.  This  frag¬ 
ment  passes  control  to  the  overlay 
handler  in  Turbo  Pascal's  run-time 
package,  having  placed  the  address 
of  the  reserved  overlay  area  on  the 
stack.  The  overlay  handler  responds 
by  checking  to  see  if  the  code  re¬ 
quired  is  already  in  place  in  the  over¬ 
lay  area  (the  overlay  file  offset  of 
whatever  code  is  currently  in  the 
overlay  area  is  stored  in  a  data  area  at 
the  top  of  the  reserved  overlay  space 
along  with  the  name  of  the  overlay 
file).  If  the  code  required  is  already  in 
place,  the  overlay  handler  executes 
it.  If  not,  the  code  is  read  from  the 
overlay  file:  the  overlay  file  is 
opened,  a  seek  is  made  to  the  re¬ 
quired  offset,  and  the  specified  quan¬ 
tity  of  code  is  read  into  the  overlay 
area.  Finally,  the  file  is  closed,  the 
overlay  file  offset  of  the  now-current 
code  fragment  is  saved  in  memory 
for  future  reference,  and  the  overlay 
function  or  procedure  is  executed. 

This  overlay  handling  method  is 
ripe  for  alteration.  If  the  contents  of 
the  overlay  file  could  be  positioned 
in  some  otherwise  unused  portion  of 
memory,  then  a  replacement  over¬ 
lay  handler  could  snatch  code  frag¬ 
ments  from  that  area  of  memory 
rather  than  from  a  disk  file.  The  time 
savings  could  be  impressive:  overlay 
calls  would  require  no  disk  activity. 
Such  a  scheme  should  even  provide 
significant  speed  advantages  over  us¬ 
ing  Borland's  overlay  handler  in  con¬ 
junction  with  a  RAM  disk  (to  contain 
the  overlay  file)  because  DOS  file  han¬ 
dling  overhead  would  be  eliminated. 
What  a  memory  overlay  scheme 
couldn't  do,  though,  is  allow  for  pro¬ 
grams  that  are  truly  larger  than 
memory  because  all  the  overlay  code 
would  need  to  fit  in  memory  avail¬ 
able  to  the  program. 

Such  a  scheme  can  be  implement¬ 
ed  with  no  changes  to  the  compiler 
and  with  surprisingly  little  extra 
code.  Best  of  all,  it  can  be  done  with 
little  modification  of  existing  pro¬ 
grams  that  use  overlays.  All  that's 
necessary  is  to  include  the  proce¬ 
dures  contained  in  MemOvrly.Inc 


(Listing  One,  page  78)  in  a  program 
that  uses  overlays  and  to  add  some 
initialization  and  deinitialization 
code  to  the  program.  (Some  limita¬ 
tions  on  the  scheme  are  discussed 
later.) 

Using  MemOvrly.Inc 

You  can  include  the  three  proce¬ 
dures  that  make  up  the  memory 
overlay  handler  in  an  existing  pro¬ 
gram  by  adding  the  line: 

{SI  MEMOVRLY.INC} 

at  a  point  in  the  procedure  and  func¬ 
tion  declaration  part  of  a  program 
that  uses  overlays  (do  not  nest  it  in¬ 
side  a  procedure  or  function).  Then, 
you  must  add  a  procedure  call,  prob¬ 
ably  inside  the  main  body  of  the  pro¬ 


gram,  to  initialize  the  substitute  over¬ 
lay  handler  for  each  overlay  group 
that  you  wish  to  use  as  a  memory 
overlay  group.  You  can  add  code  to 
dispose  of  the  dynamic  memory 
used  by  the  memory  overlay  group 
or  groups  at  the  end  of  the  program. 
This  cleanup  code  is  particularly  im¬ 
portant  if  the  program  chains  to  an¬ 
other  Turbo  Pascal  program  because 
Turbo  Pascal  preserves  the  dynamic 
memory  heap  on  chaining. 

To  set  up  a  particular  overlay 
group  as  a  memory  overlay  group, 
all  that’s  necessary  is  to  run  the  Init- 
Overlay  procedure,  passing  it  the  ad¬ 
dress  of  some  procedure  or  function 
in  the  overlay  group.  If,  for  example, 
an  overlay  group  contained,  among 
other  routines,  the  function  or  proce¬ 
dure  One,  the  entire  overlay  group 


PROGRAM  OverlayTest; 

(*  Memory  Overlay  Demonstration  Program.  *) 

{$1  MEMOVRLY.INC) 

VAR 

c  :  Char; 

OVERLAY  PROCEDURE  One; 

BEGIN 

WriteLn  ( 'This  is  Overlay  Procedure  One.'); 

END; 

OVERLAY  PROCEDURE  Two; 

BEGIN 

WriteLn ( 'This  is  Overlay  Procedure  Two.'); 

END; 

BEGIN 

(Install  the  new  overlay  handler  by  passing  it  the  address 
offset  of  ONE  procedure  or  function  from  the  overlay  group. 
Multiple  invocations  for  multiple  overlay  groups  should  be 
no  problem. ) 

InitOverlay (Ofs (One) ) ; 

REPEAT 

Write (‘Hit  any  key  to  run  the  overlays  (" Z  to  stop):  '); 
Read(Kbd,  c) ; 

WriteLn; 

IF  c  <>  AZ  THEN 
BEGIN 
One; 

Two; 

END; 

WriteLn; 

UNTIL  c  =  *Z; 

(Free  up  the  heap  space  used  by  the  replacement  overlay 
handler  by  passing  the  same  offset  as  above  to  the 
DisposeOverlayStorage  Routine  --  VITAL  if  you're  chaining 
to  another  program.  The  heap  is  preserved  in. a  chain  operation.) 

DisposeOverlayStorage (Ofs (One) ) ; 

END. 


Example  1:  Short  program  demonstrating  memory  overlays 


Dr.  Dobb 's  Journal,  June  1987 

448 


51 


TURBO  PASCAL  OVERLAYS 

(continued  from  page  51) 


could  be  initialized  as  a  memory 
overlay  group  with  the  statement: 

InitOverlayi  Ofs(  One  ) ); 

Ofs  is  a  built-in  Turbo  Pascal  function 
that  returns  the  offset  within  a  seg¬ 
ment  of  a  variable  or  procedure.  Be¬ 
cause  you  know  that  the  reserved 
overlay  area  associated  with  the  rou¬ 
tine  One  is  in  the  code  segment,  pass¬ 
ing  the  offset  of  the  routine  is  suffi¬ 
cient  to  establish  where  the  reserved 
overlay  area  for  this  group  is  located. 
This  instruction  allocates  dynamic 
memory  sufficient  to  contain  all  the 
overlay  code,  reads  that  code  into 
memory,  and  installs  the  procedure 
NewOverlayHandler  as  overlay  han¬ 
dler  for  that  group.  Any  overlay 
groups  not  set  up  in  this  fashion  are 
handled  by  Turbo  Pascal's  normal 
overlay  handler.  Also,  if  not  enough 
dynamic  storage  is  available  to  ac¬ 
cept  the  overlay  code,  the  group  is 
left  as  a  normal,  nonmemory 
overlay. 

Likewise,  the  memory  used  to  con¬ 
tain  all  the  overlay  code  for  a  given 
overlay  group  can  be  reclaimed  for 
other  uses  with  the  instruction: 

DisposeOverlayStoragei  Ofs(  One  ) ); 

It's  crucial  that  you  do  not  try  to  call 
any  routine  in  the  overlay  group  af¬ 
ter  issuing  this  instruction.  The  Dis- 
poseOverlayStorage  procedure  does 
not  restore  the  old,  nonmemory 
overlay  handler  for  the  specified 
overlay  group.  It  merely  frees  up  the 
memory  previously  occupied  by 
overlay  code  for  other  uses.  So,  the 
DisposeOverlay Storage  command  is 
usually  used  only  (if  at  all)  at  the  end 
of  a  program.  If  the  program  does 
not  use  Turbo  Pascal's  Chain  facility 
to  chain  to  another  Turbo  Pascal  pro¬ 
gram,  you  probably  don’t  even  need 
to  use  the  DisposeOverlayStorage 
procedure.  If  you  do  need  it,  it 
should  be  executed  once  for  each 
memory  overlay  group  in  use. 

How  It  Works 

When  asked  to  initialize  a  memory 
overlay  group,  the  InitOverlay  proce¬ 
dure  pulls  the  name  of  the  overlay 
code  file  from  the  reserved  overlay 


area  and  opens  the  file  as  a  Turbo 
Pascal  untyped  file.  The  file  is  sized 
and  a  check  is  made  to  make  sure 
that  enough  dynamic  memory  is 
available  to  hold  the  entire  file's  con¬ 
tents  with  sufficient  heap  space  left 
over  to  satisfy  the  program's  other 
needs.  (The  amount  of  dynamic 
memory  otherwise  required  by  the 
program  is  determined  by  the  con¬ 
stant  RequiredHeap,  which  can  be 
changed  to  reflect  the  needs  of  a  par¬ 
ticular  program.) 

If  not  enough  dynamic  memory  is 
available,  things  go  no  further  and 
the  overlay  group  is  left  as  a  normal, 
nonmemory  overlay  group.  Other¬ 
wise,  heap  space  is  allocated  and  the 
overlay  file  is  read  into  memory. 
Then,  the  initialization  procedure 
substitutes  codes  comprising  a  call  in¬ 
struction  for  the  memory  overlay 
handler  for  code  that  would  normal¬ 
ly  call  Turbo  Pascal’s  native  overlay 
handler.  A  data  area  that  formerly 
contained  the  overlay  file  name  is 
subsequently  filled  with  information 
more  pertinent  to  the  memory  over¬ 
lay  handling  routine:  the  location 
and  size  of  the  heap  space  containing 
all  the  overlay  code. 

When  a  routine  in  the  memory 
overlay  group  is  needed,  the  call  is 
directed  to  the  new  overlay  handler. 
This  handler  uses  the  overlay  offset 
and  size  location  it  receives  to  find 
the  required  code  on  the  heap.  The 
code  is  then  moved  into  the  reserved 
overlay  area  in  the  code  segment 
with  a  string  move  and  is  executed. 

The  procedure  that  disposes  of 
overlay  storage  on  the  heap  is  pretty 
straightforward.  The  only  special 
thing  it  does  is  check  to  make  sure 
that  the  overlay  group  it  has  been 
pointed  to  is  actually  a  memory  over¬ 
lay  group.  If  there  wasn't  adequate 
dynamic  memory  when  memory 
overlay  initialization  was  attempted, 
then  the  group  might  have  been  left 
as  a  normal  overlay  group  and  there 
would  be  no  dynamic  memory  to 
free. 

Limitations  and  Cautions 

The  memory  overlay  scheme  de¬ 
scribed  here  does  not  work  with 
overlay  groups  for  which  the  over¬ 
lay  file  is  64K  or  larger. 

Nested  overlays,  created  by  declar¬ 
ing  overlay  groups  within  already 
overlayed  functions  or  procedures, 


cannot  be  made  into  memory  over¬ 
lays.  This  would  require  modifica¬ 
tion  of  code  inside  overlay  files  or  on 
the  heap  and  seems  beyond  the 
bounds  of  what  could  be  reliably, 
simply  implemented. 

Turbo  Pascal’s  OvrPath  procedure 
won’t  work  in  conjunction  with 
memory  overlay  procedures.  This  is 
one  limitation  that  could  be  over¬ 
come  without  too  much  difficulty  by 
anyone  who  understands  the  code  in 
MemOvrly.Inc — but  it's  probably  not 
necessary  because  the  primary  utili¬ 
ty  of  OvrPath  is  the  placement  of  nor¬ 
mal  overlays  on  RAM  disks. 

MemOvrly.Inc  is  highly  version 
dependent.  It  definitely  won't  work 
with  any  version  of  Turbo  Pascal  pri¬ 
or  to  3.0,  and  if  Borland  changes  its 
overlay  handling  technique  in  later 
versions,  MemOvrly.Inc  may  have  to 
be  modified  to  take  account  of  the 
changes. 

Finally,  as  with  overlays  in  gener¬ 
al,  memory  overlays  should  be  used 
only  for  functions  and  procedures 
that  have  been  thoroughly  de¬ 
bugged.  Debugging  functions  and 
procedures  placed  in  overlay  groups 
can  be  exceptionally  difficult. 

Customizing  MemOvrly.Inc 

You  may  wish  to  customize  the 
memory  overlay  handling  routines 
in  two  ways.  First,  the  initialization 
routine  contains  no  input/output  er¬ 
ror  checking  code. 

Second,  when  a  program  needs 
significant  amounts  of  dynamic 
memory,  the  RequiredHeap  constant 
at  the  top  of  MemOvrly.Inc  should  be 
customized  to  reflect  that  fact  (as 
should  the  "minimum  free  dynamic 
memory”  setting  on  the  Turbo  Pascal 
memory  usage  menu). 

Availability 

All  the  source  code  for  articles  in  this 
issue  is  available  on  a  single  disk.  To 
order,  send  $14.95  to  Dr.  Dobb's  Jour¬ 
nal ,  501  Galveston  Dr.,  Redwood  City, 
CA  94063  or  call  (415)  366-3600  ext. 
216.  Please  specify  the  issue  number 
and  format  (MS-DOS,  Macintosh, 
Kaypro). 

DDJ 

(Listing  begins  on  page  78.) 

Vote  for  your  favorite  feature/article. 

Circle  Reader  Service  No.  6. 


52 


Dr.  Dobb's  Journal,  June  1987 

449 


ARTICLES 


A  Unix  BBS 
Using  Shell  Scripts 

by  Jan  L.  Harrington 


Two  years  ago  I  assumed  the 
responsibility  of  developing 
a  BBS  (now  known  as  Scholas- 
tech  Telecommunications)  for  Scho- 
lastech,  a  nonprofit  organization 
made  up  of  college  professors  and 
other  industry  professionals  whose 
goal  is  to  facilitate  and  enhance  aca¬ 
demic  computing.  The  BBS  is  de¬ 
signed  to  provide  a  distribution 
mechanism  for  Scholastech's  public- 
domain/shareware  software  collec¬ 
tion  and  to  futher  the  exchange  of  in¬ 
formation  between  educators  at  the 
college  and  university  level. 

Scholastech  Telecommunications’ 
software  was  originally  developed 
and  installed  on  an  AT&T  7300  with  1- 
megabyte  RAM  and  a  20-megabyte 
hard  disk  (by  the  time  this  article  is 
published,  the  system  will  have  been 
transfered  to  an  AT&T  3B2  with  a  72- 
megabyte  drive).  It  has  been  in  oper¬ 
ation  since  October  1985,  and  to  the 
best  of  AT&T’s  knowledge,  it  is  the 
only  BBS  that  has  ever  run  on  a  7300. 
Though  the  decision  to  use  that  par¬ 
ticular  machine  was  dictated  only  by 
circumstances  (no  other  Scholastech 
hardware  had  the  necessary  config¬ 
uration),  the  Unix  operating  system 
has  provided  an  ideal  environment 
in  which  to  develop  and  operate  a 
bulletin-board  system.  Not  only  is 
Unix  multiuser  and  multitasking  but 
it  also  contains  significant  support 
for  telecommunications. 

This  article  briefly  discusses  the 
Unix  telecommunications  environ¬ 
ment  and  then  looks  in  depth  at  the 
software  written  to  implement  Scho- 


Jan  L.  Harrington ,  4002  Stearns  Hill 
Rd.,  Waltham,  MA  02154.  Jan  is  an  as¬ 
sistant  professor  in  the  Computer  Sci¬ 
ence  Department  at  Bentley  College. 
She  is  the  author  o/Macintosh  Assem¬ 
bly  Language:  An  Introduction. 


Unix 

has  made  it 
remarkably  simple 
to  develop  a 
bulletin-board 
system. 

lastech  Telecommunications.  Al¬ 
though  it  may  seem  like  exposing  the 
trade  secret  of  the  century,  it  is  none¬ 
theless  true  that  the  Unix  operating 
system  has  made  the  development 
remarkably  simple;  the  entire  BBS 
software  consists  of  a  set  of  Unix  V 
Bourne  shell  scripts,  some  support¬ 
ing  text  files,  and  a  public-domain 
XMODEM  file  transfer  program. 

The  Unix 

Telecommunications 

Environment 

Most  implementations  of  Unix  sup¬ 
port  both  electronic  mail  (mail)  and 
intersystem  file  transfers  ( uucp — 
Unix  to  Unix  copy).  When  these  func¬ 
tions  are  present,  the  system  is  able 
to  answer  and  place  phone  calls. 
Unix  also  supports  cu  (call  up)  opera¬ 
tions,  which  turn  the  system  issuing 
the  call  into  a  dumb  terminal.  If  the 
system  called  happens  to  be  another 
Unix  machine,  cu  supports  file  trans¬ 
fer  as  well.  In  terms  of  developing  a 
BBS,  cu  is  of  little  use  because  it  is  un¬ 
reasonable  to  assume  that  all  incom¬ 
ing  calls  will  be  from  another  Unix 
system.  Cu' s  file  transfer  functions 
therefore  cannot  be  used  as  part  of 
BBS  software. 

On  the  other  hand,  the  presence  of 
facilities  to  manage  electronic  mail 
and  uucp  transfers  means  that  the 
system  can  handle  incoming  calls 
from  any  type  of  system  (it  is  nonthe- 
less  true  that  the  full  power  of  Unix 


telecommunications  can  only  be  re¬ 
alized  when  the  communication  is 
with  another  Unix  system).  The  steps 
for  preparing  a  Unix  system  for  tele¬ 
communications  do  not  vary  greatly 
from  one  implementation  to  anoth¬ 
er,  though  the  details  of  how  and 
where  the  activities  occur  may  be 
different.  Generally,  setting  up  a 
Unix  system  to  receive  incoming 
calls  involves  the  following  two 
actions: 

1.  Configure  communications  port(s): 
The  AT&T  7300  has  two  internal  mo¬ 
dems;  port  configurations  are  han¬ 
dled  by  an  applications  shell  known 
as  The  Office  (because  The  Office 
performs  some  undocumented  ac¬ 
tions  when  it  configures  ports,  users 
of  the  7300  are  wise  not  to  attempt  to 
do  the  configuration  from  the 
Bourne  shell).  More  commonly,  how¬ 
ever,  the  configuration  is  handled 
from  the  shell  by  making  appropri¬ 
ate  entries  in  device  configuration 
files,  though  the  actual  names  of  the 
files  will  vary  between  implementa¬ 
tions. 

2.  Verify  that  a  getty  process  is  avail¬ 
able  for  the  port(s):  Getty  is  a  Unix  sys¬ 
tem  process  that  polls  devices  for  in¬ 
put;  it  is  also  the  process  that  actually 
logs  a  user  onto  the  system.  Instruc¬ 
tions  as  to  how  the  system  should  re¬ 
spond  to  device  activity  are  con¬ 
tained  in  the  file  inittab.  For 
example,  the  following  inittab  is 
used  by  the  7300: 

is:2:initdefault 

rc:bootwait:/etc/rc  >  /dev/window 
<  /dev/wl  2>&1 
vid:2:respawn:/etc/getty  window 

9600 

:ph0:2:respawn:/etc/getty  phO  1200 
phl:2:respawn:/etc/getty  phi  1200 
000:2:respawn:/etc/getty  ttyOOO  9600 


54 

450 


Dr.  Dobb  s  Journal ,  June  1987 


The  first  line  in  inittab  puts  the  sys¬ 
tem  in  multiuser  mode  when  the  sys¬ 
tem  is  started;  the  second  invokes  rc, 
the  shell  script  that  handles  system 
boot.  The  remaining  lines  are  for  ac¬ 
tual  devices.  The  colon  in  front  of  phO 
indicates  that  the  first  modem  line  is 
inactive  and  should  not  be  polled;  the 
line  in  use  is  phi.  The  word  respawn, 
seen  in  the  four  device  lines,  indi¬ 
cates  that  if  the  getty  process  is  not 
active  when  the  device  is  polled,  it 
should  be  started. 

For  some  implementations  of 
Unix,  the  same  port  cannot  be  used 
for  both  incoming  and  outgoing  calls; 
the  getty  process  must  be  inactive  for 
outgoing  lines  and  active  for  incom¬ 
ing  lines.  Configuring  such  systems 
for  outgoing  calls  means  that  a  colon 
must  be  placed  in  inittab  at  the  begin¬ 
ning  of  the  entry  for  each  port  that 
will  be  used  to  dial  out,  as  was  done 
above  for  the  inactive  phO  line.  The 
7300,  however,  kills  the  getty  on  a 
port  automatically  whenever  an  out¬ 
going  call  is  placed,  making  it  possi¬ 
ble  to  use  the  same  line  for  both  in¬ 
coming  and  outgoing  traffic. 

Once  one  or  more  ports  have  been 
configured  (assuming  a  modem  is  at¬ 
tached)  and  a  getty  is  polling  those 
ports,  Unix  answers  incoming  calls 
and  takes  users  through  the  log-in 
process  without  intervention  from 
an  application  program.  Unix  also  in¬ 
tercepts  the  system  log-out  com¬ 
mand,  Ctrl-D,  and  logs  users  off.  It  is 
important  to  realize  that  this  is  not 
enough  to  configure  the  system  for 
outgoing  calls;  outgoing  calls  require 
at  least  an  entry  in  the  file  L-devices 
which  describes  the  type  of  modem 
in  use.  Uucp  and  mail  also  require 
entries  in  the  file  L.sys,  which  identi¬ 
fies  the  names  and  phone  numbers 
of  other  Unix  systems  that  the  host 
can  call.  Additional  configurations 
may  be  necessary  under  specific 
Unix  implementations. 

What  does  this  mean  for  a  BBS  that 
is  to  run  on  a  Unix  machine?  The  BBS 
software  does  not  have  to  manage 
telecommunications:  it  does  not  have 
to  poll  a  serial  port  to  detect  incom¬ 
ing  calls,  it  does  not  have  to  log  users 
onto  the  system,  and  it  does  not  have 
to  log  users  off  the  system.  A  simple 
Unix  BBS  needs  to  be  concerned  only 
with  the  actual  functions  remote  us¬ 
ers  will  perform  after  they  are  actu¬ 


ally  logged  onto  the  system. 

Overview  of  the  System 

The  Scholastech  BBS  software  pro¬ 
vides  the  following  functions: 

•  file  upload  and  download 

•  public  message  exchange 

•  private  mail 

•  user  help 

•  change  of  password 

Each  user  is  given  a  separate  Unix 
account  under  what  is  known  as  the 


The  steps 
for  preparing  a 
Unix 

telecommunications 
system 
do  not  vary 
greatly. 


“restricted  shell.”  The  restricted 
shell  prevents  users  from  changing 
directories,  from  accessing  directo¬ 
ries  not  in  their  default  PATH  (set  in  a 
.profile,  an  executable  shell  script 
that  is  run  automatically  when  a  user 
logs  in),  and  from  using  any  com¬ 
mands  not  contained  in  their  rbin  di¬ 
rectory.  The  rbin  directory  is  also 
made  part  of  the  user’s  default  path. 
The  restricted  shell  therefore  effec¬ 
tively  locks  BBS  users  into  a  small  seg¬ 
ment  of  the  system,  preventing  them 
from  even  listing  the  contents  of  di¬ 
rectories  that  are  not  part  of  the  BBS. 

BBS  commands  are  each  imple¬ 
mented  as  a  separate  shell  script. 
There  are  at  least  four  advantages  to 
this  approach: 

1.  The  short  shell  scripts  are  easier  to 
debug  than  a  long,  single  program, 
regardless  of  whether  the  program  is 
a  shell  script  or  compiled  C. 

2.  New  commands  can  be  tested  and 
added  at  any  time  without  disturbing 
BBS  operation. 

3.  Individual  shell  scripts  simplify 
the  logging  of  users’  activities  and  the 
monitoring  of  their  activities  while 
they  are  on  the  system. 

4.  Shell  script  programming  doesn’t 
require  the  nearly  10  megabytes  of 


disk  space  needed  by  the  C  develop¬ 
ment  system  on  the  7300. 

There  are,  of  course,  some  major 
disadvantages  to  working  with  sepa¬ 
rate  shell  scripts: 

1.  Each  user  must  have  his  or  her 
own  account  (as  long  as  the  number 
of  users  is  small,  this  is  not  an  enor¬ 
mous  problem,  but  as  the  number  of 
users  rises,  account  management  be¬ 
gins  to  consume  significant  amounts 
of  time). 

2.  Because  they  are  interpreted  rath¬ 
er  than  compiled,  shell  scripts  run 
much  more  slowly  than  would,  for 
example,  programs  in  compiled  C. 

3.  The  Bourne  shell  itself,  although 
containing  powerful  flow  of  control 
statements,  is  weak  in  terms  of  arith¬ 
metic  operations.  Even  simple  addi¬ 
tion  must  be  prefaced  with  the  com¬ 
mand  eypr  and  enclosed  in  single 
quotes  before  the  sum  can  be  as¬ 
signed  to  a  variable. 

Scholastech  Telecommunications 
has  two  special  accounts  without 
passwords,  new  and  info.  The  new 
account  is  designed  only  for  on-line 
account  requests;  the  info  account 
displays  information  about  Scholas¬ 
tech  activities  and  can  accept  on-line 
sign-up  for  Scholastech  workshops. 
Code  for  these  functions  is  contained 
in  their  .profiles. 

BBS  Directory  Structure 

Scholastech  Telecommunications 
files  are  maintained  within  the  direc¬ 
tories  /u/bbs  and  /u/bbs/rbin  and 
their  subdirectories.  The  contents  of 
the  directory  in  /u/bbs  are: 

•  rbin  directory  (contains  all  com¬ 
mands  available  to  restricted  shell 
users) 

•  MS-files  directory  (contains  MS-DOS 
software) 

•  Unix-files  directory  (contains  Unix 
software) 

•  Mac-files  directory  (contains  Macin¬ 
tosh  software) 

•  Uploads  directory  (destination  for 
all  uploaded  software) 

•  new  (log-in  for  on-line  sign-up  for 
new  accounts) 

•  info  (log-in  for  Scholastech  informa¬ 
tion  and  on-line  workshop  sign-up) 

•  msg  directory  (contains  support 
files  for  the  public  message 


Dr.  Dobb's  Journal,  June  1987 


55 

451 


UNIX  BBS 

(continued  from  page  55) 


subsystem) 

•  log.file  (a  text  file  that  records  BBS 
command  usage) 

•  msgl  (a  text  file  whose  contents  are 
displayed  by  a  user's  .profile) 

•  directories  for  each  BBS  user 

The  directory  /u/bbs/rbin  contains: 

•  dwnld  (shell  script  for  downloading 
software) 

•  help  (shell  script  for  displaying  BBS 
instructions) 

•list  (shell  script  for  listing  the  con¬ 
tents  of  a  data  library) 

•  pmail  (shell  script  for  sending  and 
reading  private  mail) 

•  readmsg  (shell  script  for  reading 
public  messages) 

•scan  (shell  script  for  viewing  the 
headers  of  public  messages) 

•  send  (shell  script  for  sending  public 
messages) 

•xmodem  (public-domain  file  trans¬ 
fer  program;  compiled  C) 

•  supporting  text  files: 


a.  MS. list  (listing  of  MS-DOS  data 
library) 

b.  Mac.list  (listing  of  Macintosh 
data  library) 

c.  Unix. list  (listing  of  Unix  data 
library) 

d.  help. file  (text  file  with  BBS 
instructions) 

e.  user. file  (listing  of  system  users) 

•  all  Unix  commands  used  by  the  BBS 
shell  scripts  (either  copied  or  linked 
into  this  directory) 

The  Shell  Scripts 

BBS  User’s  .profile 

The  environment  for  each  BBS  user  is 
configured  by  a  short  .profile: 

PATH  =  /u/bbs/rbin:  /u/bbs /MS- 
files:/u/bbs/Mac-files:/u/bbs/Unix- 
files:/u/bbs/Uploads 

export  PATH 
PS1  =  ’>  ’ 
cat  /u/bbs/msgl 
echo 

The  .profile  establishes  the  rbin  di¬ 
rectory  and  the  program  library  di¬ 


rectories  as  the  user's  default  PATH 
(although  technically  there  is  only 
one  restricted  shell,  /bin/rsh,  there 
can  be  many  rbin  directories  within 
a  single  Unix  file  system,  each  of 
which  has  a  different  path  name  and 
contains  a  different  set  of  com¬ 
mands).  The  default  prompt  is  also 
changed  from  the  standard  Bourne 
shell  $  to  a  >.  Finally,  the  .profile  dis¬ 
plays  the  contents  of  a  short  wel¬ 
come  message  (stored  in  msgl)  and 
then  returns  control  to  the  operating 
system. 

Users  logging  in  for  the  first  time 
or  after  a  long  absence  usually  use 
the  help  command  to  display  a 
screen  full  of  instructions.  The  help 
command  shell  script  is  only  two 
lines  long: 

echo  'who  am  i  !  cut  -fl  -d"  "  ’  'date  ! 
cut  -C1-C16'  "help"  >>/u/bbs/ 

log.file 

cat  /u/bbs/rbin/help.file 

The  first  line  writes  a  record  to  the 
command  use  log  file,  /u/bbs/log 
.file;  the  second  simply  displays  the 


56 

452 


Dr.  Dobb's  Journal,  June  1987 


contents  of  a  text  file  containing 
instructions. 

Users  who  wish  to  change  their 
passwords  can  issue  the  command 
passwd.  This  Unix  command  is  not 
contained  in  a  shell  script  but  is  made 
available  to  BBS  users  at  their  > 
prompt.  Passwd  prompts  for  the 
new  password  and  then  asks  that  it 
be  entered  again  for  verification.  On 
the  7300,  passwd  also  enforces  for¬ 
mat  constraints — passwords  must  be 
two  words  separated  by  a  special 
character  (only  passwords  estab¬ 
lished  by  the  superuser  can  violate 
those  constraints). 

The  Program  Libraries 

Three  commands  support  user  inter¬ 
action  with  the  program  libraries — 
list,  which  displays  the  contents  of  a 
program  library;  dwnld,  which 
downloads  a  file;  and  upld,  which 
uploads  a  file. 

The  command  list  (Listing  One, 
page  80)  uses  the  Unix  command 
more  to  display  the  contents  of  a  text 
file  that  contains  a  listing  of  files 
available  in  a  specific  program  li¬ 
brary  (lines  6,  11,  and  16).  Although 
the  more  widely  used  cat  command 
also  displays  the  contents  of  a  text 
file,  more  displays  the  text  one 
screen  at  a  time  (users  advance  by 
pressing  the  space  bar  or  Return  key) 
and  is  therefore  better  suited  for 
screen  displays.  The  Unix  command 
to  display  a  file  directory,  Is,  could 
also  be  used  to  show  the  contents  of 
the  program  libraries  directly;  this 
would  eliminate  the  need  to  main¬ 
tain  the  separate  text  files.  In  terms 
of  security,  however,  Is  is  a  "danger¬ 
ous”  command.  If  Is  is  present  in  the 
rbin  directory,  users  who  know  Unix 
can  use  it  from  the  >  prompt  to  view 
the  contents  of  that  directory.  They 
can  then  be  aware  that  potentially 
more  dangerous  commands  such  as 
mv,  which  moves  a  file  from  one 
name  or  place  to  another,  are  avail¬ 
able  under  the  restricted  shell.  The 
best  situation  is  to  avoid  the  use  of 
dangerous  commands  entirely,  but 
they  are  essential  to  more  than  one 
of  the  BBS  shell  scripts. 

The  listings  for  each  of  the  three 
program  libraries  are  kept  in  sepa¬ 
rate  files.  Users  must  specify  which 
program  library  as  an  argument  to 
the  command.  For  example,  list  MS 
displays  the  contents  of  /u/bbs/ 


rbin/MS.list.  The  shell  script  traps 
the  argument  at  lines  2,  7,  and  12.  If 
no  argument  is  present  (line  17),  the 
shell  script  prints  an  error  message 
(line  18)  and  exits.  List,  like  all  the 
other  BBS  commands,  also  writes  a 
record  to  the  command  use  log  file 
(lines  4,  9,  and  14). 

Dwnld  (Listing  Two,  page  80)  trans¬ 
fers  files  from  the  program  libraries 
to  a  remote  user.  The  command  re¬ 
quires  two  arguments — a  designator 
for  the  program  library  and  the 
name  of  the  file.  For  example,  dwnld 
MS  PCFILE.ARC  initiates  transfer  of 
the  file  /u/bbs/MS-files/PCFlLE.ARC. 
The  first  12  lines  of  the  dwnld  script 
identify  the  program  library.  If  no 
program  library  is  included,  the 
script  prints  an  error  message  (line 
11)  and  does  not  execute  a  transfer. 
Assuming  that  a  valid  program  li¬ 
brary  has  been  entered,  dwnld  then 
verifies  that  a  file  name  is  present 
(line  13);  the  error  message  for  a  miss¬ 
ing  file  name  appears  in  lines  37  and 
38.  The  presence  of  a  file  name  on 
the  command  line,  however,  does 
not  guarantee  that  the  file  exists.  A 


check  for  a  valid  file  occurs  on  line  15 
(the  error  message  for  an  invalid  file 
name  appears  in  lines  32-34).  After 
verifying  that  the  requested  file  ex¬ 
ists,  dwnld  writes  a  log  record  (line 
17)  and  then  prompts  the  user  for  the 
file  transfer  method  (line  18).  Files 
can  be  transferred  as  ASCII  text  (lines 
24-29);  the  script  simply  cats  the  file 
to  the  user,  who  must  have  instruct¬ 
ed  his  or  her  terminal  emulator  pro¬ 
gram  to  capture  incoming  data  to 
disk. 

Binary  file  transfers  are  per¬ 
formed  using  the  XMODEM  file  trans¬ 
fer  protocol  (line  22).  Xmodem,  the 
program  used  to  implement  the 
transfers,  is  a  stand-alone  file  upload 
and  download  program  that  can  be 
executed  from  the  Unix  command 
level  as  well  as  from  within  a  shell 
script.  Two  versions  exist  in  the  pub¬ 
lic  domain,  umodem  and  uc,  both  of 
which  are  available  as  C  source  code. 
Xmodem  was  obtained  by  download¬ 
ing  the  umodem  source  code  from  an 
existing  Unix  BBS  and  then  compil¬ 
ing  it  on  the  7300  (this  was  done  prior 
to  loading  the  program  libraries, 


Dr.  Dobb's  Journal,  June  1987 


57 

453 


UNIX  BBS 

(continued  from  page  57) 

when  there  was  still  enough  space  of 
the  disk  for  the  entire  development 
system).  The  xmodem  command  line 
is  straightforward:  the  argument  -sb 
indicates  that  a  file  is  to  be  sent  and 
that  its  format  is  binary;  the  second 
argument  is  the  path  name  of  the  file 
to  be  transmitted. 

Upld  (Listing  Three,  page  81)  han¬ 
dles  file  transfers  in  precisely  the  op¬ 
posite  manner  to  dwnld.  ASCII 
uploads  echo  incoming  characters  to 
a  text  file  (lines  14-19);  to  upload  bina¬ 
ry  files,  xmodem  is  invoked  with  the 
arguments  -rb,  receive  binary  (line 
12).  Upld  also  contains  code  to  verify 
that  a  name  for  the  file  to  be  upload¬ 
ed  has  been  entered  as  an  argument 
to  the  command  (lines  1  and  26-27); 
verifies  that  the  file  name  for  the  in¬ 
coming  file  doesn't  duplicate  an  ex¬ 
isting  file  name  in  the  Uploads  direc¬ 
tory  (lines  1-5);  and  collects  a 
description  of  the  file  (appended  to 
the  text  file  /u/hbs/Uploads/Doc- 
.file),  which  the  sysop  can  use  to 
quickly  figure  out  the  nature  of  up¬ 
loaded  files. 

The  Public  Message  Subsystem 

Three  shell  scripts  support  the  ex¬ 
change  of  public  messages — scan  dis¬ 
plays  the  headers  of  all  current  mes¬ 
sages,  readmsg  displays  individual 
messages  by  number,  and  send  sends 
a  public  message. 

Scan  is  a  short  script  that  does 
nothing  more  than  display  the  con¬ 
tents  of  a  special  text  file,  .index,  to 
which  an  entry  is  added  whenever  a 
message  is  sent: 

echo  who  am  i  !  cut  -fl  -d"  "  ’  date  ! 
cut  -cl-cl6'  "scan"  >>/u/bbs/ 

log.file 

echo 

echo 

more  /u/bbs/msg/. index 
echo 

Scan’s  purpose  is  to  allow  users  to 
identify  the  numbers  of  messages 
that  might  be  of  interest. 

Messages  are  displayed  on  the 
screen  by  the  command  readmsg 
(Listing  Four,  page  81).  Readmsg  uses 
two  simple  files — /u/bbs/. first, 
which  contains  the  number  of  the 
first  message  available,  and  /u/bbs/ 


.last,  which  contains  the  number  of 
the  last  message  available — to  re¬ 
mind  users  who  might  not  have 
scanned  the  message  index  of  valid 
message  numbers  (lines  1-9)  to  do  so. 
The  file  names  of  messages  stored  in 
/u/bbs/msg  are  the  same  as  their 
message  numbers.  For  example, 
message  number  64  has  the  file  name 
/u/bbs/msg/64.  Therefore,  once  the 
user  enters  a  message  number  (line 
12),  readmsg  can  simply  cat  a  file  by 
that  name  (line  18).  The  script  also 
traps  nonexisting  message  numbers 
(lines  19-21). 

The  shell  script  send  (Listing  Five, 
page  82)  stores  messages  for  the  pub¬ 
lic  message  system.  The  script  incre¬ 
ments  the  last  message  number  (line 
8)  and  then  conducts  a  dialogue  with 
the  user  to  collect  the  message  head¬ 
er  (lines  11-25)  and  the  body  of  the 
message  (lines  27-34).  In  addition,  the 
message  header  must  be  added  to  the 
top  of  the  file  /u/bbs/msg/. index 
(that  is,  scan  displays  message  head¬ 
ers  in  descending  order  by  message 
number).  To  maintain  the  descend¬ 
ing  order,  the  headers  are  first  writ¬ 
ten  to  a  temporary  file  (lines  35-33). 
Then,  the  existing  .index  file  is  con¬ 
catenated  onto  the  end  of  the  tempo¬ 
rary  file  (line  39).  Finally,  the  tempo¬ 
rary  file  is  moved  on  top  of  the  old 
.index,  effectively  erasing  the  exist¬ 
ing  file  (line  40).  Because  public  mes¬ 
sages  can  be  addressed  to  groups  of 
people  (for  example,  ALL  or  All  3B2 
Users),  send  does  not  bother  to  verify 
that  the  addressee  is  a  valid  system 
user  ID. 

The  Private  Mail  Subsystem 

Private  mail  is  built  around  the  Unix 
mail  command.  The  shell  script 
pmail  (Listing  Six,  page  82)  builds  an 
interface  to  aid  users  in  sending  and 
reading  mail.  The  script  first  displays 
the  four  available  commands  (lines 
4-11):  s  to  send  mail,  r  to  read  mail,  l 
to  see  system  users,  and  y  to  exit  from 
pmail. 

The  /  command  is  not  a  command 
normally  associated  with  Unix  mail, 
but  sending  Unix  mail  requires  an  ar¬ 
gument  of  either  a  valid  user  ID  or 
the  electronic-mail  name  of  a  remote 
system  known  to  the  system  from 
which  the  mail  is  sent.  Unfortunate¬ 
ly,  user  IDs  are  often  unrelated  to  us¬ 
ers’  real  names.  It  is  therefore  impor¬ 
tant  for  the  BBS  to  provide  some  way 


for  users  to  associate  system  user  IDs 
with  human  names.  Pmail  displays 
the  contents  of  the  text  file  /u/bbs/ 
rbin/user.file,  which  contains  a  list¬ 
ing  of  user  names  and  user  IDs  (lines 
61-63).  The  same  information  is  also 
available  from  the  system  file  /etc/ 
passwd.  In  other  words,  it  might  be 
possible  to  avoid  having  to  maintain 
user.file  by  using  cut  to  obtain  specif¬ 
ic  fields  from  /etc/passwd.  The  ma¬ 
chine  that  supports  Scholastech  Tele¬ 
communications,  however,  also 
contains  several  accounts  that  are 
unrelated  to  the  BBS.  The  code  re¬ 
quired  to  grep  out  the  accounts  that 
should  not  be  displayed  to  BBS  users 
far  exceeds  the  effort  to  maintain 
user.file. 

Sending  Unix  electronic  mail  is 
straightforward.  The  mail  command 
is  used  with  a  single  argument — the 
user  ID  of  the  recipient  (although 
Unix  mail  can  be  directed  to  remote 
Unix  systems,  Scholastech  Telecom¬ 
munications  does  not  make  that  ca¬ 
pability  available  to  BBS  users).  For 
example,  mail  sysop  sends  mail  to 
the  sysop  account  on  the  same  sys¬ 
tem.  To  send  private  mail,  the  pmail 
shell  script: 

•displays  a  set  of  instructions  that 
help  ensure  that  the  mail  will  be  sent 
successfully  (lines  19-29) 

•  collects  the  user  ID  of  the  account  to 
which  the  mail  is  being  sent  (lines  31- 
32) 

•verifies  that  the  user  ID  is  valid 
(lines  34-39) 

•  sends  the  mail  by  issuing  the  Unix 
mail  command  (lines  40-45) 

Users  working  at  the  Unix  com¬ 
mand  level  can  read  mail  by  issuing 
the  command  mail  without  any  argu¬ 
ments  (there  is  no  reason  why  BBS  us¬ 
ers  who  are  knowledgeable  about 
Unix  cannot  do  so  from  their  restrict¬ 
ed  shell  >  prompt).  Pmail  reads  mail 
in  exactly  that  way  (line  55).  Several 
options  are  available  once  a  piece  of 
mail  has  been  displayed  on  the 
screen,  including  forwarding  the 
mail  to  other  users  and  saving  the 
mail  under  a  different  file  name  in 
the  user's  account.  Pmail,  however, 
only  alerts  the  user  of  the  options  of 
either  deleting  the  message  or  leav¬ 
ing  it  in  the  user's  mailbox  (lines  51- 
54).  Though  knowledgeable  Unix  us¬ 
ers  may  attempt  to  save  the  mail,  BBS 


58 

454 


Dr.  Dobb's  Journal,  June  1987 


UNIX  BBS 

(continued  from  page  58) 

users  do  not  have  the  right  to  write  to 
their  own  accounts  (this  restriction 
was  included  to  conserve  disk  space); 
any  attempt  to  save  mail  in  a  user’s 
account  will  therefore  fail.  Knowl¬ 
edgeable  users  may,  however,  suc¬ 
ceed  in  forwarding  mail  to  other  BBS 
users. 

Signing  Up  for  a  New  Account 

The  special  account  new  exists  on 
Scholastech  Telecommunications 
without  a  password.  Its  sole  purpose 
is  to  permit  prospective  users  to  re¬ 
quest  accounts  on-line.  The  code  that 
manages  the  sign-up  process  is  con¬ 
tained  in  the  account's  .profile  (List¬ 
ing  Seven,  page  86).  Ideally,  this  .pro¬ 
file  should  automatically  log  users 
off  the  system  once  they  have  fin¬ 
ished.  The  only  way  to  do  so  that 
works  on  the  7300  ( stty  0  does  not 
work)  is  to  make  the  .profile  the  ac¬ 
count's  default  shell.  There  is,  how¬ 
ever,  an  important  reason  why  the 
account  wasn’t  set  up  in  that  way; 
any  shells  other  than  /bin/sh  (the 
standard  Bourne  shell)  and  /bin/rsh 
(the  standard  restricted  shell)  are  as¬ 
sumed  to  be  extremely  restricted.  In 
particular,  they  do  not  permit  any 
redirection  of  output.  That  means 
that  data  cannot  be  sent  to  a  disk  file, 
something  that  is  essential  if  user  ac¬ 
count  request  data  are  to  be  stored. 
Instead,  the  new  account  is  estab¬ 
lished  as  a  standard  restricted  shell 
account.  Its  default  path  is  set  to  an 
rbin  directory  that  contains  only  the 
two  harmless  commands  echo  and 
cat  (line  1);  its  level  1  prompt  is  set  to 
the  phrase  "Type  CNTRL-D  now  ..." 
(line  2).  When  the  .profile  ends  or  if  a 
user  breaks  out  of  the  .profile  with 
the  delete  key,  even  those  who  know 
Unix  will  be  powerless  beause  the 
commands  in  the  rbin  directory  per¬ 
mit  nothing  but  the  display  of  text  to 
the  screen.  The  prompt  continually 
reminds  them  of  the  key  sequence  to 
exit  from  the  system. 

Though  somewhat  lengthy  for  a 
.profile,  the  account  request  shell 
script  is  quite  simple.  It  first  verifies 
that  the  user  wants  to  sign  up  for  an 
account  (line  5).  Assuming  the  user 
does  want  an  account,  the  script  then 
enters  a  loop  to  collect  name  and  ad¬ 
dress  data  (lines  13-43).  After  data 


have  been  entered,  the  user  is  given  a 
chance  to  view  the  data  (lines  33-38) 
and  to  correct  it  if  necessary  (lines 
42-43).  This  process  is  repeated  for 
the  user  ID  and  initial  password  (lines 
47-60).  Note  that  the  system's  pass¬ 
word  format  is  not  enforced  during 
the  account  request  process.  This  is 
because  initial  passwords  will  be  es¬ 
tablished  by  the  superuser,  bypass¬ 
ing  the  format  restrictions.  Finally, 
the  script  writes  the  data  to  a  text  file 
(lines  63-71). 

The  new  account's  .profile  does 
not  actually  establish  new  accounts. 


I  wrote 
two  routines 
to  keep  records 
of  system 
and  command 
use. 


This  must  be  done  manually  by  com¬ 
pleting  the  following  steps: 

1.  Make  an  entry  in  /etc/passwd  for 
the  new  user. 

2.  Enter  a  password  for  the  new  user. 

3.  Create  a  directory  for  the  new 
user. 

4.  Copy  the  BBS  .profile  into  the  new 
user's  directory. 

5.  Make  an  entry  in  /u/bbs/rbin/ 
user. file  for  the  new  user. 

Steps  1  and  2  must  be  performed  by 
the  superuser.  The  remaining  steps 
are  performed  by  the  system  opera¬ 
tor  (the  account  sysop).  The  sysop  ac¬ 
count  owns  all  BBS  accounts  and  their 
files.  This  prevents  BBS  users  from 
viewing  and/or  modifying  their 
own  .profiles,  an  additional  security 
measure  often  used  along  with  the 
restricted  shell. 

Providing  Organizational 
Information 

The  shell  scripts  that  you  have  seen 
up  to  this  point  are  generic;  they 
might  be  used  to  support  just  about 
any  BBS.  The  info  account,  however, 
provides  a  function  needed  by  Scho¬ 


lastech  that  isn’t  typically  found 
with  most  bulletin  boards.  Info’s 
.profile  (Listing  Eight,  page  88)  works 
much  like  the  new  account's  .profile 
to  display  information  about  upcom¬ 
ing  Scholastech  events  and  to  allow 
users  to  register  for  workshops  on¬ 
line. 

Like  new's  .profile,  the  script  first 
establishes  a  special  rbin  directory  as 
the  account's  default  path  (line  1); in¬ 
fo's  .profile  contains  nothing  but 
three  text  display  commands — cat, 
echo,  and  more.  It  also  changes  the 
level  1  prompt  to  "Type  CNTRL-D 
now  . . .”  (line  2).  It  then  prints  a  wel¬ 
come  heading  (lines  4-5)  and  displays 
the  contents  of  a  text  file  with  the 
welcome  message  (line  7).  The  actual 
information  about  upcoming  events 
is  stored  in  the  more  extensive  info 
.file,  which  is  displayed  page  by  page 
using  the  Unix  command  more  (line 
11). 

Workshop  sign-ups  are  straightfor¬ 
ward.  Users  are  prompted  to  enter 
registration  data  (lines  20-37).  The 
.profile  then  appends  that  data  to  a 
text  file  (lines  38-43).  The  registration 
file  is  later  printed  and  turned  over 
to  the  Scholastech  personnel  in 
charge  of  the  workshops. 

System  Monitoring 

It  is  important  for  any  BBS  sysop  to  be 
aware  of  everything  that  is  happen¬ 
ing  on  the  system.  While  users  are 
logged  in,  their  activites  can  be  moni¬ 
tored  with  the  Unix  ps  command, 
which  displays  the  status  of  all  active 
processes.  There  is,  however,  a  fur¬ 
ther  need  to  keep  records  of  system 
and  command  use.  Unfortunately, 
the  7300 's  implementation  of  Unix  V 
does  not  include  the  standard  Unix 
accounting  functions.  I  therefore 
wrote  two  routines  that  display  and 
reformat  data  from  a  system  log  file 
and  from  the  log  file  written  by  each 
BBS  shell  script  in  order  to  obtain  ar¬ 
chival  information. 

The  two  shell  scripts  are  logs, 
which  displays  data  to  the  screen, 
and  usage,  which  creates  a  printed 
report.  Both  rely  on  process  begin¬ 
ning  and  ending  data  kept  by  Unix  in 
the  system  file  /etc/wtmp.  Normal¬ 
ly,  /etc/wtmp  is  cleaned  out  every 
time  the  system  is  rebooted.  This  can, 
however,  result  in  a  loss  of  data 
whenever  an  unexpected  power 


60 


Dr.  Dobb's  Journal,  June  1987 

455 


UNIX  BBS 

(continued  from  page  60) 


failure  occurs.  I  therefore  made  a 
modification  to  the  system  routine 
.cleanup,  which  is  executed  by  the 
system  boot  routine  rc,  and  turned 
the  line  >  /etc/wtmp  into  a  com¬ 
ment  to  prevent  its  execution  (I  could 
have  just  as  easily  deleted  it). 

Logs  (Listing  Nine,  page  90)  first 
creates  a  temporary  file  (/u/user.log) 
that  contains  the  user’s  name  and  the 
time  logged  on  for  each  remote  user 
with  the  Unix  command  who.  A  pipe 
to  the  Unix  command  grep  extracts 
all  entries  that  contain  phi  (line  1).  Be¬ 
cause  the  device  name  for  the  inter¬ 
nal  modem  being  used  by  the  BBS  is 
phi ,  the  grep  will  include  log-ins  on 
that  line  and  leave  out  all  other  de¬ 
vices  (the  console  is  usually  either  wl 
or  w2,  for  example).  If  the  logs  com¬ 
mand  is  issued  with  an  argument  a 
(line  2),  logs  displays  the  entire  con¬ 
tents  of  both  the  temporary  file  (line 
8)  and  the  BBS  command  log  file  (line 
15).  If  no  argument  is  included,  logs 
displays  only  the  last  ten  lines  of  the 
two  files  (lines  21  and  26).  Finally, 
logs  removes  the  temporary  file  it 
created  (line  29). 

Usage  (Listing  Ten,  page  90)  creates 
two  printed  reports.  The  first  is  a  co¬ 
lumnar  listing  of  the  user’s  name,  the 
date,  the  time  on,  and  the  time  off  for 
each  log-in  contained  in  /etc/wtmp. 
The  second  is  simply  a  hard  copy  of 
the  BBS  command  use  file.  Usage  is  a 
"dangerous”  shell  script;  one  of  its 
functions  is  to  reinitialize  /etc/wtmp 
(line  17).  It  is  therefore  extremely  im¬ 
portant  that  no  user  other  than  the 
system  operator  should  have  any 
rights  to  the  program  (that  is,  its  per¬ 
mission  line  should  appear  as  -rwy — 
-).  Why  shouldn’t  other  users  be  al¬ 
lowed  to  read  usage ?  Read  rights 
allow  another  user  to  make  of  copy 
of  a  program  to  put  in  their  own  ac¬ 
count.  Once  the  copy  is  made,  the 
user  becomes  the  owner  and  can  ex¬ 
ecute  the  script,  wiping  out  /etc/ 
wtmp  and  destroying  system  use 
data. 

The  logic  behind  usage  is  tied  to  the 
way  in  which  data  are  stored  in  /etc/ 
wtmp.  There  is  one  record  for  every 
process  that  starts  and  every  process 
that  ends  (the  two  exceptions  are  the 
processes  LOGIN  and  rc).  The  trick  to 
capturing  log-on  and  log-off  times  is 


therefore  to  match  up  the  entries  for 
any  given  user.  Because  the  user  run¬ 
ning  the  program  will  not  have 
logged  off,  the  program  must  be 
careful  to  exclude  that  user  from  the 
report.  The  particular  version  of  us¬ 
age  that  appears  in  Listing  Ten  han¬ 
dles  only  remote  users  on  the  device 
phi.  In  this  case,  the  code  doesn't 
need  to  worry  about  excluding  the 
system  operator  who  is  running  the 
program  from  the  console.  If,  how¬ 
ever,  users  on  other  devices  are  to  be 
included,  two  things  must  happen: 


It  is  extremely 
important 
that  no  user 
besides  the  sysop 
has  rights 
to  usage. 


1.  The  code  for  assembling  the  report 
must  be  duplicated  for  each  device 
(records  for  concurrent  users  are  in¬ 
terleaved  in  /etc/wtmp). 

2.  The  name  of  the  user  running  the 
program  must  be  excluded  from  the 
report  because  there  will  be  no 
matching  process  termination  re¬ 
cord  for  that  user. 

The  usage  script  first  creates  the 
system  use  report.  It  writes  report 
headings  to  a  temporary  output  file 
(lines  1-3)  and  then  begins  to  assem¬ 
ble  the  body.  The  body  is  assembled 
in  the  following  manner: 

1.  A  log-on  record  for  each  user  en¬ 
tering  the  system  on  phi  is  extracted 
and  stored  in  the  file  tempi  (line  4). 

2.  The  first  field  of  the  record,  the 
user  name,  is  cut  from  tempi  and 
stored  in  temp2  (line  5).  This  forms 
the  leftmost  column  of  the  report. 

3.  The  log-on  time  is  cut  from  tempi 
and  stored  in  temp3  (line  6).  This 
forms  the  middle  column  of  the 
report. 

4.  A  log-off  record  for  each  user  en¬ 
tering  the  system  on  phi  is  extracted. 
The  log-off  time  is  cut  from  the  re¬ 
cord  and  stored  in  the  file  temp4  (line 


7).  This  forms  the  rightmost  column 
of  the  report. 

5.  The  three  temporary  files  contain¬ 
ing  the  contents  of  the  columns  of 
the  report  are  put  together  side  by 
side  with  paste  (line  8). 

6.  To  complete  the  process,  the  re¬ 
port  is  formatted  for  output  with  the 
Unix  pr  command  (line  9).  The  tem¬ 
porary  files  are  removed  (lines  10- 
14). 

Records  in  the  BBS  command  use 
log  file  (/u/bbs/log.file)  are  already 
in  columnar  format  (they  were  writ¬ 
ten  in  such  a  manner  by  each  BBS 
shell  script).  Therefore,  they  are  sim¬ 
ply  formatted  for  output  with  pr 
(line  15).  The  log  file  is  also  reinitia¬ 
lized  (line  16).  Both  reports  are  then 
sent  to  the  printer  (lines  18  and  19). 
Note  that  usage  does  not  delete  the 
two  final  output  files  because  doing 
so  before  the  files  have  actually  been 
printed  can  disrupt  the  printing 
process. 

Acknowledgment 

I  received  both  the  XMODEM  file 
transfer  program  and  a  great  deal  of 
help  from  David  Watson,  who  oper¬ 
ates  a  Unix/Xenix  BBS  in  my  area. 
Though  I  thanked  him  profusely  at 
the  time,  I  think  he  deserves  a  public 
acknowledgment  for  his  generosity 
and  patience  in  putting  up  with 
some  of  the  elementary  (dumb)  ques¬ 
tions  I  asked  when  I  was  first  work¬ 
ing  on  my  BBS  software. 

Availability 

All  the  source  code  for  articles  in  this 
issue  is  available  on  a  single  disk.  To 
order,  send  $14.95  to  Dr.  Dobb's  Jour¬ 
nal,  501  Galveston  Dr.,  Redwood  City, 
CA  94063  or  call  (415)  366-3600  ext. 
216.  Please  specify  the  issue  number 
and  format  (MS-DOS,  Macintosh, 
Kaypro). 

DDJ 

(Listings  begin  on  page  80.) 

Vote  for  your  favorite  feature/article. 

Circle  Reader  Service  No.  7. 


62 

456 


Dr.  Dobb's  Journal,  June  1987 


COM  PORT  D 

RIVER 

Listing  One 

(Tejtt  begins  on  page  42.) 

title 

extended  int  14  handler 

name 

excora 

(C)  1987 

,  Crystal  Computer  Consulting  Inc. 

own  risk. 

This  software  may  be  used  freely,  at  your 

a a  long 

as  this  notice  is  not  removed. 

text 

segment 

word  public  'code* 

assume 

cs:_text,  sa:_text,  ds:_text,  es:_ 

text 

org 

OlOOh 

corn-able 

•tart: 

jmp 

excora 

must  be  first 

even 

; 

raise  constants 

BUFSZ 

equ 

OlOOh 

buffer  size 

XOFFSZ 

equ 

OOFOh 

input  flow  control  turn  off  point 

XONSZ 

equ 

OOEOh 

input  flow  control  turn  on  point 

DTR 

equ 

OOlh 

select  DTR  input  flow  control 

RTS 

equ 

002h 

select  RTS  input  flow  control 

XNXF 

equ 

004h 

select  XON/XOFF  input  flow  control 

OUT2 

equ 

0Q8h 

bit  to  set  OUT2 

nocts 

equ 

OlOh 

CTS  not  required  to  transmit 

NODSR 

equ 

020h 

DSR  not  required  to 

transmit 

B19200 

equ 

040h 

set  baud  rate  to  19200 

B38400 

equ 

080h 

set  baud  rate  to  38400 

C1MASK 

equ 

OlOh 

coml  mask  for  8259 

C2MASK 

equ 

008h 

cora2  mask  for  8259 

structure  for  port  control  blocks 

pcbs 

struc 

putptr 

dw 

? 

points  to  last  char  put  j 

getptr 

dw 

? 

points  to  next  char 

to  get 

bufcnt 

dw 

? 

number  of  characters  in  buffer 

bufend 

dw 

? 

address  past  buffer 

end 

phase 

dw 

? 

port  address 

timeout 

db 

? 

timeout  outer  loop 

mask 

db 

? 

mask  to  enable  port  interrupt 

rxoff 

db 

? 

set  if  reev  has  been  stopped 

opts 

db 

? 

holds  extended  options 

db 

? 

holds  character  attributes 

oldier 

db 

? 

old  interrupt  enable  register 

db 

? 

old  line  control  register 

db 

? 

old  modem  control  register 

db 

? 

old  baud  low  divisor  latch 

olddlm 

db 

? 

old  baud  high  divisor  latch 

but 

dw 

BUFSZ  dup  (?) 

the  buffer 

pcbs 

ends 

allocate  storage  for  port  control  blocks 

pcb 

pcbs 

2  dup  (<>) 

array  of  port  control  blocks 

oldOB 

dd 

? 

old  vector  for  intOB 

oldOC 

dd 

? 

old  vector  for  intOC 

oldl4 

dd 

? 

old  vector  for  int 14 

oldmsk 

db 

? 

old  mask  for  8259 

baud  rate  table  for  UART  initialization 

bauds 

dw 

0417h 

divisor  for  110  baud 

0300h 

;  divisor  for  150  baud 

dw 

0180h 

;  divisor  for  300  baud 

dw 

OOCOh 

;  divisor  for  600  baud 

dw 

0060h 

;  divisor  for  1200  baud 

dw 

0030h 

;  divisor  for  2400  baud 

dw 

0018h 

;  divisor  for  4800  baud 

dw 

OOOCh 

;  divisor  for  9600  baud 

dw 

0006h 

;  divisor  for  19200  baud 

baud38 

dw 

0003h 

;  divisor  for  38400  baud 

;  OBh 

4  OCh  (com2  4  coral)  interrupt  handler 

put 

characters  into  the  appropriate  buffer 

input  flow  control  -  stop  input  when 

buffer  nears  full 

intOB: 

;  point 

to  cora2  port  control  block  with  ds:si 

push 

ds 

push 

si 

raov 

si,  cs 

raov 

ds,  si 

lea 

si,  pcb  +  size  pcbs 

jrop 

short  alO 

intOC: 

;  point 

to  coml  port  control  block  with  ds:si 

push 

ds 

push 

si 

raov 

ai,  ca 

raov 

ds,  si 

lea 

si,  pcb 

alO: 

push 

di 

push 

dx 

push 

bx 

push 

ax 

character,  if  this  fills  buffer  to 

XOFFSZ,  then  send 

;  XQFF 

or  drop  DTR  and/or  RTS  as  indicated  by  extended  options 

(continued  on  page  66) 

64 


Dr.  Dobb's  Journal,  June  1987 

457 


COM  PORT  DRIVER 


Listing  One  ( Listing  continued,  text  begins  on  page  42. ) 


di,  [si] .putptr 
dx,  [si] .phase 
al,  dx 
ah,  al 

[si] .bufcnt,  XOFFSZ 

a40 

[si] .rxoff,  0 
a30 

[si] .rxoff,  1 
[si] .opts,  XNXF 
*20 

al,  'S'  -  '«• 
dx,  al 

bl,  [si]. opts 

bl,  DTR  or  RTS 

a30 

dx,  4 

al,  dx 

bl 

al,  bl 
dx,  al 
dx,  4 


;  buffer  put  pointer 
i  read  character  and  clear  interrupt 

;  jump  if  buffer  below  turnoff  point 

;  jump  if  receive  already  disabled 
;  indicate  receiver  now  disabled 


;  drop  DTR  and/or  RTS 


;  if  the  buffer  is  full,  set  overflow  status  and  exit 
crop  [si]. bufcnt,  BUFSZ 

jl  a40 

or  word  ptr  [di],  0200h 

jmp  short  a99 


;  read  the  port  status 
add  dx,  5 

in  al,  dx 

;  get  new  buffer  pointer,  adjust  for  wrap  around  if  required 
inc  di 

inc  di 

crop  di,  [si] .bufend 

jne  *50 

lea  di,  [si] .buf 


;  store  new  buffer  pointer  and  put  status: data  into  the  buffer 

mov  [si] .putptr,  di 

xchq  ah,  al 

mov  [ di ] ,  ax 

inc  [si] .bufcnt 

;  send  interrupt  complete  command  to  interrupt  controller 
mov  al,  020h 

out  020h,  al 


14h  interrupt  handler 

emulate  pc  bios  functions  plus  extensions 

dx  is  used  to  select  the  port  for  all  calls 
dx  -  0  for  coml,  dx  -  1  for  com 2 

ah  -  0  initialize  the  port,  parameters  in  al 
al  bits  7-5  set  the  baud  rate 

000  -  110,  001  -  150,  010  -  300,  011  -  600 

100  -  1200,  101  -  2400,  110  -  4800,  111  -  9600 
al  bits  4-3  set  the  parity 

00  —  none,  01  -  odd,  11  -  even 
al  bit  2  is  number  of  stop  bits 
0-1  stop  bit,  1-2  stop  bits 
al  bits  1-0  set  the  word  length 
00  -  5,  01  -  6,  10-7,  11-8 

returns  ah  &  al  as  per  comm  status  (ah-3)  below 
example:  ah  -  10101110  is  2400  baud, 

odd  parity,  2  stop  bits,  7  bit  characters 

ah  —  1  transmit  the  character  in  al,  al  preserved 

returns  ah  as  per  coram  status  (ah-3)  below 

ah  -  2  return  receive  character  in  al 

returns  ah  as  per  comm  status  (ah-3)  below 
only  bits  1-2-3-4-7  can  be  set 

ah  having  any  bits  set  is  a  receive  error  or  timeout 

*h-3  return  coram  port  status  in  ah  t  al 
ah  bits  are: 

bO  —  data  ready  bl  -  overrun 

b2  -  parity  error  b3  -  framing  error 

b4  -  break  detect  b5  -  xrait  holding  reg  empty 

b6  -  xrait  shift  reg  empty  b7  -  timeout 
al  bits  are: 

bO  -  delta  CTS  bl  -  delta  DSR 

b2  -  trail  edge  ring  b3  -  delta  recv  detect 

b4  -  CTS  b5  -  DSR 

b6  -  ring  indicator  b7  -  recv  signal  detect 

ah  -  4  extended  options 

returns  05A5A  in  ax,  used  to  identify  excora 
al  contains  options  as  follows: 


(continued  on  page  70) 


66 

458 


Dr.  Dobb's  Journal,  June  1987 


COM  PORT  DRIVER 


Listing  One  ( Listing  continued,  text  begins  on  page  42. ) 

;  bit  0  enables  DTR  input  flow  control 

;  bit  1  enables  RTS  input  flow  control 

;  bit  2  enables  XON/XOFF  input  flow  control 

;  bit  4  don't  require  CTS  to  transmit 

;  bit  5  don't  require  DSR  to  transmit 

;  bit  6  sets  baud  rate  to  19200 

;  bit  7  sets  baud  rate  to  38400 


ah  -  5  remove  excora,  restore  old  vectors,  release  memory 


inti 4 : 


bOO: 


blO: 


b20 : 


b30: 


b40 : 


bOOO : 


bOlO: 


sti 

push 

push 

push 

push 

push 

;  point 

mov 

raov 

lea 

or 

jz 

add 


ds 

si 

dx 


bx 

to  selected  com  port  control  block  with  ds:si 
bx,  cs 
ds,  bx 
si,  pcb 
dx,  dx 
bOO 

si,  size  pcbs 


;  qet  port  address  in  dx,  return  if  port  not  installed 
mov  dx,  [si]  .phase 

or  dx,  dx 

jz  b40 


;  ah  has  request  type. 


or 

ah. 

jz 

bOOO 

dec 

ah 

j* 

blOO 

dec 

ah 

jnz 

blO 

jmp 

b200 

dec 

ah 

jnz 

b20 

jmp 

b300 

dec 

ah 

jnz 

b30 

jmp 

b400 

dec 

ah 

jnz 

b40 

jmp 

bSOO 

pop 

bx 

pop 

cx 

pop 

dx 

pop 

Bi 

pop 

iret 

ds 

jump  to  selected  routine 

;  initialize 
;  transmit  a  char 

;  receive  a  char 

;  get  status 

;  extended  initialize 

;  remove  excom 


ah  -  0 

initialize  the  port 


push 


initialize  the  pcb,  this  empties  the  input  buffers 


lea 

ax,  [sil.buf 

mov 

[si] .putptr,  ax 

inc 

ax 

inc 

ax 

raov 

[si] .getptr,  ax 

add 

ax,  BUFSZ  *  2 

mov 

[si] .bufend,  ax 

sub 

ax,  ax 

raov 

[si] .bufcnt,  ax 

raov 

[sij.rxoff,  al 

;  assert 

flow  control  signals  and 

add 

dx,  4 

mov 

al,  DTR  or  RTS  or  OUT2 

out 

dx,  al 

cli 

in 

al,  02 lh 

and 

al,  [si] .mask 

out 

021h,  al 

sti 

sub 

dx,  3 

mov 

al,  1 

out 

dx,  al 

;  set  up 

baud  rates  and  character 

add 

dx,  2 

pop 

ax 

mov 

bl,  al 

and 

al,  OlFh 

raov 

[sij.lchar,  al 

mov 

al,  [si]. opts 

and 

al,  B19200  or  B38400 

jz 

bOlO 

call 

b440 

jmp 

short  b020 

mov 

cl,  4 

rol 

bl,  cl 

and 

bx,  OEh 

unmask  com  port  on  int  controller 


enable  receive  int  on  uart 


init  only  character  attributes 


isolate  baud  rate  selection 


(continued  on  page  72) 


70 


Dr.  Dobb's  Journal,  June  1987 

459 


COM  PORT  DRIVER 


Listing  One  (Listing  continued,  te?ct  begins  on  page  42. ) 


b020: 


blOO: 


call 

b430 

;  init  baud  and  character  attributes 

sub 

dx,  3 

jmp 

b300 

;  return  status 

;  ah  - 

1 

;  transmit  the  character  in  al 

push 

ax 

add 

dx,  6 

mov 

bh,  [si]. opts 

not 

bh 

and 

bh,  NODSR  or  NOCTS 

;  optionally  wait  for  DSR  and/or  CTS 

call 

bl30 

jnz 

bllO 

dec 

dx 

mov 

bh,  020h 

;  wait 

for  the  transmitter  to  be  ready 

call 

bl30 

jnz 

bllO 

sub 

dx,  5 

pop  ax 

push  ax 

;  actually  send  the  character 
out  dx,  al 


call 

b310 

jmp 

short  bl2 

bllO: 

;  restore 

al  and  im 

call 

b310 

or 

ah,  08  Oh 

bl20: 

pop 

cx 

mov 

al,  cl 

jmp 

b40 

;  wait  for 

selected 

bl30: 

mov 

bl,  [si]. 

b!40: 

sub 

CX,  cx 

blSO: 

in 

al,  dx 

and 

al,  bh 

crop 

al,  bh 

jz 

bl60 

loop 

bl50 

dec 

bl 

jnz 

bl40 

or 

bh,  bh 

bl60: 

ret 

;  delay  counter 


;  status  match 


;  timeout,  set  non-zero 


b200 : 


b210 : 
b220: 


b230: 


b240 : 


b250: 


b260 1 


;  ah  -  2 

;  receive  a  character,  put  it  in  al 
crop  [sij.bufcnt,  0 

jne  b230  :  chars  in  buffer,  return  one 

;  wait  for  a  character  to  appear  in  the  buffer,  or  return  timeout 
mov  al,  fail  .timeout 


add  al,  al 

sub  cx,  cx 

cmp  [sij.bufcnt,  0 

jne  b230 

loop  b220 

dec  al 

jnz  b210 

mov  ax,  08 00 Oh 

jrap  b40 


;  if  receiver  is  disabled  and  room  now 

crop  [si]  .rxoff,  0 

je  b250 

crop  [sij.bufcnt,  XONSZ 

jg  b250 

test  [si] .opts,  XNXF 

jz  b240 

mov  [si]. rxoff,  0 

mov  al,  'Q*  -  • g • 

out  dx,  al 

;  it's  easier  to  just  assert  DTR  c  RTS 
add  dx,  4 

in  al,  dx 

or  al,  DTR  or  RTS 

out  dx,  al 


;  shorter  timeout  loop,  so  loop  more 


;  timeout  return  value 

exists,  enable  receiver 

;  receive  is  already  enabled 
;  buffer  fuller  that  turn  on  point 

;  indicate  receiver  enabled 
;  send  ‘-Q 

rather  than  see  if  needed 

;  assert  DTR  and  RTS 


;  get  the  char  from  the  buffer  and  update  buffer  get  pointer 
call  b310 

and  ah,  OlEh  ;  status  reported  with  a  receive 

inc  bx 

inc  bx 

crop  bx,  [si]  .bufend 

jne  b260 

lea  bx,  [si] .buf 


72 

460 


Dr.  Dobb's  Journal,  June  1987 


roov 

dec 

jmp 


[si] .getptr, 
[si] .bufcnt 
b40 


;  updated  pointer 


ah  -  3 

get  port  status 

call  b310 

inc  dx 

in  al,  dx 

jrop  b40 


;  modem  status 


get  line  status  from  input  buffer  and/or  hardware  into  ah 


roov 

dx. 

[si] .pbase 

add 

dx. 

5 

in 

al. 

dx 

mov 

ah. 

al 

crop 

[si]. bufcnt,  0 

je 

b320 

;  when  chars 

in  buffer,  fi 

;  return 

as  above  plus  bx 

roov 

cl. 

al 

roov 

bx. 

[si] .getptr 

roov 

ax. 

[bx] 

roov 

ch. 

OlEh 

and 

ah. 

ch 

not 

ch 

and 

cl, 

,  ch 

or 

ah, 

,  cl 

or 

ah. 

1 

ret 

;  line  status 


getptr,  al  -  character 


status :data  (ah:al) 

these  status  bits  from  buffer 


;  char  in  buffer,  show  data  ready 


;  ah  -  4 

;  extended  initialization 

j  save  the  options  in  the  pcb 
iqov  [si]  .opts,  al 

;  if  an  extended  baud  rate,  set  up  the  uart 


add 

test 

j* 

roov 

call 

jrap 

test 

j* 

roov 

call 

roov 

jmp 


dx,  3 
al,  B19200 
b410 

bx,  baudl9  -  bauds 
b430 

short  b420 

al,  B38400 
b420 

bx,  baud38  -  bauds 
b430 

ax,  05A5Ah 
b40 


;  magic  value  to  identify  excoro 


init  the  baud  rate  and  character  attributes 


roov 

al. 

080h 

out 

dx. 

al 

mov 

ax. 

[bx+bauds] 

sub 

dx. 

3 

out 

dx. 

al 

inc 

dx 

mov 

al. 

ah 

out 

dx. 

al 

inc 

dx 

inc 

dx 

mov 

al. 

[si] .lchar 

out 

dx. 

al 

ret 

;  set  low  order  divisor  on  uart 
;  set  high  order  divisor  on  uart 

;  set  character  attributes 


ah  -  5 

remove  excom 

push  ds 

push  ds 

;  restore  status  of  interrupt  controller  and  uarts 
cli 

mov  ah,  oldrosk 

and  ah,  C1MASK  or  C2MASK 

in  al,  021h 

and  al,  not  (C1MASK  or  C2MASK) 

or  al,  ah 

021h,  al 


out 

sti 

lea 

call 

lea 

call 

jmp 


si,  pcb 

b510  ; 

si,  pcb  +  size  pcbs  ;  com2  pcb 
b510 

short  bS30 


coral  pcb 

restore  coral  uart  status 


restore  coral  uart  status 


;  subroutine  to  restore  uart  status 

mov  dx,  [si] .pbase 

or  dx,  dx 

jz  b520 


;  no  port 


(continued  on  ne?ct  page) 


Dr.  Dobb's  Journal ,  June  1987 


73 

461 


COM  PORT  DRIVER 

Listing  One  (Listing  continued,  tejct  begins  on  page  42.) 

add 

dx,  3 

raov 

al,  08 Oh 

out 

dx,  al 

set  access  to  baud  divisors 

sub 

dx,  3 

mov 

al,  [si] .olddll 

out 

dx,  al 

baud  divisor  low 

inc 

dx 

raov 

al,  [si] .olddlra 

out 

dx,  al 

baud  divisor  high 

add 

dx,  2 

mov 

al,  [si] .oldlcr 

out 

dx,  al 

line  control  register 

inc 

dx 

mov 

al,  [si] .oldracr 

out 

dx,  al 

modem  control  register 

sub 

dx,  3 

mov 

al,  [ si ] .oldier 

out 

dx,  al 

interrupt  enable  register 

ret 

b530: 

;  restore  old  int  OBh  vector 

Ids 

dx,  oldOB 

mov 

ah,  025h 

mov 

al,  OBh 

int 

021h 

;  restore  old  int  OCh  vector 

pop 

ds 

Ids 

dx,  oldOC 

mov 

ah,  025h 

raov 

al,  OCh 

int 

021h 

;  restore  old  int  14h  vector 

pop 

ds 

Ids 

dx,  oldl4 

mov 

ah,  025h 

mov 

al,  14h 

int 

021h 

;  free 

excora's  memory,  this  memory 

push 

es 

raov 

ax,  cs 

raov 

es,  ax 

raov 

ah,  04  9h 

int 

021h 

pop 

es 

imp 

b40 

installation  entry  point,  must  be  at  file 

end  ; 

initialize  and  stay  resident 

; 

; ; ; ; ; ; ; ; ; 

nniin 

;;;;;;;;;;;;;;;;;;;;; 

excoin: 

;  initialize  constant  parts  of  each  pcb 

raov 

ax,  040h 

raov 

es,  ax 

bios  data  segment 

sub 

di,  di 

offset  to  coral  base  address 

raov 

bx,  07Ch 

offset  to  coral  timeout 

raov 

cl,  not  C1MASK 

coral  interrupt  mask; 

lea 

si,  pcb 

coral  pcb  address 

call 

cOO 

init  coral 

inc 

di 

inc 

di 

offset  to  cora2  base  address 

inc 

bx 

offset  to  cora2  timeout 

raov 

cl,  not  C2MASK 

cora2  interrupt  mask 

add 

si,  size  pcbs 

cora2  pcb  address 

call 

cOO 

init  cora2 

jmp 

short  clO 

;  subroutine  to  initialize  parts  of  a  pcb 

raov 

dx,  es: [di] 

port  base  from  bios 

raov 

[si] .phase,  dx 

raov 

al,  es:  [bx] 

raov 

[si] .timeout,  al 

copy  tiraout  from  bios 

raov 

[si] .mask,  cl 

save  interrupt  mask 

ret 

clO: 

;  save 

old  status  of  interrupt  controller  and  uarts 

in 

al,  021h 

raov 

oldrask,  al 

lea 

si,  pcb 

coral  pcb 

call 

c20 

save  old  coral  uart  status 

lea 

si,  pcb  +  size  pcbs  ;  cora2  pcb 

call 

c20 

save  old  coral  uart  status 

Jmp 

short  c40 

;  subroutine  to  save  uart  status 

c20 : 

raov 

dx,  [si]  .phase 

or 

dx,  dx 

Jz 

c30 

no  port 

add 

dx,  3 

in 

al,  dx 

and 

al,  07Fh 

raov 

[si] .oldlcr,  al 

line  control  register 

raov 

al,  08 Oh 

out 

dx,  al 

set  access  to  baud  divisors 

sub 

dx,  3 

74 

462 


Dr.  Dobb's  Journal,  June  1987 


in 

al,  dx 

mov 

[sij.olddll,  al 

;  baud  divisor  low 

inc 

dx 

in 

al,  dx 

mov 

[si] . olddlm,  al 

;  baud  divisor  high 

add 

dx,  2 

mov 

al,  [si] .oldlcr 

out 

dx,  al 

;  restore  register  access 

inc 

dx 

in 

al,  dx 

mov 

[si] .oldmcr,  al 

;  modem  control  register 

sub 

dx,  3 

in 

al,  dx 

mov 

[si] .oldier,  al 

:  interrupt  enable  register 

c30 : 

ret 

c4  0 : 

;  release 

environment  memory,  not  used 

mov 

bx,  02Ch 

mov 

ax,  [bx] 

mov 

es,  ax 

;  address  of  env 

mov 

ah,  04 9h 

int 

021h 

;  free  memory 

;  save  old 

int  OBh  vector,  install 

new  handler 

push 

ds 

mov 

ax,  c s 

mov 

ds,  ax 

mov 

ah,  035h 

mov 

al,  OBh 

int 

021h 

mov 

word  ptr  oldOB,  bx 

mov 

ax,  es 

mov 

word  ptr  oldOB  +2,  ax 

lea 

dx,  intOB 

mov 

ah,  025h 

mov 

al,  OBh 

int 

021h 

;  save  old  int  OCh  vector,  install 

new  handler 

mov 

ah,  035h 

mov 

al,  OCh 

int 

021h 

mov 

word  ptr  oldOC,  bx 

mov 

ax,  es 

mov 

word  ptr  oldOC  +  2,  ax 

lea 

dx,  intOC 

mov 

ah,  025h 

mov 

al,  OCh 

int 

021h 

;  save  old  int  14h  vector,  install 

new  handler 

mov 

ah,  03 5h 

mov 

al,  014h 

int 

02  lh 

mov 

word  ptr  oldl4,  bx 

mov 

ax,  sa 

mov 

word  ptr  oldl4  +2,  ax 

lea 

dx,  intl4 

mov 

ah,  025h 

mov 

al,  014h 

int 

021h 

pop 

ds 

;  exit  and  keep  everything  above  the  entry  point  'excom' 

lea 

dx,  excom 

mov 

cl,  4 

shr 

dx,  cl 

inc 

dx 

mov 

ah,  031h 

al,  0 

int 

021h 

;  terminate-and-stay-resident 

text 

ends 

endra 

End  Listing  One 

Listing  Two 

♦include 

<etdio.h> 

♦include 

<doa.h> 

/• 

(C)  1987 

,  Crystal  Computer  Consulting  Inc. 

This  software  may  be  used  freely. 

at  your  own  risk. 

•/ 

as  long 

as  this  notice  is  not  removed. 

♦define 

EXINIT 

4  /* 

ah  value  for  extended  init  */ 

♦define 

EXCOM 

0x5 ASA  /* 

magic  value  to  identify  excom  */ 

♦define 

DTR 

0x01  /* 

DTR  input  flow  control  */ 

♦define 

RTS 

0x02  /* 

RTS  input  flow  control  */ 

♦define 

XHXFIN 

0x04  /* 

XON/XOFF  input  flow  control  */ 

♦define 

NOCTS 

0x10  /* 

CTS  not  required  for  transmit  */ 

♦define 

NODSR 

0x20  /* 

DSR  not  required  for  transmit  * / 

♦define 

B19200 

0x40  /* 

set  baud  rate  to  19200  */ 

♦define 

B38400 

0x80  /* 

set  baud  rate  to  38400  */ 

♦define 

COMINIT 

0x100  /• 

port  was  selected  */ 

typedef 

struct  { 

char 

*str;  /* 

command  string  */ 

void 

(•fnct)U;  /* 

command  to  execute  */ 

int 

arg;  /* 

argument  to  pass  *1 

)  CMOS; 

/* 

exraode  command  table  */ 

char 

•reramsg 

—  "excom  not  installed1*. 

(continued  on  ne?ct  page) 

•helpmsg; 

Dr.  Dobb's  Journal,  June  1987 


75 

463 


COM  PORT  DRIVER 


Listing  Two  (Listing  continued,  text  begins  on  page  42.) 


int 

initl  -  0, 

/*  extended  init  for  coral 

*/ 

init2  -  0, 

/*  extended  init  for  com2 

*/ 

void 

portnum  -  -1; 

/*  port  to  initialize  */ 

CMDS 

exit!),  install ()#  remove (), 

exinit  (),  setport  (); 

cmd«[)  -  ( 

"install",  install, 
"remove**,  remove. 


**comlM, 

Hcom2M, 

Mdtr**, 

"rts". 


aetport, 

setport, 

exinit, 

exinit. 


"xnxfin",  exinit, 
"nocts",  exinit. 


"nodsr", 

"19200**, 

"38400", 

NULL 


exinit, 

exinit, 

exinit. 


0, 

0, 

0, 

1, 

DTR, 

RTS, 

XNXFIN, 

NOCTS, 

NODSR, 

B1 9200, 

B38400, 


main (argc,  argv) 
int 

argc; 

register  char 

**argv; 


( 


register  CMDS 

•credp; 

/*  is  excora  installed  */ 

if  (intl4 (EXINIT,  0,  0)  —  EXCOM  &£  intl4 (EXINIT,  0,  1)  —  EXCOM) 
helpmag  -  inamag; 

else 

helpmag  -  remmag; 

if  (argc  —  1) 

help (helpmag) ; 


while  (*++argv  J-  NULL)  { 

/*  look  up  the  argument  in  the  command  table  */ 
for  (cradp  -  crads;  cradp->str  !-  NULL;  ++cmdp) 
if  (strcrap(*argv,  cmdp->str)  —  0) 
break; 


if  (cmdp->atr  — ■  NULL) 

help("bad  command"); 

else  ( 

/*  excora  must  be  installed  to  set  options  •/ 
if  (helpmag  —  remmag  ££  cradp->fnct  !-  inatall) 
help (helpmag) ; 

/*  execute  the  conroand  */ 

(*cmdp->fnct) (cradp->arg); 

} 


if  ( (initl  £  (B19200  |  B38400) )  —  (B19200  |  B38400) 

II  (init2  £  (B19200  |  B38400) )  —  (B19200  |  B38400)) 
help ("ambiguous  baud  rate  setting"); 

/*  actually  send  the  init  bits  to  excam  */ 


if 

((initl  (  COMINIT)  ! 

-  0) 

intl4 (EXINIT, 

,  initl,  0) 

if 

( (init2  (  COMINIT)  ! 

-  0) 

int 14 (EXINIT, 

■  init2,  1) 

exit (0) ; 

) 

/*  install  excora  •/ 
void 

inatall () 

( 

system ("excora") ; 
helpmsg  -  inamsg; 

) 


/*  remove  excora  */ 

void 

remove  () 

{ 

intl4 (5,  0,  0); 
helpmsg  -  rensmsg; 

) 


/*  collect  up  the  init  bits  for  each  port  */ 
void 

exinit (thebit) 
int 


( 


thebit; 

if  (portnura  —  -1) 

help ("no  port  selected"); 


else  if  (portnura  —  0) 

initl  |-  thebit; 

else 


Dr.  Dobb's  Journal,  June  1987 


setport (newport) 
int 

newport; 


portnum  -  newport; 


if  (portnum  —  0) 

initl  |-  COMIHIT; 


init2  |-  COMIHIT; 


/*  perform  int  14h  with  ah,  al  *  dx  as  passed,  return  ax  value  */ 
intl4(and,  val,  port) 


ir,  /•  registers  send  to  bios  */ 

or;  /*  registers  returned  from  bios  */ 

ir.h.ah  -  and; 
ir.h.al  -  val; 
ir.x.dx  -  port; 

int86 (0x14,  *ir,  tor) ; 

return  or.x.ax; 


/*  provide  a  bit  of  assistance  */ 

help(rasg) 

char 

*msg; 

{ 

printf (“\n%s\n\n",  msg); 

fputs  C*install\t\tinstall  excom\nM,  stdout); 

fputs (Mremove\t\treraove  excora\nM,  stdout); 

fputs (Mcoral\t\t subsequent  commands  for  coml\nM,  stdout); 

fputs (Mcom2\t\t8ubsequent  commands  for  com2\nM,  stdout); 

fputs ("dtrXtNtuse  DTR  for  input  flow  control\n“,  stdout); 

fputs (Hrts\t\tuse  RTS  for  input  flow  control\nM,  stdout); 

fputs  (Mxnxfin\t\tuse  XON/XOFF  (AS  '‘Q)  input  flow  control\n",  stdout); 

fputs (Mnocts\t\tdon*t  require  CTS  to  transrait\nM,  stdout); 

fputs (Mnodsr\t\tdon*t  require  DSR  to  transmit\nM,  stdout); 

fputs (M19200\t\tset  baud  rate\nM,  stdout); 

fputs ("38400\t\tset  baud  rate\nM,  stdout); 


End  Listings 


Dr.  Dobb's  Journal,  June  1987 


77 

465 


TURBO  PASCAL  OVERLAYS 


Listing  One  (Tejct  begins  on  page  50.) 

Listing  1:  MemOvrly . Inc 


* 

*  Turbo  Pascal  Memory  Overlay  Routines 

*  Copyright  (C)  1986  by  Steve  McMahon 

*  All  Rights  Reserved. 


$FC/ 

{CLD 

$F3/$A4/ 

{ REPZ 

MOVSB 

{Recover  mauled  registers) 

$1F/ 

{POP 

DS 

$5E/ 

{POP 

SI 

{RUN  OVERLAY : ) 
$83/$C6/$0D/ 

{ADD 

SI, ODH 

$FF/$E6 

{ JMP 

SI 

PROCEDURE  InitOverlay (OverlayCallOff set  :  Integer); 
VAR 


Limitations : 

These  routines  have  been  tested  only  for  Turbo  3.01A  (both 
PC-DOS  and  generic  MS-DOS).  They  may  not  work  under  3.0 
(the  celebrated  FileSize  bug  may  cause  trouble)  and  will 
certainly  not  work  under  2.0XX. 

Memory  overlay  files  must  be  <  64k  in  size! 

NORMAL  overlays  nested  inside  memory  overlays  should  work,  but 
trying  to  nest  memory  overlays  inside  memory  overlays  would 
be  disasterous! 

OvrPath  will  not  work  in  conjunction  with  memory  overlays! 
(Writing  a  replacement  routine  would  be  simple  if  the  code 
below  makes  sense  to  you.) 

I/O  testing  in  InitOverlay  is  just  Turbo's  Native.  Anyone 
really  needing  memory  overlays  will  probably  wish  to  install 
their  own  I/O  error  checking. 


OverlayCailPtr 
TestSize,  i 


*OverlayProcedure; 

Integer; 

STRING [ 13 ] ; 

FILE; 


CONST 

RequiredHeap 


{Paragraphs  of  Heap  Required  by  Program 
for  other  purposes  than  memory  overlays. 
Change  this  to  suit  your  needs  for 
dynamic  storage . ) 


TYPE 

{Type  used  in  both  InitOverlay  and  DisposeOverlayStorage } 
OverlayProcedure  -  RECORD 

CASE  Boolean  OF 
True  : 

(  OldCall  :  ARRAY [ 1 . . 3 ]  OF  Byte; 
OldOffset  :  Integer; 

FileName  :  ARRAY{1..13]  OF  Char; 

) ; 

False  : 

(  NewCalllnstruction  :  ARRAY[1..3]0 

NewCallAddress  :  Integer; 

CurrentOffset  :  Integer; 

OverlayCodeLoc  :  ''Byte; 

NewRoutineLoc  :  Integer; 

OverlaySize  ;  Integer; 


f  ;  FILE; 

BEGIN 

OverlayCailPtr  Ptr(CSeg,  OverlayCallOf f set) ; 

WITH  OverlayCailPtr*  DO 
BEGIN 

{Obtain  overlay  file  name) 
i  1; 

s  "; 

WHILE  FileName [i]  <>  *0  DO 
BEGIN 

s  8 +  FileName{i]; 

i  i  +  1; 

END; 

{Open  overlay  file  as  untyped  file) 

Assign (f,  s) ; 

Reset  (f) ; 

{determine  file  size  in  $80-byte  sectors) 

TestSize  FileSize  (f); 

{Check  to  see  if  there's  enough  space  on  the  heap.) 

{If  there  isn't,  leave  the  overlay  on  disk) 

IF  (MemAvail  >  (RequiredHeap  +  TestSize  *  8))  AND 

(MaxAvail  >-  TestSize  *  8)  THEN  {there's  enough  space) 
BEGIN  {install  overlay) 

OverlaySize  TestSize; 

GetMem (OverlayCodeLoc,  OverlaySize  *  $80); 
BlockRead(f,  OverlayCodeLoc*,  OverlaySize,  i); 


NewCalllnstruction [1)  $2E;  {CS:} 

NewCalllnstruction (2)  $FF; 

NewCalllnstruction [3]  $16;  {indirect  near  call) 

NewCallAddress  Ofs (NewRoutineLoc) ; 

NewRoutineLoc  Ofs (NewOverlayHandler)  +  7; 

{extra  7  bytes  skips  turbo's  procedure  overhead) 
CurrentOffset  $FFFF;  {force  load  on  first  call) 
END; 

Close ( f ) ; 

END; 


ARRAY [1. .3) OF  Byte; 
Integer; 

Integer; 

*Byte; 

Integer; 

Integer; 


PROCEDURE  NewOverlayHandler; 

BEGIN 
INLINE ( 

{When  this  routine  receives  control,  AX  contains  the 
number  of  bytes  in  the  desired  overlay  &  BX  contains  the 
offset  (in  pages)  of  the  desired  overlay  within  the 
overlay  file  (now  on  the  heap) . } 

{First,  check  to  see  if  the  desired  overlay  is  already  in 
place  by  comparing  DX  with  the  offset  recorded  in  memory 
immediately  after  the  call  instruction.  If  they  match, 
no  load  is  necessary) 

$5E/  {POP  SI  ) 

$2E/$3B/$14/  {CMP  DX, CS : [SI ]  ) 

$74/ $1B/  {JZ  RUN_OVERLAY ) 

{Save  vital  registers) 

$56/  {PUSH  SI  ) 

$1E/  {PUSH  DS  ) 

{Load  ES:DI  with  destination  address  (the  point  the 
code  will  run  at).  Displace  to  account  for  header.) 

$0E/  {PUSH  CS  } 

$07/  {POP  ES  ) 

$8B/$FE/  {MOV  DI, SI  ) 

$83/$C7/$0D/  {ADD  DI, ODH  ) 

{Fetch  heap  address  of  source  overlay  code  from  memory 
position  two  bytes  after  first  byte  after  call  to  this 
routine.  Store  it  in  DS-.SI) 

$46/  {INC  SI  ) 

$46/  {INC  SI  ) 

$2E/$C5/$34/  { LDS  SI,CS:[SI]  ) 

{Multiply  overlay  page  by  100H  to  get  number  of  bytes  code 
is  displaced  from  start  of  overlay  code  area  (on  heap) . 

Add  to  source  offset  in  SI.) 

$8A/$F2/  {MOV  DH, DL  ) 

$32/ $D2/  { XOR  DL, DL  ) 

$03/ $F2/  {ADD  SI, DX  ) 

{Put  number  of  bytes  to  move  in  CX) 

$8B/$C8/  {MOV  CX, AX  ) 

{Copy  CX  bytes  from  DS:SI  to  ES:DI) 


PROCEDURE  DisposeOverlayStorage (OverlayCallOf fset  :  Integer); 

VAR 

OverlayCailPtr  :  *OverlayProcedure; 

BEGIN 

OverlayCailPtr  Ptr  (CSeg,  OverlayCallOf f set ) ; 

WITH  OverlayCailPtr*  DO 

IF  NewCalllnstruction (3)  -  $16  THEN  {Overlay  is  in  memory) 
FreeMem (OverlayCodeLoc,  OverlaySize  *  $80); 

END; 


End  Listings 


Dr.  Dobb's  Journal,  June  1987 

466 


79 


UNXBBS 

Listing  One  (Text  begins  on  page  54.) 

list 

1 

TPATH- /u /bbs / r bin 

*  check  the  command  argument  -  $1 

2 

if  [  "$1"  -  MS] 

3 

then 

♦  argument  is  for  MS-DOS  files 

4 

echo  'who  am  i  |  cut  -fl  -d"  "  1  'date  |  cut  -cl- 16  ' 

"MS. list"  »/u/bbs/log.file 

5 

echo 

6 

more  $TPATH/MS . list 

7 

elif  [  "$1"  -  Mac  ] 

8 

then 

♦  argument  is  for  Macintosh  files 

9 

echo  'who  am  i  |  cut  -fl  -d"  "  '  'date  |  cut  -cl-16  ' 

"Mac.  list"  »/u/bbs/log.  file 

10 

echo 

11 

more  $TPATH/Mac. list 

12 

elif  [  "$1"  -  Unix  ) 

13 

then 

*  argument  is  for  Unix  files 

14 

echo  'who  am  i  |  cut  -fl  -d"  "  '  'date  |  cut  -cl-cl6 

"Unix. list"  »/u/bbs/log. 

file 

15 

echo 

16 

more  $TPATH/Unix. list 

17 

else 

*  no  argument  was  entered 

18 

echo 

19 

echo  "List  which  directory?  (MS,  Mac,  or  Unix)" 

20 

fi 

End  Listing  tine 

Listing  Two 

dwnld 

*  identify  which  file  directory  (contained  in  the  first  argument  -  $1) 

1 

if  [  »$1"  -  MS  ] 

2 

then 

*  an  MS-DOS  file 

3 

filedir-/u/bbs/MS-files 

4 

elif  (  "$1"  -  Mac  ] 

5 

then 

#  a  Macintosh  file 

6 

filedir-/u/bbs /Mac- files 

7 

elif  (  "$1"  -  Unix  ] 

8 

then 

*  a  Unix  file 

9 

filedir-/u/bbs/Unix-files 

10 

else 

*  no  valid  file  directory  was  entered 

11 

echo  "Follow  the  dwnld  command  with  a  file  directory 

-  MS,  Mac,  or  Unix" 

12 

fi 

#  verify  that  a  file  name  has  been  entered  (contained  in 

the  second  argument 

-  $2) 

13 

if  [  -n  "$2"  ] 

14 

then 

#  verify  that  the  file  name  exists  within  the  selected  directory 

15 

if  [  -f  "$filedir/$2"  ] 

16 

16  '  "dwnld"  $1  $2 

17 

echo  'who  am  i  |  cut  -fl  -d"  "  '  'date  I  cut  -cl*- 

»/u/bbs/log.  file 

*  select  a  file  transfer  protocol 

18 

echo  "Transfer  potocol  (X  —  Xmodem;  A  —  ASCII) :  \c" 

19 

read  method 

20 

if  [  "$method"  -  X  -o  “Smethod"  -  x  ] 

21 

then 

#  send  a  binary  file  using  the  XModem  protocol 

22 

xmodem  -sb  $filedir/$2 

23 

else 

*  send  an  ASCII  file 

24 

echo  "Prepare  to  transfer  file.  Press  return  to  start. 

25 

read  dummy 

26 

cat  $filedir/$2 

27 

sleep  5 

28 

echo  "Download  complete.  Press  return  to  continue. 

80 


Dr.  Dobb’s  Journal,  June  1987 

467 


29  read  dummy 

30  fi 

31  else 

*  a  valid  file  name  wasn't  entered 

32  echo  "Sorry,  that  file  name  does  not  exist.  Be  sure  to  type  the" 

33  echo  "file  name  exactly  as  it  appears  in  the  directory  listing." 

34  echo  "Upper  and  lower  case  letters  are  different." 

35  fi 

36  else 


♦  no  file  name  was  entered 


37 

38 

39 


fi 


echo 

echo 


"Please  enter  a  file  name  after  the  directory  on  the  command  line." 
"Use  the  list  command  to  see  available  file  names." 


End  Listing  Two 


Listing  Three 


upld 


*  verify  that  a  file  name  has  been  entered  (in  argument  1  -  $1) 

1  if  [  -n  "$1"  J 

2  then 


3 

4 


* 


verify  that  a  file  with  the  same  name 

if  [  -f  /u/bbs/Uploads/$l  ] 
then 


isn't  already  stored 


in  the  Uploads 


directory 


#  the  file  name  already  exists 

5  echo  "Please  give  your  file  another  name.  $1  already  exists." 

6  else 


♦  collect  the  file  transfer  protocol 


7 

8 

9 

10 
11 


echo  'who  am  i  |  cut  -fl  . . date  |  cut  -cl-16*  "upld"  $1  »/u/bbs/log. 

file 

echo  "Transfer  protocol  (X  -  Xmodem;  A  -  ASCII);  \c" 
read  method 

if  [  "$method"  -  X  -o  "$raethod"  -  x  ] 
then 


#  receive  a  binary  file  using  the  XModem  protocol 

12  Xmodem  -rb  /u/bbs/Uploads/$l 

13  else 


*  receive  an  ASCII  file 


14 

15 

16 

17 

18 

19 

20 


echo 

echo  "Begin  sending  ASCII  file." 

echo  "Type  a  CNTRL-d  when  finished  transmitting." 
echo 

cat  >  /u/bb3/Uploads/temp 

cp  /u/bbs/ Uploads /temp  /u/bbs/Uploads/$l 


#  collect  the  file  description 


21 

22 

23 

24 

25 


echo  "Please  enter  a  one  line  description  of  your  file." 
read  desc 

^  echo  $1  $desc  »/u/bbs/Uploads /Doc. file 

else 

#  no  file  name  was  entered 


26  echo  "Please  enter  the  name  of  the  file  you  wish  to  upload  on" 

27  echo  "the  command  line." 

28  fi 

End  Listing  Three 

Listing  Four 


readmsg 


1  FIRST-/u/bbs/. first 

2  LAST-/u/bbs/ . last 

3  MSG-/u/bbs/rasg 

4  echo  'who  am  i  |  cut  -fl  -d . date  |  cut  -cl-16'  "readmsg"  »/u/bba/log.file 

5  echo  * 

#  display  the  number  of  the  first  and  last  messages  available 

6  echo  "Messages  available:" 

7  cat  $FIRST 

8  echo  "through" 

9  cat  $LAST 

10  echo 

*  accept  the  number  of  the  first  message  to  be  displayed 

11  echo  "Message  number  to  read  ('q'  to  quit):  \c" 

12  read  message 

♦  loop  until  the  user  wants  to  quit 

13  while  (  "$message"  !-  q  ] 

14  do 

15  echo 

#  verify  that  the  message  number  entered  actually  exists 

(continued  on  next  page) 


Dr.  Dobb's  Journal,  June  1987 

468 


81 


UNX  BBS 


Listing  Four  (Listing  continued,  tegt  begins  on  page  54.) 


16  if  [  -f  $MSG/$message  ] 

17  then 

#  display  the  message 

18  cat  $MSG/$raes8age 

19  elif  t  "$message"  !-  q  ] 

20  then 


*  the  entered  message  number  doesn't  exist 

echo  "Sorry.  That  message  doesn't  exist." 
fi 

echo 

#  accept  the  next  message  number  to  read 

echo  "Message  number  to  read:  ('q'  to  quit):  \c" 
read  message 


24 

25 

26  done 

27  echo 


End  Listing  Pour 


Listing  Five 

send 


1  echo  'who  am  i  |  cut  -fl  -d"  "  '  'date  |  cut  -cl-16'  "send"  »/u/bbs/ log.  file 

2  SCAN-/u/bbs/msg/ . index 

3  LAST-/u/bbs/ .last 

4  MSG-/u/bbs/msg 

5  L-'cat  $LAST' 

6  D-'date' 

*  increment  the  number  of  the  last  message  entered 

7  L-'expr  $L  +  1 ' 

*  save  the  new  "last  message"  value 

8  echo  $L  >$LAST 

9  echo 

*  collect  message  header  information 

10  echo  "To:  \c" 

11  read  to 

12  echo  "Subject:  \c" 

13  read  subject 

14  echo  "From:  \c" 

15  read  who 

16  trap  'rm  -f  $MSG/$L; continue '  2  3 

17  ech 

18  echo  "Start  typing  your  message.  You  have  20  lines  available." 

19  echo  "Type  a  period  on  a  new  line  to  end  the  message." 

20  echo 

*  store  the  message  header  in  the  message  file 

21  echo  "*  $L  From:  $who  $D"  >$MSG/$L 

22  echo  "  To:  $to"  »$MSG/$L. 

23  echo  "  Subject:  $subject"  »$MSG/$L 

24  echo  »$MSG/$L 

*  collect  the  body  of  the  message 

25  NL-0 

26  read  newline 

27  while  [  "$newline"  !-  »."  -a  "$NL"  -It  20  ] 

28  do 

29  echo  $newline  »$MSG/$L 

30  Nl^'expr  $NL  +  1' 

31  read  newline 

32  done 

33  echo 

*  update  the  message  index  for  use  by  the  scan  command 

34  echo  "Message  being  posted,  please  wait...  \c" 

35  echo  "*  $L  From:  $who  $D"  »$MSG/temp 

36  echo  "  To:  $to"  »$MSG/temp 

37  echo  "  Subject:  $subject"  »$MSG/temp 

38  echo  "  "  »$MSG/temp 

39  cat  $SCAN  »$MSG/temp 

40  mv  $MSG/temp  $scan  End  Listing  Five 

41  echo 


Listing  Six 

pmail 


1  USERS-/u/bbs/rbin/user. file 

2  echo  'who  am  i  |  cut  -fl  -d"  "  '  'date  |  cut  -cl-16'  "pmail"  »/u/bbs/ log .  file 
♦  display  pmail  instructions 

3  echo 

4  echo"  **************  Private  Mail  ***************** 

5  echo 

6  echo  Commands : 

7  echo  8  -  send  mail  to  another  user  ( Continued  On  page  84 ) 


82 


Dr.  Dobb's  Journal,  June  1987 

469 


UNIX  BBS 


Listing  Six  (Listing  continued,  tejct  begins  on  page  54. ) 


8 

9 

10 
11 


12 

13 


14 

15 

16 

17 

18 


19 

20 
21 
22 

23 

24 

25 

26 

27 

28 

29 

30 


31 

32 


33 

34 

35 


36 

37 

38 

39 


40 

41 

42 

43 

44 

45 


46 

47 

48 

49 

50 


51 

52 

53 

54 


55 

56 


57 

58 

59 

60 


61 

62 

63 


64 

65 

66 
67 


68 

69 

70 

71 

72 

73 


echo  r  -  read  mall  sent  to  you 

echo  1  -  list  the  users  by  user  id  and  name 

echo  x  -  exit  the  private  mail  system 

echo 

♦  accept  the  first  pmail  command 

echo  "Command:  \c" 
read  choice 

♦  loop  until  the  user  wants  to  quit 

while  [  "$choice"  1-  x  -o  "$choice"  !-  X  1 
do 

echo 

case  $choice  in 
[Ss]) 

♦  send  private  mail 

#  print  instructions 

echo  "WARNING:  To  successfully  send  mail  observe  these  rules:" 
echo 

echo  "  1.  If  you  wish  to  erase  a  character,  use  the  backspace  key." 

echo  "DO  NOT,  DO  NOT  use  the  delete  key." 

echo  "  2.  Type  a  carriage  return  at  the  end  of  each  line  on  the" 

echo  "screen.  Lines  should  be  no  more  than  80  characters." 

echo  "  3.  Do  not  use  the  following  characters  in  your  message  (they" 

echo  "have  special  meaning  to  the  system) :  0  and  ♦" 

echo 

echo  "Press  RETURN  to  begin  sending  your  message:  \c" 

read  dummy 

echo 

*  collect  user  id  of  recipient 

echo  "To  whom  (user  id) :  \c" 
read  to 

*  verify  that  recipient  entered  is  a  valid  user  id 

TPERS-'grep  $to  $USERS  |  wc  -c' 

if  (  "$TPERS"  -eq  0  ] 

then 

#  user  id  is  invalid 

echo 

echo  "Sorry,  that  user  id  doesn't  exist." 
echo  "List  them  with  the  1  command." 
else 

♦  send  mail 

echo 

echo  "Start  typing  your  message.  Type  a  period  on  a  new" 
echo  "line  to  send  the  message." 
echo 
mail  $to 
fi 

*  collect  next  option 

echo 

echo  "Command  (x  to  exit) :  \c" 
read  choice 
continue; ; 

[Rr]) 

♦  read  private  mail 

#  print  instructions 

echo  "After  each  message  you  will  receive  a  /?/  prompt." 
echo  "Type  '  d'  to  delete  the  message,  <cr>  to  leave  it  in" 
echo  "your  mailbox,  or  #q#  to  stop  reading  mail." 
echo 

*  real  the  mail 

mail 

echo 


#  collect  the  next  command 


echo  "Command  (x  to  exit) :  \c" 
read  choice 
continue ; ; 

[LI]) 

#  display  a  list  of  user  names  and  associated  user  id's 

echo 

more  $USERS 
echo 

*  collect  the  next  command 

echo  "Command  (x  to  exit) :  \c" 
read  choice 
continue; ; 

*) 


#  if  an  unrecognized  command  was  entered,  prompt  for  another  one 


esac 

done 

echo 


echo  "Command  (x  to  exit) :  \c" 
read  choice 
continue ; ; 


End  Listing  Six 

(Listing  Seven  begins  on  page  86.) 


84 

470 


Dr.  Dobb's  Journal,  June  1987 


UNIX  BBS 


Listing  Seven  (Listing  continued,  text  begins  on  page  54.) 


.profile  from  the  Account  new 


*  set  default  path  to  an  empty  rbin  directory 

1  PATH-/u/bbs/new/rbin 

*  set  level  one  prompt  to  remind  users  how  to  log  off 

2  PSl-'Type  CNTRL-D  now...  ' 

*  display  the  welcome  message 

3  cat  msgl 

4  echo 

*  make  sure  the  user  really  wants  an  account 

5  echo  "Do  you  wish  to  request  a  login?  (Y/N)  \c" 

6  read  choice 

*  continue  only  if  the  user  answers  yes 

7  if  [  "$choice"  -  y  -o  "$choiceH  -  Y  ] 

8  then 

9  echo 

10  OK-"N" 

*  loop  until  user  is  satisfied  with  data 

11  while  [  " $OK"  !-  Y  -a  "$OK"  1—  y  ] 

12  do 

*  collect  user  name,  address,  etc. 

13  echo  "Enter  your  real  name:  \c" 

14  read  realname 

15  echo 

16  echo  "Enter  the  first  line  of  your  mailing  address:  (4  lines  are  available) 

17  read  addressl 

18  echo 

19  echo  "Enter  the  second  line  of  your  mailing  address:  <cr>  if  none" 

20  read  address2 

21  echo 

22  echo  "Enter  the  third  line  of  your  mailing  address:  <cr>  if  none" 

23  read  address3 

24  echo 

25  echo  "Enter  the  fourth  line  of  your  mailing  address:  <cr>  if  none" 

26  read  address4 

27  echo 

28  echo  "Enter  your  voice  telephone  number:  \c" 

29  read  phone 

30  echo 

*  display  the  data  entered  for  the  user 

31  echo  "This  is  what  you  have  entered:" 

32  echo 

33  echo  $realname 

34  echo 

35  echo  $addressl 

36  echo  $address2 

37  echo  $address3 

38  echo  $address4 

39  echo 

4  0  echo  $phone 

41  echo 

*  ask  the  user  to  verify  the  data 

42  echo  "Is  this  correct?  (Y/N)  \c" 

43  read  OK 

44  done 

45  OK-"N" 

*  loop  until  user  is  happy 

46  while  (  "$OK"  !-  Y  -a  "$OK"  !-  y  ] 

4  7  do 

*  collect  the  account  data 

48  echo 

49  echo  "Enter  a  one  word  login  name:  \c" 

50  read  lognmae 

51  echo 

52  echo  "Enter  an  initial  password:  \c" 

53  read  initpasswd 

54  echo 

*  display  the  entered  data  for  the  user 

55  echo  "You  have  entered:" 

56  echo 

57  echo  $logname 

58  echo  $initpasswd 

59  echo 

*  ask  the  user  to  verify  the  data 

60  echo  "Is  this  correct?  (Y/N)  \c" 

61  read  OK 

62  done 

63  echo 

*  write  the  new  account  data  to  a  holding  file 

(continued  on  page  88) 


86 


Dr.  Dobb's  Journal,  June  1987 

471 


UNIX  BBS 


Listing  Seven  (Listing  continued,  text  begins  on  page  54.) 


64  echo  $realname  »signups 

65  echo  $addressl  »signups 

66  echo  $address2  »signups 

67  echo  $address3  »signups 

68  echo  $address4  »aignups 

69  echo  $phone  »signups 

70  echo  $logname  »signups 

71  echo  $initpasswd  »signups 

72  echo  »signups 

*  tell  the  U3er  that  he  or  ahe  ia  finished 

73  echo  "This  completes  the  login  request  process.  You'll  receive  an  account" 

74  echo  "confirmation  through  the  mail  within  a  week." 

75  echo 

76  fi 

*  tell  the  user  to  log  off 

77  echo 

78  echo  "Enter  CTNRL-D  to  log  off  the  system." 

79  echo  End  Listing  Seven 


Listing  Eight 

.profile  for  the  info  Account 


1 


2 


3 

4 

5 

6 

7 

8 

9 

10 


11 

12 

13 

14 

15 


16 

17 


18 

19 

20 
21 
22 

23 

24 

25 

26 

27 

28 

29 

30 

31 

32 

33 

34 

35 

36 

37 


38 

39 

40 

41 

42 

43 


44 

45 

46 


47 

48 

49 


*  set  the  default  path  to  an  empty  rbin  directory 
PATH-/u/bbs/info/rbin 

*  set  the  level  one  prompt  to  the  logoff  message 
PSl-'Type  CNTRL-D  now...' 

*  display  the  welcome  message 
echo 

echo  "  Welcome  to  Scholastech  Telecommunications" 

echo  "  Scholastech  Info" 

echo 

cat  rasg 

echo 

echo  "Press  the  carriage  return  to  begin:  \c" 
read  dummy 

t  display  the  contents  of  the  information  file 

more  info. file 
echo 

echo  "Press  the  carriage  return  when  done:  \c" 

read  dummy 

echo 

*  give  the  user  the  choice  of  whether  or  not  to  sign  up  for  workshops 

echo  "Do  you  wish  to  sign  up  for  a  workshop?  (Y/N)  \c" 
read  dummy 

*  loop  until  user  doesn't  want  to  register  for  any  more  workshops 

while  [  "$dummy"  -  y  -o  "Sduraray"  -  Y  ] 
do 

echo 

echo  "Enter  your  name:  \c" 

read  regname 

echo 

echo  "For  which  workshop  are  you  registering?  \c" 

read  workshop 

echo 

echo 

echo  "Enter  your  school  or  company  name:  \c" 

read  school 

echo 

echo  "For  how  many  people  are  you  registering?  \c" 

read  people 

echo 

echo  "Enter  a  voice  phone  number  where  you  can  be  reached  in  case" 
echo  "there  are  any  questions  about  your  registration:  \c" 
read  phone 
echo 

*  write  the  registration  data  to  disk 

echo  $regname  »signups 
echo  $workshop  »signups 
echo  $school  >>signups 
echo  $ people  »signups 
echo  $phone  >>signups 
echo  » signups 

*  allow  the  user  to  register  for  more  than  one  workshop 

echo  "Do  you  wish  to  register  for  another  workshop?  (Y/N)  \c" 
read  dummy 

done 


#  tell  the  user  how  to  log  off 
echo 

echo  "Type  CNTRL-D  to  log  off  the  system." 
echo 


End  Listing  Eight 

(Listing  Nine  begins  on  page  90.) 


88 

472 


Dr.  Dobb's  Journal,  June  1987 


UNIX  BBS 


Listing  Nine  (Listing  continued,  te^ct  begins  on  page  54.) 

logs 


#  create  a  temporary  file  with  a  log  on  record  for  each  remote  user 

1  who  /etc/wtmp  |  grep  ph  >/u/user.log 

#  if  an  "a"  was  entered  as  an  argument  to  the  command,  display  the  entire  log  files 

2  if  [  "$1"  -  a  ] 

3  then 

4  echo 

5  echo 

6  echo  "System  Logins" 

7  echo 

#  display  all  system  use  since  /etc/wtmp  was  last  purged 

8  more  /u/user.log 

9  echo 

10  echo  "Next?" 

11  read  dummy 

12  echo 

#  display  all  BBS  use  since  /u/bbs/log. file  was  last  purged 

13  echo  "BBS  User  Log" 

14  echo 

15  more  /u/bbs/log. file 

16  echo 

#  otherwise,  show  only  the  last  ten  entries  in  each  log  file 

17  else 

18  echo 

19  echo 

20  echo  "Last  System  Logins" 

21  echo 

22  tail  /u/user.log 

23  echo 

24  echo 

25  echo  "Last  BBS  Activity" 

26  echo 

27  tail  /u/bbs/log. file 

28  echo 

29  fi 

#  remove  the  temporary  file 

30  rm  /u/user.log 

Listing  Ten 

usage 


*  print  headings  to  report  file 

1  echo  "User  Date  Login  Logout"  >usage.temp 

2  echo  " - "  »usage .  temp 

3  echo  "  "  »usage.temp 

#  get  a  log  in  record  from  /etc/wtmp  for  remote  users  on  device  phi 

4  who  /etc/wtmp  |  grep  phi  >templ 

#  get  the  user  name 

5  cut  -fl  -d"  "  tempi  >temp2 

#  get  the  log  in  date  and  time 

6  cut  -C25-36  tempi  >temp3 

*  get  log  out  record  from  /etc/wtmp  and  extract  the  log  out  time 

7  who  -d  /etc/wtmp  |  grep  phi  |  grep  -v  LOGIN  |  grep  -v  rc  |  cut  -c32-36  >temp4 

#  paste  together  for  the  report 

8  paste  temp2  temp3  temp4  »usage.temp 
+  print  the  report 

9  pr  usage. temp  >usage . summary 

*  remove  temporary  files 

10  rm  tempi 

11  rm  temp2 

12  rm  temp3 

13  rm  temp4 

14  rm  usage. temp 

#  format  the  BBS  command  use  log  for  output  and  purge  the  log  file 

15  pr  /u/bbs/log . file  >log. summary 

16  >  /u/bbs/log. file 

*  clean  out  /etc/wtmp 

17  >  /etc/wtmp 

♦  send  both  reports  to  the  printer 

18  lp  usage . summary 

19  lp  log. summary  Eod  Listings 


End  Listing  lVine 


90 


Dr.  Dobb's  Journal,  June  1987 

473 


These  softstrips  by  Cauzin 
Systems  contain  the  listings 
for  C  Chest.  The  listings 
begin  on  page  94. 


C  CHEST 


Listing  One  (Te?ct  begins  on  page  102.) 


391 

int 

(•cxnp)  ()/ 

/* 

40| 

int 

(•swap)  ()/ 

/" 

411 

int 

items ize; 

/• 

421 

int 

nltems; 

/* 

43| 

int 

maxi tern; 

/• 

44| 

char 

•bottom; 

/• 

451 

char 

•heap; 

/• 

# include  <stdio.h> 


PQ*C  General-purpose  priority-queua  routines. 

(C)  1987  Allen  I.  Holub.  All  Rights  Reserved. 

typadef  char  "PQ;  Dummy  typedef  for  priority  queue. 

PQ  *pq_create(  numele,  eleslze,  cmp,  swap,  initheap  J 

int  numele;  Max  #  of  elements  in  the  queue 

int  eleslze;  Size  of  one  element  in  bytes 

int  ("crop)  ();  Pointer  to  comparison  function 

int  ("swap)  ()/  Pointer  to  swap  function 

char  "initheap;  Inital  heap  or  NULL  to  allocate 


pq_ins(  p,  item  ) 
PQ  *p; 
char  "item; 


Insert  Item  into  queue 
Pointer  to  priority  queue 
Pointer  to  item  to  insert 
Return  number  of  empty  slots 
before  insertion  (0  if  none). 


int  pq__del  (  p,  target  )  Delete  item  from  queue 
2Q  *p;  Pointer  to  priority  queue 

c^*r  ‘target;  Pointer  to  place  to  put  deleted 

item. 

Return  I  of  items  in  queue  before 
delete  (0  if  nothing  deleted) . 


char  *pq_loolt  (  queue  ) 
PQ  "queue; 


Ixx>)c  at  (don't  delete)  top  ele 
Pointer  to  queue. 


int  pq_numele(  queue  )  Return  •  of  elements  in  queue. 
PQ  "queue;  Pointer  to  queue. 


typedef  struct 


size  of  one  element  in  heap 


static  void  reheapjdown (  p  ) 


/*  Reheap  the  Heap,  starting  at  the  top  and  working 
*  down; 


561 

57 1 

58 1 

V 

int 

parent ; 

/*  index  of  parent 

*/ 

59| 

int 

child; 

/"  index  of  child 

*/ 

60 1 

char 

"pparent; 

/*  pointer  to  parent 

•/ 

611 

char 

"pchild; 

/"  pointer  to  child 

V 

421 

char 

"psibling; 

/"  pointer  to  child's  sibling 

V 

63| 

char 

"heap; 

/*  pointer  to  heap 

V 

64 1 

651 

heap  - 

p->heap; 

for (  parent  -  0,  child  -  1;  child  <  p->nitems  ;) 

( 

pparent  -  heap  ♦  (parent  •  p->itemsize) ; 
pchild  -  heap  ♦  (child  "  p->itemsize) ; 

if(  child+1  <  p->niteras  ) 

l 

psibling  -  pchild  +  p->itemsize  ; 

if(  ("p->anp)  (pchild,  psibling)  <  0  ) 

( 

pchild  -  psibling; 

♦♦child; 

) 


if  <  (*p->anp) (  pparent,  pchild  )  >-  0) 
break; 

("p->swap) (  pparent,  pchild  ); 

parent  -  child; 

child  -  (parent  "2)  +1; 


102| 

103| 

1041 

•/ 

int 

parent; 

/"  index  of  parent 

*/ 

105| 

int 

child; 

/"  index  of  child 

V 

106| 

char 

"pparent; 

/*  pointer  to  parent 

•/ 

107| 

char 

"pchild; 

/"  pointer  to  child 

"/ 

108| 

char 

"heap; 

/"  pointer  to  heap 

•/ 

109| 

110] 

child 

-  p-> nltems 

-  l; 

nil 

heap 

-  p->heap; 

rehaap_up(  p  ) 


Reheap  the  Heap,  starting  at  the  bottom  and  working  up. 
Note  that  we  must  use  a  divide- by- 2  rather  than  a 
shift-right  in  the  while  loop  because  -1/2  —  0  but 
-1  »  1 - 1. 


while (  (parent  -  (child-1)  /  2)  >-  0  ) 

( 

pchild  -  heap  ♦  (child  *  p->itamsize) ; 
pparent  -  heap  ♦  (parent  *  p-> items i ze) ; 

if  (  (*p->anp)  (  pparent,  pchild  )  >-  0) 
break; 

(*p->swap) (  pparent,  pchild  ); 
child  -  parent; 


pq_create(  numele,  eleslze,  anp,  swap,  initheap  ) 


/* - 

PQ 

*pq_create 

int 

numele; 

int 

eleslze; 

int 

(*anp)  (); 

int 

(•swap)  (); 

char 

"initheap; 

/*  max  I  of  elements  in  the  queue  "/ 
/*  size  of  one  element  in  byte  "/ 
/*  pointer  to  comparison  function  "/ 
/*  pointer  to  swap  function  */ 
/*  inital  heap,  NULL  to  allocate  */ 


Create  a  priority  queue  that  can  hold  at  most 
"numele"  elements,  each  of  size  "eleslze".  The 
anp  function  is  passed  two  pointers  to  queue 
elements  and  it  should  behave  as  follows: 

(*anp)  (pi,  p2) 

For  descending  priority  queues  [pq_get  ()  returns  the 

largest  item] . 

*pl  <  *p2  return  <  0 

•pi  —  "p2  return  —  0 

*pl  >  "p2  return  >  0 

For  ascending  priority  queues  (pq__get  ()  returns  the 

smallest  item]. 

*pl  <  "p2  return  >  0 

*pl  —  *p2  return  —  0 

*pl  >  *p2  return  <  0 

If  the  initheap  argument  is  NULL,  an  empty  heap  is 
created  automatically,  otherwise  initheap  must  point 
at  an  initialized  numele-el ament-long  heap. 


PQ  *p,  "malloc  ()  ; 

int  heapsize; 

heapsize  -  numele  *  eleslze  ;  /*  heap  size  in  bytes  •/ 

if(  initheap  ) 

I 

if(  l  (p  -  malloc  (sizeof  (PQ) ) )  ) 
return  0; 

p->heap  -  initheap; 

p->bottom  -  initheap  ♦  ((numele  -  1)  *  eleslze); 
p->nl terns  -  numele; 


lf(  1 (p  -  malloc (sizeof (PQ)  ♦  heapsize))  ) 
return  0; 

p->heap  -  (char  •)  (p  +  1); 
p->nitems  -  0; 

p-> bottom  -  p->heap  -  eleslze  ; 


p->cmp  -  aqp; 


94 

476 


Dr.  Dobb's  Journal,  June  1987 


18S| 

p->swap 

-  swap; 

186| 

p->itemsize 

-  elesize; 

187| 

p->maxitem 

-  numele  ; 

188| 

return  p; 

1891  ) 

190| 

192| 

1931  pq_ins(  p,  item  ) 


194|  PQ 
19S|  char 
1961  { 

197|  / 

198| 

199| 

200| 

201| 

202| 

203| 

2041 

205| 

206| 

207| 

208| 

209| 

210|  1 

2111 

212|  i 

2131  { 

214| 

215| 

216| 

217|  1 

218| 

2191  r 

220|  } 

2211 

222|  /• - 

223| 

2241  lnt 
225| 

2261  PQ 
2271  char 


•p;  /*  Pointer  to  priority  queue  •/ 

•item;  /‘  Pointer  to  item  to  insert  •/ 

/*  Insert  a  new  item  into  the  priority  queue  (provided 

*  that  space  is  available. 

* 

•  Return  the  number  of  empty  slots  in  the  queue  before 

*  the  insertion.  This  nunber  is  0  if  the  queue  is 

*  full  and  nothing  is  inserted.  Algorithm  is: 

• 

*  if (  space  is  available  in  the  queue  ) 

*  increase  queue  size 

*  copy  new  item  into  bottom  of  queue 

*  reheap  from  the  bottom  up. 


int  space__avail  -  p->maxitem  -  p->ni terns; 

if (  space_avail  >  0  ) 

( 

♦+(  p->nitems  )  ; 

roemcpy  (  p->bottom  +-  p->itemsize,  tltem,  p->itemsize  ); 
reheap__up  (  p  J ; 


return  space__avail  ; 


pq_del (  p,  target  ) 


226|  PQ  *p;  /*  pointer  to  priority  queue  V 

227 |  char  ‘target;  /*  place  to  copy  current  largest  item  •/ 

228|  ( 

229|  /•  Copy  the  largest  item  in  the  priority  queue  to 

230|  *  the  address  held  in  target,  then  delete  the  item. 

2311  * 

232|  *  Return  the  number  of  items  in  the  queue  before  the 

233|  *  delete.  If  this  number  is  0,  then  nothing  was 

234|  *  in  the  queue  and  ‘target  will  not  have  been 

2351  *  modified.  Algorithm  is: 

2361  • 

237|  *  if(  there's  something  in  the  queue  )  (1] 

238|  ‘  remember  pointer  to  former  first  item  (2] 

239|  ‘  replace  the  first  item  with  the  last  one  (3) 

2401  •  shrink  the  heap  by  one  element  C 4 ) 

241|  *  reheap  from  the  top  down  [51 

242|  */ 

243| 

244|  int  slots_in_use; 

2451 

246 1  if(  slots_in__use  -  p->nitems  )  /*  1  */ 

247|  ( 

2481  memcpy{  target,  p->heap,  p->itemsize  );  /‘  2  •/ 

2491  memcpy(  p->heap,  p-> bottom,  p->itemsize  );  /*  3  ‘/ 

250| 

2511  — (p->nitems)  ;  /*  4  •/ 

252|  p->bottom  —  p->itemsize; 

253| 

254|  reheap_down(  p  );  /*  5  •/ 

255|  1 

256| 

257 |  return  slots_in_use  ; 

258|  1 
2591 

260|  /* - - - */ 

2611 

262|  char  *p<^_look(  queue  ) 

263|  PQ  ‘queue; 

2641  ( 

265|  /*  Return  a  pointer  to  the  largest /smallest  element 

266|  *  in  the  priority  queue  but  don't  dequeue  it. 

267|  ‘/ 

268| 

269|  return  queue->heap; 

270|  1 
271| 

2721  - - - 

2731 

274|  int  pq_numele(  queue  ) 

2751  PQ  ‘queue; 

2761  ( 

277|  /•  Return  number  of  items  in  queue 


int  slots_ln_use; 

if(  slots_in__use  -  p->ni terns  ) 

( 

roemcpy (  target,  p->heap,  p->itemsize  ); 
roemcpy (  p->heap,  p-> bottom,  p->itemsize  ); 

—  (p->nitems)  ; 
pobottom  —  p->itemsize; 

reheap_down (  p  )  ; 


return  slots  in  use  ; 


*pq_look(  queue  ) 

‘queue; 

/*  Return  a  pointer  to  the  largest /smallest  element 
*  in  the  priority  queue  but  don't  dequeue  it. 


return  queue->heap; 


return  qua ue->nl terns; 


(continued  on  ne^ct  page) 


Dr.  Dobb's  Journal,  June  1987 


C  CHEST 


Listing  One  (Listing  continued,  te?ct  begins  on  page  102.) 


•ifdef  MAIN 


Int  Descending  -  1; 
int  Makequeue  -  1; 


anp(  si,  s2  ) 
char  **sl,  **s2; 


/•  Chang*  these  In  CodeView 
/•  to  change  the  tests. 


int  rval; 

rval  -  stranp(  *sl,  *s2  ); 

if(  Descending  ) 
return  rval; 

else 

return  {  rval  <  0  ) 

(  rval  >  0  ) 

/*  rval  --  0  */ 


swap(  si,  s2  ) 
char  ••si,  **s2; 


top  -  *sl; 
•si  -  *s2; 
*s2  -  tmp; 


printq (  p  ) 

PQ  *p; 

{ 

int  1; 

printf  ("Queue  is:\n"); 

printf (  "\tanp . 0x%04x\t",  p 

printf (  "\tswap .  0x%04x\n",  p 

printf  (  "\titemslz* . %d\t",  p 

printf (  "\tnitems . %d\n",  p 

printf  (  "\tmaxitem . %d\t",  p 

printf (  "\tbottora . 0x%04x\n",  p 

printf  (  “\theap .  0x%04x\t",  p 

printf  (  "\tbottom  -  heap...  %d\n\n",  | 

I 

lf(  p->nlteras  <-  0  ) 

printf ("\tqueue  is  emptyVn"); 

else  for(  i  -  0  ;  i  <  p->nit ems  ;  i++  ) 


p->anp  ); 

p->swap  ) / 
p->itemsiz*  ); 

p->  nit  etna  )  ; 
p~>maxltea  ) ; 
p->bottcra  ) ; 
p->heap  ) ; 

(char  *•) p-> bottom  - 
(char  **)p->heap  ); 


printf ("\t%-2d:  %10.10s  (0x%04x>  (children;  %2d,  %2d)\n", 
i,  ((char  **)p->heap)  (i),  ((char  **)p->heap)  [1 ) , 
(2*i)*l,  (2*1)  +2  ); 


PQ  ‘queue; 
char  buf(SO); 
int  1; 
char  *p ; 

static  char  *testq[]  - 

"0",  "1«,  "2",  "3",  "4",  "5",  "6",  "7",  "8",  "9" 


lf(  Makequeue  ) 

queue  -  pq_creat*(  10,  slzeof  (char*) ,  cmp,  swap,  0  ); 

else 

queue  -  pq_cr«at« (  10,  sizeof  (char*) ,  cmp,  swap,  testq) ; 

If (  (queue  ) 

I 

printf ("pq_creat*  fall#d\n“); 
exit (  1  ); 


printf  (•♦ - +\n") ; 

printf ("1  Enter  i<string><CR>  to  add  string  to  |\n“); 

printf ("|  queue,  d<CR>  to  dequeue  top  element,  q  to  |\n")7 
printf ("|  exit  the  program.  |\n"); 

printf  ("+ - +\n") ; 

while (  1  ) 

( 


377| 

378| 

379| 

380| 

381| 

382| 

383| 

384| 

385| 

386| 

387 1 
388| 

3891 

390| 

391| 

392| 

393| 

394| 

395| 

396| 

397| 

398| 

399| 

400| 

401| 

402| 

403| 

404|  ) 

405|  ) 

406| 

407|  fendlf 


printq (  queue  ) ; 

printf  (  "\n[i |d|q] [string]  ;■  ); 

gets  (  buf  ); 

if (  *buf  —  'q*  ) 
exit  (  1  ); 

else  if(  *buf  —  *i'  ) 

( 

1  -  pq_ins(  queue,  strsave(buf  +1)  ); 
if(  1  ) 

print f(“%d  slots  avail,  before  Insert \n",  1); 
else 

printf ("Queue  was  full,  did  nothlng\n\n") ; 


1  -  pq_del (  queue,  (char*)  ip  ); 

printf ("%d  slots  used  before  delete,  got  <%s>\n"# 

l.  p>; 

ir (  i  ) 
f 

fr«e(  p  ) ; 
p  -  "nothing"; 


End  Listing  One 


Listing  Two 


1 1  char 
2 |  char 
31  ( 


Listing  2  —  strsave.c 

*strsave(  str  ) 

*str; 

Save  the  Indicated  string  in  a  mallocQed  section 
of  static  memory.  Return  a  pointer  to  the  copy  or 
0  if  malloc  failed. 


register  char  *rptr; 
extern  char  *malloc(); 

if (  rptr  -  malloc (  strlen(str)  +1  )) 
( 

strcpy(  rptr,  str  ); 
return  rptr; 


End  Listing  Two 


18 1  return  (char  *)0; 

19|  ) 

Listing  Three 

Listing  3  —  freq.c 
1|  # include  <stdio.h> 


FREQ.C  Print  a  list  of  the  frequency 

of  occurence  of  all  bytes 
in  a  list  of  files  given  on  the  cotxmand  line. 
Frequencies  are  printed  as  a  probability  x  100. 
For  example,  if  we  read  a  total  of  20  characters, 
5  of  which  are  'e',  the  probability  of  an 
occuring  in  the  input  is  5/20  (.25)  and  freq  will 
output  (.25  *  100)  or  25. 


typedef  struct 


int  val; 
double  count; 


•define  TABSIZE  256 

ITEM  Tab [  TABSIZE  );  /‘I'm  counting  on  this  being 

/*  initialized  to  zero. 


anp{  lteml,  ltem2  ) 

ITEM  *itenl,  *item2; 

>  , 

/*  Comparison  function  used  by  ssort(),  below. 

*  Count  is  the  primary  sort  field  and  val  is 

*  the  secondary  field. 

*/ 


96 

478 


Dr.  Dobb's  Journal,  June  1987 


34| 

35 1  re 
36| 

37 1 
38| 

39|  ) 

40| 

41|  /• - 

42| 

43 |  main(  argc,  argv  ) 
44 |  char  **argv; 

45|  ( 

char 
FILE 
int 

double 
double 
double 
double 
double 


(  iteml->count  < 

(  it«ral->  count  > 
/*  itecol->count  — 


iteo2->count  )  ?  -1  s 
item2->count  )  71: 
itera2->count  */ 

iteml->val  -  ite»2->val  ; 


46| 
47| 
48| 
49| 
50 1 
511 
52| 
53| 
54| 
55| 
56| 
57| 
58| 
59| 
60| 
611 
62| 
63| 
64| 
65| 
66| 
67 1 
68| 
69| 
70| 
711 
72| 
73| 
74| 
751 
76| 
77| 
78| 
79| 
80| 
811 
82| 
83| 
84| 
85| 
86| 
87  | 
88| 
89| 
90| 
911 
92| 
93| 
94| 
95| 
96| 
97 1 
98 1 
99| 
1001 
101| 
102| 
103| 
104| 
105| 
106| 
107| 
108| 
109| 
110| 
nil 
1121 
1131 
1141 
1151 
116| 
117| 
118| 
1191 
120| 
1211 
1221 
1231 
124| 
1251 
126| 
127| 
128| 


*binjto_ascii  () ; 
•fp; 

1; 

smallest; 

largest; 

numchars  -  0.0  ; 
sum  -  0.0  ; 

probability  -  0.0  ; 


/*  in  pchar.c  */ 


reargv (  targe,  targv  ); 


Needed  only  for 
On  Cofiznandl  shell 


V 

•/ 


for{ 

{ 


-argc,  ++argv  ;  — argc  >-  0  ;  argv++  ) 


if{  I  (fp  -  fopen(  "argv,  "rb")) 
perrort  "argv  ); 

else 

< 


fprintft  stderr,  "%s\n",  *argv  ); 

while (  (i  -  getc(fp) )  I-  EOF  ) 

( 

++  numchars; 

++  Tab[  i  t  Oxff  ]. count  ; 

) 

f close {  fp  ) ; 

) 

J 

/*  Find  largest  and  smallest  elements  and  at  the  same 
•  time  initialize  the  val  fields  of  the  Tab  entries. 
V 


largest 

smallest 


0.0; 

numchars 


for(  1 

( 


0  ;  i  <-  Oxff  ;  i++  ) 


) 


Tab(i].val  -  i; 

if (  Tab(i). count  >  0.00000001 
tt  Tab [i]. count  <  smallest  ) 

smallest  -  Tab[l) .count; 

if(  Tabli). count  >  largest  ) 

largest  -  Tab(i] .count; 


/*  Sort  the  list  by  probability.  A  shell  sort  is  used. 

*  You  can  replace  the  ssort  call  below  with  a  call  to 

*  the  qsortt)  subroutine,  available  with  many  compilers. 

*  Qsort  ()  takes  the  same  arguments  as  ssort  () . 

*  A  version  of  qsort  appeared  in  the  in  the  C  Chest, 

*  DDJ  4102  (April,  1985;  also  Bound  Volume  10,  p.316). 

*  The  ssort ()  subroutine  appeared  in  DOJ  #113,  C  Chest 

*  (March,  1986)  p.  70. 


ssort  (  Tab,  TABSIZE,  sizeof (ITEM) ,  cmp  ); 


*  Print  the  list.  Each  element  is  printed  as  ^hree 

*  numbers;  the  value  of  the  charcter  (in  hex),  the 

*  probability  of  that  character  * pearing  in  the  input, 

*  and  the  probabability  normalized  so  that  the  least- 

*  frequently  occuring  probability  has  the  value  1. 

V 


for(  i  -  0  /  1 

( 


<  TABSIZE  ;  ) 


probability  -  Tab(i). count  /  numchars  ; 
sum  +-  probability; 

printf (  -0x%02x\t%7.6f\t*1.0f\n-, 

Tab(l) .val, 
probability, 

Tab[i). count  /  smallest  ); 


) 


fprintf(  stderr,  "Total  -  %8.6f 


(N  -  48.2f)\n", 
sura,  numchars  ) ; 


End  Listings 


Dr.  Dobb's  Journal,  June  1987 


97 

479 


ARTFICIAL  INTELLIGENCE 


Listing  One  (Te?ct  begins  on  page  116.) 

Listing  1.  Inheritance  in  SCOOPS 
?  (C)  Copyright  1987  Ernest  R.  Tello 
(define-class  artifact 

(instvars  material  weight  purpose  cost) 

(options 

(gettable-variables  material  weight  purpose  cost) 
settable-variables 
inittable-variables) ) 


(define-class  transport -means 

(instvars  medium  time-range  power-source) 

(mixins  artifact) 

(options 

(gettable-variables  medium  time-range  power-source) 
settable-variables 
inittable-variables) ) 


(define-class  transport-vehicle 

(instvars  load-capacity  length  max-speed) 

(mixins  artifact  transport -means) 

(options 

(gettable-variables  load-capacity  length  max-speed) 
settable-variables 
inittable-variables) ) 


(define-class  passenger-vehicle 

(instvars  capacity  safety  dining  facilities) 

(mixins  artifact  transport -means  transport -vehicle) 

(options 

(gettable-variables  capacity  safety  dining  facilities) 
settable-variables 
inittable-variables) ) 


(define-class  water-transport-vehicle 

(classvars  (  body-name  'hull)  (dof  2)  (dangers  'sink  )  (advantages  'relaxing  )) 
(mixins  artifact  transport -means  transport -vehicle  passenger-vehicle) 

(options 

(gettable-variables  dof  dangers) 
settable-variables 
inittable-variables) ) 


(define-class  surface-vessel 

(instvars  l-decks  l-masts  l-engines  ) 

(mixins  artifact  transport -means  transport -vehicle  passenger-vehicle  water-transport- 

vehicle) 

(options 

(gettable-variables  #-decks  l-masts  l-engines  ) 
settable-variables 
inittable-variables) ) 

(define-class  ship 

(instvars 

x-posltion  y-position  x-velocity  y-velocity  mass) 

(mixins  surface-vessel) 

(options 

(gettable-variables  x-posltion  y-position  x-velocity  y-velocity  mass) 
settable-variables 
inittable-variables) ) 

(de fine-met hod  (ship  speed)  () 

(sqrt  (+  (expt  x-velocity  2) 

(expt  y-velocity  2)))) 


(def ine-method  (ship  direction)  () 

(atan  y-velocity  x-velocity) ) 


(define-class  ocean-liner 

(instvars  company  launched  homeport  tons  ) 

(mixins  ship) 

(options 

(gettable-variables  company  launched  homeport  tons  ) 
settable-variables 
inittable-variables) ) 


(define  shipl 

(make-instance  ship 

'x-position  100 
•y-position  150 
'x-velocity  30 
•y-velocity  40 
'mass  100}) 

(compile-class  artifact) 

(compile-class  transport -means) 

(compile-class  transport -vehicle) 
(compile-class  passenger-vehicle) 
(compile-class  water-transport -vehicle) 
(compile-class  surface-vessel) 

(compile-class  ship) 


98 

480 


Dr.  Dobb's  Journal,  June  1987 


(compile-class  ocean-liner) 

End  Listing  One 

Listing  Two 

Listing  2.  Multiple  Inheritance  in  SCOOPS 
;  (C)  Copyright  1987  Ernest  R.  Tello 


(define-class  business 

(instvars  name  location  industry  business-type  size 
year-founded  ownership-type  gross-sales  costs 


market-share) 

(options 

(gettable-variables  name  location  industry 
business-type  size  year-founded 
ownership-type 


gross-sales  costs  market-share) 
settable-variables 
inittable-variables) ) 

(define-method  (business  calc-net-gain)  (gross-sales  costs) 
(-  gross-sales  costs) ) 


(define-class  adversary 

(Instvars  aggressiveness  allies  goals  coirmon-goals 
strengths  weaknesses) 

(options 

(gettable-variables  aaaressiveness 
allies  goals  conmon-goals 
strengths  weaknesses) 


settable-variables 
inittable-variables) ) 


(define-class  competitor 

(mixins  business  adversary) ) 

(compile-class  business) 

(compile-class  adversary) 

(compile-class  competitor) 


(define  your-business  (make-instance  business)) 

(define  competitor-1  (make-instance  competitor))  End  Listings 


Dr.  Dobb's  Journal,  June  1987 


99 

481 


COLUMNS 


This  month  is  the  first  of  what 
has  unintentionally  become  a 
two-part  article.  I  had  intended  to 
implement  an  interesting  variant  of 
the  Huffman  encoding  data-com- 
pression  algorithm  that  is  used  by  the 
Unix  COMPACT  program  and  is  de¬ 
scribed  in  Robert  Gallager's  article 
"Variations  on  a  Theme  of  Huffman” 
( IEEE  Transactions  on  Information 
Theory,  vol.  IT-24,  no.  6  [November 
1978]:  668-674).  Gallager  describes  an 
adaptive  one-pass  Huffman  code  in 
which  the  code  changes  as  the  input 
file  is  processed.  This  way  the  code 
tree  stays  optimal  over  the  life  of  the 
file.  As  usual,  the  problem  proved 
more  intractable  than  I  had  at  first 
anticipated.  So,  this  month  I’ll  de¬ 
scribe  some  of  the  stuff  I  developed 
along  the  way  to  a  solution;  I’ll  dis¬ 
cuss  the  actual  compression  algo¬ 
rithm  in  a  future  column. 

Priority  Queues 

A  queue,  in  the  normal  data-structure 
sense  of  the  word,  works  like  a  line 
in  a  bank  does.  New  entries  are  add¬ 
ed  to  the  back  of  the  queue,  and 
items  are  removed  from  the  front. 
That  is,  the  items  in  the  queue  are  or¬ 
dered  by  insertion  time — those  items 

by  Allen  Holub 


that  were  inserted  first  are  removed 
first.  Another  kind  of  queue  is  a  pri¬ 
ority  queue  or  heap,  a  queuelike  data 
structure  in  which  something  other 
than  time  is  used  to  order  the  ele¬ 
ments.  For  example,  the  largest  or 
smallest  element — rather  than  the 
least-recently  inserted  element — 
could  be  the  first  to  be  dequeued. 
Heaps  have  several  uses  (other  than 
Huffman  codes).  For  example,  a  heap 
is  used  to  do  the  merge  phase  of  the 


C  CHEST 

Priority  Queues 


external-sorting  program  described 
in  the  June  1986  C  Chest.  A  clever 
sorting  algorithm  (heapsort)  uses  a 
priority  queue  to  do  the  sorting.  Ele¬ 
ments  are  put  into  the  queue  in  ran¬ 
dom  order,  and  the  smallest  element 
is  extracted  repetitively  until  the  ar¬ 
ray  is  sorted. 

The  file  PQ.C  (Listing  One,  page  94) 
contains  a  set  of  general-purpose  pri¬ 
ority  queue  routines.  Queues  can  be 
constructed  of  any  sort  of  object 
(numbers,  pointers,  structures,  and 
so  forth)  and  can  be  ordered  in  any 
sort  of  way.  You  could,  for  example, 
keep  a  queue  of  structures,  adding 
structures  to  the  queue  at  random 
but  always  extracting  the  one  with 
the  smallest  key.  You  could  also  cre¬ 
ate  a  queue  of  pointers  to  structures. 
You  could  even  use  these  routines  to 
maintain  a  normal  (but  inefficient) 
queue  by  adding  a  time-entered  field 
to  the  structure  and  then  extracting 
by  smallest  time  entered.  Similarly,  a 
stack  can  be  represented  by  a  priori¬ 
ty  queue  in  which  the  item  with  the 
largest  time  entered  is  removed  first. 

PQ.C  contains  five  externally  acces¬ 
sible  routines.  The  first  of  these  cre¬ 
ates  a  new  queue: 

typedef  char  QUEUE; 

QUEUE  *pq_create(  numele, 
elesize,  cmp,  swap, 
initheap ) 

int  numele; 
int  elesize; 


int  (*cmp)( ); 
int  (*swap)( ); 
void  ’initheap; 

The  QUEUE  type  is  a  dummy  pointer- 
size  type  that  is  used  in  the  same  way 
as  is  the  FILE  pointer  returned  from 
fopen  ( ).  Pq^createf )  is  passed  five 
arguments.  Numele  is  the  maximum 
number  of  elements  in  the  queue. 
Elesize  is  the  size  of  one  element. 
Cmp  is  a  pointer  to  a  comparison 
function.  It  is  passed  pointers  to  two 
enqueued  objects  and  should  return 
a  value  as  indicated  in  Table  1,  page 
105.  Swap  is  a  pointer  to  a  swap  func¬ 
tion.  It  is  passed  pointers  to  two  ob¬ 
jects,  and  it  should  swap  the  objects 
(not  the  pointers).  Finally,  initheap 
can  be  used  in  two  ways:  if  it  is  NULL, 
an  empty  priority  queue  is  created; 
otherwise,  initheap  is  assumed  to  be  a 
pointer  to  an  already-initialized  num- 
e/e-long  array  that  will  be  used  as  the 
heap.  I’ll  discuss  the  utility  of  this  in  a 
moment.  I’ve  used  the  ANSI  syntax 
here  to  declare  initheap.  A  void  point¬ 
er  is  one  that  doesn't  point  at  any  ex¬ 
plicit  object;  you  have  to  cast  it  to  a 
pointer  to  a  real  type  to  use  it.  If  your 
compiler  doesn't  support  void  point¬ 
ers,  use  a  pointer  to  char. 

A  queue  that’s  created  by 
pq^createf )  can  be  deleted  by  free! ); 
just  pass  it  the  pointer  returned  from 
pq—create( ).  Note,  however,  that  this 
will  only  free  memory  allocated  by 
pq—create( )  itself.  If  you  create  your 
own  initial  queue  and  pass  it  to 
pq—create(  )  via  the  initqueue  argu¬ 
ment,  you’ll  have  to  free  up  the 
memory  that  you  allocated.  By  the 
same  token,  if  you  create  a  queue  of 
pointers  to  strings,  freet )  will  free 
the  memory  used  by  the  queue  but 
not  by  the  strings. 

I’ll  illustrate  how  to  set  up  a  queue 


102 

482 


Dr.  Dobb's  Journal,  June  1987 


C  CHEST 

(continued  from  page  102) 


with  a  moderately  complex  exam¬ 
ple — say  you  want  to  keep  a  queue  of 
pointers  to  the  following  structures: 

typedef  struct 

{ 

int  weight; 
char  ’stuff; 
long  more_stuff; 

{ 

ITEM; 

The  ITEMS  can  be  inserted  in  any  or¬ 
der,  but  they  will  be  extracted  in  or¬ 
der  of  decreasing  weight.  The  com¬ 
parison  function  used  for  this 
purpose  looks  like  this: 

cmp(  pi,  p2 ) 

ITEM  **pl,  **p2; 

{ 

return  (*pl)->  weight 
-  (*p2)->  weight; 

} 

and  the  swap  function  looks  like  this: 

swap!  pi,  p2 ) 

ITEM  “pi,  “p2; 

{ 

ITEM  *tmp; 
tmp  =  *pl; 

*pl  =  *p2; 

*p2  =  tmp; 

} 

If  you  had  wanted  to  order  the  queue 
by  increasing — rather  than  decreas¬ 
ing — weight,  you  would  have  re¬ 
versed  pi  and  p2  in  the  comparison 
function’s  return  statement. 

An  empty  ten-element  queue  can 
now  be  created  with  the  following: 

QUEUE  *qp; 

qp  =  pq_create(  10,  sizeof(lTEM’), 
cmp,  swap,  0  ); 

Items  are  added  to  the  queue  using: 

int  pq_ins(  qp,  item  ) 

QUEUE  *qp; 
char  ’item; 

where  qp  is  a  pointer  returned  from 
a  previous  pq^createt )  call  and  item 
is  a  pointer  to  the  object  to  insert. 
Pq^inst  )  returns  the  number  of 
empty  slots  that  were  in  the  queue 


104 


Dr.  Dobb's  Journal,  June  1987 

483 


before  the  insert.  If  the  return  value 
is  0,  the  queue  was  full  and  pq—insf ) 
will  have  done  nothing.  A  new  item 
is  inserted  in  the  queue  that  I  created 
earlier  with  the  following  code: 

ITEM  *p; 

p  =  (ITEM  *)  malloc  (sizeof(ITEM)l; 
p->stulf  =  "A  string"; 
p->weight  =  5; 
if(  !pq_ins(  qp,  &p  ) ) 
printf("queue  is  full"); 

Note  that  you  have  to  pass  in  the  ad¬ 
dress  of  the  object  to  enqueue — in 
this  case  the  address  of  the  pointer  p. 
Items  are  extracted  from  the  queue 
with: 

int  pq_del(  qp,  item  ) 

QUEUE  *qp; 
char  *item; 

where,  again,  qp  is  a  QUEUE  pointer 
returned  from  pq— create! )  and  item 
is  a  pointer  to  a  place  into  which  the 
dequeued  object  will  be  copied.  The 
number  of  items  in  the  queue  before 
the  delete  is  returned.  If  this  number 
is  0,  the  queue  was  empty  and  the 
contents  of  *item  are  undefined.  For 
example,  the  largest  element  of  the 
queue  can  be  dequeued  with  the  fol¬ 
lowing  code: 

ITEM  *p; 

if(  !pq_del(  qp,  &p  ) ) 
printf("queue  is  empty"); 

Two  additional  support  routines 
are  provided: 

char  *pq_look(  p ) 
char  *pq_numele(  p ) 

QUEUE  *p; 

Again,  p  is  a  pointer  to  a  QUEUE  re¬ 
turned  from  a  previous  pq_create(  ) 
call.  Pq^Jookf  )  returns  a  pointer  to 
the  object  at  the  head  of  the  queue. 
The  object  is  not  actually  dequeued, 
however.  Pq^numele!  )  returns  the 
number  of  elements  currently  in  the 
queue. 

Implementation 

A  priority  queue  can  be  represented 
by  a  binary  tree  that  has  the  follow¬ 
ing  properties: 

1.  All  children  in  the  tree  have  a  val¬ 
ue  less  than  their  parent. 


Dr.  Dobb’s  Journal,  June  1987 

484 


2.  The  tree  is  as  perfectly  balanced  as 
possible — that  is,  the  difference  in 
height  of  all  leaves  is  at  most  1. 

3.  Leaves  are  inserted  into  the  tree 
from  left  to  right  until  an  entire  rank 
is  full,  then  the  next  rank  is  started. 

An  example  tree  is  shown  in  Fig¬ 
ure  1,  below.  Note  that  this  tree  is  not 
strictly  ordered  in  the  normal  way. 
That  is,  rule  1  doesn't  require  that  the 
tree  be  sorted,  only  that  the  children 
are  smaller  than  the  parent.  You 
could  exchange  the  subtrees  rooted 
at  1  and  2  without  violating  rule  1. 
This  ordering  guarantees  that  the 
root  node  of  the  entire  tree  always 
holds  the  largest  element,  however. 
Notice  that  an  insert  operation  is 
harder  than  normal  because  of  rules 
2  and  3.  An  11th  node  must  be  insert¬ 
ed  as  the  right  child  of  node  4  and  a 
12th  node  as  the  left  child  of  node  5. 
If  you  do  this,  however,  rule  1  may 
be  violated  by  the  newly  inserted 
node.  To  avoid  this  last  problem,  you 
have  to  adjust  the  contents  of  the 
nodes  at  every  insertion  (a  process 


called  reheaping).  For  example,  if 
you  insert  a  new  node  having  the 
value  k  as  the  right  child  of  node  4, 
you’ll  have  to  shuffle  things  around 
because  of  rule  1. 

The  reheaping  process  is  illustrat¬ 
ed  in  Figure  2,  page  106.  You  reheap 
from  the  bottom  up  with  an  inser¬ 
tion.  Because  k  is  greater  than  h,  the 
contents  of  nodes  10  and  4  must  be 
swapped.  You  then  go  up  one  level  to 
node  4  and  compare  its  key  to  its  par¬ 
ent’s.  Flere  k  is  still  greater  than  i,  so 
you  swap  again,  moving  the  k  to 
node  1.  Finally,  nodes  0  and  1  must  be 
swapped  as  well,  moving  k  to  the 
root  position. 

Because  of  rules  2  and  3,  it's  conve¬ 
nient  to  represent  the  tree  as  an  ar¬ 
ray  in  which  the  element  at  node  N 
has  children  at  nodes  2N  +  1  and 
2N+2.  For  example,  the  node  at  ar¬ 
ray  10 1  has  children  at  array  11  ]  and  ar- 
ray[2],  the  node  at  array[2l  has  chil¬ 
dren  at  array[5J  and  array  IS  1,  and  so 
forth.  This  array  representation,  usu¬ 
ally  called  a  heap,  has  both  advan¬ 
tages  and  disadvantages.  The  main 


For  an  ascending  queue 

For  a  descending  queue 

(the  smallest  object  is 

(the  largest  object  is 

dequeued  first): 

dequeued  first): 

(*cmp) (pi ,p2) 

( *cmp ) (pi ,p2) 

if:  returns: 

if:  returns: 

*p1  >  *p2  negative  number 

*pl  >  *p2  positive  number 

♦pi  =  =  *p2  0 

*p1  ==  *p2  o 

♦pi  <  *p2  positive  number 

*p  1  <  *p2  negative  number 

Figure  1:  A  priority  queue  represented  as  a  binary  tree 


105 


C  CHEST 

(continued  from  page  105) 

problems  are  that  the  maximum 
number  of  elements  must  be  known 
at  create  time,  and  it's  difficult  to  de¬ 
lete  an  interior  node  at  random  or  to 
merge  two  queues.  Nonetheless,  the 
heap  representation  is  the  most  effi¬ 
cient  for  the  overwhelming  majority 
of  applications — in  which  the  only 
operations  are  insert  and  delete. 

The  tree  from  Figure  1  is  shown  as 
an  array  in  Figure  3,  right.  To  insert  a 
node  in  the  tree,  you  make  the  array 
one  element  larger,  put  the  new 
node  in  the  rightmost  element,  and 
then  reheap  from  the  bottom  up 
(right  to  left).  The  largest  node  is  al¬ 
ways  at  array [0],  just  as  it  was  always 
at  the  root  position  in  the  tree.  Delete 
the  largest  element  by  copying  the 
rightmost  node  in  the  tree  (array [9] 
in  Figure  3)  into  array [01,  making  the 
array  one  element  smaller,  and  then 
reheaping  from  the  top  down  (left  to 
right). 

The  heap  representation  is  a  rea¬ 
sonably  efficient  one.  Unlike  a  nor¬ 
mal  binary  tree,  you  never  have  to 
search  for  the  insertion  point  in  the 
tree  (it’s  always  at  the  far  right).  Simi¬ 
larly,  the  largest  element  is  always  in 
a  fixed  place  (at  the  far  left).  Though 
there’s  a  certain  amount  of  copying 
that  has  to  be  done  during  a  reheap, 
no  more  than  log2N  copies  need  ever 
be  done.  Nonetheless,  it’s  worth¬ 
while  to  make  the  queue  elements 
themselves  small.  Use  an  array  of 
pointers  to  structures  rather  than  an 
array  of  structures.  This  way  you 
have  to  swap  only  two  pointers — 
rather  than  two  complete  struc¬ 
tures — when  you  reheap. 

Note  that  a  sorted  array  is  a  valid 
heap  (though  a  heap  is  not  necessari¬ 
ly  a  sorted  array).  This  property  ex¬ 
plains  the  initheap  argument  to 
pq^jcreate) ).  If  your  input  is  already 
a  sorted  list,  there's  no  point  in  doing 
a  series  of  insert  operations  to  create 
the  heap — you  can  just  pass  the  ar¬ 
ray  directly  to  pq— create) ). 

The  priority  queue  is  represented 
internally  by  the  following  struc¬ 
ture: 

typedef  struct 

{ 

int  I'cmpll ); 

int  (*swap)( ); 


b.  after  reheaping 


Figure  2:  Inserting  a  new  node 


Figure  3:  A  priority  queue  represented  as  an  array 


106 


Dr.  Dobb's  Journal,  June  1987 

485 


C  CHEST 

(continued  from  page  ,106) 

most  recently  added  element.  It  will 
point  at  heapl-1]  if  the  heap  is  empty. 
Note  that  you  could  derive  nitems  by 
subtracting  bottom  from  heap  but  it's 
more  convenient  to  keep  it  as  a  sepa¬ 
rate  number. 

The  pq^createf )  function  on  lines 
128-188  of  Listing  One  creates  a  PQ 
structure  and  initializes  the  various 
fields.  If  initheap  is  not  NULL,  space 
for  the  PQ  structure  alone  is  allocated 
(on  line  167)  and  heap  is  made  to  point 
to  the  specified  array.  The  various 
other  fields  are  adjusted  so  that  the 
heap  is  full.  If  initheap  is  NULL,  space 
for  both  the  PQ  structure  and  the 
heap  itself  is  allocated  (on  line  176), 
and  an  empty  heap  is  initialized.  Fi¬ 
nally,  a  pointer  to  the  PQ  structure  is 
returned  on  line  188. 

The  insert  and  delete  operations 
are  performed  in  pq—ins( )  (lines  193- 
219)  and  pq^deK  )  (lines  224-258)  us¬ 
ing  the  method  described  earlier.  Be¬ 
cause  the  compiler  doesn’t  know  the 
size  of  a  heap  element  at  compile 
time,  items  must  be  copied  with 
memcpyt )  calls  at  run  time.  The  re- 
heaping  is  deferred  to  two  static 
workhorse  functions — refieap_ 
down( )  and  reheap— up( ) — declared 
at  the  top  of  the  file.  Reheap— down( ) 

(lines  51-89)  starts  at  the  root  node 
( heap[0] )  and  works  down  the  tree.  It 
selects  the  larger  of  the  two  children 
on  fines  72-81  and  then  swaps  the 
root  element  and  the  larger  child 
(and  moves  down  the  tree)  if  neces¬ 
sary.  The  reheap  process  stops  as 
soon  as  the  root  is  larger  than  both 

int  itemsize; 
int  nitems; 
int  maxitem; 
char  ’bottom; 
char  ’heap; 

{ 

PQ; 

The  comparison  and  swap  function 
pointers  are  remembered  in  cmp 
and  swap.  Itemsize  is  the  size  in  bytes 
of  one  element  of  the  heap,  nitems  is 
the  number  of  elements  currently  in 
the  heap,  and  mayhem  is  the  maxi¬ 
mum  number  of  items  that  the  heap 
can  hold.  The  heap  field  points  at  the 
array  itself,  and  bottom  points  at  the 

^  Standard  #include  Files 

At  present  I’m  using  the  Microsoft  C 
compiler,  Version  4.0,  for  two  rea¬ 
sons — the  first  is  the  CodeView  de¬ 
bugger,  and  the  second  is  the  degree 
of  Unix  compatibility.  (The  MS-DOS 
compiler  is  the  Xenix  compiler; 
there's  literally  no  difference.)  I 
spend  a  lot  of  time  bouncing  around 
between  the  Unix  system  at  Berkeley 
and  my  own  PC,  so  having  compati¬ 
ble  compilers  is  essential.  Using  the 
Microsoft  compiler,  I've  never  had  to 
modify  a  program  written  under  one 
system  and  ported  to  the  other  unless 
that  program  did  some  sort  of  very- 
low-level  communication  with  the 
operating  system  (such  as  talk  direct¬ 
ly  to  the  BIOS). 

There's  another  advantage  to  Unix 
compatibility,  and  that’s  standardiza¬ 
tion.  Although  the  emerging  ANSI 
standard  finally  incorporates  the  I/O 
library  into  the  language— at  least  in 
terms  of  regularizing  the  function 
names  and  argument-passing  con¬ 
ventions — there  aren’t  nearly  as 
many  ANSI  functions  as  there  are 
Unix  functions,  so  Unix  must  contin¬ 
ue  to  provide  a  de  facto  standard. 
You  don't  really  know  C  unless  you 
know  the  Unix  library.  Your  code 
just  won't  be  portable  because  you 
don't  know  what's  standard  and 
what  isn't.  Learning  your  own  com¬ 
piler’s  library  is  a  necessary  but  not 

Flotsam  and  Jetsam 

sufficient  condition  for  knowing  C. 
As  a  consequence,  it's  very  much  to 
your  advantage  to  pick  up  a  copy  of 
the  Unix  System  V  documentation 
and  read  through  it.  Hitherto,  these 
manuals  have  been  hard  to  come  by. 
All  that  you  could  get  were  the  now- 
out-of-date  Version  7  manuals  (the 
big  blue  and  green  paperbacks). 
AT&T,  however,  has  just  published 
the  manuals  for  Release  2.0  of  Unix 
System  V.  The  complete  five-volume 
set  is  overkill  unless  you’re  really  us¬ 
ing  Unix.  Nonetheless,  Volume  2 — 
System  Calls  and  Library  Routines — 
should  be  in  every  C  programmer's 
library.  It’s  available  by  mail  order 
(for  $29.95)  from  the  Computer  Liter¬ 
acy  bookshop  in  San  Jose,  California, 
and  is  well  worth  the  money.  (Com¬ 
puter  Literacy’s  phone  number  is 
[408]  435-1118.  The  five-volume  set  is 
Steven  V.  Earhart’s  Uniy  Program¬ 
mer's  Manual  [New  York:  Holt,  Rine¬ 
hart  &,  Winston,  1986].) 

Because  I  develop  all  the  programs 
that  appear  in  C  Chest  in  the  Unix  en¬ 
vironment  (the  Microsoft  environ¬ 
ment  is  the  Unix  environment  for  all 
intents  and  purposes),  it  seems 
worthwhile  to  list  the  various  #iri- 
clude  files  that  I  use  in  C  Chest.  These 
are  usually  presented  without  com¬ 
ment  because  they’re  both  standard 
and  well  documented  elsewhere.  If 
your  compiler  doesn’t  have  one  of 

these  files,  then  you  don’t  need  to  # in¬ 
clude  it  in  your  program  because 
none  of  the  library  functions  will  re¬ 
quire  any  of  the  things  included 
within  the  file.  By  the  same  token,  if 
your  compiler  doesn’t  have  one  of 
these  files,  then  it's  not  Unix  compati¬ 
ble,  in  spite  of  what  the  advertise¬ 
ments  may  say. 

It's  impossible  for  me  to  describe 
how  to  port  programs  to  every  non¬ 
standard  compiler  on  the  market. 

You'll  have  to  learn  how  to  do  this 
yourself.  You’ll  need  to  know  your 
own  compiler’s  library  pretty  well 
and  have  at  least  a  working  knowl¬ 
edge  of  the  Unix  equivalents  to  vari¬ 
ous  library  functions.  Do  your  home¬ 
work.  I  will  say,  while  on  the  subject, 
that  the  Lattice  compiler  is  one  of 
those  that  falsely  claims  Unix  com¬ 
patibility — for  example,  it  doesn't 
support  a  Unix-compatible  stat(  ) 
function,  it  uses  the  name  fork (  )  in 
incorrect  ways,  and  it  doesn't  have  a 
Unix-compatible  * include  file  system. 

These  inconsistencies  are  surprising 
because  Lattice  seems  to  have  gone  to 
a  lot  of  trouble  to  add  Unix-compati¬ 
ble  functions  to  its  library  in  the  last 
release.  Unfortunately,  the  degree  of 

Unix  compatibility  that  is  indeed  pre¬ 
sent  lurbs  you  into  a  false  sense  of  se¬ 
curity. 

The  various  * include  files  that 
you're  likely  to  see  in  a  C  Chest  pro- 

108 

486 


Dr.  Dobb's  Journal,  June  1987 


children.  Note  that  an  ascending 
queue  (one  where  the  root  holds  the 
smallest  rather  than  the  largest  ele¬ 
ment)  can  be  created  by  modifying 
the  comparison  function,  without 
touching  the  reheap  code  at  all. 

Reheap— up(  )  (lines  95-124)  reheaps 
in  the  other  direction.  It  starts  with 
the  most  recently  entered  item 
(which  is  at  heap/nitems-l])  and 
works  up  the  tree  to  the  root.  The 
routine  is  simpler  than  reheap 
—downf  )  because  you  don't  have  to 
find  the  larger  sibling — a  child  has 
only  one  parent.  Again,  the  reheap¬ 
ing  processes  stops  as  soon  as  a  par¬ 
ent  node  that  is  larger  than  the  cur¬ 
rent  child  node  is  detected. 

The  remainder  of  the  file  is  com¬ 
piled  only  if  MAIN  is  # defined .  In  this 

case,  a  stand-alone  test  program  is 
compiled.  This  program  creates  a 
ten-element-long  heap  of  character 
pointers  and  then  adds  or  deletes 
strings  from  the  heap  according  to 
commands  entered  from  the  key¬ 
board.  Type  i<string>  to  insert 
<string>  into  the  heap,  d  to  delete 
an  item,  and  q  to  exit  from  the  pro¬ 
gram.  The  priority  queue  is  created 
either  on  lines  359  or  361,  depending 
on  the  value  of  Makequeue.  If  Make- 
queue  is  set,  an  empty  queue  is  creat¬ 
ed;  otherwise,  a  preinitialized  queue 
(using  the  array  declared  on  lines 
353-356)  is  created.  If  Ascending  is 
true,  the  queue  is  an  ascending  prior¬ 
ity  queue  (items  will  be  removed  in 
ascending  order);  otherwise,  it's  de¬ 
scending.  There’s  no  explicit  code  to 

modify  Makequeue  and  Ascending  (I 
just  modified  them  with  CodeView, 
the  Microsoft  debugger,  as  I  was  de¬ 
bugging). 

New  items  are  inserted  into  the 
queue  on  line  386  and  deleted  on  line 

395.  The  comparison  and  swap  func¬ 
tions  are  declared  on  lines  289-313. 
Finally,  the  strsavef  )  function,  used 
on  line  389,  is  shown  in  Listing  Two, 
page  96.  Because  this  is  just  a  small 
test  program,  I’m  ignoring  the  error 
return  from  strsavef  ) — you 

shouldn’t  do  this  in  a  real  applica¬ 
tion,  of  course. 

FKJEQ.C 

The  priority  queue  routines  are  of 
general  utility.  You  need  a  few  spe¬ 
cial-purpose  utilities  to  make  Huff- 

_ 

gram  are  listed  below.  Most  of  these 
are  both  Unix  and  Microsoft  compati¬ 
ble,  but  some  are  used  just  in  the  DOS 
compiler.  Note  that  this  isn't  a  list  of 
all  the  Unix/Microsoft  * include 
files — just  the  ones  I'm  likely  to  use. 

ctype.h — contains  various  text  pro¬ 
cessing  macros:  isalpha,  isupper,  is- 
lower,  isdigit,  isydigit,  isspace, 
ispunct,  isalnum,  isprint,  isgraph, 
iscntrl,  isascii,  toupper,  tolower,  and 
toascii. 

dos.h — not  Unix  compatible.  Con¬ 
tains  # defines  for  the  MS-DOS  inter¬ 
face  functions:  bdos,  doseyterr,  int86, 
int86y,  intdos,  and  segread. 

errno.h — 9 defines  for  the  various  er¬ 
ror  condition  codes  returned  from 
the  I/O  library:  EZERO,  EPERM,  E- 
NOENT,  ESRCH,  EINTR,  EIO,  ENXIO, 
E2RIG,  ENOEXEC,  EBADF,  ECHILD,  E- 
AGAIN,  ENOMEM,  EACCES,  EFAULT, 
ENOTBLK,  EBUSY,  EEXIST,  EXDEV,  E- 
NODEV,  ENOTDIR,  EISDIR,  EINVAL,  E- 
NFILE,  EMFILE,  ENOTTY,  ETXTBSY,  E- 
FBIG,  ENOSPC,  ESPIPE,  EROFS,  EM  LINK, 
EPIPE,  EDOM,  ERANGE,  EUCLEAN,  and 
EDEADLOCK. 

fcntl.h — contains  definitions  needed 
to  use  the  unbuffered  I/O  function 
open(  ):  O—RDONL Y,  O-WRONLY, 
O—RDWR,  O 4  PPEND,  O—CREAT, 

O-TRUNC,  O—EXCL,  O-TEXT ,  and 
O^BINARY. 

io.h — contains  function  declarations 
for  access,  chmod,  chsize,  close, 
creat,  dup,  dupZ,  eoffilelength,  isatty, 
locking,  Iseek,  mktemp,  open,  read,  re¬ 
name,  setmode,  sopen,  tell,  umask,  un¬ 
link,  and  write. 

math.h — contains  definitions  for  abs, 
acos,  asin,  atan,  atan2,  atof  bessel, 
cabs,  ceil,  cos,  cosh,  exp,  fabs,  floor, 
fmod,  frexp,  hypot,  labs,  Idexp,  log, 
loglO,  matherr,  modf,  pow,  sin,  sinh, 
sqrt,  tan,  and  tanh. 

process.h — contains  definitions  for 
various  process-control  functions: 
abort,  execl,  execle,  execlp,  execlpe, 
execv,  execve,  eyecvp,  execvpe,  exit, 
—exit,  getpid,  spawnl,  spawnle, 
spawnlp,  spawnlpe,  spawnv, 

spawnve,  spawnvp,  spawnvpe,  and 
system. 

signal. h — contains  definitions  need¬ 
ed  by  the  signali )  subroutine.  Some 
common  definitions  are:  S1GINT, 
SIGFPE,  SIG—DFL,  and  SIG-IGN. 

stdarg.h — contains  definitions  need¬ 
ed  to  write  an  ANSI-compatible  sub¬ 
routine  with  a  variable  number  of 
arguments  (see  varargs.h).  The  fol¬ 
lowing  are  defined:  va—list,  vastart, 

va_arg,  and  vasnd. 

stdio.h — contains  * defines  and  so  on 
for  all  the  buffered  I/O  functions  (/- 
open,  fprintf,  and  so  forth).  St  din, 
stdout,  stderr,  FILE,  EOF,  and  NULL  are 
all  defined  here.  Note  that  several 
psuedofunctions  that  you’re  used  to 
thinking  of  as  subroutines  ( getchar , 
putchar,  getc,  putc,  feof,  ferror,  and 
fileno )  are  actually  macros  that  are 
^defined  in  stdio.h.  If  you  forget  to 
« include  this  file,  you’ll  get  error  mes¬ 
sages  from  the  linker  (such  as  "Unre¬ 
solved  external:_putchar”).  Putchar 
is  a  macro  that  ultimately  evaluates 
to  a  call  to  a  system-level  subroutine 
usually  called  either  _ flsbuf  or 
_ flushbuf. 

sys/stat.h— contains  definitions  for 
the  system-status  subroutines  stat( ) 
and  fstatf ). 

sys/types.h — also  required  by  staff ) 
and  fstatf  ).  Various  time  functions 
(such  as  utimef ))  use  the  information 
declared  here  as  well. 

varargs.h — Definitions  for  the  vari- 
able-number-of-argument  mecha¬ 
nism  used  by  Unix  System  V  (see 
stdarg.h).  The  following  are  defined 
here:  va—list,  va—dcl,  va—list,  va_a- 
list,  vastart,  vasrg,  and  vasnd.  £ 

Dr.  Dobb's  Journal,  June  1987 


109 

487 


C  CHEST 

(continued  from  page  109 ) 


man  trees,  however.  One  of  these  is 
the  FREQ.C  program  shown  in  Listing 
Three,  page  96.  FREQ.EXE  is  a  stand¬ 
alone  program  that  takes  as  input  a 
list  of  files  and  outputs  a  table  show¬ 
ing  the  frequency  of  occurrence  of 
every  8-bit  pattern  in  these  files  (I'll 
call  these  patterns  “characters'’  from 
here  on).  The  output  table  is  sorted  in 
ascending  order  of  frequency  (least 
frequently  occurring  characters  to¬ 
ward  the  top).  The  output  format 
uses  one  character  per  line  with 
three  numbers  on  each  line — the 
leftmost  number  is  the  value  of  the 
character  itself  (in  hex);  the  next  two 
numbers  are  character  frequencies, 
output  in  (wo  forms.  The  first  form  is 
the  probability  of  occurrence  of  ev¬ 
ery  character — the  number  of  times 
that  that  character  occurred  in  the 
input  divided  by  the  total  number  of 
input  characters.  The  second  form  is 
a  normalized  character  count  in 
which  the  least  frequently  occurring 
nonzero  pattern  has  the  value  of  1— 
it’s  the  character  count  divided  by 
the  count  associated  with  the  least 
frequently  occurring  character. 

The  table  itself  is  declared  on  lines 
13-21.  It  is  a  256-element  array  of 
ITEMS,  indexed  by  character  value. 
One  field  of  the  ITEM  structure  holds 
a  count  that  is  incremented  every 
time  the  associated  character  is  en¬ 
countered  in  the  input.  The  other 


field  holds  the  character  value  itself. 
You  can  derive  this  from  the  index, 
of  course,  but  putting  it  into  an  ITEM 
lets  you  sort  the  array  by  frequency 
of  occurrence  without  loosing  the 
value. 

The  array  is  loaded  with  the  for 
loop  on  lines  58-74.  The  count  associ¬ 
ated  with  every  byte  is  incremented 
on  line  69.  The  counts  are  all  initial¬ 
ized  to  0  by  default  because  global- 
level  objects  are  always  initialized  to 
0  unless  there’s  an  explicit  initializer 
present  as  part  of  the  declaration. 
The  next  for  loop,  on  lines  83-93,  ini¬ 
tializes  the  value  fields  of  the  array 
and  at  the  same  time  finds  the  ele¬ 
ment  having  the  smallest  nonzero 
value.  (I'm  getting  the  largest  value, 
too,  but  am  not  using  it  for  anything 
at  present.)  The  array  is  sorted  with 
the  ssort(  )  call  on  line  105.  This  sub¬ 
routine  works  just  like  the  Unix-com¬ 
patible  qsortf )  does,  but  it  does  a 
shell  sort.  See  the  comment  on  lines 
95-102  for  more  information.  The 
comparison  function  used  by  ssortf ) 
is  declared  on  lines  25-39.  Finally, 
the  table  is  printed  (and  the  probabil¬ 
ities  and  so  forth  are  computed)  in 
the  loop  on  lines  115-126.  The  sum  of 
the  probabilities  is  printed  to  stderr 
on  line  126  just  to  make  sure  that  ev¬ 
erything  worked  Correctly.  It  should 
always  be  1.00. 

Availability 

All  the  source  code  for  articles  in  this 
issue  is  available  on  a  single  disk.  To 


order,  send  $14.95  to  Dr.  Dobb's  Jour¬ 
nal,  501  Galveston  Dr.,  Redwood  City, 
CA  94063  or  call  (415)  366-3600  ext. 
216.  Please  specify  the  issue  number 
and  format  (MS-DOS,  Macintosh, 
Kaypro). 

Bibliography 

Fredman,  Sedgewick,  Sleator,  and 
Tarjan.  "The  Pairing  Heap:  A  New 
Form  of  Self-Adjusting  Heap.”  Algo- 
rithmica  1:1  New  York:  Springer  Ver- 
lag,  1986:  111-129.  This  describes  a 
priority  queue  represented  as  a  tree 
(in  order  to  make  the  merge  and  ran¬ 
dom-delete  operations  more 
efficient). 

Holub,  Allen.  "C  Chest. ’’  DDJ  (June 
1986):  26-40.  This  article  describes  a 
set  of  general-purpose  queue-ma¬ 
nipulation  routines.  The  same  article 
appears  in  DDJ  Bound  Volume  10: 
436-334. 

Sedgewick,  Robert.  Algorithms. 
Reading,  Mass.:  Addison-Wesley, 
1983:  127-141.  This  book  contains  a 
description  of  priority  queues  (or 
heaps)  and  heapsort. 

DDJ 

(Softstrips  begin  on  page  92.) 

(Listings  begin  on  page  94.) 

Vote  for  your  favorite  feature/article. 

Circle  Reader  Service  No.  8. 


110 

488 


Dr.  Dobb's  Journal,  June  1987 


COLUMNS 


16-BIT  SOFTWARE  TOOLBOX 


Resources 

Ward,  Robert.  Debugging  C.  India¬ 
napolis,  Ind.:  Que  Corp.,  1986.  350 
pages  with  index.  ISBN  0-88022-261-1. 

Finding  this  hook  among  the  flot¬ 
sam  and  jetsam  of  computer  publish¬ 
ing  is  like  finding  a  $100  bill  in  the 
street.  The  art,  craft,  and/or  skill  of 
effective  debugging  is  a  topic  that  has 
rarely  been  discussed  in  any  useful 
way  in  the  computer  literature.  The 
debugging  chapters  in  manuals  or 
textbooks  are  usually  focused  on  de¬ 
fensive  strategies  and  the  design  of 
test  data,  rather  than  on  the  stabiliza¬ 
tion,  isolation,  localization,  and  cor¬ 
rection  of  bugs  once  detected.  To 
new  programmers,  who  have  not 
yet  developed  a  methodical  ap¬ 
proach  to  debugging,  the  process 
seems  a  little  magical  and  their  own 
approach  is  frequently  haphazard 
and  based  largely  on  luck.  Although 
native  talent  is  an  element  of  effec¬ 
tive  debugging,  as  it  is  of  effective 
programming,  a  far  larger  compo¬ 
nent  is  simply  the  accumulated  expe¬ 
rience  and  the  strategies  learned  by 
finding  and  fixing  many  different 
types  of  bugs  in  the  past. 

In  his  introduction,  Mr.  Ward 
writes:  "Until  I  started  teaching,  I  as¬ 
sumed  that  good  debugging  skills 
were  a  natural  outgrowth  of  good 
design  skills.  Not  so.  A  bright  student 
may  intuitively  decompose  a  prob¬ 
lem  into  beautifully  coherent,  cohe¬ 
sive,  functional  modules.  That  same 
student  may  not  be  able  to  find  the 


by  Ray  Duncan 

most  trivial  syntax  errors,  let  alone 
discover  subtle  runtime  bugs.  Equal¬ 
ly  bright  students  turn  in  working 
designs  that  literally  defy  analysis. 
While  I  don't  believe  that  we  learn 
debugging  by  studying  design,  I  do 
believe  that  we  can  learn  efficient 
debugging. 

"We  can  develop  a  methodological 


model  that  directs  our  efforts  toward 
more  productive  searches.  We  can 
acquire  heuristic  knowledge  (a  kind 
of  folk  wisdom)  about  where  to  look 
first.  We  can  be  deliberately  sensitive 
to  the  different  variables  and  observ¬ 
able  phenomena  in  different  envi¬ 
ronments.  We  can  become  expert  at 
selecting  and  using  appropriate  tools. 
And,  through  critical  analysis  of  our 
attempts  to  find  'worthy'  bugs,  we 
can  learn  from  our  own  mistakes.” 

The  first  chapters  of  Debugging  C 
concentrate  on  the  debugging  pro¬ 
cess  itself,  with  emphasis  on  recogni¬ 
tion  of  bugs  (lexical,  syntactic,  execu¬ 
tion,  and  intent  errors),  their 
localization  (using  the  principles  of 
lexical,  temporal,  or  referential  prox¬ 
imity),  a  methodical  approach,  and 
good  record  keeping.  Later  chapters 
discuss  the  localization  of  compile¬ 
time  errors  and  tracing  methods. 
The  last  chapters  of  the  book  are  par¬ 
ticularly  C-oriented,  with  excellent 
discussions  of  the  special  problems  of 
C  programmers:  bugs  due  to  data 
type  mismatches,  operator  prece¬ 
dence  or  misuse,  and  uninitialized  or 
out-of-range  pointers.  Here  is  where 
the  author  gently  presents  many  use¬ 
ful  tips  that  are  common  sense  to  vet¬ 
eran  C  coders,  usually  acquired 
through  painful  experience. 

Example:  “C,  unlike  Pascal  and  Ba¬ 
sic,  doesn't  give  an  initial  value  to  lo¬ 
cal  variables  when  it  creates 
them  ....  Most  stack  frames  are  rela¬ 
tively  small  (less  than  20  bytes)  but 
every  stack  frame  has  a  frame  point¬ 
er  (an  address  higher  in  the  stack) 
and  a  return  address  (an  address  in 
the  code  area).  Because  uninitialized 


pointers  will  use  these  obsolete  ad¬ 
dresses  which  occur  frequently  in 
reused  areas  of  the  stack,  the  pro¬ 
grammer  can  expect  10  to  30  percent 
of  the  uninitialized  pointers  to  refer¬ 
ence  code  or  areas  of  the  stack."  Simi¬ 
larly,  Mr.  Ward  explains,  with  stack 
traces  and  memory  dumps,  why 
common  errors  such  as  initializing 
too  many  bytes  of  an  automatic  char¬ 
acter  array  result  in  bizarre  transfers 
of  control  or  crashes  “between  lines” 
of  source  code. 

The  last  few  chapters  describe  the 
use  of  some  common  machine-level 
and  symbolic  debuggers,  interpret¬ 
ers,  and  integrated  development  en¬ 
vironments.  The  book’s  appendices 
include  the  C  source  code  for  a  de¬ 
bugging  subsystem  that  can  be 
linked  to  a  C  application.  The  debug¬ 
ger  provides  tracing  at  several  layers 
of  granularity,  stack  frame  displays, 
memory  dumps,  and  watch  points 
(periodic  checks  to  see  if  a  certain 
variable  has  been  altered  or  has  tak¬ 
en  on  a  specified  value). 

The  author  couples  a  direct,  lucid, 
and  informal  style  of  writing  with  a 
depth  of  understanding  and  highly 
structured  presentation  that  are  un¬ 
usually  effective.  In  spite  of  the  title, 
most  of  the  book  has  broad  applica¬ 
bility  and  would  be  perfectly  com¬ 
prehensible  to  anyone  with  even  a 
passing  acquaintance  with  the  C 
language. 

Hyman,  Michael  I.  Memory  Resident 
Utilities,  Interrupts,  and  Disk  Manage¬ 
ment  with  MS  and  PC  DOS.  Portland, 
Oreg.:  Management  Information 
Source,  1986.  373  pages  with  index. 
ISBN  0-943518-73-3. 

This  book,  with  the  short  catchy  ti¬ 
tle,  is  anything  but  self-effacing.  The 
introduction  reads:  “You  will  find 
this  book  to  be  the  ultimate  reference 
guide  to  getting  the  most  out  of  your 
machine.  As  you  read  it,  you’ll  learn 
how  to  use  and  enhance  DOS  to  ex- 


112 


Dr.  Dobb 's  Journal,  June  1987 

489 


16-BIT 

(continued  from  page  112) 


plore  your  computer  and  make 
mighty  programs.  Powerful  exam¬ 
ples  will  lead  you  along  the  way  and 
provide  models  for  later  refer¬ 
ence....”  Such  hype  is  common¬ 
place  from  publishers'  PR  depart¬ 
ments,  but  it's  a  little  more  unusual 
coming  straight  from  the  author's 
pen.  Several  paragraphs  farther  on, 
you  find  the  admonition  "The  letter 
T  and  the  number  T'  are  represent¬ 
ed  in  the  program  code  by  the  same 
character.  In  no  case  is  the  letter  T 
used  as  a  variable  name.  This  should 
eliminate  any  confusion.”  If  there’s 
anything  I’m  already  confused  about 
at  this  point,  it’s  why  the  publisher  of 
an  "ultimate”  reference  book  would 
allow  it  out  of  the  door  containing 
program  listings  wherein  the  nu¬ 
meral  1  and  the  letter  /  cannot  be  dis¬ 
tinguished.  But  no  matter,  let's  see 
what  the  rest  of  the  book  has  to  offer. 

Chapters  1  through  8  discuss  the 
boot  sector,  file  allocation  table,  di¬ 
rectory  structure,  and  file  area  of  MS- 
DOS  disks.  Incredibly  tortuous  Turbo 
Pascal  source  code  for  a  disk  peeker/ 
patcher  named  Explorer  is  devel¬ 
oped  as  part  of  the  exposition.  Each 
chapter  ends  with  a  summary  of 
sorts  entitled  "Key  Programming 
Points.”  Here  are  the  Key  Program¬ 
ming  Points  listed  at  the  end  of  Chap¬ 
ter  4:  "Sectors  are  the  smallest  orga¬ 
nizational  unit.  They  contain  512 
bytes.  You  can  find  interesting  mes¬ 
sages  and  modify  programs  by  edit¬ 
ing  sectors.  Explorer  uses  arrow  keys 
to  move  the  cursor.” 

The  following  few  chapters  dis¬ 
cuss  the  partition  table,  “un-erasing” 
files,  and  a  program  that  patches 
COMMAND.COM  to  change  the  names 
of  MS-DOS  internal  commands  (real 
useful!).  Next  the  author  covers  input 
and  output  with  the  mouse,  key¬ 
board,  and  screen  using  IBM  PC  ROM 
BIOS  drivers;  an  overview  of  file  and 
record  I/O;  directory  searching;  and 
memory  management.  Nothing  new 
here. 

Finally,  in  Chapters  28  through  38, 
you  get  to  the  apparent  reason  for 
the  book’s  existence:  the  problems 
and  pitfalls  of  programming  Termi¬ 
nate  and  Stay  Resident  (TSR)  utilities. 
This  part  of  the  book  contains  much 
useful  information,  poorly  organized 


and  presented,  about  chaining  onto 
interrupts,  putting  up  and  taking 
down  pop-up  displays,  monitoring 
the  keyboard  for  hot  keys,  and  the 
like.  If  this  part  of  the  book  were 
properly  structured  and  edited,  it 
would  make  a  decent  magazine  arti¬ 
cle,  but  the  book  as  a  whole  is  a  poor 
investment. 

Norton,  Peter;  and  Socha,  John.  Peter 
Norton's  Assembly  Language  Book 
for  the  IBM  PC.  New  York:  Brady/ 
Prentice-Hall,  1986.  413  pages  with 
index.  ISBN  0-13-661901-0. 

This  book  is  Yet  Another  Assem¬ 
bly-Language  Tutorial  of  average 
quality.  It  is  built  around  the  design 
and  stepwise  enhancement  of  a  sim¬ 
ple  disk  sector  modification  utility 
called  DSKPATCH,  discussing  in  pass¬ 
ing  some  issues  of  processing  key¬ 
board  input  and  updating  screen  dis¬ 
plays  using  the  IBM  PC  ROM  BIOS  video 
driver. 

This  book  is  mainly  notable  as  a 
demonstration  of  the  recent  trend  to¬ 
ward  commodity  marketing  of  com¬ 
puters  and  related  products.  As  the 
competition  in  the  wonderland  of  sil¬ 
icon  has  become  more  intense,  we 
have  seen  the  adoption  of  marketing 
tactics  that  were  previously  the 
province  of  car,  appliance,  and  ap¬ 
parel  manufacturers,  such  as 
scratch-off  sweepstake  tickets  (Bor¬ 
land);  cash  rebates  (Apple);  "free  gifts 
with  purchase”  (with  every  dBASE  III 
Plus,  get  a  Cross  writing  instrument 
free!);  mystical  mumbo  jumbo  such 
as  Robert  Carr,  Wayne  Ratliff,  and 
Jonathan  Sachs  being  touted  as 
“Chief  Scientists”  of  their  respective 
companies;  and  last  but  not  least,  ce¬ 
lebrity  endorsements. 

Back  in  1983-1984,  while  John  So¬ 
cha  was  a  contributing  editor  for  Sof- 
talk/PC,  he  wrote  a  book  called  As¬ 
sembly  Language  Safari  for  the  IBM 
PC:  First  Explorations  (Bowie,  Md.: 
Brady,  1984.  ISBN  0-89303-321-9).  For 
various  reasons,  including  rather 
poor  production  values  in  the  book 
itself  and  massive  financial  problems 
at  the  Brady  corporate  level,  the 
book  received  little  attention  and 
went  out  of  print  shortly  thereafter. 
Nowadays,  John  Socha  works  for  Pe¬ 
ter  Norton  Inc.  in  Santa  Monica,  Cali¬ 
fornia,  and  is  the  author  of  the  pro¬ 
gram  sold  as  the  Norton  Commander. 
When  Peter  Norton’s  Assembly  Lan¬ 


guage  Book  appeared,  with  Peter 
Norton  and  John  Socha  listed  as  coau¬ 
thors,  I  thought  it  would  be  instruc¬ 
tive  to  make  a  page-by-page  compari¬ 
son  of  John  Socha’s  old  book  with  the 
new  one. 

A  fairly  generous  assessment  of  the 
two  books  is  that  there  are  approxi¬ 
mately  17  pages  of  new  material  in 
the  "new”  Norton/Socha  version, 
scattered  among  the  following  top¬ 
ics:  the  Proceed  command  of  DEBUG, 
a  MASM  program  skeleton,  MAKE 
files,  SYMDEB,  linker  maps,  .COM  vs. 
.EXE  programs,  the  ASSUME  directive, 
segment  overrides,  and  phase  errors. 
The  remainders  of  the  two  books  are 
identical,  except  for  some  redrawn 
figures,  the  addition  of  some  titles 
and  divider  pages,  and  some  minor 
changes  in  wording  that  would  have 
been  introduced  by  any  competent 
editor.  There  are  also  three  new  ap¬ 
pendices  that  are  mostly  filler  from 
other  sources,  such  as  MASM  error 
messages  and  character  tables.  In 
other  words,  the  substance  of  Mr. 
Norton's  contribution  to  this  book 
seems  to  be  his  picture  on  the  cover, 
his  billing  as  a  coauthor,  and  the  run¬ 
ning  head  "Peter  Norton’s  Assembly 
Language  Book”  on  the  top  of  each 
right-hand  page. 

Some  might  argue  that  the  kind  of 
misrepresentation  involved  here 
hurts  no  one  and  is  therefore  of  no 
consequence.  To  be  sure,  the  book’s 
purchasers,  although  they  might 
have  been  misled  about  the  creative 
origins  of  the  book,  have  still  bought 
a  reasonably  useful  introduction  to 
assembly  language.  The  three  princi¬ 
pals — John  Socha,  Peter  Norton,  and 
Brady — are  undoubtedly  happy  be¬ 
cause  their  ploy  has  caused  the  book 
to  vault  onto  the  computer  best-seller 
lists — which  means  everyone  in¬ 
volved  has  made  some  money.  And 
last  but  not  least,  Peter  Norton’s  rec¬ 
ognition  as  an  expert  on  all  matters 
concerning  the  IBM  PC  has  been 
magnified. 

A  Poor  Man’s  MAKE 

J.  F.  Philippe  Marchand,  of  Webster, 
New  York,  sent  in  the  programming 
goody  of  the  month — a  short  pro¬ 
gram  called  CHKDATE.C.  This  pro¬ 
gram  compares  the  date  of  two  files 
and  returns  an  “error  level”  code 
that  can  be  tested  within  a  batch  file. 
For  those  people  who  are  not  fortu- 


114 

490 


Dr.  Dobb  s  Journal,  June  1987 


nate  enough  to  have  one  of  the  com¬ 
mercial  MAKE  utilities  (which  are 
bundled  with  many  of  the  compilers 
and  assemblers  being  sold  today), 
CHKDATE  can  help  to  automate  the 
process  of  compiling  and  linking  the 
various  modules  of  an  application 
program.  Example  1,  below,  contains 
the  C  source  code  for  the  CHKDATE 
program,  and  Example  2,  below, 
contains  an  example  batch  file  that 
demonstrates  the  use  of  CHKDATE. 


ways  reported  a  reliable  count  of  the 
number  of  drives  on  the  system  in 
register  al.  Under  DOS  3.0  and  later, 
though,  the  value  returned  in  al  is  al¬ 
ways  at  least  5,  even  if  you  happen  to 
be  running  short-handed  with  fewer 
drives. 

“I  have  no  explanation,  though  I  do 
have  a  solution  in  the  form  of  func¬ 
tion  DriveCnt  [Example  3,  below]. 
The  logic  behind  DriveCnt  is  simple. 
It  takes  the  value  al  reports  from 
function  Oeh  and  passes  it  as  a  drive 


finds  one  it  likes.” 

Although  George  has  been  kind 
enough  to  send  in  the  routine  Drive¬ 
Cnt  for  the  edification  of  DDJ  readers, 
he  sells  a  package  called  Boosters  for 
Turbo  Pascal  programmers  that  in¬ 
cludes  this  subroutine  and  77  others, 
a  screen  generator,  some  40  example 
programs,  and  a  93-page  manual. 
The  Boosters  package  costs  $40  and 
can  be  ordered  direct  from  George 
Smith  at  609  Candlewick  Lane,  Lil- 
burn,  GA  30247;  (404)  923-6879. 


MS-DOS  Programming  Tips 

George  Smith,  of  Lilburn,  Georgia, 
writes:  "I  have  run  across  one  puz¬ 
zling  problem  [in  MS-DOS  3.0].  DOS 
function  Oeh,  which  makes  a  speci¬ 
fied  drive  the  current  drive,  has  al- 


code  to  function  47h  (Get  Current  Di¬ 
rectory),  which  will  set  the  carry  flag 
if  the  drive  code  is  invalid.  It  repeats 
calls  to  function  47h  with  successive¬ 
ly  lower  drive 
codes  until  DOS 


^include  <stdio.h> 

^include  <sys  types. h> 

#include  <sys  stat.h> 

main( argc , argv ) 
int  argc ; 
char  *argv [  ]  ; 

{  struct  stat  buf  1  ,  buf  2  ; 
int  result ,  k ; 
if  (  argc  <  3  )  { 

pr  intf  ( "chkdate  :  usage  chkdate  f  1  f  2  .  .  f  n\n"  )  ; 
printf("  wi  11  return  error  level  1\n"); 
printf  ("  if  f  1  older  than  f  2  .  .  f  n\n"  )  ; 

pr  intf  ( "  or  if  f  1  .  .  fn  don  '  t  exist  ,\n"  )  ; 

exit (  1  )  ; 

); 

if  (  stat (  argv [ 1 J  ,  Sbuf 1  )  ! =  0  )exit( 1  )  ; 
for  (  k  =  2;  k<argc;  k-t+  )  { 

if  (  stat(  argv[k],  £buf  2  )  !  —  0  )  exit(1); 
if  (  buf  1  .  st_atime  <  buf  2  .  st_atime  )  exit  (  1  )  ; 
); 

exit ( 0 )  ; 


Example  1:  Phil  Marchand's  CHKDATE.C  program 

chkdate  main.obj  main.c 
IF  ERRORLEVEL  1  goto  : compile  1 
goto  : next  1 

: compi le 1 
msc  main  ,main; 

: next  1 

chkdate  func.obj  func.c 

IF  ERRORLEVEL  1  goto  :compile2 
goto  : next2 

: compi le2 
msc  f  unc  ,  f  unc  ; 

: next2 

chkdate  main.exe  main.obj  func.obj 

IF  ERRORLEVEL  1  goto  :  link 
goto  :  exit 

:  link 

link  main  .  obj  +  f  unc  .obj,  main.exe; 


DDJ 

Vote  for  your  favorite  feature/article. 
Circle  Reader  Service  No.  9. 


Example  2:  TEST.BAT  file,  a  demonstration  of  the  use  of 
the  CHKDATE  program  in  a  batch  file  to  automate  the 
compilation  and  linking  of  an  application 


;  FUNCTION  DriveCnt  :  Integer; 

;  (C)  1986  George  F.  Smith  t  Company 

;  Purpose: 

;  Returns  number  of 

logical  drives  on  host 

;  computer  -  DOS  2.0 

and  above. 

;  Sample  Usage  (Turbo  Pascal) : 

;  Writeln ('Number  of 

drives  is  ', DriveCnt); 

;  Suggested  processing  sequence: 

;  MASM  DriveCnt,,, 

;  Link  DriveCnt 

;  Exe2Bin  DriveCnt  DriveCnt.com 

;  C2I  DriveCnt.com  XDriveCnt.ini 

;  (C2I  utility  converts  .com  files  to  Turbo  Pascal 

;  inline  code.  See  DDJ,  10/86,  pg.  90  )p 

code  segment 

assume  cs:code 

DriveCnt: 

push  bp 

standard 

mov  bp, sp 

subroutine 

push  ds 

overhead 

sub  sp, 64 

create  scratch  area, 

mov  si,sp 

save  in  si  for  47h  call 

mov  ah,19h 

get  default  drive 

int  21h 

returns  drive  code  in  al 

mov  dl,al 

move  code  to  dl 

mov  ah, Oeh 

set  default  drive  to  Itself 

int  21h 

On  return,  al-#  drives 

;  If  DOS  3.0  or  above,  number  of  drives  in  al  will 

;  be  minimum  of  5.  Will  repeat  calls  to  function  47h 

;  until  a  valid  drive  code  is  obtained. 

mov  dl,al 

number  of  drives  in  dl 

push  ss 

segment  address 

pop  ds 

of  scratch  buffer 

search:  mov  ah, 47h 

is  drive  code  okay? 

int  21h 

carry  flag  set  if 

drive  code  invalid 

jnc  okay 

jump  if  code  valid 

dec  dl 

drive  code  too  big; 

decrement  it  and 

jmp  search 

try  again 

okay:  xor  dh,dh 

clear  dh 

mov  [bp+4],dx 

give  results  to  caller 

add  sp, 64 

adjust  stack  ptr 

pop  ds 

restore  caller's  regs 

mov  sp, bp 

pop  bp 

ret 

code  ends 

end  DriveCnt 

Example  3:  George  Smith's  DriveCnt  routine  to 
determine  the  number  of  disk  drives  present  in  an  MS-DOS 


Dr.  Dobbs  Journal,  June  1987 


115 

491 


COLUMNS 


ARTIFICIAL  INTELLIGENCE 


Object-Oriented  Programming  in  SCOOPS 


This  month,  I  conclude  my  re¬ 
view  of  PC  Scheme  (which  I 
consider  to  be  the  Turbo  Pascal  of  the 
PC  LISP  family)  with  some  examples 
of  object-oriented  programming  us¬ 
ing  SCOOPS,  the  object-oriented  ex¬ 
tension  of  the  language.  To  do  this, 
I’ll  have  to  make  some  extensions  to 
PC  Scheme  itself — but  first,  some  dis¬ 
cussion  about  LISP  programming  in 
general. 

LISP  is  the  most  organic  and  lifelike 
of  all  programming  languages.  Most 
of  its  dynamic  character  comes  from 
the  combination  of  complex,  nested 
structures  with  dynamic  reassign¬ 
ment  of  structures  of  pointers  and  a 
simple  syntax  that  uses  the  same  rep¬ 
resentation  for  data  and  program¬ 
ming  code. 

The  functional  programming  as¬ 
pect  of  LISP  involves  a  special  imple¬ 
mentation  of  argument  passing  so 
that  usually  few  variables  need  to  be 
stored  permanently.  The  main  thing 
in  pure  functional  programming  is 
not  modifying  objects  in  permanent 
storage  but  passing  symbols  as  if  they 
were  values  being  passed  between 
mathematical  functions.  The  main 
result  of  such  a  program  is  the  struc¬ 
ture  it  returns  rather  than  the  state  it 
creates  in  the  permanent  storage  of 
the  machine. 

But  LISP  is  not  simply  a  single-para¬ 
digm  programming  language.  It  is  a 
language  that  so  far  has  been  able  to 
absorb  each  new  programming  mod¬ 
el  as  it  appears  and  to  incorporate 
these  models  in  a  functioning  whole. 
Over  the  years  it  has  absorbed  other 
programming  concepts  and  is  con- 


by  Ernest  R.  Tello 


da  calculus. 

In  pure  functional  LISP  program¬ 
ming,  anything  that  does  not  just  re¬ 
turn  a  structure  but  modifies  the  ma¬ 
chine  is  generally  considered  a  side 
effect.  But  it  is  often  important  in  an 
object-oriented  environment  to  mod¬ 
ify  the  object  hierarchy  dynamically 
in  complex  and  carefully  controlled 
ways.  Take,  for  example,  the  case  of 
performing  simple  list  processing 
functions,  such  as  updating  and  mod¬ 
ifying  list  structures.  Here  it  is  often 
the  case  that  either  or  both  of  the 
functions  of  returning  the  necessary 
structure  and  producing  the  neces¬ 
sary  structure  in  permanent  storage 
are  important  parts  of  the  required 
tasks. 

In  the  following  example,  I  will 
show  various  versions  of  a  function 
add-to-end  that  are  implemented  so 
as  to  return  different  values  and  pro¬ 
duce  different  side  effects.  This  func¬ 
tion  extends  the  list  processing  func¬ 
tions  of  LISP  to  include  the  ability  to 
add  an  element  to  the  end  of  an  al¬ 
ready  existing  list  structure. 

The  function  give-n-take  was  writ¬ 
ten  to  demonstrate  the  side  effects  of 
the  add-to-end  function.  First,  two 
lists  are  created — nums,  which  con¬ 
tains  the  list  of  number  words  (one 
two  three  four  five );  and  morenums, 
which  is  composed  of  the  comple¬ 
mentary  number  word  list  (six  seven 
eight  nine  ten).  Here  is  what  give-n- 
take  looks  like: 


tinuing  to  do  so — for  example,  as  I 
mentioned  last  month,  SCOOPS  has  as¬ 
similated  some  features  of  Smalltalk. 
The  original  model  on  which  the  LISP 
language  was  based  was  that  of  func¬ 
tional  programming,  using  the  lamb- 


(define  nums  ’(one  two  three  four 

five)) 

(define  morenums  '(six  seven  eight 

nine  ten)) 

(define  (give-n-take) 

(add-to-end  (car  morenums)  nums) 


(set!  morenums  (cdr  morenums))) 

As  you  can  see  from  the  short  ses¬ 
sion  that  follows,  what  give-n-take  re¬ 
turns  is  different  from  the  side  ef¬ 
fects  it  has  on  these  lists.  It  simply 
returns  the  morenums  list  that  was 
passed  to  it.  When  you  ask  LISP  for 
the  contents  of  these  lists  by  typing 
their  names  at  the  interpreter 
prompt,  however,  you  see  the  effects 
that  give-n-take  has  each  time  it  is 
called.  It  takes  numbers  successively 
from  the  beginning  of  the  morenums 
list  and  adds  them  to  the  end  of  the 
nums  list: 

[2]  nums 

(ONE  TWO  THREE  FOUR  FIVE) 

[3]  morenums 

(SIX  SEVEN  EIGHT  NINE  TEN) 

[4]  (give-n-take) 

(SEVEN  EIGHT  NINE  TEN) 

[5]  nums 

(ONE  TWO  THREE  FOUR  FIVE  SIX) 

[6]  morenums 
(SEVEN  EIGHT  NINE  TEN) 

[7]  (give-n-take) 

(EIGHT  NINE  TEN) 

[8]  nums 

(ONE  TWO  THREE  FOUR  FIVE  SIX  SEVEN) 

[9]  morenums 
(EIGHT  NINE  TEN) 

Now  compare  this  to  another  ver¬ 
sion  of  the  function  called  add-to- 
end-2.  In  the  next  session  I  create  the 
list  of  integers  from  1  to  4.  My  origi¬ 
nal  add-to-end  returns  something  un¬ 
usable,  but  the  side  effects  are  the 
correct  result.  Add-to-end-2  does  just 
the  opposite — it  returns  the  list  with 
the  number  added  to  the  end,  but 
when  you  examine  the  list  of  inte¬ 
gers,  you  see  that  nothing  has 
changed. 

[6]  (define  integers  ’(1  2  3  4)) 

INTEGERS 

[7]  (add-to-end  5  integers) 

(4  5) 

[8]  integers 


116 

492 


Dr.  Dobb's  Journal,  June  1987 


(12  3  4  5) 

[9]  (add-to-end-2  6  integers) 

(12  3  4  5  6) 

[10]  integers 
(12  345) 

This  hasn't  been  just  an  academic 
exercise.  The  side-effects  version  of 
add-to-end  is  useful  for  doing  neces¬ 
sary  housekeeping  in  an  object-ori¬ 
ented  LISP  environment.  It  is  valuable 
to  be  able  to  maintain  lists  of  all  the 
current  instances  of  various  classes 
that  are  alive  in  a  system  that  is 
changing  dynamically.  Without  the 
version  of  add-to-end  that  can  actual¬ 
ly  modify  such  lists,  you  would  not 
be  able  to  update  them  continually. 

Another  function  that  could  be 
useful  is  delete-last!.  It  performs  the 
opposite  service — that  of  destructive¬ 
ly  removing  the  final  element  in  a 
list.  Its  definition  is: 

(define  (delete-last!  1st) 

(delete!  (car  (last-pair  1st))  1st)) 

The  last-pair  function  in  PC 
Scheme  returns  the  last  pair  in  a  list. 
The  function  works  for  returning 
the  last  element  as  a  single  element 
list  because  the  car  reduces  the  pair 
to  a  simple  list  structure. 

The  following  quick  session  shows 
the  behavior  of  delete-last!.  As  you 
can  see,  in  this  case  both  what  the 
function  returns  and  its  side  effects 
are  identical. 

[2]  (define  numbers  ’(one  two  three 

four  five)) 

NUMBERS 

[3]  (delete-last!  numbers) 

(ONE  TWO  THREE  FOUR) 

[4]  numbers 

(ONE  TWO  THREE  FOUR) 

Programming  in  SCOOPS 

Sending  messages  is  a  rather  simple 
matter  of  using  the  send  function 
with  the  receiver  object  and  the  mes¬ 
sage  to  be  sent  plus  its  arguments.  So, 
to  send  a  message  to  the  my-body  ob¬ 
ject  (introduced  in  the  May  column), 
giving  it  a  new  body  part  called  toes, 
you  would  say: 


[1]  (send  my-body  put-cpart-name 

’toes) 


[2]  body-parts 

(HEAD  NECK  ARMS  HANDS  TRUNK  LEGS 

FEET  TOES) 


If  you  then  changed  your  mind 
and  decided  to  remove  this  new 
body  part,  you  could  do  so  by  global¬ 
ly  accessing  the  body-parts  list  using 
the  newly  defined  function  delete- 
last!  as  follows: 

[3]  (delete-last!  body-parts) 

[4]  body-parts 

(HEAD  NECK  ARMS  HANDS  TRUNK  LEGS 

FEET) 

The  code  in  Listing  One,  page  98, 
demonstrates  simple  inheritance  in 
PC  Scheme  through  several  levels  of 
a  fairly  linear  hieararchy.  First,  the 
root  class  artifact  is  defined  with  the 
instance  variables  material,  weight, 
purpose,  and  cost.  Then  transport- 
means  is  defined  as  a  subclass  of  arti¬ 
fact  with  the  additional  instance  vari¬ 
ables  medium,  time-range,  and 
power-source.  Naturally,  transport- 
means  inherits  all  the  variables  and 
methods  of  the  artifact  class.  Then 
transport-vehicle  is  defined  as  the 
next  subclass  and  passenger-vehicle 
as  a  subclass  of  it.  Descending  further 
in  the  same  linear  manner  of  adding 
more  and  more  specific  classes  that 
inherit  everything  from  the  previous 
class,  the  classes  water-transport-vehi¬ 
cle,  surface-vessel,  ship,  and  ocean- 
liner  are  defined.  The  instance  object 
shipl  is  created  as  an  instance  of  the 
class  ship.  The  two  methods  speed 
and  direction  are  also  provided  for 
the  ship  class. 

Listing  Two,  page  99,  provides  an 
example  of  multiple  inheritance  in 
SCOOPS.  Here  I  have  implemented  the 
example  used  in  April's  column  in  PC 
Scheme.  First,  the  classes  business 
and  adversary  are  defined.  Then  the 
class  competitor  is  defined  and  uses 
the  multiple  inheritance  feature  to 
inherit  everything  from  both  of 
these  two  classes. 

Availability 

All  the  source  code  for  articles  in  this 
issue  is  available  on  a  single  disk.  To 
order,  send  $14.95  to  Dr.  Dobb's  Jour¬ 
nal,  501  Galveston  Dr.,  Redwood  City, 
CA  94063  or  call  (415)  366-3600  ext. 
216.  Please  specify  the  issue  number 
and  format  (MS-DOS,  Macintosh, 
Kaypro). 

DDJ 

(Listings  begin  on  page  98.) 

Vote  for  your  favorite  feature/article. 

Circle  Reader  Service  No.  10. 


Dr.  Dobb's  Journal,  June  1987 


117 

493 


LETTERS 

(continued  from  page  12) 


fastest  and  just  about  the  handiest 
editor  under  ten  fingers  (if  it  isn't, 
just  redefine  your  macros  the  instant 
you  think  of  something  betterl.  Just 
take  the  compiler  and  all-purpose 
resident  macro  program  as  bonuses. 

Now,  the  importance  of  speed  is 
simply  that  we  want  to  work  at  pro¬ 
gramming,  not  at  running  an  editor. 
Like  a  fine  sound  system,  a  good  edi¬ 
tor  should  be  transparent.  When  1 
write  while  v  <=  ma\v  do,  I  don’t 
think  of  the  keys  I  press,  only  of  the 
statement  I’m  producing.  It  should  be 
the  same  when  I  want  to  move  that 
statement,  to  indent  it  and  what  fol¬ 
lows,  or  simply  to  erase  the  line — or 
to  find  it  in  a  2,000-line  file.  What  no 
one  wants  is  to  sit  and  look  at  the  edi¬ 
tor  editing. 

Which  brings  us  to  WordStar  im¬ 
printing.  I  got  imprinted  with  Word¬ 
Star  because  my  mother  imprinted 


me  with  ten  fingers.  For  the  first  two 
days,  it  may  be  easier  to  use  an  editor 
that  has  F3  for  "delete  word”  and 
shift-F3  for  "delete  line,”  but  after 
those  two  days,  only  a  WordStar- 
style  editor  gives  you  a  chance  to  get 
to  the  point  at  which  you  only  have 
to  think  “delete  line”  and  not  notice 
the  actual  fingerwork  needed  to  do 
it — because  it  doesn’t  demand  that 
your  hands  leave  the  typing  position. 

Hunt-and-peck  chickens  may  not 
understand  this,  but  there  is  a  world 
beyond  the  barnyard,  you  know. 
The  WordStar  commands  aren't 
meant  for  the  user’s  manual  but  for 
the  user’s  hands.  They're  the  natural 
extension  of  sheer  speed  because 
they  remove  another  nontranspar¬ 
ent  interface  between  your  mind 
and  the  program  text. 

So,  my  ultimate  editor  is  simply  the 
Turbo  Pascal  editor  with  a  good  mac¬ 


ro  program  plus  several-document 
capacity,  windowing,  wordwrap  for 
comments,  block-limited  search-and- 
replace  with  more  complex  specifi¬ 
cation  capacity,  and  files  greater 
than  64K.  But  never  at  the  price  of 
speed  or  the  WordStar  keyboard 
code.  Don’t  mistake  us  far-voyaging 
mallards  for  clay  pigeons. 

Philippe  Ranger 
6120  Hutchison 
Montreal,  Canada  H2V  4C2 


6502  Hacks 

Dear  DDJ, 

When  I  read  Mark  Ackerman's  "6502 
Hacks”  (February  1987),  I  was  both 
surprised  and  delighted  that,  in  this 
world  of  68000s  and  hypercubes, 
there  is  still  anyone  left  who  would 
spend  the  time  and  effort  to  write  a 
good  article  about  6502 
programming. 

But,  before  anyone's  programs 
start  crashing,  let  me  correct  Mr. 
Ackerman  on  two  related  points. 
First,  the  software  interrupt  instruc¬ 
tion  ( BRK )  uses  the  IRQ  (maskable  in¬ 
terrupt)  vector,  not  the  NMI  (nonmas¬ 
kable  interrupt)  vector.  But  to  make 
matters  even  worse,  the  BRK  instruc¬ 
tion  pushes  its  address +2  on  the  re¬ 
turn  stack.  In  order  to  return  to  the 
opcode  immediately  following  the 
BRK,  the  return  address  on  the  stack 
must  be  dug  up  and  decremented  be¬ 
fore  returning  (very  messy). 

Second,  and  more  important,  is 
that  the  RTI  (return  from  interrupt) 
instruction  is  not  functionally  com¬ 
patible  with  the  sequence  PLP  (pull 
processor),  RTS  (return  from  subrou¬ 
tine).  The  RTI  instruction  pulls  the 
processor  status  byte  and  then  con¬ 
tinues  execution  at  the  address 
pulled  from  the  stack,  increments 
the  address  by  1,  and  then  continues 
execution  at  the  adjusted  address. 
This  compensates  for  the  fact  that 
the  JSR  pushes  the  address  of  the 
next  opcode-1. 

The  reason  for  this  lies  in  a  quirk  of 
the  6502  opcode  processing.  To  exe¬ 
cute  the  JSR  instruction,  the  proces¬ 
sor  first  fetches  the  opcode  and  the 
low  byte  of  the  address.  It  then 
pushes  the  current  program  counter 
(which  is  now  pointing  to  the  third 
byte  of  the  instruction)  and,  finally, 


ENTRY 

STX 

SAVEX 

; Save  [X] 

;(Body  of  subroutine) 

LDX 

tf  $FF 

;Restore  [X] 

SAVEX 

EQU 

*-  1 

;Data  field  of  preceding  instruction 

RTS 

Example  Is  A  fast  method  for  saving  and  restoring  registers 


TXBLOCK 

STX 

—TXLENGTH 

; Set  block  length  (0=256) 

STA 

STY 

_ TXADDR 

-TXADDR +  1 

;Set  block  address 

LDX 

#  - 1 

; Init  timeout 

LDY 

(to 

;Init  block  index 

STY 

—TXSUM 

;Reset  TXSUM  accum. 

TXBLOOP 1 

LDA 

$FFFF , Y 

; First ,  get  byte  to  send 

_TXADDR 

EQU 

*-2 

;Address  portion  of  LDA 
instruction 

TXBL00P2 

BIT 

HWREADY 

;Check  if  Tx  port  is  ready 

BPL 

TXBLOOP  3 

;(N)Check  timeout 

STA 

HWDATA 

;  (Y)Tx  the  byte 

EOR 

ft  $FF 

;Accum  TXSUM 

_ TXSUM 

EQU 

1 

; Data  portion  of  EOR  instruction 

STA 

INY 

-TXSUM 

; Update  running  XOR  sum 

CPY 

tf$FF 

; End  block? 

_ TXLENGTH 

EQU 

*-  1 

;Data  portion  of  CPY  instruction 

BNE 

TXBLOOP 1 

; (N)Continue  sending 

LDA 

_TXSUM 

;Send  the  checksum 

TXBLOOP  3 

DEX 

;Timeout  expired? 

BNE 

TXBL00P2 

;(N)Continue  sending 
; (Y)Return  timeout  error 

Example  2:  An  actual  6502  code  fragment 


122 

494 


Dr.  Dobb's  Journal,  June  1987 


fetches  the  high  byte  of  the  address. 
So,  the  address  that  gets  pushed  on 
the  stack  is  always  one  byte  shy  of 
the  next  instruction.  In  contrast,  the 
interrupt  acknowledge  sequence 
will  only  happen  between  instruc¬ 
tions,  so  the  progam  counter  is  al¬ 
ways  pointing  to  an  opcode  when 
the  interrupt  return  address  is 
pushed. 

In  his  coverage  of  self-modifying 
code,  I  think  Mr.  Ackerman  missed 
one  very  useful  trick.  I  often  save 
registers  (because  they  are  so  very 
scarce)  by  storing  their  contents  in 
the  data  portion  of  a  load  instruction 
located  at  the  end  of  the  routine  [see 
Example  1,  page  122],  It  takes  5  bytes 
of  code  but  is  absolutely  the  fastest 
method  of  saving  and  restoring  a  reg¬ 
ister  (six  cycles  total)  and  both  the  in¬ 
struction  and  data  storage  are  local¬ 
ized  in  the  subroutine. 

To  demonstrate  variations  of  the 
trick,  I  have  included  a  program¬ 
ming  fragment  of  an  actual  routine 
[see  Example  2,  page  1221.  The  rou¬ 
tine  transmits,  over  an  extremely  fast 
synchronous  communication  port,  a 
block  of  data  (from  1  to  256  bytes)  fol¬ 
lowed  by  the  exclusive-ORed  sum  of 
the  block.  It  must  also  make  sure  that 
the  port  is  not  "hung”  by  keeping  a 
decaying  timer  in  the  X  register.  The 
routine  is  passed  the  address  of  the 
block  in  registers  A  and  Y  and  the 
length  of  the  block  in  X.  When  the 
loop  is  cooking,  it  can  transmit  a  byte 
every  27  cycles. 

Of  course,  I  won’t  tell  you  what 
product  this  code  is  running  in  (the 
labels  have  been  changed  to  protect 
the  innocent),  for  fear  that  someone 
wouldn’t  buy  our  system  if  they 
knew  I  programmed  this  way  on  a 
regular  basis. 

James  Bucanek 
C-Si  Systems 
572  W.  Pima 
Coolidge,  AZ  85228 

DDJ 


Dr.  Dobb's  Journal,  June  1987 


FORUM _ 

VIEWPOINT 

(continued  from  page  14) 


pie)  to  use  simple  trigonometry  in  a 
high-level  language  than  it  is  to  use  it 
in  assembly  language. 

High-level  languages  can  enhance 
the  value  of  a  programmer's  work 
by  allowing  code  to  run  with  very  lit¬ 
tle  alteration  on  many  different  com¬ 
puters — all  that  is  required  is  a  com¬ 
patible  compiler  for  each.  My  68000 
cross  assembler  is  now  operational 
on  computers  that  use  Z80,  8088,  and 
68000  processors.  I  would  like  to 
challenge  Suman — or  any  assembly- 
language  programmer — to  convert 
an  assembly-language  program  of 
similar  functionality  and  complexity 
to  run  in  two  environments.  I  suspect 
that  it  would  take  them  longer  than 
the  few  days  each  that  it  took  to  port 
the  68000  cross  assembler  I  presented 
in  the  April  and  May  1986  issues  of 
DDJ. 

With  reference  to  Suman 's  specific 
criticisms,  the  use  of  named  con¬ 
stants  (for  any  language,  including 
assembly  language)  is  almost  univer¬ 
sally  considered  to  improve  main¬ 
tainability  by  localizing  changes.  Al¬ 


though  I  used  the  constants  FIRST  and 
LAST  only  twice  (not  once,  as  Suman 
stated),  I  feel  their  use  was  justified. 

As  to  the  lack  of  initialized  vari¬ 
ables  in  Modula-2, 1  agree  that  this  is  a 
defect  of  the  language.  That  should 
not,  however,  be  an  indictment  of  all 
high-level  languages — C  does  allow 
initialized  variables,  including  initial¬ 
ized  structures  and  arrrays.  If  Mod¬ 
ula-2  allowed  initialized  variables, 
there  would  have  been  no  need  for 
InitOperationCodes  (the  module  that 
Suman  disliked  so  much).  The  mne¬ 
monic  lookup  table  could  have  been 
created  at  compile  time — in  the  mod¬ 
ule  that  needed  the  table — and  with 
much  less  overhead. 

Certainly,  the  mnemonic  lookup 
table  of  X68000  could  have  been  writ¬ 
ten  more  efficiently  using  assembly 
language.  It  would  have  been  harder 
to  write  or  to  read,  however,  and  it 
would  not  have  been  portable  (some 
processors  invert  the  order  of  bytes 
within  words).  The  sets  that  were 
used  for  the  allowable  addressing 
modes  ( ModeA  and  ModeB)  repre¬ 


sented  the  68000  addressing  scheme 
in  a  recognizable  way.  For  example, 
one  of  the  set  members  was  called 
Size67  to  indicate  that  the  sixth  and 
seventh  bits  of  the  68000  operation 
code  were  used  to  indicate  the  size  of 
the  operation.  If  that  member  were 
to  be  converted  to  a  computer  word 
(Is  and  0s),  a  programmer  reading 
the  listing  might  be  at  a  loss  to  figure 
out  what  the  significance  of  a  partic¬ 
ular  bit  (or  combination  of  bits)  was. 

Finally,  Suman  seems  to  contradict 
his  own  point  about  turgidity  and  re¬ 
dundancy  when  he  suggests  that  I 
should  have  used  118  individual 
write  statements  instead  of  the  sim¬ 
ple  loop: 

FOR  i  :  =  FIRST  TO  LAST  DO 

WriteRec  (f,  Table68K[i]); 

END; 

Perhaps  an  ideal  solution  to  the  di¬ 
lemma  of  choosing  between  the  ex¬ 
pressiveness  of  high-level  languages 
and  the  efficiency  of  assembly  lan¬ 
guage  is  to  mix  the  two.  Most  decent 
compilers  allow  the  integration  of  as¬ 
sembly-language  modules.  A  pro¬ 
gram  can  be  written  and  debugged 
in  Modula-2  or  C,  then  profiled  to 
identify  the  bottlenecks.  Finally, 
these  sections  can  be  rewritten  in  as¬ 
sembly  language.  The  original  high- 
level  language  code  can  be  left  be¬ 
hind  as  a  comment  block  to  the 
assembly  language.  Although  this 
would  hamper  portability  some¬ 
what,  the  sections  of  a  program  that 
benefit  from  recasting  in  assembly 
language  are  usually  a  small  portion 
of  the  overall  code.  Projects  that 
would  be  Herculean  tasks  in  assem- 
bty  language  become  quite  comfort¬ 
able  using  modern  high-level  lan¬ 
guages.  Get  your  project  working, 
then  if  you  need  more  speed  or  bet¬ 
ter  memory  utilization,  tune  it  up  us¬ 
ing  a  small  amount  of  hand-coded  as¬ 
sembly  language. 

DDJ 

Vote  for  your  favorite  feature/article. 

Circle  Reader  Service  No.  1 . 


124 

496 


Dr.  Dobb's  Journal ,  June  1987 


PROGRAMMER'S  SERVICES 


THE  STATE  OF  BASIC 


BASIC  Functions 

In  this  issue  we’ll  look  at  user-de¬ 
fined  functions  as  implemented  by 
QuickBASIC,  Turbo  BASIC,  True  BASIC, 
and  Better  BASIC. 

QuickBASIC  2.0  permits  user-de¬ 
fined  nonrecursive  functions  to  ex¬ 
tend  over  multiple  lines.  All  function 
names  must  start  with  the  letters  FN. 
A  data  type  symbol  may  be  required, 
depending  on  the  data  type  returned 
by  the  function  and  any  global  de¬ 
fault  name  declarations  used.  A  func¬ 
tion  has  an  optional  list  of  scalar  ar¬ 
guments  (no  arrays  are  allowed)  with 
all  the  arguments  passed  by  value. 
Multiline  functions  end  with  an  END 
DEF  statement  and  can  use  EXIT  DEF 
to  exit  from  the  function. 

Most  of  the  new  BASIC  dialects  al¬ 
low  passing  of  all  arguments  by  val¬ 
ue.  A  function  with  no  arguments 
that  returns  a  simple  value  is  one 
way  of  implementing  Pascal-like 
constants;  you  cannot  accidentally  al¬ 
ter  the  function's  value,  as  may  be 
the  case  with  a  variable. 

Nonparameter  variables  that  are 
used  within  the  functions  are  global. 
To  avoid  undesirable  side  effects  and 
to  localize  such  variables,  include 
them  in  a  STATIC  declaration.  This 
technique  ensures  that  new  address¬ 
es  are  assigned  to  these  variables  ev¬ 
ery  time  a  STATIC  declaration  is  en¬ 
countered. 

Turbo  BASIC  functions  are  similar 
to  those  of  QuickBASIC,  with  the  fol¬ 
lowing  differences: 

•  Turbo  BASIC  functions  can  be 
recursive. 

•  Local  variables  are  declared  inside 
functions  using  the  LOCAL  keyword. 

•  The  SHARED  keyword  is  used  to  ex¬ 
plicitly  declare  global  variables. 


Turbo  BASIC  offers  LOCAL  and 
STATIC  declarations,  which  give  pro¬ 
grammers  the  ability  to  explicitly  de¬ 
clare  local  nonstatic  variables.  In 
Turbo  BASIC,  local  arrays  are  first  de¬ 
clared  in  the  LOCAL  list  and  then  dy¬ 
namically  dimensioned  using  DIM 
DYNAMIC. 


True  BASIC  implements  user-de¬ 
fined  functions  in  a  slightly  different 
way  from  that  of  the  previous  two 
BASIC  dialects.  First,  function  names 
do  not  have  to  start  with  the  letters 
FN.  The  price  to  pay,  however,  is  the 
use  of  DECLARE  DEF  declarations  to 
inform  True  BASIC  of  the  function 


DEF  FNFACT  (  NS  ) 

•  QuickBASIC  nonrecursive  factorial  function 
STATIC  1%,  F  *  static  local  variables 
F  —  1  ‘  Initialize 
1  loop  to  get  factorial 
FOR  1%  =  2  TO  N% 

F  =  F  *  1% 

NEXT  1% 

FNFACT  =  F 
END  DEF 


Example  1:  QuickBASIC  listing  for  a  nonrecursive  factorial  function 


DEF  FNFACT ( NS  ) 

•  Turbo  BASIC  recursive  factorial  function 
IF  NS  >  1  THEN 

FNFACT  =  NS  *  FNFACT (N%-1 ) 

ELSE 

FNFACT  =  1 
END  IF 
END  DEF 


Example  2:  Turbo  BASIC  listing  for  a  recursive  factorial  function 


DEF  Fact ( N ) 

!  True  BASIC  recursive  factorial  function 
IF  N  >  1  THEN 

LET  Fact  =  N  *  Fact(N-1 ) 

ELSE 

LET  Fact  =  1 
END  IF 
END  DEF 


Example  3:  True  BASIC  listing  for  a  recursive  factorial  function 


REAL  FUNCTION:  Fact 
INTEGER  ARG:  N 
EXTERNAL:  Fact 

10  IF  N  >  1  THEN  RESULT  =  N  *  Fact  (N-  1  )  ELSE  Fact  =  1 
END  FUNCTION 


Example  4:  Better  BASIC  listing  for  a  recursive  factorial  function 


REAL  FUNCTION:  LOGN 

REAL  ARG  :  X 

REAL  ARG:  BAISE/OPT=10 

10  RESULT  =  LOG ( X }  /  LOG ( BAISE ) 

END  FUNCTION 

Example  S:  Simple  power  function  in  Better  BASIC  that  demonstrates  the 
default  parameter  feature 


128 


Dr.  Dobb’s  Journal,  June  1987 

497 


names  imported  from  external  li¬ 
braries  and  modules.  True  BASIC  sup¬ 
ports  recursive  multiline  functions 
that  take  scalar-  and  array-type  pa¬ 
rameters.  All  the  function  parame¬ 
ters  are  passed  by  value. 

In  essence,  True  BASIC  supports 
two  levels  of  functions:  internal  and 
external.  Internal  functions  are  those 
located  in  the  main  program,  before 
the  unique  END  statement.  All  the 
variables  in  the  internal  function  and 
not  in  the  argument  list  are  global, 
which  enables  internal  functions  to 
create  and  manipulate  global  vari¬ 
ables.  External  functions  are  located 
either  after  the  END  statement  of  the 
main  program  or  in  an  external  li¬ 
brary  or  module  file.  External  func¬ 
tions  defined  in  modules  can  access 
information  through  the  argument 
list,  SHARED  variables,  and  PUBLIC 
variables.  External  functions  in  li¬ 
braries  have  a  strict  data  interface 
because  they  rely  mainly  on  the  ar¬ 
gument  lists.  Argument  lists  in  True 
BASIC  can  contain  a  file  I/O  channel 
number  that  allows  file  I/O,  which 
provides  another  method  of  access¬ 
ing  large  data  that  are  stored  in  inter¬ 
mediate  files. 

Better  BASIC  approaches  the  imple¬ 
mentation  of  functions  in  a  Pascal¬ 
like  fashion:  the  type  of  the  function 
or  its  arguments  is  explicitly  de¬ 
clared  using  data-type  keywords  and 
not  symbols.  Much  of  the  discussion 
about  Better  BASIC  procedures  that 
appeared  in  the  May  column  applies 
to  functions.  In  addition: 

•  Function  names  do  not  need  to  start 
with  the  letters  FN. 

•  The  function  type  and  name  are  de¬ 
clared  on  a  separate  line.  This  causes 
Better  BASIC  to  respond  interactively 
by  creating  a  new  workspace  for  the 
function  and  display  and  by  display¬ 
ing  the  memory  available  for  it.  Con¬ 
sequently,  functions  can  use  any 
range  of  valid  line  numbers  without 
conflicting  with  other  functions,  pro¬ 
cedures,  and  the  main  program. 

•  Parameters  are  declared  by  first  list¬ 
ing  their  type,  the  keyword  ARG:, 
and  the  parameter  name.  Like  Better 
BASIC  procedures,  parameters  can  be 
assigned  default  values. 

•  The  result  of  the  function  is  re¬ 
turned  using  the  standard  identifier 


RESULT. 

•  Recursive  functions  need  to  declare 
the  function  and  any  local  variables 
as  external.  This  enables  Better  BASIC 
to  allocate  new  addresses  instead  of 
using  the  same  ones  as  in  the  calling 
function. 

Examples  1-4,  page  128,  show  ver¬ 
sions  of  the  factorial  function  written 
in  each  of  the  BASIC  implementations 
discussed  here.  Recursive  versions 
are  used  with  all  implementations 


except  QuickBASIC.  Example  5,  page 
128,  shows  a  simple  Better  BASIC 
function  that  returns  the  logarithm 
to  any  base.  The  default  is  base  10. 
That  is,  using  the  function  LOGN(X) 
returns  the  base-10  logarithm,  and 
LOGN(X,BAISE)  returns  the  logarithm 
of  X  to  base  BAISE. 

DDJ 

Vote  for  your  favorite  feature/article. 

Circle  Reader  Service  No.  1 2 


Dr.  Dobb's  Journal,  June  1987 

498 


129 


PROGRAMMER'S  SERVICES 

OF  INTEREST 


Languages 

Computer  Crossware  Cabs  has  in¬ 
troduced  Real  BASIC,  a  BASIC  inter¬ 
preter  for  the  Atari  520  and  1040  ST 
that  executes  BASIC  code  20  to  100 
times  faster  than  ST  BASIC  while 
maintaining  full  compatibility.  It 
contains  an  in-line  Motorola-compat¬ 
ible  assembler  that  allows  you  to 
switch  to  assembly  language  without 
leaving  the  interpreted  BASIC  envi¬ 
ronment  and  to  have  full  access  to 
BASIC  variables  while  in  assembly 
mode.  Real  BASIC's  features  include 
an  integrated,  full-screen  editor  that 
supports  ST  BASIC  commands  and 
some  Micro  Emacs  commands;  ex¬ 
tended  graphics  instructions;  and 
function  calls  that  deal  with  the 
mouse  and  joystick  ports.  Real  BASIC 
sells  for  $69.95.  Reader  Service  No.  16. 
Computer  Crossware  Labs  Inc. 

516  Fifth  Ave.,  Ste.  507 
New  York,  NY  10036 
(212)  677-3686 

Digitalk  has  released  two  optional 
extension  kits  to  accompany  Release 
1.2  of  Smalltalk/V  for  PC-DOS  ma¬ 
chines.  The  first  kit  integrates  full 
EGA  color  capabilities  into  the  Small¬ 
talk/V  environment,  greatly  enhanc¬ 
ing  the  system's  bit-mapped  graph¬ 
ics.  The  second  kit,  called  Goodies, 
offers  several  new  programming 
kernels  and  lets  you  extend  Smalltalk 
to  handle  applications  that  require 
discrete  event  simulation,  forward- 
chaining  interface  operations,  and 
connection  to  external  sensors  and 
instrumentation.  Release  1.2  of 
Smalltalk/V  is  priced  at  $99,  and  up¬ 
dates  to  1.1  can  be  downloaded  from 
CompuServe  and  BIX  or  obtained  on 
disk  from  the  company  for  $10.  The 
optional  extension  kits  cost  $49  each. 
Reader  Service  No.  17. 


Digitalk  Inc. 

5200  W.  Century  Blvd. 

Los  Angeles,  CA  90045 
(213)  645-1082 

Rational  Visions  is  shipping  a  full- 
featured  PROLOG  programming  sys¬ 
tem  for  the  Atari  ST.  The  system  uses 
the  Edinburgh  standard  syntax,  mak¬ 
ing  it  compatible  with  most  popular 
tutorials  on  PROLOG.  Features  include 
an  Emacs-style  text  editor  within  the 
system's  interpreter  that  allows  you 
to  interact  with  external  disk  files, 
the  internal  knowledge  base,  or  in¬ 
voke  PROLOG  goals  from  within  the 
editor;  a  built-in  grammar-rule  trans¬ 
lator  that  allows  the  development  of 
natural-language  interfaces;  support 
for  floating-point  values;  and  extend¬ 
ed  math  functions.  Also,  all  PROLOG'S 
metaprogramming  primitives  have 
been  implemented,  as  have  Atari- 
specific  features  such  as  GEMDOS  pri¬ 
mitives  and  interfaces  to  VDI  and  AES. 
The  system  is  not  copy-protected  and 
supports  user-written  applications 
free  of  licensing  restrictions.  The 
package  sells  for  $39.95.  Reader  Ser¬ 
vice  No.  18. 

Rational  Visions 

7111  W.  Indian  School  Rd.,  Ste.  131 
Phoenix,  AZ  85031 
(602)  846-0371 

Datalight  has  introduced  Optimum- 
C,  a  full-featured  global  optimizing  C 
compiler  for  IBM  PCs  and  compati¬ 
bles  that  supports  the  Unix  System  V 
C  language  along  with  several  pro¬ 
posed  ANSI  extensions.  Features  in¬ 
clude  8087  and  software  floating¬ 
point  support,  strong  type  checking, 
ROMable  code  generation,  a  MAKE 
program,  MS-DOS  object  files  format, 
Lattice  C  compatibility,  and  full  li¬ 
brary/start-up  source  code.  Support 
is  provided  for  compact-,  small-,  me¬ 
dium-,  and  large-memory  models. 
Source  licenses  and  support  con¬ 
tracts  are  available  for  the  compiler. 
Optimum-C  sells  for  $139.  Reader 
Service  No.  19. 

Datalight 
P.O.  Box  82441 
Kenmore,  WA  98028 
(206)  367-1803 

Tools 

Devpac  Amiga,  from  HiSoit,  is  a 


68000  program  development  system 
for  the  Amiga  computer  that  fea¬ 
tures  a  combined  editor  and  assem¬ 
bler  and  a  symbolic  disassembler/de¬ 
bugger.  The  full-screen  editor  can  be 
used  from  either  the  mouse  or  from 
the  keyboard.  The  assembler  is  a 
complete,  fast,  macro  assembler  that 
supports  include  files  read  from  disk, 
conditional  assembly,  and  Motorola- 
standard  macro  handling  and  that 
can  produce  executable  or  linkable 
code.  The  debugger  includes  all  the 
expected  commands,  such  as  break¬ 
points  and  single-stepping,  and  also 
allows  you  to  use  your  original  sym¬ 
bols  when  debugging  programs.  It 
uses  its  own  screen  for  its  display  so 
as  not  to  disturb  that  of  other  pro¬ 
grams.  Devpac  Amiga  is  available  in 
the  United  States  from  Apex  Re¬ 
sources.  It  runs  on  any  Amiga  with  at 
least  512K  RAM  and  costs  $99.95. 
Reader  Service  No.  20. 

Apex  Resources 
129  Sherman  St. 

Cambridge,  MA  02140 
(800)  343-7535 
In  MA  (617)  876-2505 

Numerical  Algorithms  Group 

(NAG)  now  offers  172  selected  rou¬ 
tines  from  its  extensive  NAG  Fortran 
Library  for  use  on  workstations  and 
personal  computers.  The  NAG  For¬ 
tran  Workstation  Library  provides 
the  routines  most  frequently  used  by 
engineers  and  PC  programmers. 
Each  routine  includes  an  example 
program,  the  source,  and  data  results 
to  illustrate  the  routine’s  usage.  The 
NAG  Fortran  Workstation  Library  is 
available  for  the  DEC  MicroVAX,  IBM 
PC  line,  and  Sun  workstations.  Li¬ 
cense  fees  range  from  $1,296  for  a 
single  workstation  to  $384  each  for  11 
or  more.  Reader  Service  No.  21. 
Numerical  Algorithms  Group  Inc. 
1101  31st  St.,  Ste.  100 
Downers  Grove,  IL  60515 
(312)  971-2337 

P-tral,  from  Woodchuck  Indus¬ 
tries,  is  a  native  code  BASIC-to-Pascal 
translation  package  that  translates 
IBM  BASICA/MS  BASIC  to  Turbo  Pascal. 
The  program  is  interactive  and  lets 
you  pick  out  or  name  subroutines  as 
well  as  rename  variables  not  fitting 
Pascal  criteria.  The  program  works 


132 


Dr.  Dobb's  Journal,  June  1987 

499 


OF  INTEREST 

(continued  from  page  13Z) 

best  with  a  hard  disk  and  requires 
MS-DOS  or  PC-DOS  2.1  or  later  with  AN- 
SI.SYS.  P-tral  sells  for  $179.  Reader  Ser¬ 
vice  No.  22. 

Woodchuck  Industries  Inc. 

340  W.  17th  St.,  #2B 
New  York,  NY  10011 
(212)  924-0576 

RTC  plus,  a  FORTRAN-  and  RATFOR-tO- 
C  translator  package  from  Cobalt 
Blue,  allows  you  to  tap  old  FORTRAN 
code  for  new  C  development.  The 
translator  is  designed  for  translating 
non-I/O  FORTRAN  libraries  and  code 
in  which  I/O  is  concentrated  in  a  few 
routines.  Also,  more  than  95  percent 
of  STUG's  RATFOR  statements  are  sup¬ 
ported  by  RTC  Plus  for  complete 
translations  of  clean  RATFOR  code. 
RTC  Plus  runs  under  MS-DOS,  Version 
2.2  or  later  and  costs  $325.  Reader 
Service  No.  23. 

Colbalt  Blue 
1683  Milroy,  Ste.  101 
San  Jose,  CA  95124 
(408)  723-0474 

TurboPower  Software  has  re¬ 
leased  an  upgrade  to  T-DebugPLUS,  a 
symbolic  run-time  debugger  for  Tur¬ 
bo  Pascal.  The  new  version  (1.04)  al¬ 
lows  Turbo  Pascal  programmers  to 
debug  code  in  overlays  and  access 
CPU  registers  and  memory.  Other  im¬ 
provements  include  easier  customi¬ 
zation  of  the  debugger  and  a  new 
MAKELST  utility  program  that  creates 
a  commented  disassembly  listing  of 
Turbo  Pascal  programs.  T-DebugPLUS 
requires  Turbo  Pascal  3.0  and  a  PC- 
DOS  machine  with  256K  RAM.  It  sells 
for  $60  ($10  for  upgrade).  Reader  Ser¬ 
vice  No.  24. 

TurboPower  Software 
3109  Scotts  Valley  Dr.,  Ste.  122 
Scotts  Valley,  CA  95066 
(408)  438-8608 

Laney  Systems  has  released  StruBAS 
2.0,  a  structured  BASIC  toolkit  for  ap¬ 
plication  development  with  Quick 
BASIC  and  IBM  BASIC  2.0  compilers  in 
network  and  single-user  environ¬ 
ments.  Integrated  tools  include  en¬ 
hanced  structured  programming  fa¬ 
cilities,  screen  handling,  a  Btrieve 
interface,  StruBAS/ISAM  for  those  not 
wishing  to  buy  Btrieve,  an  object  li¬ 
brary  of  utility  subroutines,  and  utili¬ 
ty  programs.  A  preprocessor  ex¬ 


pands  BASIC  with  new  commands  to 
produce  a  more  readable  and  struc¬ 
tured  language  while  making  all  BA¬ 
SIC’s  features  and  constructs  avail¬ 
able  as  well.  StruBAS  programs  are 
translated  by  the  preprocessor  to 
compilable  BASIC,  which  is  then  com¬ 
piled  and  linked  to  produce  execut¬ 
able  programs.  StruBAS  is  not  copy¬ 
protected,  and  no  royalties  are 
charged  for  programs  written  using 
the  package.  It  requires  a  PC-DOS  ma¬ 
chine  with  320K  RAM  and  a  hard  disk 
and  sells  for  $495.  Reader  Service  No. 
25. 

Laney  Systems  Inc. 

3  Office  Park  Dr.,  Ste.  105 
Little  Rock,  AR  72211 
(501)  225-7755 

BASIC  Program  Analyzer,  from  the 
Expert  Systems  Division  of  Expert 
Information  Systems,  is  a  rule- 
based  system  that  reads  Microsoft's 
GW-BASIC  program  files  in  ASCII  for¬ 
mat  and  performs  analysis,  listing, 
and  cross-reference  functions  upon 
them.  The  analyzer  prints  individ¬ 
ually  tabulated  cross-references  of 
line  numbers,  variables,  keywords, 
quoted  strings,  and  numerical  con¬ 
stants  found  in  theograms.  Error, 
warning,  and  comment  messages  are 
inserted  where  applicable  into  tabu¬ 
lated  cross-references,  and  an  ASCII 
file  is  produced  of  all  portions  of  the 
printout,  including  the  cross-refer¬ 
ences.  Optional  parameters  can  be 
used  to  modify  some  of  the  analyz¬ 
er's  operations,  such  as  testing  for 
compiler  syntax  compatibility  and 
for  redirecting  output.  BASIC  Pro¬ 
gram  Analyzer  is  available  for  most 
MS-DOS  computers  and  costs  $99.95. 
Reader  Service  No.  26. 

Expert  Information  Systems  Inc. 
Expert  Systems  Division 
P.O.  Box  1310 
El  Campo,  TX  77437-1310 
(409)  543-9222 

Libraries 

United  States  Software  Corp.  has 

released  a  68HC11  floating-point  li¬ 
brary.  This  complete  math  package 
operates  on  the  68HC11,  is  designed 
in  a  modular  format,  and  is  delivered 
in  source  assembly  language  for 
maximum  flexibility.  The  library 
conforms  to  the  IEEE  754  Floating 
Point  Standard.  The  68HC11  FPAC/ 


DP  AC  is  available  in  either  single-pre¬ 
cision  or  double-precision  formats. 
The  library  includes  trigonometric, 
logarithmic,  and  exponentiation 
functions,  and  data-conversion  and 
floating-point  utility  procedures.  It  is 
available  on  IBM  PC-compatible  me¬ 
dia  for  a  one-time  fee  of  $950.  Reader 
Service  No.  27. 

United  States  Software  Corp. 

14215  N.W.  Science  Park  Dr. 

Portland,  OR  97229 
(503)  641-8446 

LINLIB,  from  Information  and 
Graphic  Systems,  is  a  library  of  C 
routines  for  developers  of  scientific 
and  statistical  software.  The  package 
performs  all  basic  operations  on  vec¬ 
tors  and  matrices,  with  no  limitation 
on  size  of  vector  or  matrix.  Of  major 
importance  are  the  routines  that  fac¬ 
tor  matrices  (LU,  QR,  and  Cholseky 
factors).  These  factorizations  enable 
you  to  solve  any  kind  of  system  equa¬ 
tions — underdetermined,  deter¬ 
mined,  and  overdetermined.  Also  in¬ 
cluded  in  LINLIB  are  routines  based 
on  the  B-spline  for  manipulating 
splines.  LINLIB  sells  for  $150.  Reader 
Service  No.  28. 

Information  and  Graphic  Systems 
15  Normandy  Ct. 

Atlanta,  GA  30324 
(404)  231-9582 


DDJ 


134 

500 


Dr.  Dobb's  Journal,  June  1987 


FORUM 


SWAINE'S  FLAMES 


1  think  I  may  be  the  first  person 
who  has  ever  actually  seen  a  com¬ 
puter  bug. 

It  came  to  me  at  the  end  of  a  long 
and  disillusioning  night  when  my 
body,  no  doubt  in  sympathy  with  the 
new  punitive  tariffs  on  Japanese 
goods,  began  rejecting  a  dinner  of 
overripe  sushi.  En  route  from  bath¬ 
room  to  loft,  I  heard  an  odd  tapping 
sound  from  the  office  and  stopped  to 
investigate. 

In  my  coilywobbled  state  I  had  ne¬ 
glected  to  turn  off  the  Macintosh, 
and  what  I  saw  in  the  glow  from  the 
screen  was  a  bug — it  looked  a  lot  like 
a  cockroach — repeatedly  climbing  to 
the  top  of  the  Mac  and  leaping  to  the 
keyboard  to  enter  a  character.  The 
exertion  must  have  been  tremen¬ 
dous,  and  he  looked  pretty  beat 
when  he  finally  finished  and 
crawled  into  the  disk  drive.  I'm  not 
surprised  he  didn’t  bother  with  capi¬ 
talization  or  punctuation  (the  bottom 
row  of  keys  was  almost  too  long  a 
jump  for  him). 

He  left  this  behind. 

boss  i  m  afraid  that 

the  computer  curmudgeon  john 

dvorak  has  been  talking 

about  our  beloved  dr  dobb  s  journal 

of  whatever  again 

if  dr.  dobb  s  became 

outspoken  it  would  be 

dangerous  dvorak  said  in  one  of 

those  tasty  magazines  you 

leave  lying  around  this  office  i 

wish  you  would  leave  a 

crust  of  bread  i  m  getting  tired 

of  paper  and  glue 

i  wonder  does  he  mean 

like  that  1940s  science  fiction 

writer  was  dangerous 

when  he  explained  how 

an  a  bomb  would  work  or  like  the 

white  house  press  corps  would  be 

dangerous  if  it  decided  it 

wasn  t  the  white  house  s  press  corps 


or  like  sesame  street  would  be 
dangerous  if  it  taught  children  to 
question  their  teachers 
and  their  parents 
or  maybe  like  infoworld  was 
dangerous  when  it  was 
publishing  dvorak 

sometimes  i  think  that  john 
dvorak  is  a  modeless 
dialog  box  the  max 
headroom  of  the 
personal  computer  industry 

how  could  dr  dobb  s  become 
dangerous  is  what  i 
d  like  to  know 

maybe  by  examining  legislator  s 
voting  records  on  software 
issues  or  by  doing  a  study  of 
compiler  pricing  vs  value  or  by 
publishing  detailed  specs  on 
data  formats  of 

commercial  products  so  that  data 
held  hostage  by  obsolete 
applications  could  be  liberated 
or  by  showing  how  to  develop 
software  for  the  386 
without  waiting  for  you 
know  who  or  by  questioning 
the  need  for  an  operating  system 
on  a  personal  computer  at  all 

i  used  your  modem  to 

call  dvorak  s  number  and  talked 

with  his  dog 

sparky  maybe  dr  dobb  s  should 

list  the  companies 

that  preannounce  products  to 

outflank  their  competitors 

sparky  suggested 

or  do  their  beta  testing  in 


release  version  one  point  oh 
or  cripple  their  products  with 
copy  protection  tricks 
or  refuse  to  eliminate 
known  bugs  what  s  this  about 
bugs  i  asked  him  but 
he  just  barked 

one  thing  i  know  is  that  dr 
dobb  s  could  stand  to 
be  a  little  more  rough  and 
tumble  a  bit  more  capricious  and 
corybantic  there  s  a 
dance  in  the  old  dame  yet 
and  if  that  s  dangerous  it  s 
fine  by  me  please  save 
this  file  as  i  can  t  work 
the  mouse  nor  would  i  want  to  i 
haven  t  always  had  the  best  of 
relations  with  mice  yours  archy 

Scarab  journalism  was  invented  by 
Don  Marquis,  a  New  York  Sun  colum¬ 
nist,  when  he  found  a  cockroach 
composing  in  his  office  late  one  night 
some  seventy  years  ago,  learned  that 
the  bug  was  a  reincarnated  vers  libre 
poet,  and  cannily  conned  the  poet 
roach  into  meeting  copy  deadlines 
for  him  for  the  next  ten  years. 

The  saga  of  archie  the  cockroach 
itself  allegedly  owes  its  existence  to 
the  newspaper  account  of  a  type¬ 
writing  rat  who  plied  his  or  her  art 
by  night  circa  1916  in  a  Dobb's  Ferry, 
New  York,  garage.  By  some  odd  sym¬ 
metry  in  the  transmigration  of  souls, 
a  Dobb  has  once  again  played  host  to 
verse  vermin. 

Fine  by  me. 

jz/tcJLuJ 

Michael  Swaine 
editor-in-chief 


136 


Dr.  Dobb's  Journal,  June  1987 

501 


#129  JULY  1987 


2.95  (3.95  CANADA )l 


Dr.  Dobb’s  Journal  of 

Software  Ibols 

FOR  THE  PROFESSIONAL  PROGRAMMER 


386  DEVELOPMENT  TOOLS: 
Within  Your  Lifetime 


Optimizing  8088  Code 
Rules  for 

Software  Developers 

Workarounds  for 
PROLOG 

Curses  for 
MS-DOS 

Languages: 

A  Curses  Package  in  C 
Forth:  Rules  to  Code  By 
Multitasking  in  Turbo  Pascal 
Inside  a  LISP  Machine 
BASIC  Data  Types 


JULY  1987 


CONTENTS 


VOLUME  12,  ISSUE  7 


ARTICLES 


80386  ► 


PROLOG 
solutions  ^ 


Multitasking  ► 


Dos  curses  ► 


80386  ^ 


Design  rules  ► 


LISP  machine  ► 


Gates  vs. 
Knuth  ► 


80386  PROGRAMMING:  Developing 386  Applications...  1 6 
Today 

by  Richard  Relph 

Despite  what  you  may  have  heard,  programming  for  the  80386 
doesn  't  require  waiting  for  a  386  version  of  Microsoft’s  OS/2. 
Richard  examines  some  of  the  development  options  currently 
available. 

CODING:  8088  Assembly-Language  Programming  34 

Techniques 

by  Thomas  Disque 

Some  not-so-obvious  tricks  to  make  8088  code  fast  and  tight 

LANGUAGES:  Logic  and  Knowledge  Representation  30 

in  PROLOG 

by  Richard  Butrick 

The  differences  between  formal  logic  and  PROLOG  can  lead  to  some 
nasty  surprises.  Richard  offers  some  warnings  and  workaround 
solutions. 

LANGUAGES:  Multitasking  With  Turbo  Pascal  43 

by  Craig  A.  Lindley 

Turbo  Pascal’s  procedures  and  stack  handling  offer  a  mechanism 
for  implementing  a  nonpreemptive  multitasking  scheme. 


COLUMNS 


C CHEST  94 

by  Allen  Holub 

Allen  solves  another  problem  in  moving  programs  between  MS- 
DOS  and  Unix  machines  with  a  curses  windowing  package  for  MS- 
DOS. 

1 6-HIT  SOFTWARE  TOOLBOX  106 

by  Ray  Duncan 

Ray  takes  a  look  at  some  of  the  tools  available  for  developing 80386 
applications.  He  also  reviews  some  advanced  Unix  books  and 
resumes  the  discussion  of  assembly  languages  versus  high-level 
languages. 

STRUCTURED  PROGRAMMING  lla 

by  Michael  Ham 

Mike  offers  some  rules  for  software  design  based  not  on  theory  but 
on  experience. 

ARTIFICIAL  INTELLIGENCE  118 

by  Ernest  R.  Tello 

In  the  first  of  two  columns  on  the  Xerox  1186  AI  workstation,  Ernie 
examines  the  hardware  and  working  environment  of  this  new 
LISP  machine. 


FORUM 


EDITORIAL 

by  Michael  Swaine 

RUNNING  LIGHT 

by  Tyler  Sperry 

ARCHIVES 

LETTERS 

by  you 

SWAINE 'S  FLAMES 
by  Michael  Swaine 


PROGRAMMER'S 

SERVICES 


DR.  DOBB’S  CATALOG: 

DDJ  books  and  software 
8  ADVERTISER  INDEX:  117 

Where  to  find  those  ads 
8  BOOKS:  138 

lO  The  Connection  Machine  and 
The  C+  +  Programming 
136  Language 

THE  STATE  OF  BASIC:  130 

Expanding  the  set  of  data  types 
OFINTEREST:  134 

Products  for  programmers 


Dr.Dobb’S  Journal  of 

Software  Tools 


0  DEVELOPMENT  TOOLS: 
I  Within  'four lifetime 


I  Cpbrn  ir.g  SD88  Cod* 


I  Schwcite  Oev@icc*’s 


About  the  Cover 

Most  of  the  props  for  this  shot, 
with  the  probable  exception  of 
the  skeleton,  are  items  that  any 
programmer  may  have  lying 
around.  The  real  killer,  of  course, 
is  the  missing  software.  (Special 
thanks  to  Donald  Knuth  for  an 
outstanding  reference  and  to 
Dennis  Brothers  for  the  brass  rat.) 


This  Issue 

Are  you  suffering  from  those 
wait-another-year-for-the-gang- 
in-Redmond  blues?  This  month 
both  Richard  Relph  and  Ray 
Duncan  examine  some  of  386 
development  tools  already 
available. 


Next  Issue 

Next  month's  issue  has  a  special 
emphasis  on  tools  for  C 
programmers.  The  lead  article 
examines  both  the  pleasant  and 
troublesome  surprises  with  the 
coming  ANSI  standard  C  and 
offers  some  prescriptions  and 
preventive  measures. 


Dr.  Dobb's  Journal,  July  1987 

504 


3 


FORUM 


EDITORIAL 


This  month's  cov¬ 
er  whimsically  il¬ 
lustrates  the  frustra¬ 
tion  some  program¬ 
mers  have  expressed 
over  the  delivery  by 
IBM  of  286-  and  386- 
based  machines  de¬ 
signed  to  run  Micro¬ 
soft's  OS/2  multi¬ 
tasking  operating  sys¬ 
tem  when  there  will  not  be  a  com¬ 
mercially  available  OS/2  this  year. 

The  announced  OS/2  timetable  is: 
(1)  Software  Development  Kits  in  de¬ 
velopers'  hands  by  next  month,  in¬ 
cluding  kernel  software,  develop¬ 
ment  languages  and  tools,  and  specs 
for  the  Windows  Presentation  Man¬ 
ager  and  LAN  Manager;  (2)  Windows 
Presentation  Manager  and  LAN  Man¬ 
ager  shipped  as  SDK  updates  by  No¬ 
vember;  (3)  OS/2  kernel  shipped  to 
OEMs  by  the  end  of  the  year;  and  (4) 
delivery  dates  for  Windows  Presen¬ 
tation  Manager  and  LAN  Manager  to 
OEMs  and  anything  at  all  to  end  users 
to  be  announced.  That’s  for  the  286/ 
386  version  of  OS/2;  no  timetable  has 
been  announced  for  the  later  release 
that  will  take  advantage  of  all  the  ca¬ 
pabilities  of  the  386. 

How  you  view  that  schedule  de¬ 
pends,  perhaps,  on  whether  you 
view  OS/2  as  the  operating  system  of 
the  1990s  or  as  Microsoft’s  response  to 
the  AT. 

Last  year  I  watched  a  number  of 
technical  journalists  racing  with  Bill 
Gates  to  write  a  program  to  perform 
a  simple  task.  Storm  the  Gates,  the 
contest  was  called,  and  Bill  won  it, 
using  QuickBASIC,  hands  down.  As  I 
see  it,  Bill’s  current  challenge  is  to  get 
OS/2  out  before  Don  Knuth  com¬ 
pletes  The  Art  of  Computer  Program¬ 
ming,  and  that  may  be  a  tougher 
challenge. 

Meanwhile,  you  don’t  have  to  wait 
for  OS/2  to  begin  writing  software 
for  the  386  machines.  Our  lead  article 
this  month  details  some  ways  in 
which  you  can  get  started  today. 
And,  meanwhile — well,  there  are 


a  lot  of  meanwhiles. 
Digital  Research  has 
reorganized,  with  a 
new  CEO  and  with 
founder  Gary  Kildall 
taking  a  more  active 
role  after  a  hiatus  of 
three  years.  DRI  has 
ported  its  Concurrent 
DOS  to  the  386  and  also 
has  a  real-time  OS, 
FLEXOS  386,  in  beta  test  now.  DRI’s 
products  have  been  a  lot  better  than 
the  company’s  marketing  in  the  past; 
we’ll  see  if  that  continues.  THEOS  soft¬ 
ware  has  ported  its  multiuser,  multi¬ 
tasking  THEOS  operating  system  to 
the  386.  AT&T’s  Unix  and  Microsoft's 
Xenix  are  converging  to  one  product 
for  the  386.  Microsoft  perceives  the 
386  multiuser  market  to  be  small,  but 
there  are  a  lot  of  single-user  Unix 
workstations  that  may  be  replaced 
by  386  machines,  and  there  will  be 
pressure  to  put  Unix  on  those  386s. 

Still,  the  most  common  operating 
system  on  386  machines  in  1988  will 
probably  be  MS-DOS  3.x. 

None  of  this  should  be  taken  as  evi¬ 
dence  that  we  don’t  think  OS/2  will 
be  important  for  286  and  386  ma¬ 
chines.  We  are  sending  people  to  the 
developer  conferences  and  are  even 
setting  up  an  OS/2  development  lab 
in  our  offices.  DDJ  editors  who  have 
worked  with  OS/2  say  it  is  an  impres¬ 
sive  piece  of  work.  We  do  think  OS/2 
will  be  important — when  it  arrives. 

In  my  remaining  space  I’d  like  to 
introduce  DDJ's  new  editor,  Tyler 
Sperry.  Tyler  meets  all  my  criteria 
for  a  good  editor:  good  technical  cre¬ 
dentials,  an  appreciation  for  the  ef¬ 
fective  use  of  English,  willingness  to 
work  long  hours  for  low  pay.  .  .  .  Ty¬ 
ler  is  every  bit  as  wonderful  as  OS/2, 
and  he's  here  now.  In  fact,  you  can 
meet  him  on  page  eight. 

Michael  Swaine 
editor-in-chief 


Dr.  Dobb’s  Journal  of 

Software  Tools 

FOR  THE  PROFESSIONAL  PROGRAMMER 


Editorial 

Editor-in-Chief  Michael  Swaine 
Editor  Tyler  Sperry 
Managing  Editor  Vince  Leone 
Assistant  Editors  Sara  Noah  Ruddy 
Levi  Thomas 

Technical  Editor  Allen  Holub 
Consulting  Editor  Nick  Turner 
Contributing  Editors  Ray  Duncan 
Michael  Ham 
Bela  Lubkin 
Namir  Shammas 
Ernest  R.  Tello 

Copy  Editor  Rhoda  Simmons 
Production 

Production  Manager  Bob  Wynne 

Art  Director  Michael  Hollister 
Assoc.  Art  Director  Joe  Sikoryak 
Technical  Illustrator  Frank  Pollifrone 
Cover  Photographer  Michael  Carr 
Circulation 

Circulation  Director  Maureen  Kaminski 
Newsstand  Sales  Mgr.  Stephanie  Ericson 
Book  Marketing  Mgr.  Jane  Shaminghouse 
Circulation  Coordinator  Kathleen  Shay 
Administration 
Finance  Director  Kate  Wheat 
Business  Manager  Betty  Trickett 
Accounts  Payable  Supv.  Mayda  Lopez-Quintana 
Accts.  Receivable  Supv.  Laura  Di  Lazzaro 
Advertising  Director 
Ferris  Ferdon  (415)  366-3600 
Account  Managers 
see  page  117 

Promotions/Srvcs.  Mgr.  Anna  Kittleson 
Advertising  Coordinator  Charles  Shively 
Associate  Publisher 
Michael  Swaine 
Assistant  Sara  Noah  Ruddy 

Dr.  Dohb’s  Journal  of  Software  Tools  (USPS  307690) 
is  published  monthly  by  M&T  Publishing  Inc.,  501  Gal¬ 
veston  Dr.,  Redwood  City,  CA  94063;  (415)  366-3600. 
Second-class  postage  paid  at  Redwood  City  and  at  ad¬ 
ditional  entry  points.  DDJ  is  published  under  license 
from  People’s  Computer  Company,  2682  Bishop  Dr., 
Suite  107,  San  Ramon,  CA  94583,  a  nonprofit 
corporation. 

Article  Submissions:  Send  manuscripts  and  disk 
(with  article  and  listings)  to  the  Editor. 

DDJ  on  CompuServe:  Type  GO  DDJ 
Address  Correction  Requested:  Postmaster:  Send 
Form  3579  to  Dr.  Dobbs  Journal,  P.O.  Box  27809,  San 
Diego,  CA  92128.  ISSN  0888-3076 

Customer  Service:  For  subscription  problems  call: 
outside  CA  (800)  321-3333;  in  CA  (619)  485-9623  or  566- 
6947.  For  book/software  order  problems  call  (415)  366- 
3600. 

Subscriptions:  $29.97  per  1  year;  $56.97  for  2  years. 
Canada  and  Mexico  add  $27  per  year  airmail  or  $10 
per  year  surface.  All  other  countries  add  $27  per  year 
airmail.  Foreign  subscriptions  must  be  prepaid  in  U.S. 
funds  drawn  on  a  U.S.  bank.  For  foreign  subscriptions, 
TELEX:  752-351. 

Foreign  Newsstand  Distributor:  Worldwide  Media 
Service  Inc.,  386  Park  Ave.  South,  New  York,  NY  10016; 
(212)  686-1520  TELEX:  620430  (WUI). 

Entire  contents  copyright  ®  1987  by  M&.T 
Publishing  Inc.  unless  otherwise  noted 
on  specific  articles.  All  rights  reserved. 


M8(.T  Publishing  Inc. 

Chairman  of  the  Board  Otmar  Weber 
Director  C.  F.  von  Quadt 
President  and  Publisher  Laird  Foshay 


6 


Dr.  Dobb's  Journal ,  July  1987 

505 


FORUM 


RUNNING  LIGHT 


ARCHIVES 


By  way  of  intro-  Wm 

duction,  let  me  I M 

say  that  the  image  on  y  W 

the  right  may  look 
like  the  new  editor  of 
DDJ,  but  it  isn’t.  Sure,  M 

we  both  share  the 
improbable  name  of  IM 

Tyler  Sperry  and  ii  •  ..CjjB 
we're  both  generally 
the  same  size  and 
shape.  But  don’t  be  USHhI^I 
fooled.  Anyone  who  knows  me  will 
instantly  recognize  the  fellow  in  the 
photograph  as  an  imposter  because 
he’s  wearing  (horrors!)  a  tie.  As  a  re¬ 
formed  hacker,  I  tend  toward  the 
view  that  t-shirt  and  jeans  are  ade¬ 
quate  attire  for  nearly  any  social 
occasion. 

Besides,  I’m  much  better  looking 
than  he  is.  (Trust  me.) 

If  you  were  to  visit  the  real  me  at 
the  DDJ  offices,  you’d  probably  be 
shocked  by  a  few  other  conflicts  be¬ 
tween  that  picture  and  reality.  In 
contrast  to  the  benign,  seamless 
background  of  that  photo,  I  spend 
my  days  (and  sometimes  nights)  in  an 
office  crammed  with  books,  papers, 
computers,  and  software.  In  short, 
it’s  your  typical  programmer’s  nest 
combined  with  editorial  clutter. 

Because  this  would  be  your  first 
visit  with  the  new  editor,  we’d  prob¬ 
ably  spend  most  of  our  time  getting 
to  know  each  other.  I’d  tell  you  some 
wildly  funny  (and  completely  un¬ 
printable)  stories  culled  from  over  15 
years  of  playing  with  computers.  Sto¬ 
ries  about  my  hardware  days  play¬ 
ing  with  the  innards  of  video  games 
or  my  time  working  in  Kaypro’s  en¬ 
gineering  department.  There’d  be 
some  libelous  stories  from  my  stint  as 
editor  and  publisher  of  Profiles,  Kay- 
pro’s  user  magazine. 

After  all  that  talk  about  me,  I’d 
break  out  a  couple  of  cans  of  Jolt  and 
we’d  start  talking  about  you  and 
your  interests.  (I’m  always  on  the 
lookout  for  new  writers  for  DDJ.) 


V  We’d  discuss  the  In- 

|  »  tel  and  Motorola  pro¬ 

cessors,  and  I  might 
even  confess  some  of 
my  youthful  indis- 
„  cretions  with  the  RCA 

1802  and  National’s 
fkff-  16000  family. 

pffer  (  DDJ  being  the  kind 
■I  'A  !f|  /  'J  of  magazine  it  is, 
K  if:  '•  a£  sooner  or  later  we’d 
WJH  i  Jr  have  a  religious  argu¬ 
ment  about  languages.  You'd  proba¬ 
bly  start  innocently  enough  with  a 
comment  about  your  favorite  lan¬ 
guage,  and  I’d  talk  about  how  I’ve 
tried  a  dozen  computer  languages 
from  ALGOL  to  Z80  assembly  and  nev¬ 
er  found  one  that  satisfied  all  my  de¬ 
sires.  If  that  didn’t  work,  I’d  try  pro¬ 
voking  you  by  admitting  that  I  do 
most  of  my  programming  in  either 
Forth  or  assembly  language.  Eventu¬ 
ally  you'd  rise  to  the  bait  and  I'd  have 
snookered  you  into  writing  an  article 
that  demonstrated  the  virtues  of 
your  favorite  language  in  a  way  that 
none  of  our  readers  had  ever  seen 
before.  A  conniving  lot,  these  DDJ 
editors. 

Of  course,  you've  only  been  visit¬ 
ing  me  metaphorically,  so  you'll 
have  to  take  my  word  for  how  funny 
the  stories  are.  And  you'll  also  have 
to  remind  me  by  letter,  CompuServe 
(76703,4266),  or  phone  ([415]  366-3600) 
of  what  sort  of  article  you  were  plan¬ 
ning  to  write  for  us  and  if  you  need 
one  of  our  writer's  kits.  The  October 
issue  will  be  all  set  by  the  time 
you  read  this,  but  there's  probably 
still  time  to  get  an  article  into  our 
November  graphics  issue.  We're  al¬ 
ways  looking  for  good  articles, 
whether  or  not  they  reflect  a  partic¬ 
ular  issue's  focus. 

Now,  if  you’ll  excuse  me,  I  have  a 
magazine  to  get  back  to.  Thanks  for 
stopping  by,  and  stay  in  touch. 


Tyler  Sperry 
editor 


Windows 

"Three  examples:  VisiCorp,  Microsoft 
with  Windows,  Digital  Research  with  Con¬ 
current  DOS.  In  all  three  cases  we  have 
consistent  out-of-schedule  behavior.  These 
were  basically  all  companies  that  went 
from  very  small  development  groups  to 
significantly  larger  development  groups, 
and  probably  without  project  manage¬ 
ment.” — Giacomo  Marini,  quoted  in  The 
Software  Designer,  DDJ,  October  1984. 

Learning  from  Mistakes 

"One  of  the  major  frustrations  in  my  life 
is  my  inability  to  remember  to  use  facts 
and  ideas  I  know  perfectly  well.  When  I 
look  at  programs  written  by  my  friends,  I 
suspect  the  affliction  may  be  well-nigh 
universal." — Programming  Pastimes  and 
Pleasures,  Charles  Wetherell,  DDJ,  February 
1979. 

"Have  you  ever  put  two  (or  more)  memo¬ 
ry  boards  into  the  same  address  location?  If 
they  are  in  RAM,  the  computer  will  work 
fine,  except  that  there  will  be  a  lack  of 
memory.  This  program  scans  all  addresses 
and  displays  a  map  of  memory  as  seen  by 
the  processor.  This  lack  of  memory'  will 
be  caught  quickly  and  hopefully  save  some 
time  from  head-scratching." — "Memory 
Map  Program  for  the  Z80/8080, "  Robert  Al- 
kire,  DDJ,  January  1979. 

Ten  Years  Ago  in  DDJ 

"In  the  early  days  of  personal  comput¬ 
ing  (i.e.  last  year)  hobbyists  were  happy  to 
be  able  to  display  anything,  jerkily  or  oth¬ 
erwise.  Today,  with  lots  of  video  displays 
available,  byters  can  afford  to  be  choosy. 
One  of  the  features  I’m  looking  for  in  my 
next  video  display  is  linear  scrolling, 
where  the  whole  page  moves  up  one  ras¬ 
ter  line  at  a  time.  Has  anybody  seen 
one?” —  Jim  Day,  DDJ,  June /July  1977. 

"A  San  Francisco  dealer  in  used  comput¬ 
er  equipment  announced  at  the  July  20th 
Homebrew  Computer  Club  meeting  that 
he  has  dozens  of  properly  functioning 
2311-type  hard-disc  drives  available  for 
$350.  Several  people  in  both  northern  and 
southern  California  are  currently  complet¬ 
ing  controllers  for  these  *7  1/4  megabyte* 
drives  that  will  plug  into  a  S-100  bus." — 
DDJ,  June/July  1977. 

"TRIPPY  COMPANY  NAME  NOTED 

"Joining  the  ranks  of  such  straight-laced 
companies  as  Parasitic  Engineering  and 
Barefoot  Computer  Store,  we  recently  not¬ 
ed  a  new  California  company,  specializing 
in  repairing  sickly  equipment:  Micro- 
Mouse.  Now,  with  a  name  like  that,  you 
’know*  they  gotta  be  good  people.” — DDJ, 
June /July  1977. 


Dr.  Dobb  s  J< 


OURNALof 


lalisthenics  fj  Orthodontia 

Running  Light  Without  Ovrrbyte 


8 

506 


Dr.  Dobb's  Journal,  July  1987 


FORUM _ 

LETTERS 


Artificial  Neural  Network 
Pioneers 

Dear  DDJ, 

I  read  with  considerable  interest  "An 
Artificial  Neural  Network  Experi¬ 
ment”  by  Robert  Brown  (April  1987). 
This  article  is  a  classic  example  that 
nothing  is  really  new;  except  for  the 
modern  implementation  using  state- 
of-the-art  hardware,  virtually  every¬ 
thing  described  in  this  article  was 
first  published  in  the  early  1960s. 

In  1961  and  1962,  Dr.  Bernard  Wi- 
drow  of  Stanford  University  first 
published  the  basic  theory  described 
in  this  article;  he  reduced  it  to  prac¬ 
tice  in  a  device  known  in  those  days 
as  Adaline  and  even  formed 
a  corporation  to  manufac¬ 
ture  the  adaptive  elements 
used  in  the  Adaline  device. 

Adaline  was  implemented 
using  essentially  the  Circuit 
described  in  Figure  4  of  the 
article  and  used  electro¬ 
chemical  cells  whose  resis¬ 
tance  could  be  modified  by 
plating  or  stripping  material 
from  the  cell  plates.  Thus, 
an  adaptive  network  exactly 
as  described  in  the  article 
was  achieved.  The  mathe¬ 
matics  involved  was  identi¬ 
cal  to  that  detailed  in  the 
article. 

Adaline  was  reduced  to 
practice  in  a  4X4  array  of 
cells  and  was  demonstrated 
to  show  pattern  recognition 
characteristics  identical  to 
those  described  in  Table  1  of 
the  article;  an  expanded 
Adaline  (Madaline)  was 
demonstrated  to  learn  how 
to  balance  a  broom  on  a 


model  railroad  car  after  being  taught 
for  a  few  minutes  by  an  operator. 
The  algorithm  was  implemented  on 
a  computer  and  demonstrated  to 
have  speech  recognition  characteris¬ 
tics  that  were  remarkable  for  the 
state  of  the  art  existing  at  that  time 
(1962). 

Widrow  is  well  published  in  this 
area — for  example,  see  "Rate  of 
Adaption  in  Control  Systems,”  Amer¬ 
ican  Rocket  Society  Journal  (Septem¬ 
ber  1962):  1,378-1,385. 

Subsequent  to  Dr.  Widrow's  work, 
a  major  extension  of  this  work  was 
published  by  Dr.  John  Voevodsky, 
which  deals  extensively  with  the 
neurological  structure  and  its  rela¬ 
tionship  to  the  work  by  Dr.  Widrow. 

I  was  a  student  of  Dr.  Widrow  at 
the  time  the  Adaline  work  was  in 
process  and  have  wondered  over  the 
years  why  nothing  became  of  it. 
Much  of  the  inhibition  was  caused 
by  the  cold  water  poured  on  this  top¬ 
ic  by  Marvin  Minsky  in  his  book  Per¬ 
ceptions  in  1969;  the  lack  of  suitable 
technology  to  scale  the  concept  up  to 
useful  levels  was  also  a  practical 
inhibitor. 

Dr.  Voevodsky  has  continued  to 


champion  the  concept  over  the  years 
and  has  recently  formed  a  corpora¬ 
tion  to  advance  this  technology  (see 
Electronic  Engineering  Times,  [March 
2, 1987]:  24). 

I  applaud  Robert  Brown  for  pre¬ 
senting  this  information  clearly  in  a 
modern  context,  but  it  is  important 
and  instructive  to  understand  the 
true  origins  of  this  work  and  to  give 
appropriate  credit  to  Bernard  Wi¬ 
drow  and  John  Voevodsky  for  their 
truly  pioneering  work  in  this  field. 

David  Lytle 

90  Sidney  Ct. 

San  Rafael,  CA  94903 

Educating  Programmers 

Dear  DDJ, 

I  would  like  to  make  a  few  com¬ 
ments  on  Allen  Holub's  Viewpoint  in 
April  1987. 

He  has  a  good  point;  however,  how 
about  going  one  step  farther  and  tak¬ 
ing  issue  with  the  very  way  math  is 
taught  in  undergraduate  classes?  It's 
true  that  right  now  calculus  means 
successive  application  of  memorized 
rules  until  you  find  the  right  one.  But 
that  need  not  be.  In  fact,  I  suggest 
that  it  gives  rise  to  poor  math — not 
only  poor  programming. 
Undergraduate  math  (and 
science)  education  in  the  U.S. 
lags  far  behind  the  level  of 
most  other  industrial  coun¬ 
tries,  and  that  kind  of  sylla¬ 
bus — implying  that  you 
don't  really  have  to  apply 
logical  thinking,  under¬ 
standing  of  the  problem  at 
hand,  and  so  on — helps  a  lot 
toward  this  negative  goal. 

I  agree  completely  that  a 
good  liberal  arts  curriculum 
is  best  for  programmers — in 
fact  it's  best  for  scientists, 
too.  Incidentally,  less  memo¬ 
rizing  and  more  under¬ 
standing  of  what's  going  on 
could  make  calculus  (or  any 
math,  for  that  matter)  an  in¬ 
tegral  part  of  such  a  curricu¬ 
lum  and  lead  to  better  pro¬ 
grammers,  better  scientists, 
and,  maybe,  just  plain  better 
human  beings  (not  to  speak 
of  English  majors).  By  the 
way,  that's  what  people  try, 


10 


Dr.  Dobb's  Journal,  July  1987 

507 


LETTERS 

(continued  from  page  10) 


with  varying  success,  to  do  when 
teaching  math  in  Europe. 

Federico  Marchetti 

2505  Jemsen  Ave.,  #438 

Ames,  IA  50010 

Dear  DDJ, 

This  is  a  response  to  the  Viewpoint 
"Education  and  Programming"  by 
Allen  Holub  in  your  April  1987  issue. 

This  essay  touched  on  a  potent 
mystery:  exactly  what  is  the  knowl¬ 
edge  or  state  of  mind  required  for 
good  programming?  That  is,  what  is 
programming?  Hackers,  have  you 
ever  been  stymied  by  a  persistent 
novice  who  keeps  asking  you  "Yes, 
but  what  is  it  that  you  do?”  Mr.  Holub 
may  approve  of  what  I  consider  the 
best  answer  so  far:  "I  write  novels 
that  computers  read  "  (Patrick  Ho¬ 
gan,  1987).  I,  too,  believe  that  pro¬ 
gramming  is,  above  all,  a  literary 
skill,  an  art  of  written  communica¬ 
tion.  Look  at  the  coincidences  in  the 
qualities  of  good  writing  and  good 
programming:  concise  (efficient), 
consequential  (functional),  powerful¬ 
ly  expressive  (good  use  of  hardware), 
evocative  of  hidden  truths,  structur¬ 
ally  harmonious  at  scales  both  large 
and  small. 

That's  why  I  think  the  notion  that 
mathematics  is  inessential  to  pro¬ 
gramming  is  wrong.  On  the  con¬ 
trary,  mathematics  can  teach  things 
needed  by  writers  of  software,  news¬ 
papers,  and  poetry  alike.  The  first  of 
these  is  logic,  the  foundation  of  math¬ 
ematics.  Impatient  novelists  and 
frobbozzers  would  not  shudder  if 
they  could  learn  how  quirky  and  "il¬ 
logical”  mathematical  logic  really 
can  be.  But  it  don't  mean  a  thing  if 
that  proof  don't  sing!  From  logic  you 
learn  that  the  truth,  to  be  valuable, 
must  be  communicated,  that  it  must 
be  notated,  that  it  pays  to  use  power¬ 
ful  theorems,  and  that  the  shortest 
road  home  is  the  best.  And,  as  an  in¬ 
spiration,  there  is  the  mystical  prom¬ 
ise  of  GGdel’s  theorem — that  there  is 
something  so  true  that  it  cannot  be 
captured  in  proof. 

For  programmers,  logic  offers 
such  especially  apt  vocational  train¬ 
ing  that  it  deserves  extra  study  and 
practice,  long  after  the  premeds  and 
lit  jocks  have  moved  on  to  Latin. 
Here,  programmers  can  acquire  a 


handy  facility  with  abstract  symbols 
and  an  assertion-and-proof  style  of 
thought  that,  as  Dijkstra  and  Mills 
have  demonstrated,  is  practical  and 
effective.  Mathematics  courses  are 
good  practice  in  logic  and  following 
"the  drill."  Computers  and  math 
quizzes  are  equally  stern  in  their 
judgment,  so  take  plenty  of  math  and 
get  used  to  it.  Plus,  along  the  way, 
you'll  pick  up  plenty  of  useful  tools 
and  tricks.  I  pity  graphics  program¬ 
mers  who  have  not  studied  linear  al¬ 
gebra  and  real  analysis,  for  they  are 
doomed  to  write  inferior  code. 

If,  as  Mr.  Holub  suggests,  you  limit 
your  study  of  mathematics  to  one  se¬ 
mester  of  basics,  you  will  not  only 
have  failed  to  acquire  a  good  liberal 
arts  education  but  you  will  also  have 
fallen  behind  in  the  art  of 
programming. 

Kerry  Kimbrough 

1710  Aggie  Ln. 

Austin,  TX  78757 

Dear  DDJ, 

Allen  Holub  fails  to  see  the  very  deep 
connection  between  mathematics 
and  programming.  The  facts  of  the 
matter  are  that  mathematics  and 
programming  are  inextricably  relat¬ 
ed.  Both  share  a  common  ground, 
namely  that  of  solving  problems.  But 
before  a  solution  to  a  given  problem 
is  achieved,  the  nature  of  the  prob¬ 
lem  must  be  understood — that  is,  the 
specifications  of  the  problem  are  giv¬ 
en.  It  is  here  that  mathematics  and 
programming  have  the  greatest  simi¬ 
larity.  Both  methods  produce  their 
solutions  from  what  is  known  about 
the  problem  in  advance,  and  the 
more  that  is  known,  the  easier  the  so¬ 
lution.  Generation  of  the  solution  is 
performed  in  a  logical  and  orderly 
fashion.  Solution  of  math  problems  is 
not  haphazard,  despite  Mr.  Holub's 
statement  to  the  contrary.  In  fact, 
many  of  the  items  in  his  assessment 
of  essay-writing  fundamentals  are 
also  found  in  mathematics.  The 
mathematical  analogue  of  the  out¬ 
line  are  the  axioms,  or  those  things 
given  as  true,  of  the  problem.  Also, 
solutions  to  mathematics  problems 
are  generated  a  step  at  a  time,  similar 
to  the  sections  of  an  essay.  In  either 
case,  methodical  and  ordered  pro¬ 
cesses  are  performed  in  order  to 


achieve  the  desired  result. 

Mr.  Holub’s  suggestion  to  use  Latin 
as  the  paradigm  of  complex  systems, 
though  laudable,  is  flawed.  Latin,  al¬ 
though  rich  in  its  complexity,  is  a 
dead  language:  its  patterns  are  fixed 
and  unchanging.  Computer  pro¬ 
grams,  on  the  contrary,  are  dynamic, 
evolving  entities.  Latin  is,  therefore, 
inadequate  as  a  model  from  which  to 
study  the  structure  of  programs.  Hu¬ 
man  language  is  also  fraught  with  in¬ 
consistencies — for  example,  excep¬ 
tions  to  rules — and  sometimes 
suffers  from  ambiguities  and  impre¬ 
cision.  Programming  requires  quite 
the  opposite,  as  even  novice  pro¬ 
grammers  could  testify.  Mathematics 
is  also  intolerant  of  ambiguity  and 
imprecision.  Through  the  use  of 
well-defined,  consistent  principles, 
solutions  to  mathematics  problems 
are  guaranteed  to  be  true  to  a  high 
degree  of  reliability.  No  human  lan¬ 
guage  can  assert  the  same. 

Mathematics  takes  several  years  to 
grasp  fully.  A  one-year  calculus  class 
is  sufficient  to  provide  only  a  coarse 
introduction  to  mathematics.  Rather 
than  see  mathematics  requirements 
be  watered  down,  curricula  should 
be  expanded  to  include  upper-divi¬ 
sion  mathematics  requirements. 

One  final  thought  in  this  area: 
training  in  language  and  literature, 
although  providing  a  broad  knowl¬ 
edge  base,  does  little  to  provide  stu¬ 
dents  with  the  necessary  problem 
solving  skills  and  methodology  appli¬ 
cable  to  computer  programming. 
What  problems  are  there  to  be  solved 
in  Russian  literature? 

We  do  not  need  groups  of  people 
capable  of  writing  text  editors,  data¬ 
base  managers,  and  so  on.  Surely 
there  are  enough  of  these  programs 
already.  "Scientific”  programming 
should  be  strongly  emphasized  in 
any  programming  curriculum.  Mr. 
Holub  is  correct  in  his  assessment 
that  certain  aspects  of  undergradu¬ 
ate  education  require  reform,  but  he 
has  failed  to'  see  the  deeper  underly¬ 
ing  problem:  the  need  to  make  math¬ 
ematics,  science,  and  computers,  be¬ 
cause  of  their  increasing  importance, 
more  available  to  undergraduates, 
even  those  who,  in  Mr.  Holub's 
words  . .  don’t  have  an  aptitude  for 
(continued  on  page  126) 


12 

508 


Dr.  Dobb's  Journal,  July  1987 


ARTICLES 


Developing 
80386  Applications 

.  .  .  Today 


The  arrival  of  the  Intel 
80386  promises  a  new 
level  of  sophistication 
in  applications.  With  its  speed 
and  multigigabyte  virtual, 
physical,  and  segment  ad¬ 
dress  spaces,  developers  can 
create  applications  requiring 
the  resources  of  a  mainframe 
but  usable  on  personal  computers.  In  addition  to  being 
able  to  run  these  new,  sophisticated  applications,  the  386 
can  play  host  to  several  "old”  8086  programs,  each  believ¬ 
ing  it  has  the  whole  CPU  to  itself.  It  is  this  promise  of  being 
able  to  run  newer,  more  powerful  applications  without 
losing  the  use  of  existing  programs  that  makes  the  386  so 
exciting. 

But  many  software  developers  may  feel  the  availability 
of  a  new  CPU  such  as  the  386  is  difficult  to  take  advantage 
of.  There  isn't  an  operating  system  supporting  the  386 
that  people  are  willing  to  buy.  And  even  when  Micro¬ 
soft's  386  DOS  becomes  real,  the  Lotuses  of  the  software 
world  will  get  first  crack  at  it,  taking  away  the  small  de¬ 
veloper's  edge  of  being  able  to  react  more  quickly.  It 
seems  as  if  there's  no  way  for  a  small  company  or  individ¬ 
ual  to  develop  the  first  spreadsheet,  for  example,  to  take 
advantage  of  the  386's  capabilities. 

Well,  that  isn’t  actually  the  case  anymore.  Just  about 
any  programmer  can  develop  code  today  that  takes  ad¬ 
vantage  of  the  full  resources  of  the  386,  using  what  I  call 
an  "environment.”  An  environment  is  a  layer  of  software 
that  looks  like  a  386  DOS  to  your  386  application  but  that 
acts  like  a  standard  8086  application  to  the  host  operating 
system,  MS-DOS.  This  article  describes  two  such  environ¬ 
ments  that  were  available  in  March;  at  least  two  more 


Richard  Relph,  846  Salt  Lake  Dr.,  San  Jose,  CA  95133.  Rich¬ 
ard  is  a  software  and  hardware  consultant.  He  has  written 
compilers  and  embedded  systems. 


should  be  available  by  the 
time  you  read  this. 

The  environments  de¬ 
scribed  here  limit  your  386  ap¬ 
plication  to  physical  memory 
and  must  be  written  in  either 
Pascal  or  C  (unless  you  feel  up 
to  developing  multimegabyte 
programs  in  assembly  lan¬ 
guage).  Over  the  next  few  months,  these  restrictions  may 
be  eased.  Environments  supporting  virtual  memory  and 
other  languages  will  certainly  be  ported  to  the  386.  Two 
environments  supporting  multitasking  were  planned  for 
release  in  April. 

Before  you  can  use  an  environment,  you  need  a  compil¬ 
er  that  supports  the  environment  (and  its  host  processor, 
the  386).  First,  you  select  your  compiler;  then  an  environ¬ 
ment;  and  finally,  the  hardware  to  run  it  on.  All  three 
ingredients  are  available  today. 

The  only  vendor  supplying  386  compilers  for  any  envi¬ 
ronment  I  know  about  is  MetaWare,  which  is  now  ship¬ 
ping  both  Professional  Pascal  and  High-C.  Phar  Lap  was 
the  first  company  to  ship  an  environment — DOS-Extend- 
er.  Softguard  followed  soon  after  with  VM/RUN,  the  pieces 
of  VM/386  needed  to  run  386  programs.  The  Software 
Link  provides  PC/MOS  386,  a  multiuser,  multitasking  OS 
for  the  386.  The  version  of  PC/MOS  I  tested  supported  only 
8086  tasks,  although  the  company  provided  documenta¬ 
tion  and  release  dates  for  the  386-tasking  version.  The 
hardware  I  used  was  the  Compaq  Deskpro  386. 

The  Compiler 

Compiling  your  source  code  is  the  starting  point  for  creat¬ 
ing  386  programs.  I  used  MetaWare's  High-C  386  to  per¬ 
form  this  task.  High-C  386  itself  runs  on  any  MS-DOS  com¬ 
puter  with  a  hard  disk.  In  fact,  no  386  is  required 
anywhere  in  the  development  process  until  you  actually 
need  to  execute  (and  debug)  your  code.  Of  course,  like  any 


by  Richard  Relph 


There  are  alternatives  to 
waiting  a  year  or  two  for 
Microsoft's  386  version  of 
OS/2 . 


16 


Dr.  Dobb's  Journal,  July  1987 

509 


MS-DOS  program,  High-C  runs  signifi 
cantly  faster  on  a  well-designed, 

386-based  computer. 

MetaWare  High-C  is 
complete  (and  then  some) 
implementation  of  the 
C  language,  regard¬ 
less  of  which  standard 
you  choose  to  hold  it  up 
to.  The  ANSI  C  language 
specification,  though  not  yet  a 
standard,  is  supported  in  one  of 
two  similar  languages  accepted  by 
High-C.  The  other  language  is  perhaps 
best  referred  to  as  extended  ANSI  C.  This 
is  the  default  language.  The  strict  ANSI  lan¬ 
guage  is  supported  in  the  form  of  different 
parse  and  scan  table  files  supplied,  which  can  be 
fed  to  the  compiler  in  lieu  of  the  ones  built  into  it. 

Some  of  the  MetaWare  extensions  to  ANSI  C  include 
case  ranges,  nested  procedures  (like  those  of  Pascal), 
named  parameter  association  (borrowed 
from  Ada),  access  to  unnamed  members 
of  unions  (taken  from  C++),  and  inter¬ 
leaved  declarations  and  statements. 

Two  unusual  features  allowed  by  ANSI 
and  used  by  MetaWare  are  pragmas  and 
intrinsic  functions.  Pragmas  allow  pro¬ 
grammers  to  tell  a  compiler  something 
about  their  code.  MetaWare  uses  them  to 
change  calling  conventions,  specify  seg¬ 
ments  for  objects,  enable  various  optimiza¬ 
tions,  and  other  unusual  but  sometimes 
needed  features.  Intrinsic  functions  are 
function  calls  that  the  compiler  recognizes 
and  generates  code  for  in-line  (without  a 
procedure  call),  using  any  special  instructions 
the  processor  may  support  (such  as  rep,  scans, 
movs,  etc.).  MetaWare  provides  intrinsics  for 
absolute  value;  minimum  and  maximum  of 
lists;  common  string  and  memory  functions; 
and,  if  the  387  option  is  used,  some  transcen¬ 
dental  functions.  I  didn’t  test  support  for  Intel’s 
80387  numeric  coprocessor  (a  faster,  extended 
80287)  because  the  Compaq  Deskpro  386  can’t  use 
one,  but  287  support  is  provided  and  I  tested  it. 

To  do  language  testing,  MetaWare  provided 
me  with  its  test  suite,  which  it  also  sells  as 
a  separate  product.  The  test  suite  includes  many 
strange  but  valid  and  mostly  compiler-indepen¬ 
dent  constructs  in  the  C  language.  It  consists  of 
roughly  2,000  lines  of  code  and  includes  both  a  lan¬ 
guage  test  and  a  library  test.  I  expected  MetaWare’s 
compiler  to  pass  the  suite  (because  the  company 
provided  it)  and  it  did.  I  was  convinced  by  looking  at 
the  test  suite  that  a  compiler  would  have  to  be  pretty 
sound  to  run  it,  but  just  to  provide  a  basis  of  compari¬ 
son,  I  stripped  the  suite  of  High-C-specific  code  and 
ran  it  through  some  other  well-known  compilers. 
The  other  compilers  all  failed,  in  various  and  surpris¬ 
ing  ways,  but  that  is  not  the  subject  of  this  article. 

The  library  provided  with  High-C  is  intended  to 


conform  to  the  ANSI  specification, 
and  it  does.  Additional,  nonstan¬ 
dard  (but  common)  functions  are 
also  supported  via  utility 
“packages.”  These  pack¬ 
ages  are  in  the  form  of 
header  files  much  like 
the  normal  ones,  only 
with  the  extension  .CF. 
Packages  are  provided  for 
stack  dumping,  for  interrupt 
trapping  (yes,  trapping),  for  using 
the  MS-DOS  int  21  functions,  and  for 
calling  other  system-dependent  ser¬ 
vices.  As  an  additional  feature  of  the  li¬ 
brary,  all  portable,  but  system-dependent, 
functions  rely  on  a  core  of  functions  that 
programmers  can  replace  in  order  to  sup¬ 
port  embedded  or  hostless  applications. 

V;  The  documentation  provided  with 

/  High-C  is,  to  say  the  least,  complete. 
Some  700  or  more  pages  in  length, 
there  is  probably  no  question  about 
the  product  not  answered  within 
its  volume.  The  documentation  is 
divided  into  five  sections,  each  a 
manual  itself,  each  with  a  table 
of  contents  and  an  index.  The 
manuals  are  License,  Installa¬ 
tion,  Program,  Library,  and  Lan¬ 
guage.  Each  is  typeset,  with  the  ex¬ 
ception  of  the  Language  manual, 
which  is  printed  on  a  dot-matrix 
printer  (presumably  awaiting  final 
ANSI  approval  of  the  C  standard).  It  will 
take  new  users  some  time  to  get  familiar 
with  the  structure  of  the  manuals  and  to  feel  at 
ease  with  them.  This  is  especially  true  of  the  Lan¬ 
guage  manual,  which  is  precise  but  not  particular¬ 
ly  easy  to  read  or  understand. 

When  it  comes  to  support  for  High-C,  it  is  hard 
to  imagine  any  better.  Questions  about  the  prod¬ 
uct  are  routed  directly  to  the  person  who  wrote 
the  compiler,  which  ensures  that  you  are  talking  to 
someone  who  has  some  familiarity  with  the  prod¬ 
uct.  MetaWare  is  always  improving  its  compilers, 
both  by  fixing  any  discovered  bugs  and  by  adding 
new  features.  I  have  always  found  the  company  co¬ 
operative  when  it  comes  to  providing  updates,  al¬ 
though  its  official  policy  is  three  months  of  free  up¬ 
dates  with  additional  support  at  15  percent  ($135)  per 
year. 

High-C  is  more  expensive  than  MetaWare's  other 
DOS  compilers  but  not  unreasonably  so.  The  version  I 
tested  was  1.3,  although  this  is  not  enough  to  exactly 
specify  a  particular  set  of  compiler  characteristics  be¬ 
cause  MetaWare  often  tinkers  with  the  compiler 
without  changing  the  version  number. 

Installation  of  the  compiler  is  easy  if  you  proceed 
as  MetaWare  suggests.  The  first  step  is  to  run  a  batch 
script  called,  oddly  enough,  install.  This  script  places 
several  files  and  directories  in  the  current  directo- 


Dr.  Dobb  s  Journal,  July  1987 

510 


17 


80386  APPLICATIONS  TODAY 

(continued  from  page  17) 


ry.  It  is  unfortunate  that  most  of  these  files  are  packed  in  a 
single  archive  file  spread  across  the  four  diskettes.  This 
makes  customized  installations  a  two-step  process:  first 
installing  per  MetaWare  guidelines,  then  rearranging  for 
individual  tastes.  To  MetaWare's  credit,  the  entire  instal¬ 
lation  section  of  the  manual  is  duplicated  on  the  first  dis¬ 
kette  for  easy  reference. 

Besides  the  compiler,  include  files,  and  necessary  li¬ 
braries,  the  package  contains  several  utilities,  such  as 
some  standard  Unix-style  file  manipulators.  More  unusu¬ 
al  is  a  set  of  utilities  that  allows  editing  and  detailed  ex¬ 
amination  of  .OBJ  files  and  utilities  for  producing  cross- 
references  of  multiple  source  files. 

Performance 

The  compiler  isn’t  going  to  win  any  contests  for  speed  of 
compilation,  but  users  should  find  performance  adequate 
for  most  programs.  Most  of  the  slowness  comes  from  the 
size  of  the  compiler  and  the  time  it  takes  to  load  from  disk, 
so  performance  will  appear  particularly  bad  with  small 
source  files.  The  compiler  does  generate  a  lot  of  informa¬ 
tion  about  your  program  and  issues  many  useful  warn¬ 
ings  when  it  finds  questionable  code,  such  as  using  a  vari¬ 
able  before  assigning  anything  to  it  or  failing  to  specify  a 
return  value  from  a  nonvoid  function.  Listings,  with  or 
without  generated  assembly-language  code,  can  be  pro¬ 
duced  via  a  command-line  switch. 

The  generated  code  is  good  considering  that  no  global 
optimizations  are  made.  Of  course,  the  386’s  more  regular 
architecture  doesn't  hurt  either,  although  there  are  nu¬ 
ances  there  as  well.  For  instance,  the  fastest  way  to  multi¬ 
ply  something  by  5  is  to  use  the  LEA  instruction,  of  all 
things!  High-C  supports  three  register  variables,  and  they 
are  used  by  default,  unlike  High-C  for  the  8086  and  80286, 
which  only  supports  two  and  then  only  if  you  enable 
them.  The  size  of  integers  is  32  bits  in  High-C.  I  didn't 
spend  a  lot  of  time  benchmarking  the  compiler  because  it 
is  the  only  game  in  town  right  now.  But  I  did  compare 
programs  compiled  under  386  mode  with  programs  com¬ 
piled  under  286  mode  and  found  general  improvements 
of  5  to  10  percent  in  the  386  version.  In  programs  in  which 
long  arithmetic  was  used  often,  more  dramatic  increases 
were  apparent. 

The  Linker 

After  compiling  your  program,  you’ll  need  to  link  it.  I 
used  Phar  Lap’s  assembler  and  linker,  although  you  can 
get  the  same  assembler  and  linker  from  Softguard  (which 
licenses  them  from  Phar  Lap).  I  did  not  play  with  the 
assembler  much  beyond  running  the  installation  test,  but 
based  on  comments  in  the  update  notice,  it  seems  fairly 
sound.  The  linker,  on  the  other  hand,  I  used  a  great  deal. 

386LINK,  as  it  is  called,  is  fairly  slow,  mostly  because  of 
the  way  in  which  libraries  are  managed.  Unlike  Micro¬ 
soft’s  DOS  linker,  which  uses  an  (undocumented)  index 
table,  Phar  Lap's  simply  searches  the  library  by  scanning 
it.  Doing  all  the  arithmetic  in  32  bits  using  8086  instruc¬ 
tions  doesn’t  help  performance  either.  The  input  form  is 
something  Phar  Lap  calls  Easy  OMF-386,  a  simple  exten¬ 


sion  to  OMF-86,  the  OBJ  form  both  Intel  and  Microsoft  use 
for  8086  objects.  Easy  OMF-386  extensions  are  documented 
in  an  appendix  of  the  linker's  manual.  The  linker  can 
produce  several  output  forms.  .EXE  files  are  the  default, 
although  the  format  is  not  exactly  compatible  with  DOS 
.EXE  files  because  it  doesn’t  use  8086-style  segments.  Intel 
hex  and  Motorola  S-records  can  also  be  generated.  A  spe¬ 
cial  output  form  is  .REX,  which  is  what  the  Softguard  envi¬ 
ronment  uses. 

The  Environments 

If  you  buy  the  assembler  and  linker  from  Phar  Lap,  you 
also  get  MINIBUG  and  RUN386.  The  total  package  lets  you 
develop  programs  to  run  on  the  386.  If  you  want  to  sell 
those  programs,  you  must  purchase  a  redistribution  li¬ 
cense.  You  can  then  bind  RUN386  to  your  program  and 
sell  it.  In  this  bound  form,  386  programs  are  invoked  just 
as  normal  DOS  applications. 

MINIBUG  is  a  debugger  for  386  programs,  similar  to  MS- 
DOS  DEBUG.  Missing  are  the  Assemble  command,  the  Load 
command,  and  the  Name  command.  The  Dump  and  En¬ 
ter  commands  are  enhanced  to  support  SYMDEB-style  size 
specifiers  for  ASCII,  word,  and  double-word  quantities. 
The  RX  command  displays  all  80386  registers,  while  the  R 
command  displays  just  the  registers  used  by  normal  pro¬ 
grams.  You  can  modify  any  80386  register,  including  all 
the  protected  registers  (which  include  the  debug 
registers). 

RUN386  is  Phar  Lap’s  386  execution  vehicle.  It  is  a  single, 
standard  MS-DOS  program  that  accepts  the  name  of  a  386 
.EXE  file,  loads  it,  and  runs  it,  passing  any  additional  com¬ 
mand-line  parameters  on  to  the  80386  application. 
RUN386’s  job  has  really  only  begun  when  your  386  pro¬ 
gram  gets  control,  however.  Presumably,  your  program 
will  want  to  do  I/O  and  probably  through  the  MS-DOS  that 
was  running  just  before.  The  problem  is  that  MS-DOS  is  an 
8086  program  and  yours  is  not.  Furthermore,  you  can 
address  gobs  of  memory,  but  MS-DOS  and  its  underlying 
hardware  can  only  get  at  the  first  megabyte  of  physical 
memory.  RUN386  handles  the  details  of  getting  your  data 
through  to  DOS  and  letting  DOS  get  to  its  hardware.  It  does 
this  by  intercepting  your  int  21s,  examining  the  registers, 
translating  them  and  moving  data  if  necessary,  and  final¬ 
ly  passing  control  to  MS-DOS  in  real  mode.  When  MS-DOS 
finishes,  RUN386  regains  control  and  again  translates  reg¬ 
isters  accordingly.  RUN386  also  fields  hardware  inter¬ 
rupts  and  forwards  them  to  their  real  mode  handlers. 

Softguard’s  VM/RUN  does  pretty  much  the  same  thing  as 
RUN386  does.  VMRUN  is  a  .COM  file  that  loads  many  other 
Softguard-supplied  files  (all  of  which  must  be  in  the  cur¬ 
rent  directory)  and  then  the  application’s  .REX  file.  Each 
application  must  have  a  profile,  a  description  of  how  the 
programmer  would  like  certain  features  of  the  environ¬ 
ment  configured.  Mainly,  these  parameters  specify  how 
much  memory  to  allocate  for  specific  uses.  Memory  can 
be  allocated  for  program-managed,  low  physical  address 
buffers  (thus  avoiding  the  expensive  block  move),  for  DOS 
exec  calls,  and  for  the  stack.  All  other  memory  is  given 
over  to  the  80386  application  code  and  its  data. 

Another  bit  of  information  in  the  profile  is  whether  to 
start  up  in  debug  mode  or  not  and  whether  to  debug 
using  the  display  or  a  separate  terminal  connected  to  a 


18 


Dr.  Dobb's  Journal,  July  1987 

511 


80386  APPLICATIONS  TODAY 

(continued  from  page  18) 


COM:  port. 

Softguard's  debugger  is  different  from  Phar  Lap's  and  is 
in  many  ways  better.  It  uses  the  80386's  debug  registers  to 
set  execution  and  data  breakpoints.  It  is  screen-oriented 
when  running  locally  and  can  use  a  remote  terminal  as 
the  debug  console  (in  which  case  it  does  start  to  resemble 
DEBUG).  When  running  locally,  screen  swapping  is  used  to 
avoid  the  normal  problems  associated  with  one  screen 
being  used  for  two  purposes.  The  debugger  is  more  or  less 
always  present.  If  a  trap  occurs  during  execution  of  your 
program,  the  debugger  is  invoked. 

One  major  difference  between  the  environments  these 
two  packages  present  is  the  memory  model.  Phar  Lap's 
continues  to  use  segments,  whereas  Softguard's  is  based 
on  a  flat-memory  model.  Although  it  is  accurate  to  say 
that  Phar  Lap  supports  the  large  model  compared  to  Soft¬ 
guard's  small  model,  it  is  perhaps  a  bit  misleading.  The 
small  model,  after  all,  supports  up  to  4  gigabytes  per  ob¬ 
ject  or  program.  One  place  this  dichotomy  of  models  is 


Vendors 

High-C  388 
MetaWare  Inc. 

903  Pacific  Ave.,  Ste.  201 
Santa  Cruz,  CA  95060 
<468)429-6382 
$895 

Reader  Service  No.  40 

386ASM,  386 LINK,  RUN 386,  MINIBUG 
Phar  Lap  Software  Inc. 

60  Averdeen  Ave. 

Cambridge,  MA  02138 

(617)661-1510 

$495 

$995,  redistribution  license 
Reader  Service  No.  41 

VM/HUN 

Softguard  Systems  Inc. 

2840  San  Thomas  Expressway,  Ste.  201 
Santa  Clara,  CA  95051 
(408)  970-9240 
$595 

Reader  Service  No.  42 

PC/MOS  388 

The  Software  Link,  Inc. 

8601  Dunwoody  PL,  Ste.  632 

Atlanta,  GA  30338 

(404)  998-0700 

$595,  single-user 

$995,  multiuser 

Reader  Service  No.  43 


apparent  is  in  direct  screen  I/O  (I  do  not  recommend  do¬ 
ing  direct  screen  I/O,  but  both  environments  support  it 
and  it  does  provide  a  useful  example  of  the  difference 
between  flat  and  segmented  models).  For  Softguard’s,  you 
merely  compute  what  the  address  would  have  been  on 
the  IBM  PC  (which  depends  on  what  display  adapter  is  in 
use),  put  that  in  an  index  register,  and  access  screen  mem¬ 
ory  through  it.  So  Softguard  would  have  the  screen  ad¬ 
dressed  with  EDX  (for  example)  equal  to  000b8000.  Phar 
Lap,  on  the  other  hand,  provides  a  segment  descriptor 
that  points  at  the  base  of  the  display  adapter,  so  merely 
computing  the  offset  and  accessing  through  the  segment 
with  the  offset  does  the  equivalent  thing.  Segment  lc 
points  at  the  screen  memory,  so  to  access  the  first  location 
in  it,  you  load  a  segment  selector  (register)  with  lc  and  use 
an  offset  of  0.  Phar  Lap's  lc:0  is  equivalent  to  Softguard's 
000b8000,  assuming  a  color  graphics  adapter.  If  you  use  a 
monochrome  adapter,  Phar  Lap's  environment  still  ac¬ 
cesses  it  using  lc:0,  but  Softguard's  uses  OOObOOOO.  It 
should  be  noted  that  Phar  Lap  does  provide  a  segment 
selector  (34)  that  works  the  way  Softguard's  DS  does  for 
the  low  1  megabyte. 

Softguard's  VM/RUN  is  compatible  with  its  future  VM/ 
386,  a  multitasking  386  control  program.  VM/386  is  not  an 
operating  system  but  rather  a  layer  between  other  oper¬ 
ating  systems  and  the  hardware  (VM/370  programmers 
should  recognize  this  picture  immediately).  Under  (or 
over,  depending  on  how  you  view  things)  VM/386,  sever¬ 
al  different  8086  operating  systems  may  be  running,  each 
believing  it  owns  the  machine.  One  virtual  machine  can 
even  be  rebooted  without  affecting  any  other  virtual 
machine. 

The  Software  Link  has  taken  a  different  tack.  Its  PC/ 
MOS  386  is  a  single  80386  operating  system  that  can  run 
several  DOS  programs  at  one  time.  PC/MOS  will  also  be  able 
to  support  80386  native  mode  programs,  but  this  feature 
was  not  available  at  the  time  I  wrote  this  review  (it  was 
due  to  be  released  in  April).  Documentation  provided  by 
The  Software  Link  indicates  that  PC/MOS  will  support 
large-model  386  programs,  like  RUN386  does,  although  it 
doesn't  appear  that  they'll  be  compatible. 

Performance 

When  comparing  RUN386  and  VM/RUN  in  use,  RUN386 
seems  to  be  easier  to  use.  Only  one  new  file  is  introduced 
(RUN386.EXE)  and  it  can  be  anywhere  along  the  path,  al¬ 
though  the  386  .EXE  file  must  be  in  the  current  directory. 
VM/RUN  requires  at  least  five  other  files,  four  of  which 
must  be  in  the  current  directory,  along  with  the  386  appli¬ 
cation's  .REX  file.  VM/RUN  takes  longer  to  load  these  files 
and  to  get  to  the  task  at  hand  (running  your  program) 
than  does  RUN386.  RUN386  seemed  to  cooperate  with  my 
editor  (Epsilon),  whereas  VM/RUN  did  not.  Both  VM/RUN 
and  RUN386  did  run  under  my  make  utility  (Polymake). 
VM/RUN  also  clears  the  screen  during  initialization, 
which  did  not  seem  appropriate.  And  finally,  VM/RUN  is 
between  7  and  15  percent  slower  executing  identical  code 
on  the  same  Compaq  Deskpro  386.  This  is  presumably 
because  of  hardware  interrupt  handling  or  overhead  as¬ 
sociated  with  running  at  CPL  3  rather  than  CPL  0  as  RUN386 
does.  (CPL  is  the  current  processor  level  and  represents 
the  level  of  privileges  that  should  be  granted  to  a  pro- 


20 

512 


Dr.  Dobb's  Journal,  July  1987 


80386  APPLICATIONS  TODAY 

(continued  from  page  20) 

gram — 0  is  the  most  privileged  level.)  Other  explanations 
are  possible,  and  the  results  may  not  repeat  on  non-Com¬ 
paq  386  machines. 

Softguard’s  product  does  outperform  Phar  Lap’s  in  one 
important  area — memory  allocation.  In  addition  to  being 
more  flexible  (through  the  profile  discussed  above),  Soft- 
guard’s  also  seems  to  end  up  with  more  memory  avail¬ 
able.  Running  a  binary  search  between  0  and  MAXINT  (2 
billion),  Softguard’s  reported  1,656,288  bytes  available  in  a 
2-megabyte  system,  whereas  Phar  Lap's  reported  only 
1,496,912  bytes  available.  Of  course,  the  more  memory 
you  have  in  the  machine,  the  more  memory  will  be  avail¬ 
able  for  386  programs  (until  you  hit  that  4-gigabyte  limit). 

The  Hardware 

None  of  this  discussion  would  matter  if  there  were  not 
computers  on  which  to  run  these  new  environments. 
The  Compaq  Deskpro  386  proved  to  be  an  excellent  per¬ 
former  on  all  counts.  It  was  a  joy  to  use.  Everything  about 
this  machine  is  fast,  except  the  tape  drive.  The  machine 
has  four  speed  modes — common,  fast,  high,  and  auto. 
Common  speed  is  4  MHz  and  is  comparable  to  a  6-MHz 
IBM  PC/AT.  Fast  is  8  MHz  and  is  a  bit  faster  than  an  8-MHz 
AT.  High  is  16  MHz  and  is  unlike  anything  you  have  had 
on  your  desk  before.  Auto  mode  switches  between  high 
and  fast,  depending  on  the  diskette  motor-on  signal,  at¬ 
tempting  to  ease  speed  incompatibilities  in  such  areas  as 
floppy-disk  access.  I  ran  in  high-speed  mode  exclusively 
and  had  no  problems,  although  I  did  not  run  any  copy¬ 
protected  software.  The  bus  version  of  Microsoft  Mouse 
also  worked  well. 

Unless  you  believe  the  rumors  about  Intel  building  a 
386  with  286  pinouts  (shaving  address  bus  and  data  bus 
pins  in  the  process),  I  believe  your  next  MS-DOS  machine 
should  have  a  386  CPU.  I  know  mine  will.  And  more  soft¬ 
ware  developers  have  announced  386-specific  applica¬ 
tions  in  the  six  months  since  Compaq  introduced  its  Desk- 
pro  386  than  all  the  286  specific  applications  announced 
since,  well,  ever. 

Summary 

The  point  of  this  article  is  to  convince  you  that  you  can  get 
started  developing  386-based  applications  now,  without 
waiting  for  a  386  OS/2  to  arrive,  whenever  that  may  be.  If 
you  develop  software  in  a  high-level  language,  such  as 
MetaWare's  High-C  or  Professional  Pascal,  you  will  at 
most  have  to  recompile  and  relink  your  application  to  get 
it  to  run  under  some  new  environment. 

The  tools  to  do  these  things  are  still  young  but  are  cer¬ 
tainly  adequate  for  conventional  application  develop¬ 
ment.  They  will  undoubtedly  get  better  as  time  goes  on 
and  probably  are  already  significantly  better  than  the 
tools  I  used  to  write  this  article  in  February  and  early 
March.  I  hope  that  anyone  who  wishes  to  do  develop¬ 
ment  for  the  386  will  contact  each  of  the  vendors  men¬ 
tioned  here  to  get  more  up-to-date  information. 

DDJ 

Vote  for  your  favorite  feature/article. 

Circle  Reader  Service  No.  1 . 


Dr.  Dobb's  Journal,  July  1987 

513 


ARTICLES 


8088 


Assembly-Language 
Programming  Techniques 


by  Tom  Disque 


Implementing  a  large  software 
system  on  a  personal  computer 
requires  that  programmers  rec¬ 
ognize  the  bottlenecks  of  that  system 
and  reduce  or  eliminate  them.  It  is 
not  enough  simply  to  recode  a  high- 
level  language  in  assembly  language; 
the  code  must  take  advantage  of  the 
special  quirks  of  a  host  machine. 

One  of  my  responsibilities  in  the  PC 
Host  Group  at  SAS  Institute  is  to  en¬ 
sure  that  PC  host  code  is  as  small  and 
as  fast  as  possible.  During  this  pro¬ 
cess,  I  have  collected  several  optimi¬ 
zation  techniques,  some  of  which  are 
applicable  to  other  architectures. 
These  techniques  are  use  of  special 
instructions,  unorthodox  use  of  con¬ 
ventional  instructions,  and  rear¬ 
rangement  of  jump  sequences.  In  the 
following  discussion,  please  note  that 
all  timings  are  for  the  8088  micropro¬ 
cessor  and  assume  that  the  4-byte 
8088  prefetch  queue  is  empty  at  the 
start  of  execution. 

Repeat  Instructions 

One  of  the  techniques  involves  using 
the  repeat  move  and  repeat  store  in¬ 
structions.  The  idea  is  to  move/store 
as  much  data  per  instruction  as  possi¬ 
ble.  Example  1,  page  25,  shows  the 
code  to  move  a  number  of  bytes, 
where  the  source  is  in  ds:si,  the  desti¬ 
nation  is  in  es:di,  and  the  count  is  in 
cy.  I  have  also  applied  this  technique 
to  the  MC68000,  although  the  even 
address  requirements  of  that  chip 
complicate  the  logic  somewhat  (see 
the  box  on  page  26). 


®1986  by  SAS  Institute  Inc.,  Boy  8000, 
SAS  Circle,  Cary,  NC  27511-8000.  Tom 
Disque  is  a  software  developer  at  SAS 
Institute  Inc. 


A  collection 
of 

optimization 

techniques 


Another  repeat  instruction,  the  re¬ 
peat  compare,  allows  me  to  trim  a 
few  cycles.  This  instruction  is  used  in 
a  compare  string  routine  that  returns 
0  if  the  strings  are  equal,  1  if  string  1 
<  string  2,  and  -1  if  string  1  >  string 
2.  Originally,  my  code  looked  like 
that  shown  in  Example  2,  page  25, 
where  string  1  is  in  ds:si,  string  2  is  in 
es.di,  and  the  count  is  in  cy.  With  this 
version,  no  matter  which  value  is  to 
be  returned,  a  16-cycle  jump  has  to  be 
executed.  My  final  code  is  shown  in 
Example  3,  page  25.  This  code  uses 
only  nine  cycles  for  above  or  below. 

Probably  the  biggest  bottleneck  in 
any  screen-intensive  code  for  the  IBM 
PC's  color  display  is  the  wait  for  hori¬ 
zontal  retrace— that  period  when  the 
electron  beam  is  turned  off  and 
moved  to  the  far  left  of  the  screen  to 
draw  another  scan  line.  This  is  the 
only  time  data  can  be  moved  to 
screen  memory  without  causing 
flicker  on  the  screen  unless  vertical 
retrace  time  is  used  (during  vertical 
retrace,  the  electron  beam  is  moved 
from  the  lower-right  corner  to  the 
upper-left  corner).  Vertical  retrace 
screen  updates  caused  flickering  dur¬ 
ing  scrolls,  so  I  used  horizontal  re¬ 
trace  screen  updates.  I  discovered, 
however,  that  using  a  movsw  for  a 
character/attribute  pair  could  cause 
flicker  on  some  screens,  so  I  changed 
the  code  in  Example  4,  page  25,  to 


that  in  Example  5,  page  25.  The  im¬ 
proved  code  enabled  me  to  move  one 
word  per  retrace  on  the  IBM  PC/XT 
and  two  words  per  retrace  on  the  IBM 
PC/AT,  with  no  flicker.  I  was  even 
able  to  write  three  words  per  retrace 
with  a  small  amount  of  flicker 
around  the  edges  of  the  screen  on  the 
AT  but  decided  to  stick  with  two 
words. 

Pointers 

Much  of  our  group’s  code  involves 
pointer  addition  and  subtraction,  so 
we  use  sequences  to  add  a  constant  to 
or  subtract  it  from  a  normalized 
pointer  and  to  produce  a  normalized 
pointer  (normalized  means  the  offset 
is  always  less  than  16). 

This  code  adds/subtracts  the  con¬ 
stant  1234H  hex  from  the  pointer  in 
ay.by: 

Add 

add  bx,0FFF4H 
adc  ax,123H 
and  bx,0FH 

Subtract 

sub  bx,0FFF4H 
sbb  ax,123H 
and  bx,0FH 

We  use  the  following  sequences  to 
add  to  or  subtract  from  an  unnorma¬ 
lized  pointer,  giving  an  unnorma¬ 
lized  result  with  an  offset  less  than 
32,767: 

Add 

add  bx, value 
jge  labell 
add  ax,800H 
and  bh,7FH 

labell: 


24 

514 


Dr.  Dobb's  Journal,  July  1987 


Subtract 

sub  bx, value 
jge  label2 
sub  ax,800H 
and  bh,7FH 

label2: 


We  use  the  following  sequence  to 
normalize  a  pointer  (again,  in  ay:bfl: 

mov 

dx,bx 

and 

bx,0FH 

shr 

dx,l 

shr 

dx,l 

shr 

dx,l 

shr 

dx,l 

add 

ax,dx 

Note  that  this  sequence  could  be  cod- 

ed  as: 

mov 

dx,bx 

and 

bx,0FH 

mov 

cl, 4 

shr 

dx,cl 

add 

ax,dx 

This  version  would  be  smaller,  but 
the  shifts  would  take  longer  on  the 
8086/80186,  80286,  and  so  on.  Note 
that  because  of  the  smaller  size  of  the 
8088/80188  microprocessors’  pre¬ 
fetch  queue,  the  time  is  approximate¬ 
ly  the  same. 

It  is  common  in  high-level  lan¬ 
guages  such  as  C  to  assign  a  value  to  a 
variable  based  on  a  condition,  as  in: 

i  =  1; 

if  (k  >  0)  i  =  2; 

In  this  example,  the  idea  is  to  assign 
the  most  likely  value  in  the  first  state¬ 
ment  and  the  least  likely  in  the  condi¬ 
tional.  In  some  cases  in  assembly  lan¬ 
guage,  the  reverse  turns  out  to  be 
most  efficient,  as  follows: 


or 

ah,80H 

;  set  blinking 

test 

al, BLINK 

jnz 

blinking 

and 

ah,7FH 

blinking 

;  clear 

blinking: 

In  this  example,  which  sets  or  clears 
the  blinking  attribute  for  the  screen 
display,  the  least  likely  alternative  is 
for  blinking  to  be  set  (not  many  peo¬ 
ple  enjoy  looking  at  a  blinking  screen 
eight  hours  a  day). 

If  the  jump  is  taken,  it  uses  16  cy¬ 


cles;  if  not,  the  jump  uses  4  cycles  and 
the  instruction  to  clear  blinking  uses  4 
cycles — half  the  amount  of  time  of  a 
jump  taken.  Note  that  the  "taken 
jump/not  taken  jump”  relative  tim¬ 
ings  change  with  different  chips  in 
the  8080  family;  the  trend  seems  to  be  a 
reduction  in  the  timing  of  a  taken 
jump. 

Miscellaneous  Techniques 

Finally,  I  have  a  few  miscellaneous 
optimization  techniques.  When  ex¬ 
changing  segment  registers,  a  com¬ 


mon  technique  is: 

push  cs 
pop  es 

Much  faster  (4  cyles  vs.  26  cycles)  but 
bigger  (4  bytes  vs.  2  bytes)  is: 

mov  ax,cs 
mov  es,ax 

Another  sequence  to  watch  out  for 
is: 


shr 

cx,  1 

; Convert  byte  count  to  word  count 

jnc 

word  move 

;If  cx  was  odd,  carry  will  be  set 

movsb 

;Move  the  odd  byte 

word  move: 

jcxz 

exit 

;In  case  cx  was  equal  to  1  originally 

rep 

movsw 

;Move  words 

exit: 

Example  1:  Using  the  repeat  move  and  repeat  store  instructions  to  move  a 
number  of  bytes 


xor 

ax,  ax 

; Assume  equal 

rep 

compsb 

; Compare  strings 

je 

exit 

;If  equal,  we're  finished 

ja 

above 

;If  above,  set  ax  to  1 

dec 

ax 

; Below.  Set  flag  to  -1 

jmp 

short  exit 

above: 

inc 

ax 

exit : 

Example  2:  Code  for  a  compare  string  routine 


xor  ax,  ax 
rep  crrpsb 
je  exit 

sbb  ax, ax  ;  If  above,  cf=0,  if  below  cf=l 
cmc  ;  If  above,  cf-1,  if  below  cf=0 
adc  ax, 0  ;  If  above  ax=l,  if  below  ax=-l 


Example  3:  Improved  version  of  the  compare  string  routine  shown  in  Ex¬ 
ample  2 


lolab: 

in 

al,  dx 

;Get  status 

test 

al,  1 

;Is  it  low? 

jnz 

cli 

lolab 

;Wait  until  it  is 
;No  more  interrupts 

hi lab: 

in 

al,  dx 

;Get  status 

test 

al,  1 

;Is  it  high? 

jz 

movsw 

sti 

hi  lab 

;Wait  until  it  is 
; Write  to  the  screen 
; Reenable  interrupts 

Example  4:  Screen  memory  update  routine  synchronized  to  horizontal  retrace 


mov 

bx,  [si] 

;Load  value  "outside  of"  cli-sti 

lolab: 

in 

al,  dx 

;Get  status 

test 

al,  1 

;Is  it  low? 

jnz 

cli 

lolab 

;Wait  until  it  is 
;No  more  interrupts 

hilab: 

in 

al,  dx 

;Get  status 

test 

al,  1 

;Is  it  high? 

jz 

hilab 

;Wait  until  it  is 

mov 

sti 

es :  [di]  ,bx 

;Write  to  the  screen 
; Reenable  interrupts 

I  Example  S:  Improved  version  of  code  shown  in  Example  4 


Dr.  Dobb's  Journal,  July  1987 


25 

515 


mov 

mov 

mov 


ax, word  ptr  [bp], value 
bx,  ax 

ax, word  ptr  [bp] .value [2] 


;8  bytes,  30  cycles 


8088  ASSEMBLY  LANGUAGE 

(continued  from  page  25) 


mov 


mov 

mov 


les 

mov 


Faster  (8  cycles  [plus  8  more  to  fetch 
the  two  extra  instructions]  vs.  28  cy¬ 
cles)  but  bigger  (8  bytes  vs.  4  bytes)  is: 


Example  G:  Three  ways  to  load  a  pointer  into  ax:bx 


If  the  count  had  been  8: 


32-Bit  Moves 
on  the  Motorola 
68000  Microprocessor 


The  Motorola  68000  microprocessor 
poses  a  somewhat  more  difficult 
problem  than  the  Intel  chips  when 
you  want  to  move  more  than  8  bits  of 
data  at  a  time.  The  68000  can  move  32 
bits  in  a  single  instruction,  but  it  must 
move  from  an  even  address  when 
doing  so;  only  8-bit  moves  can  have 
an  odd  address. 

Example  7,  left,  shows  code  for  a 
technique  that  allows  32-bit  moves 
most  of  the  time.  Here,  register  A0 
contains  the  pointer  from  which  the 
data  is  to  be  moved,  register  A1  con¬ 
tains  the  register  to  which  the  data  is 
to  be  moved,  and  register  DO  contains 
the  number  of  bytes  to  move.  The 
overhead  because  of  the  length 
check  over  a  straight  byte  move  (at  8 
bytes)  is  8.6  percent;  the  overhead  if 
one  address  is  odd  and  the  other  is 
even  (at  16  bytes)  is  4.9  percent.  In  or¬ 
der  to  have  the  best  of  both  worlds,  I 
wrote  separate  8-byte  and  16-byte 
move  routines  that  only  move  bytes 
at  a  time.  Because  these  are  the  only 
common  sizes  below  or  close  to  the 
threshold  that  I  move,  I  have  gained 
an  increase  in  speed  at  every  level.  In 
fact,  moves  of  as  little  as  100  bytes 
produce  a  threefold  increase  in 
speed  over  the  simple  byte  move 


Dr.  Dobb's  Journal,  July  1987 


cmpi.l 

#15,  dO 

If  <  15,  byte  move  is  faster 

bit  .s 

bytemove 

Special  code  to  move  words 

Addresses  must  both  be  even  or  both  be  odd 

moveq.l 

#0,d2 

d2  is  a  flag  for  code  below 

move . 1 

a0,d3 

Copy  the  address  register 

lsr.b 

#l,d3 

Is  from  an  odd  address? 

addx.l 

d2,d2 

If  from  is  odd,  increment  flag 

move.l 

al,d3 

Cannot  ’btst'  an  address  register 

btst 

#0,d3 

beq.s 

evenaddr 

addq.l 

#l,d2 

To  is  odd.  increment  flag 

evenaddr  btst 

#0,d2 

If  one  addr  is  odd  and  the  other  even. 

bne.s 

bytemove 

we  cannot  do  it 

lsr.b 

#l,d2 

If  both  were  odd,  d2  is  2;  else  it  is  0 

* 

Now  d2  *  1  indicates  odd,  d2  -  0  says  even 

btst 

#0,d0 

Is  n  an  odd  number? 

beq.s 

evenlen 

addq . 1 

#l,d2 

N  is  odd.  Set  flag 

evenlen  cmpi . 1 

#l,d2 

Find  out  which  even/odd 

* 

combination  we  have  here 

beq.s 

oddeven 

One  is  odd,  one  is  even 

blt.s 

wordmove 

Both  are  even.  Take  off! 

move.b 

<a0)+,  <al)  + 

Both  are  odd.  Fix  the  odd  byte 

wordmove  asr.l 

<2,  dO 

Convert  byte  count  to  word  count 

bcc.s 

longmove 

Any  odd  byte  has  been  moved 

move.w 

(a0)+,  (al)  + 

Now  move  extra  half-word 

longmove  subq.l 

#l,dO 

Decrement  for  dbf 

longloop  move.l 

(a0)+,  (al)  + 

dbf 

dO, longloop 

bra.  a 

exit 

cddeven  btst 

#0,d0 

Was  the  count  the  odd  one? 

bne.s 

oddcnt 

evencnt  subq . 1 

#l,d0 

The  address  was  odd,  count  even; 

move.b 

(a0)+,  (al)  + 

Now  the  addr  is  even,  count  odd! 

oddcnt  asr.l 

#2,  dO 

Convert  byte  count  to  word  count 

bcc.s 

lngmove2 

The  odd  byte  will  been  moved  later 

move.w 

(aO) +,  (al)  + 

Now  move  extra  half-word 

lngmove2  subq.l 

#l,dO 

Decrement  for  dbf 

lngloop2  move.l 

(aO)  +,  (al)  + 

dbf 

dO, lngloop2 

move.b 

(aO)  +,  (al)  + 

Move  the  odd  byte 

bra.s 

exit 

bytemove  subq.l 

#l,d0 

Decrement  for  dbf 

loop  move.b 

(a0)+,  (al)  + 

*to++  =  *from++ 

dbf 

dO, loop 

while  — n  >  0 

exit  rts 

Return 

Example  7:  Performing  32-bit  moves  on  the  68000 

8088  ASSEMBLY  LANGUAGE 

(continued  from  page  26) 


xor  al,al 

would  be  even  faster  and  the  same 
size.  An  optimization  of  structure  ref¬ 
erences  replaces: 

xor  si, si 

Reference  the  0th  structure 
mov  ax, [si]. value 

with: 

mov  ax,[0].value 

A  pointer  can  be  loaded  into  ay.by 
in  three  ways,  as  shown  in  Example 
6,  page  26.  Note  that  the  first  method 
is  faster  because  loads  and  stores  us¬ 
ing  the  accumulator  don't  use  any  cy¬ 
cles  for  effective  address  calculation. 
On  the  80188  microprocessor  and  lat¬ 
er  chips,  the  effective  address  calcu¬ 
lation  is  taken  care  of  by  the  hard¬ 
ware,  which  means  that  the  third 
method  will  be  fastest  on  those  chips. 

One  quirk  of  the  8088  microproces¬ 
sor  instruction  set  appears  in  the  tim¬ 
ing  of  the  movsw  instruction  as  op¬ 
posed  to  the  same  instruction 
prefaced  by  a  repeat  byte.  The 
movsw  instruction  is  26  cycles  alone 
and  9  +  25  cycles  per  repeat  when 
prefaced  by  a  repeat  byte.  This 
means  that  when  less  than  nine  repe¬ 
titions  are  to  be  done,  the  faster  alter¬ 
native  is  to  code  the  movsw  instruc¬ 
tions  one  at  a  time.  The  same  is  true  of 
the  movsb,  stosw,  and  stosb  instruc¬ 
tions.  This  seems  odd  because  the  re¬ 
peat  instructions  always  have  the 
overhead  of  decrementing  the  cy  reg¬ 
ister;  thus,  you’d  expect  it  to  be 
slower. 

It  is  always  important  to  check  the 
timings  of  instruction  sequences  on 
the  target  machine;  the  relative  mag¬ 
nitudes  of  the  timings  given  here,  for 
instance,  are  different  from  those  for 
the  80188  microprocessor.  I  have 
used  the  8088  microprocessor's  tim¬ 
ings  because  the  8088  shows  im¬ 
provements  most  dramatically.  Com¬ 
mon  assumptions  about  floating¬ 
point  arithmetic  don't  always  apply 
on  the  8087  coprocessor,  either.  For 
example,  one  common  assumption  is 
that,  if  three  adds  could  replace  two 
multiplies,  the  resultant  code  would 
be  faster — not  so  with  the  8087 


coprocessor! 

I  have  shown  that  code  can  be  opti¬ 
mized  by  use  of  special  instructions, 
unusual  usage  of  conventional  in¬ 
structions,  and  rearrangement  of 
code.  The  techniques  outlined  here, 
if  used  alone  and  in  infrequently 
called  code,  do  not  necessarily  pro¬ 
duce  noticeable  results,  however. 
First,  you  must  identify  bottle¬ 
necks — even  a  small  reduction  in 
time  is  noticeable  in  a  true  bottle- 


The  repeat  move  given  in  the  text  is 
optimal  for  machines  using  an  8-bit 
data  path,  but  the  code  can  be  im¬ 
proved  in  some  cases  for  machines 
using  a  16-bit  path  (that  is,  using  CPUs 
such  as  the  8086,  80186,  and  80286). 
When  a  movsw  instruction  executes 
on  the  8088,  it  generates  two  external 
bus  cycles  regardless  of  whether  the 
addresses  are  odd  or  even.  When  a 
movsw  executes  on  the  8086,  howev¬ 
er,  only  one  bus  cycle  is  generated  for 
even  addresses.  (Two  bus  cycles  are 
still  generated  for  odd  addresses.) 

At  first  glance,  the  code  to  convert 
from  byte  count  to  word  count  in  the 
text  might  seem  useless.  But  bus  cy¬ 
cles  alone  do  not  determine  instruc¬ 
tion  speed.  Empirical  results  on  a 
Leading  Edge  Model  D  (with  an  8088 
CPU)  show  roughly  a  20  percent  in¬ 
crease  in  execution  speed.  Ideally, 


neck.  If  you  don’t  notice  any  speed 
improvements,  it's  not  worth  opti¬ 
mizing  your  code. 

Acknowledgments 

I  would  like  to  thank  John  Toebes 
and  Mike  Jones  for  the  use  of  their 
pointer  techniques  in  this  article. 

DDJ 

Vote  for  your  favorite  feature/article. 

Circle  Reader  Service  No.  2. 


you  would  expect  the  speed  to  be  al¬ 
most  double,  so  you  can  see  what  a 
large  effect  the  bus  size  has  on  execu¬ 
tion  speed  when  accessing  memory. 

In  order  to  correct  for  the  odd  ad¬ 
dress  problem,  the  code  sequence  in 
Example  8,  above,  can  be  used. 

Without  this  correction,  the  movsw 
code  for  even  addresses  is  twice  as 
fast  as  for  odd  addresses;  in  fact,  the 
odd  address  word  move  runs  as  slow 
as  a  byte  move!  With  the  correction, 
the  odd  address’  move  is  almost  the 
same  speed.  Please  note  that  this  code 
is  best  used  for  moving  large  blocks  of 
data;  for  small  blocks,  the  overhead  is 
a  significant  factor.  Also  keep  in 
mind  that  both  pointers  will  usually 
be  even  because  most  compilers  try 
to  align  objects  on  the  most  efficient 
boundaries. 


shr 

cx,  1 

;As  in  the  text 

jnc 

word  move 

movsb 

word  move: 

jcxz 

exit 

mov 

ax,  si 

;Use  ax  to  test  addresses 

and 

ax,di 

;Put  the  addresses  together 

shr 

ax,  1 

;If  both  addresses  were  odd. 

jnc 

not  odd 

;the  carry  flag  will  be  set 

movsb 

;Move  to  an  even  address 

dec 

CX 

;One  less  word  for  the  repeat 

jcxz 

finish 

;If  only  one  word 

rep 

movsw 

finish: 

movsb 

;Move  the  last  byte 

jmp 

short  exit 

not  odd 

rep 

movsw 

exit: 

Example  8:  Correction  to  increase  speed  of  movsw  code  for  odd  addresses 
on  an  8086 


Aii  Improved 
Move  for 
Wider  Buses 


28 


Dr.  Dobb's  Journal,  July  1987 

517 


ARTICLES 


Logic  and  Knowledge 
Representation 
in  PROLOG 

by  Richard  Butrick 


The  syntax  of  PROLOG  is  essen¬ 
tially  limited  to  fact  state¬ 
ments,  which  are  simple 
noncompound  sentences,  and  rule 
statements,  which  are  if  statements 
with  the  consequent,  which  must  be 
a  fact  statement,  placed  on  the  left. 
The  form  of  the  rule  statement  is 
head  if  body.  The  head  must  be  a  sim¬ 
ple  noncompound  statement,  and 
the  body  can  be  either  simple  or  a 
compound  built  up  of  conjunctions, 
disjunctions,  and  negations.  The 
body  cannot  itself  be  a  conditional 
containing  if. 

Even  Procrustes  would  find  this  a 
severe  limitation  on  human  expres¬ 
sion.  Once  you  get  past  the  cozy  gene¬ 
alogical  examples  on  which  PROLOG 
texts  rely  to  introduce  logic  program¬ 
ming,  reformulation  of  human 
thinking  into  forms  pallatable  to  PRO¬ 
LOG  can  get  pretty  rough.  Moreover, 
weighty  considerations  concerning 
SLD-resolution  (which  stands  for  Se¬ 
lecting  a  literal,  using  a  Linear  strate¬ 
gy,  and  searching  the  space  of  possi¬ 
ble  deductions  Depth-first)  and 
unsatisfiable  sets  of  Horn  clauses 
aside,  PROLOG’S  arsenal  of  deductive 
weapons  consists  of  a  rapid  fire  pea¬ 
shooter  known  to  the  scholastics  as 
modus  ponens.  Thus  casting  your 
pearls  of  wisdom  into  the  jaws  of 
PROLOG  syntax  does  not  guarantee 
even  the  most  obvious  deductions  be¬ 
ing  drawn  by  PROLOG'S  single-cylin¬ 
der  inference  engine. 

Consider  Lao  Tsu’s  chestnut: 
“When  opposites  supplement  each 
other,  everything  is  harmonious.” 


Richard  Butrick,  Ohio  University,  Ath¬ 
ens,  OH  45701.  Richard  is  a  professor 
in  the  Computer  Science  Department. 


Reform  ulation 
of  human  thinking 
into  PROLOG 
can  be  rough. 


This  can  be  cast  into  PROLOG  syntax 
rather  handily  as  Everything— is— har¬ 
monious  if  all— opposites— supplement 

—each _ other.  PROLOG,  however, 

cannot  even  define  from  this  that  not 
all  —  opposites  —  supplement  _  each  _ 
other,  given  the  fact  that  not  every¬ 
thing— is— harmonious.  Basically,  the 
only  inference  that  PROLOG  can  per¬ 
form  (extended  by  unification)  is  to 
infer  R  from: 

R  if  SI  and  S2  and  S3  .  .  .  and  Sn 
and: 

SI  and  S2  and  S3  .  .  .  and  Sn 

The  R  on  the  left  cannot  even  be 
compound!  This  means  that  the  R  on 
the  left  cannot  be  of  any  of  the  fol¬ 
lowing  forms:  not  L,  Lor  M,  L  and  M. 
What  then  do  you  do  with  "If  there  is 
a  sharp  move  in  the  market,  then  it  is 
either  short  covering  on  the  up  side 
or  a  purely  technical  reaction  on  the 
down  side”  (that  is,  C  or  R  if  AD?  Or 
even,  more  simply,  “If  it  is  an  ordi¬ 
nary  market,  then  it  is  not  wise  to 
buy  in  the  first  hour”  (that  is,  not  W  if 
O )? 

Knowledge  representation  in  PRO¬ 
LOG  comes  down  to  formulating  sen¬ 
tences  in  a  syntax  acceptable  to  PRO¬ 
LOG  and  in  formulating  the  sentences 


in  such  a  way  that  PROLOG  will  make 
the  appropriate  deductions.  A  lot  of 
creative  thinking  is  involved  in  com¬ 
ing  up  with  the  sentences  to  be  repre¬ 
sented  in  PROLOG  (knowledge  codifi¬ 
cation)  and  in  organizing  blocks  of 
knowledge  (modularization),  but  it 
still  comes  down  to  knowing  how  to 
enter  such  sentences  to  assure  the 
level  of  deductive  completeness  de¬ 
sired.  Two  categories  of  problems 
have  to  be  dealt  with:  compound 
fact/consequent  representation 
problems  and  existential  quantifica¬ 
tion  representation  problems. 

Compound  Fact/Consequent 
Representation 

As  indicated,  the  official  syntax  of 
PROLOG  only  allows  the  introduction 
of  two  types  of  sentences  into  a  pro¬ 
gram.  It  accepts  simple  or  noncom¬ 
pound  statements  (statements  that  do 
not  contain  any  ifs,  ands,  ors,  or  nots), 
and  it  accepts  conditionals  of  the 
form  noncompound  if  simple-or-com- 
pound.  The  compound  side  of  the 
conditional  cannot  itself  be  a  condi¬ 
tional  (it  can  contain  only  ands,  ors, 
and  nots).  In  practice,  there  are  sever¬ 
al  ways  around  this  restriction,  and 
in  fact  it  is  crucial  to  get  around  it  for 
full  knowledge  representation. 

The  points  made  here  are  with  spe¬ 
cific  reference  to  the  Simple  syntax 
of  micro-PROLOG,  which  is  the  syntax 
that  is  closest  to  classical  logic  and  to 
English  syntax.  Considerations 
brought  forward  apply  equally  to 
C&M  syntax,  however,  because  they 
refer  to  the  underlying  logic  of  PRO¬ 
LOG  based  on  SLD-resolution.  I  might 
also  mention  in  passing  that,  among 
the  better-known,  interpreted  PRO¬ 
LOGS,  only  micro-PROLOG  imple- 


30 

518 


Dr.  Dobb's  Journal,  July  1987 


ments  tail  recursion  correctly  with 
no  limit  on  the  recursion.  Moreover, 
metalevel  programming  involving 
standard  syntax  can  be  incorporated 
into  Simple  programs,  and  in  that 
sense  Simple  syntax  is  a  full-fledged 
PROLOG  system  and  not  a  Simple  Si¬ 
mon  syntax. 

Negative  Facts 

There  are  three  ways  around  the  re¬ 
striction  on  negative  literals  (nega¬ 
tions  of  simple  sentences): 

fact  if  FAIL  (1) 

not— fact  (2) 

fact(not)  (3) 

For  example,  the  following  PRO¬ 
LOG  program  snippet  behaves  just  as 
if  it  contained  negative  facts  in  its 
fact-rule  base: 

might_be_bear_market  if  not 
bull-market  (la) 

bull-market  if  FAIL  (lb) 

PROLOG  responds  to  the  query  isfbull- 
— market)  with  no  and  to  the  query 
istmight— be— bear— market )  with  yes. 

The  corresponding  program  for 
the  not— fact  solution  is: 

might— be_ bear— market  if 

not— bull— market  <2a) 

not— bull— market  <2b) 

This  version  can  make  the  deduction 
might— be— bear— market  and,  obvi¬ 
ously,  not— bull— market,  but  it  cannot 
make  the  deduction  not  bull— market. 
In  this  regard,  solution  (  2)  is  weaker 
than  solution  ( 1 ). 

The  third  solution  is  really  a  sleazy 
solution  that  only  works  because  the 
dictionary  maintained  by  the  syntax 
analyzer  does  not  keep  the  degree  of 
the  predicate: 

might— be_ bear— market  if  not 
bull— market  (3a) 

bull— market(not)  (3b) 

Thus  PROLOG  responds  to  the  query 
is( bull— market)  with  no  even  though 
bull— market  was  used  as  a  one-place 
predicate  and  not  as  a  fact  (zero- 
placed  predicate).  To  make  matters 
worse,  Simple  syntax  accepts  the 
postfix  notation  and  so  (3b)  could 
have  been  not  bull— market.  Rather 
than  not  being  treated  as  negation, 


however,  it  is  treated  as  a  thing  that 
has  the  property  bull— market.  These 
conceptual  confusions  make  little  dif¬ 
ference,  however,  as  the  correct  re¬ 
sponses  and  deductions  will  be  made. 
The  net  effect  is  that  of  declaring  bull- 
— market  to  be  a  data-rel,  which  en¬ 
ters  it  in  the  dictionary  but  not  the 
database. 

It  might  seem  at  this  point  that  solu¬ 
tion  (1)  works  fine  and  that  negative 
facts  are  neatly  handled  by  the  built- 
in  primitive  FAIL.  Unfortunately, 
there  is  a  serious  problem  lying  be¬ 
hind  the  negative-fact  problem  and 
that  is  that  PROLOG  will  make  falla¬ 
cious  inferences  with  the  introduc- 


The  incorporation  of 
recursion  search 
methods  within  a 
deductive  framework 
makes  PROLOG 
valuable  and 
powerful. 


tion  of  the  built-in  not.  Consider  the 
following  example: 

sentence— 1  if  not  sentence— 2 
sentence— 2 

sentence— 3  if  not  sentence— 1 

PROLOG  not  only  answers  no  to  istsen- 
tence—1)  but  it  also  deduces  sen¬ 
tence— 3.  Let  us  hope  the  following 
sentences  do  not  find  their  way  into  a 
Pentagon  expert  system: 

a_ pre-emptive— strike-will— occur  if 
not  supreme— caution— exercised 
supreme— caution— exercised 
pre-emptive— strike— defence— un¬ 
necessary  if  not  pre-emptive— strike 
—will— occur 

Yes  will  be  the  response  to  the  query 
i  s(pre-emptive— strike— defence— un¬ 
necessary).  Now  there  is  a  piece  of 
artificial  intelligence.  The  negation 
logic  for  PROLOG  operates  under 
what  is  called  negation-by-failure,  or 
the  closed-world  principle.  This 
means  that  if  a  sentence  S  is  not  de- 
ducible,  then  not  S  is  deducible.  This 
type  of  argument  is  known  in  logic 


texts  as  the  argument  from  igno¬ 
rance,  and  it  commits  the  fallacy  of 
failing  to  distinguish  between  known 
truth  and  truth.  In  a  complete  system 
everything  true  is  asserted  to  be  true 
either  directly  or  by  implication. 
Hence  that  which  is  not  asserted  is 
false.  Because  complete  systems  are 
few  and  far  between,  and  no  system 
using  elementary  arithmetic  is  com¬ 
plete  (this  is  known  as  Godel's  Incom¬ 
pleteness  Result),  what  is  needed  is  a 
form  of  negation  for  incomplete  sys¬ 
tems.  For  such  systems,  the  built-in 
not  should  be  used  only  in  those  cases 
in  which  failure  to  succeed  implies 
falsity,  as  for  example  in  not  y  ON  (a  b 
c  d).  Otherwise,  the  negation  logic 
needed  must  be  provided  by  the  pro¬ 
gram.  This  is  solution  (2). 

Compound  Facts 

Consider  the  following  compound: 

if  sentence— 1  then  sentence— 2  or 
sentence— 3 

The  attempted  representation  in 
PROLOG  as: 

sentence— 2  or  sentence— 3  if 
sentence— 1 

does  not  work  because  a  compound 
cannot  occur  to  the  left  of  an  if.  The 
question  of  representation  turns  on 
the  sorts  of  deductions  that  could  be 
made  from  the  sentence.  The  fol¬ 
lowing  are  possible  deductions:  not- 
— sentence— 1,  sentence— 2  or  sen¬ 
tences,  sentences,  sentence—3.  The 
deduction  of  sentences,  for  exam¬ 
ple,  would  require  the  following 
entry: 

sentence—3  if  sentence— 1  and  not_ 
sentence— 2 

Table  \,  page  32,  gives  some  useful 
conversions.  Which  conversion  or 
conversions  you  use  depends  on 
which  sentences  you  target  as  possi¬ 
ble  goals  within  the  knowledge  base. 
You  must  then  ensure  that  you  use  a 
consistent  representation  within  the 
entire  knowledge  base.  Needless  to 
say,  this  is  no  mean  task,  and  general¬ 
ly  speaking,  unless  all  the  sentences 
to  be  represented  fit  within  the  offi¬ 
cial  syntax,  attempting  to  ensure  that 
all  legitimate  deductions  can  be 
drawn  is  an  impossible  task. 


Dr.  Dobb’s  Journal,  July  1987 


31 

519 


PROLOG  SEMANTICS 

(continued  from  page  31) 


The  following  is  an  example  of  a 
knowledge  representation  problem 
at  the  level  of  sentence  logic  (as  op¬ 
posed  to  quantificational-  or  predi¬ 
cate-level  logic): 

1.  If  aggregate  expenditure  is  unre¬ 
sponsive  to  changes  in  the  money 
supply  and  unresponsive  to  changes 
in  the  interest  rate,  then  monetarism 
offers  little  hope  for  controlling  the 
economy,  (if  AUM  and  AUI  then  MLH) 

2.  If  there  is  a  large  infusion  of  mon¬ 
ey,  then  interest  rates  will  fall  and 


people  will  be  induced  to  hold  mon¬ 
ey  upon  a  large  infusion  of  money 
rather  than  invest  or  spend,  (if  LIM 
then  IRF  and  HLM) 

3.  If  people  are  induced  to  hold  mon¬ 
ey  upon  a  large  infusion  of  money, 
then  aggregate  expenditure  is  unre¬ 
sponsive  to  changes  in  the  money 
supply,  (if  HLM  then  AUM) 

4.  Because  it  cannot  be  the  case  that 
both  interest  rates  fall  and  people  are 
not  induced  to  hold  new  money, 
then  aggregate  expenditure  is  unre¬ 
sponsive  to  changes  in  the  interest 
rate,  (if  either  not  IF  or  HNM  then  AUI) 

Representation  of  these  "rules”  in 


Logic 

PROLOG 

not  L 

not_L 

L  and  M 

L 

M 

if  L  then  M 

M  if  L 

not_L  if  not_M 

L  or  M 

L  if  not_M 

M  if  not_L 

not(L  and  M) 

not_M  if  L 

not_L  if  M 

if  L  then  M  and  K 

M  if  L 

not—L  if  not_M 

K  if  L 

not_L  if  not_K 

if  L  then  M  or  K 

M  if  L  and  not_K 

K  if  L  and  not_M 

not_L  if  not_M  and  not_K 

if  L  or  M  then  K 

K  if  L 

K  if  M 

not_L  if  not_K 

not_M  if  not_K 

if  L  and  M  then  K 

K if  L  and  M 

not_M  if  not_K  and  L 
not_L  if  not_K  and  M 

if  L  or  M  then  K  or  H 

K  if  L  and  not_H 

K  if  M  and  not_H 

H  if  L  and  not_K 

H  if  M  and  not_K 

not_L  if  not_K  and  not_H 

not_M  if  not_K  and  not_H 

if  L  and  M  then  K  or  H 

Kif  t  and  M  and  not_H 

H  if  L  and  M  and  not_K 

not_L  if  not_K  and  not_H  and  M 

not_M  if  not_K  and  not_H  and  L 

Table  1:  Some  useful  conversions  between  logic  and  PROLOG 


PROLOG  depends  on  the  sorts  of  de¬ 
ductions  that  are  of  potential  interest. 
Full  representation  is  prohibitively 
large.  The  following  representation 
allows  the  principle  deductions  of 
MLH  or  of  not— AUM,  depending  on 
the  facts  added  to  the  rule  base: 

MLH  if  AUM  and  AUI 
not_AUM  if  not— MLH  and  AUI 
IRF  if  LIM 

not— LIM  if  not— IRF 
HLM  if  LIM 
not_LIM  if  not— HLM 
AUM  if  HLM 
AUI  if  not—IF 
AUI  if  HNM 

Given  the  facts  LIM  and  not -IF,  PRO¬ 
LOG  can  deduce  MLH.  Given  the  facts 
not — MLH  and  HNM,  PROLOG  can  de¬ 
duce  not— AUM,  and  so  forth. 

Representation  at  the 
Quantificational  Level 

Variables  in  simple  sentences  and 
variables  that  occur  on  both  the  left 
and  right  of  a  conditional  are  treated 
as  universally  quantified.  Variables 
that  occur  only  on  the  right  side  of 
the  if  are  treated  as  existentially 
quantified.  Although  this  sums  up  in 
a  nutshell  PROLOG’S  treatment  of 
quantification,  a  great  deal  needs  to 
be  done  to  explain  this  policy  and 
how  to  work  within  the  constraints 
of  the  policy. 

In  PROLOG  the  sentences: 

x  is_pernicious 
x  is_pernicious  if  x  is_ devious 
x  is_pernicious  if  y  is_ devious  and  x 

admires  y 

are  treated  as  if  they  were  the 
following: 

Every  x  is  pernicious 

(Everyone  [at  Harry's  Place]  is 
pernicious) 

Every  x  is  pernicious  if  that  x  is 

devious 

(Everyone  who  is  devious  is 
pernicious) 

Every  x  is  pernicious  if  some  y  is  de¬ 
vious  and  x  admires  that  y 
(Anyone  who  admires  a  devious 
person  is  pernicious) 

PROLOG  provides  no  explicit  nota¬ 
tion  to  indicate  whether  an  y  in  a  sen¬ 
tence  is  supposed  to  mean  "some  x,” 


32 

520 


Dr.  Dobb’s  Journal,  July  1987 


Logic 

PROLOG 

(Ex)Fx 

F ( alpha ) 

(There  is  an  x,  Fx) 

(Something  is  an  F) 

(Ex)Gx 

G( beta ) 

Table  2:  Representing  existing  but  unknown  objects  in  PROLOG 


Logic 

PROLOG 

(x)(Ey)x  <  y 

Less ( x  (alpha  x )  ) 

(Every  no.  is  less  than  some  no.) 

Table  3:  Representing  dependencies  among  unknown  objects  in  PROLOG 


PROLOG  SEMANTICS 

(continued  from  page  32) 


or  "an  x,"  or  "all  x.”  Expressions  such 
as  "all,"  "some,”  and  "any,"  which 
are  used  to  indicate  quantification, 
are  simply  not  part  of  PROLOG'S  vo¬ 
cabulary.  How  then  do  you  say  that 
there  are  some  pies  in  the  sky  as  op¬ 
posed  to  saying  that  x  is  a  pie  in  the 
sky  and  have  PROLOG  treat  this  state¬ 
ment  as  if  it  meant  that  every  x  is  a 
pie  in  the  sky?  To  a  large  extent  the 
logic  for  indicating  "some”  (existen¬ 
tially  quantified  variables)  must  be 
provided  by  the  programmer. 

Thoralf  Skolem,  the  great  Norwe¬ 
gian  logician,  is  credited  with  provid¬ 
ing  the  first  systematic  method  for 
representing  quantifiers  without  us¬ 
ing  quantifiers.  His  method  utilizes 
what  are  aptly  called  Skolem  func¬ 
tions.  Fortunately,  his  ideas  are  intu¬ 
itive  enough  to  avoid  having  to  refer 
directly  to  his  formal  presentation. 
You  might  feel  inclined  at  this  point 
to  inquire  politely,  without  denigrat¬ 
ing  the  wonders  of  Skolemizing 
quantifiers,  why  you  would  wish  to 
represent  quantificational  relation¬ 
ships  without  using  quantifiers.  The 
answer  is  that  modus  ponens,  the 
heart  of  the  PROLOG  inference  en¬ 
gine,  can’t  work  with  quantifiers.  Be¬ 
sides,  logicians  love  to  frighten  their 
hapless  dinner  companions  by  tell¬ 
ing  them,  as  if  indifferent  to  moral 
outrage,  that  they  have  spent  the  day 
in  their  office  Skolemizing  quantifi¬ 
ers.  But  then,  quantifiers  will  do  any¬ 
thing  for  a  modus  ponens. 

To  say  that  something  is  an  F  is  not 
to  say  that  any  specific  thing  is  an  F. 
Thus  in  a  domain  of  three  things —  {a 
b  c} —  you  cannot  infer,  say,  that  c  is 
an  F  because  something  is  an  F.  One 
of  them  is  an  F,  but  which  one  is,  is 
not  known.  The  idea  here  is  to  invent 
a  fictitious  name  that  by  convention 
is  not  a  name  of  anything  in  the  do¬ 
main  of  discourse.  This  corresponds 
to  the  standard  mathematical  prac¬ 
tice  that  runs  as  follows:  "Something 
is  an  F.  Call  it  alpha."  Giving  a  ficti¬ 
tious  name  or  moniker  to  unknown 
persons  is  also  common  in  ordinary 
language — for  example,  "Kilroy”  or 
"Jack  the  Ripper.”  From  a  logical 
point  of  view,  the  important  thing  is, 
first  of  all,  to  distinguish  between  the 
name  of  an  unknown  object  and  the 
name  of  a  known  object.  Using  Greek 


34 


Dr.  Dobb's  Journal,  July  1987 

521 


letters  is  a  convenient  way  of  doing 
this.  The  second  thing  is  to  make  sure 
that  different  Greek  letters  are  as¬ 
signed  to  objects  whose  existence  is 
asserted  by  different  "somes”  (exis¬ 
tential  quantifiers),  as  shown  in  Table 
2,  page  34. 

It  is  important  to  block  the  infer¬ 
ence  that  something  is  both  an  F  and 
a  G  from  something  is  an  F  and  some¬ 
thing  is  G.  You  do  this  by  using  differ¬ 
ent  Greek  letters  that  may  or  may  not 
refer  to  the  same  object.  The  follow¬ 
ing  PROLOG  statements  provide  the 
information  necessary  for  PROLOG  to 
make  the  inference  that  alpha  and 
beta  have  at  least  one  property  in 
common: 

F(alpha)  (*  There  are  Fs  *) 

G(beta)  (*  There  are  Gs  *) 

G(x)  if  F(x)(*  All  Fs  are  Gs  *) 

Given  the  query  is(Gfalpha)),  PROLOG 
answers  yes. 

When  existential  quantifiers  are 
used  with  universal  quantifiers,  the 
logic  for  eliminating  quantifiers  be¬ 
comes  more  complicated.  Thus  "ev¬ 
ery  number  is  less  than  some  num¬ 
ber”  cannot  be  rendered  as  "every 
number  is  less  than  alpha.”  Clearly 
the  intention  of  the  first  statement  is 
not  that  every  number  is  less  than 
one  and  the  same  number,  alpha. 
This  would  make  alpha  less  than  it¬ 
self!  The  intention  is  that  the  alphas 
are  different  for  different  numbers. 
This  is  done  by  making  alpha  x-de- 
pendent,  as  shown  in  Table  3,  page 
34.  Instead  of  using  alpha,  the  term 
(alpha  x)  is  used.  The  variable  depen¬ 
dencies  for  alpha  are  only  those  uni¬ 
versally  quantified  variables  that 
precede  it  when  written  in  quanti¬ 
fied  form  (see  Table  4,  above). 

To  get  an  idea  of  the  inferences 
PROLOG  can  make  with  these  sorts  of 
sentences,  consider  the  following: 

Loveslalpha  x)  (*  Someone  loves  ev¬ 
eryone  *) 

Loveslx  (beta  x))  <*  Everyone  loves 

someone  *) 

To  each  of  the  queries  is(Loves( alpha 
alpha)),  is  (Loveslalpha  (beta  x)),  and 
is(Loves((beta  x)  (beta  (beta  x)))),  PRO¬ 
LOG  responds  yes. 

Unfortunately,  PROLOG  doesn't  di¬ 
rectly  make  the  legitimate  inference 
"everyone  is  loved  by  someone” 


Logic 

PROLOG 

(x)(Ey)Lxy 

Loves ( x  (alpha  x ) ) 

(x)(Ey)Lyx 

(Everyone  is  loved  by  someone) 

Loves((alphax)  x) 

(Ex)(y)Lxy 

(Someone  loves  everyone) 

Loves (alpha  y ) 

(Ex)(y)Lyx 

(Someone  is  loved  by  everyone) 

Loves  (y  alpha ) 

These  sentences  are  taken  as  isolated  examples.  Hence  the  use  of  the  same  Greek  letter 

in  all  the  examples  rather  than  using  new  letters  for  each  existential  quantifier. 

Table  4:  Resolving  ambiguities  of  affection  in  PROLOG 


Dr.  Dobb's  Journal,  July  1987 

522 


35 


Logic 

PROLOG 

(x)(Px  ~>  (Ey)(Cy  &  Syx)) 

(Every  passenger  was  searched 
by  some  customs  officer) 

S  (  (  alpha  x  )  x )  )  if  P(x| 

C  (  (  alpha  x )  )  if  P  (  x ) 

(Ex)(Cx  &  (Y)(Py  -->  Sxy)) 

(Some  customs  officer  searched 
every  passenger) 

C ( alpha ) 

Sfalpha  y)  if  P(y) 

(Ex)(Px  &  (y)(Cy  ~>  Syx)) 

(Some  passenger  was  searched 
by  every  customs  officer) 

P (alpha ) 

S ( y  alpha )  if  C ( y ) 

(x)(Cx  >  (Ey)(Py  &  Sxy)) 

(Every  customs  official 
searched  a  passenger) 

S(x  (alpha  x) )  ifC(x) 

P ( (alpha  x )  )  if  C ( x ) 

These  are  treated  as  isolated  sentences,  so  alpha  is  used  in  all  cases  instead  of  using 
new  Greek  letters  for  each  new  existential  quantifier.  The  PROLOG  sentences  are  subject 
to  the  truth  functional  conversions  indicated  in  the  first  section  of  this  article  (compound 
fact/consequent  representation  problems). 

Table  5:  Representing  relationships  among  classes  of  objects  in  PROLOG 

PROLOG  SEMANTICS 

(continued  from  page  35) 

(Loves) (delta  y)  y))  from  “someone 
loves  everyone”  ( Lovesfalpha  yj).  It 
might  be  argued  that  in  effect  it 
makes  the  inference  because  the  que¬ 
ry  "is  everyone  loved  by  someone?” 
could  be  said  to  take  the  form 
which(}c:  Lovesfy  y  JJ.  The  answer  will 
be  y  ,  and  this  could  be  taken  as 
meaning  "everyone  is  loved  by  some¬ 
one.”  The  y  in  the  query  is  under¬ 
stood  by  PROLOG  as  being  existential¬ 
ly  quantified. 

Some  Standard 
Quantificational  Statements 

It  is  typical  of  human  discourse  to 
identify  two  classes  of  objects  and 
claim  some  sort  of  relationship  be¬ 
tween  the  the  two  classes.  Those  sen¬ 
tences  that  advance  all-some  or  some- 
all  relationships  have  particularly 
interesting  logical  properties  and  il¬ 
lustrate  the  use  of  Skolem  functions. 
Consider  the  sentence  "All  passen¬ 
gers  were  questioned  by  customs  of¬ 
ficers.”  This  statement  identifies  two 
classes  of  objects — passengers  and 
customs  officers — and  asserts  that 
each  of  the  former  was  searched  by 
at  least  one  (not  necessarily  one  and 
the  same)  of  the  latter.  This  differs 
from  the  statement  "Some  customs 
officers  searched  all  of  the  passen¬ 
gers/'  but  a  relationship  is  still  being 
asserted  between  two  classes  of  ob¬ 
jects.  Table  5,  above,  considers  varia¬ 
tions  of  these  "dual-class”  sentences. 

The  following  is  a  well-known  test 
problem  for  formulation  into  an  in¬ 
ference  system  based  on  modus  pon- 
ens  and  unification: 

The  customs  officials  searched  every¬ 
one  who  entered  the  country  who 
was  not  a  VIP. 

Some  of  the  drug  pushers  entered 
this  country  and  they  were  only 
searched  by  drug  pushers. 

No  drug  pusher  was  a  VIP. 

This  problem  is  to  be  represented  in 
such  a  way  as  to  support  the  infer¬ 
ence  that  some  of  the  officials  were 
drug  pushers.  The  representation  in 
PROLOG  is  as  follows: 

Cllalpha  x))  if  E(x)  and  not_V(x) 
S((alpha  x)  x)  if  E(x)  and  not_V(x) 
D(beta) 


E(beta) 

D(y)  if  Sly  beta) 
not_V(x)  if  D(x) 

To  the  query  is(S((alpha  beta)  beta)), 
PROLOG  answers  yes,  and  to  whichfy: 
C(yj  and  D(?t)),  PROLOG  responds  (al¬ 
pha  beta). 

With  the  use  of  Skolem  functions, 
quantificational-level  logic  can  be 
worked  into  the  patois  of  PROLOG. 
But  this  is  only  first-order  quantifica¬ 
tional  logic  without  identity.  Intro¬ 
ducing  identity,  whereby  Sly)  is  in¬ 
ferred  from  Six)  and  x  =  y,  is  another 
story  in  itself  and  requires  second-or¬ 
der  programming. 

Summary  Remarks 

From  a  logical  point  of  view,  attempt¬ 
ing  to  make  all  deductions  on  the  ba¬ 
sis  of  modus  ponens  and  unification 
seems  a  needless  and  stultifying  limi¬ 
tation  on  human  reasoning  capacity. 
Indeed,  deduction  engines  that  em¬ 
ploy  a  full  complement  of  inference 
rules  are  legion  in  academe.  But  they 
suffer  from  several  problems,  not  the 
least  of  which  is  speed  or  lack  there¬ 
of.  The  killer  problem  is  that  of  incor¬ 
porating  recursion  with  deduction. 
PROLOG  accomplishes  this,  and  the  in¬ 
corporation  of  recursion  search 


methods  within  a  deductive  frame¬ 
work  is  really  what  makes  PROLOG 
valuable  and  powerful.  Its  logical 
limitations  are  the  price  paid  for  this 
incorporation,  but  the  trade-off  is 
likely  to  seem  a  worthwhile  one  for 
some  time. 

DDJ 

Vote  for  your  favorite  feature/article. 

Circle  Reader  Service  No.  3. 


Dr.  Dobb's  Journal,  July  1987 


37 

523 


ARTICLES 


Multitasking  with 
Turbo  Pascal 

by  Craig  A.  Lindley 


Multitasking  is  the  ability  to 
distribute  the  resources  of 
a  single  CPU  to  several  pro¬ 
cesses  or  tasks.  This  leads  to  better 
utilization  of  a  CPU:  while  one  pro¬ 
cess  is  stuck  waiting  for  an  event  to 
occur  before  it  can  continue  (a  re¬ 
sponse  from  the  user,  for  example), 
another  process  can  continue  run¬ 
ning.  The  CPU  can  thus  perform 
more  useful  work  with  greater  effi¬ 
ciency.  To  implement  multitasking 
requires  certain  hardware  and  soft¬ 
ware  resources.  Although  the  IBM  PC 
has  the  necessary  hardware,  MS- 
DOS — the  operating  system  of  the 
masses — currently  lacks  any  support 
for  multitasking.  A  new  DOS  with  the 
requisite  features  has  been  promised 
by  IBM  and  Microsoft  "real  soon 
now." 

Fortunately,  you're  not  completely 
abandoned  when  it  comes  to  soft¬ 
ware  support.  Some  of  the  more  re¬ 
cent  computer  languages  such  as 
Ada  and  Modula-2  inherently  sup¬ 
port  multitasking.  Even  earlier  lan¬ 
guages  such  as  Pascal  and  C  have  the 
necessary  resources  if  you're  willing 
to  write  a  few  routines.  In  this  article 
I  describe  how  to  add  multitasking  to 
standard  Turbo  Pascal.  Specifically,  I 
illustrate  how  to  add  a  cooperative, 
round-robin  task  scheduler  (kernel) 
to  any  Turbo  Pascal  program  using 
only  a  few  short  procedures.  The  ap¬ 
proach  I  take  allows  cookbook  use  of 
the  multitasking  routines — that  is, 
you  don’t  have  to  understand  fully 
how  the  procedures  work  to  use 


Craig  A.  Lindley,  6  Sutherland  PL, 
Manitou  Springs,  CO  80829.  Craig 
works  for  ROLM  Corp.  as  a  software 
engineer  involved  in  real-time  telepho¬ 
ny  control. 


How  to  add  a 
round-robin 
task  scheduler 
with  a  few 
short  procedures 

them  productively.  Further,  you 
won't  need  to  write  any  assembly- 
language  code;  all  the  code  can  be 
written  in  high-level  Pascal. 

After  describing  the  basics  of  the 
implementation,  I  cover  the  data 
structures  used  for  interprocess  com¬ 
munication  and  synchronization, 
hardware  interrupt  servicing  in  the 
multitasking  context,  and  finally  an 
example  that  illustrates  how  they  all 
fit  together.  Please  note,  the  tech¬ 
niques  presented  in  this  article  apply 
only  to  the  MS-DOS  implementation  of 
Turbo  Pascal.  A  CP/M  or  Macintosh 
version  would  require  major 
modification. 

This  multitasking  kernel  was  de¬ 
veloped  as  part  of  the  design  of  a  PC- 
based  serial  protocol  analyzer.  The 
software  allows  asynchronous  acqui¬ 
sition  and  display  of  serial  data  in 
two  directions  simultaneously.  The 
serial  protocol  analyzer  application 
illustrates  both  of  the  classical  rea¬ 
sons  for  using  multitasking  because  it 
provides  a  minimum  response  time 
to  external  stimuli  (serial  data)  and  a 
natural  separation  of  largely  unrelat¬ 
ed  processes — that  is,  the  acquisition 
of  serial  data  and  management  of  the 
computer  display. 

Implementation 

Before  delving  into  the  details  of  the 
implementation,  it's  necessary  to  un¬ 
derstand  the  properties  of  the  multi¬ 


tasking  kernel  presented  here. 

The  kernel  utilizes  a  cooperative  as 
opposed  to  a  preemptive  task-switch¬ 
ing  mechanism.  That  is,  each  task 
that  runs  voluntarily  gives  up  control 
of  the  CPU  so  that  other  tasks  can  run. 
This  is  quite  different  from  the  time¬ 
slicing  preemptive  algorithm  used  in 
many  multitasking  schemes.  (The 
term  multitasking  originally  implied 
time  slicing.)  In  the  MS-DOS/Turbo 
Pascal  environment,  the  use  of  coop¬ 
erative  task  switching  is  beneficial  in 
two  ways:  first,  it  solves  the  reen- 
trancy  problem  inherent  in  MS-DOS, 
and  second,  it  precludes  saving  the 
CPU  registers  during  a  task  switch  as 
Turbo  Pascal  does  not  expect  CPU  reg¬ 
isters  to  be  preserved  between  pro¬ 
cedure  invocations. 

This  multitasking  kernel  uses  a 
round-robin,  equal-priority  task 
scheduler.  The  scheduler  moves 
from  one  task  to  the  next  ready  task 
without  regard  for  task  priority.  If  a 
task  is  ready,  it  will  run.  Task  priori¬ 
ty  could  easily  be  added  to  the  ker¬ 
nel,  but  the  application  for  which 
the  kernel  was  designed  did  not  re¬ 
quire  it. 

Each  task  utilizes  two  data  struc¬ 
tures  for  its  operation.  The  first  is  the 
task  control  block,  or  TCB.  Figure  1, 
page  43,  shows  the  layout  of  a  task 
control  block.  Example  1,  page  43, 
shows  the  equivalent  Turbo  Pascal 
code.  Bptr  is  a  pointer  into  the  stack 
segment  of  the  BP  register  storage  lo¬ 
cation  in  this  task's  stack  frame.  Bptr 
is  declared  an  integer  instead  of  a 
pointer  because  the  segment  into 
which  it  points  is  already  known  to 
be  the  stack  segment.  Link  is  a  point¬ 
er  to  the  next  TCB  in  a  circularly 
linked  list  of  TCBs.  State  indicates  the 
current  state  of  this  task  with: 


42 

524 


Dr.  Dobb's  Journal,  July  1987 


0  =  ready  to  run 

1  =  waiting  for  a  signal  to  run 

2  =  running 

Id  is  an  identifying  number  assigned 
to  this  task  when  it  was  created.  It  is 
currently  used  only  in  debugging  of 
a  multitasking  application. 

All  task  control  blocks  are  located 
in  Turbo  Pascal's  heap  area.  The  sec¬ 
ond  kernel  data  structure  is  the  stack 
frame.  As  you  might  expect,  the 
stack  frame  for  each  task  is  located  in 
the  808x  stack  segment.  The  stack 
frame  is  the  stack  area  for  a  single 
task.  Each  task  has  its  own  unique 
stack  frame  the  size  of  which  is  de¬ 
termined  when  a  task  is  first  execut¬ 
ed.  Many  different  types  of  informa¬ 
tion  can  be  stored  in  a  task's  stack 
frame  including: 

•  procedure  return  addresses 

•  parameters  passed  to  procedures 

•  local  variables 

•  CPU  registers  during  the  servicing 
of  hardware  interrupts 

Figure  2,  right,  shows  how  the  data 
structures  relate  in  a  program  that 
has  three  tasks  competing  for  the  use 
of  the  CPU.  In  this  snapshot,  task  1  is 
currently  running.  This  is  indicated 
by  its  state  being  set  to  2  and  the  cur¬ 
rent  pointer,  or  CP,  pointing  at  its 
task  control  block.  When  and  if  task  1 
relinquishes  control  of  the  CPU,  task  2 
will  then  begin  running  as  it  is  the 
next  ready  task  in  the  linked  list.  No¬ 
tice  how  the  operation  of  this  multi¬ 
tasking  kernel  spans  all  three  of  the 
808x  CPU  segments. 

Multitasking  Kernel 
Routines 

Five  Turbo  Pascal  procedures  form 
the  basis  of  the  multitasking  kernel. 
They  are  Fork,  Yield,  Wait,  Send,  and 
Pause.  Each  of  these  routines  manip¬ 
ulates  the  structure  shown  in  Figure 
2.  The  multitasking  kernel’s  code  is 
shown  in  Listing  One,  page  52. 

Fork  is  used  to  create,  or  spawn, 
another  task  for  the  CPU  to  execute.  It 
is  modeled  closely  on  the  Unix  proce¬ 
dure  of  the  same  name.  Fork  sets  a 
global  variable  called  child— process 
to  indicate  whether  or  not  the  fork 
operation  was  successful.  This  vari¬ 
able  is  used  to  modify  program  flow 
depending  on  the  status  of  the  fork 
operation.  If  child— process  is  re¬ 


turned  true,  the  fork  was  successful. 
It  is  important  to  realize  that,  when 
fork  is  successfully  executed,  control 
is  not  returned  to  the  parent  task  but 
to  the  new  task  just  forked — the 
child  process.  The  parent  task  is  sus¬ 
pended  (the  task  state  is  set  to  ready) 
until  given  another  chance  to  run.  A 
look  at  Example  2,  page  44,  will  help 
clarify  the  operation  of  the  Fork 
procedure. 

On  return  from  the  Fork  proce¬ 
dure,  Newtask  is  started  in  an  envi¬ 
ronment  that  is  different  from  the 
environment  of  the  main  program. 
The  structure  of  all  tasks  is  the  same, 
as  is  discussed  later.  When  Newtask 
gives  up  control  of  the  processor  via 
a  yield  or  wait,  you  revert  to  the 
main  program's  environment  via  a 


Example  1:  Equivalent  Turbo  Pascal 


task  switch  and  reexecute  the  condi¬ 
tional  statement  shown  above.  This 
time,  however,  the  child— process 
variable  will  be  false  (reset  by  the 
Newtask  code),  so  the  invocation  of 


Figure  1:  Task  control  block 


code  for  Figure  1 


Figure  2:  Program  snapshot 


tcbptr  =  A  tcb 

{a  pointer  to  a  TCB} 

tcb  =  RECORD 

Bptr:  integer; 

{Bptr  storage} 

Link:  tcbptr; 

{Link  storage} 

State:  byte; 

{Current  state} 

Id:  byte; 

End; 

{Task  Id} 

Dr.  Dobb’s  Journal,  July  1987 


43 

525 


TURBO  PASCAL  MULTITASKING 

(continued  from  page  43) 


task  1  is  skipped  and  the  execution  of 
the  main  program  code  continues. 

The  processor  can  execute  only 
one  program  at  a  time,  so  while  the 
main  program  code  is  executing,  the 


newly  forked  Newtask  routine  is  qui¬ 
escent.  If  the  main  program  yielded, 
however,  its  execution  would  pause 
while  Newtask  continued  to  run.  In 
this  multitasking  kernel  you  can  fork 
as  many  tasks  as  the  stack  segment 
has  stack  space  available  for  and  the 
data  segment  has  TCB  space  available 


for. 

To  help  you  understand  the  ker¬ 
nel’s  operation,  here  is  a  detailed 
breakdown  of  Fork’s  operation: 

1.  It  checks  to  see  if  there  is  enough 
stack  space  available  for  a  new  task. 
The  size  of  the  allocated  stack  is  de¬ 
termined  by  the  global  variable  Task- 
_ stack— .size ,  which  can  be  changed 
between  calls  to  Fork  if  the  stack  re¬ 
quirements  change.  Always  allocate 
more  stack  area  than  you  think 
you'll  need  because,  if  a  task  crosses 
its  stack  boundary,  your  program 
will  crash  and  burn.  It  is  advisable  to 
allocate  at  least  128  bytes  of  stack 
more  than  your  program  will  re¬ 
quire  to  allow  for  MS-DOS  calls  and 
hardware  interrupts. 

2.  It  saves  the  Bptr  of  the  currrently 
executing  task  in  the  current  TCB  so 
that  it  can  be  restored  when  this  task 
runs  again. 

3.  It  calls  the  Pascal  procedure  New  to 
allocate  space  for  the  new  task's  TCB 
in  Turbo  Pascal's  heap. 

4.  It  links  the  new  TCB  into  the  linked 
list  of  TCBs. 

5.  It  sets  the  state  of  the  new  task's 
TCB  to  running  as  it  will  be  upon  re¬ 
turn  from  Fork. 

6.  It  points  CP  at  this  new  TCB. 

7.  It  gets  the  next  task's  ID  number 
and  stores  it  in  the  new  TCB. 

8.  A  portion  of  the  old  task's  stack 
contents  are  then  copied  into  the 
new  stack  frame.  This  allows  return 
from  Fork  into  the  new  task’s 
environment. 

9.  It  updates  the  variable  frame—ptr 
to  reflect  the  hunk  of  stack  area  just 
allocated  to  this  new  task. 

10.  It  sets  the  child_process  variable 
to  true. 

When  all  these  operations  have  been 
completed,  the  return  from  Fork 
starts  the  new  task  running  in  its 
own,  newly  created  environment. 

Yield  is  used  by  the  current  task  to 
give  up  control  of  the  CPU  voluntarily 
to  the  next  ready  task.  If  only  a  single 
task  is  running,  Yield  cannot  do  any¬ 
thing  except  output  an  error  message 
and  halt  because  a  programming  er¬ 
ror  has  been  made.  When  a  task  that 
has  yielded  runs  again,  it  will  contin¬ 
ue  its  execution  at  the  Pascal  state¬ 
ment  following  the  yield.  It  is  very 
important  for  a  task  to  yield  periodi¬ 
cally  because  otherwise  it  will  gain 


■the  main  program  code  is  running  here 

Fork.;  {fork  a  new  task) 

If  child _process  =  true  then  {if  fork  was  successful) 
Newtask;  {start  new  task  running) 


Example  2:  Fork  procedure,  which  manipulates  structure  in  Figure  2 


44 

526 


Dr.  Dobb's  Journal,  July  1987 


control  of  the  CPU  and  might  never 
relinquish  it.  Under  these  conditions 
none  of  the  other  tasks  would  be  giv¬ 
en  any  time  to  run.  Yield  statements 
should  be  placed  liberally  in  a  task’s 
code,  especially  in  loops  that  might 
take  quite  a  while  to  complete. 

The  format  of  all  tasks  should  re¬ 
semble  that  shown  in  Example  3, 
right. 

In  other  words,  every  task  should 
be  in  the  form  of  an  infinite  loop 
that  never  returns  to  the  main  pro¬ 
gram  but  that  does  yield  to  the  other 
tasks. 

The  actions  performed  when  Yield 
is  executed  are: 

1.  The  Bptr  into  the  current  task's 
stack  frame  is  saved  in  the  current 
task’s  TCB. 

2.  The  state  in  the  TCB  is  changed 
from  running  to  ready. 

3.  The  linked  list  of  TCBs  is  traversed 
until  the  next  ready  task  is  located. 
CP  is  then  made  to  point  at  this  new 
TCB. 

4.  The  Bptr  for  the  task  to  run  is  re¬ 
trieved  from  its  TCB  and  stored  in  the 
808x  processor’s  BP  register,  as  re¬ 
quired  by  Turbo  Pascal.  When  re¬ 
turn  from  Yield  is  performed,  the  en¬ 
vironment  is  changed  to  that  of  the 
next  task  and  execution  continues 
from  where  that  task  previously 
yielded.  It  is  quite  possible,  for  exam¬ 
ple,  for  task  3  to  yield  and  for  task  15 
to  begin  execution  if  task  15  is  the 
next  ready  task  in  the  linked  list. 

Wait  is  used  by  a  task  to  voluntarily 
suspend  its  execution  indefinitely 
until  some  external  stimulus  is  ap¬ 
plied.  The  external  stimulus  is  a  Send 
operation,  which  I  describe  next. 
Note  that  a  waiting  task  does  not  con¬ 
sume  any  CPU  time — it  is  not  polling 
for  the  external  stimulus  to  be  ap¬ 
plied  (which  would  require  CPU 
time)  but  is  completely  dormant.  As 
far  as  the  CPU  is  concerned,  a  waiting 
task  is  nonexistent. 

The  code  and  therefore  the  opera¬ 
tion  of  Wait  is  similar  to  that  of  Yield. 
The  differences  are: 

1.  The  state  of  the  current  task  is 
changed  from  running  to  waiting  in¬ 
stead  of  to  ready. 

2.  Wait  requires  a  parameter  to  indi¬ 
cate  what  stimulus  the  task  is  to  wait 
for.  The  parameter  is  a  pointer  to  a 


task  control  block  pointer  ( tcbptr ), 
which  is  stored  in  the  variable  wait- 
for.  Waitfor  must  be  initialized  be¬ 
fore  the  Wait  procedure  call  is 
executed. 

Send  is  used  to  wake  up  a  waiting 
task.  It  changes  the  state  of  the  task 
being  signaled  from  waiting  to  ready 
so  that  it  will  run  the  next  time  the 
processor  gets  to  it. 

Procedure  Newtask; 

Var 

any  required  local  variables 
Begin 

child_process  :=  false;  {resets 
Repeat 

New  task  code  goes  here 

Yield; 

Until  False; 

End 

the  global  variable) 

Example  3:  Task  format 

Dr.  Dobb’s  Journal,  July  1987 


45 

527 


TURBO  PASCAL  MULTITASKING 

(continued  from  page  45) 


Pause  is  used  to  suspend  a  task's  ex¬ 
ecution  for  a  specified  number  of 
one-quarter-second  intervals,  or 
ticks.  The  tick  count  is  a  signed  inte¬ 
ger  value  with  a  maximum  count  of 
32,767,  representing  a  maximum  de¬ 
lay  of  approximately  2.5  hours.  This 
implementation  of  Pause  is  crude  but 
effective.  It  relies  on  the  fact  that  all 
tasks  yield  periodically,  so  Pause  can 
check  to  see  if  the  tick  count  has  been 
satisfied  and  therefore  whether  the 
paused  task  is  to  be  made  ready  to 
run. 

Intertask  Communication 
and  Synchronisation 

Because  of  the  asynchronous  nature 
of  tasks  running  in  the  multitasking 
environment,  data  sharing  requires 
special  considerations.  Two  of  the 
more  important  problems  that  must 
be  overcome  to  allow  a  multitasking 
system  to  be  viable  are: 

1.  Data  passed  back  and  forth  be¬ 
tween  tasks  must  be  absorbed  and 
buffered.  This  is  necessary  to  equal¬ 
ize  the  different  rates  at  which  tasks 
produce  and  consume  data. 

2.  Important  data  areas  or  system  re¬ 
sources  must  be  protected  from  use 
by  more  than  one  task  at  a  time. 

The  classical  method  used  to  solve 
the  first  problem  is  with  a  data  struc¬ 
ture  known  as  the  first  in,  first  out 
(FIFO)  list,  or  queue.  The  second  prob¬ 
lem  can  be  solved  using  a  data  struc¬ 
ture  called  a  semaphore.  Both  of 
these  structures,  along  with  the  ap¬ 
propriate  support  routines,  are  im¬ 
plemented  in  this  kernel. 

FIFOs 

The  implementation  of  FIFOs  in  this 
multitasking  kernel  is  very  general¬ 
ized.  By  this  I  mean  that  the  same 
general  techniques  can  be  used  to 
maintain  a  queue  without  regard  for 
the  type  of  data  stored.  A  simple  FIFO 
using  bytes  is  shown  in  the  example 
program  of  Listing  One.  FIFOs  for 
storage  of  other  types  of  data — in¬ 
cluding  real  numbers,  strings,  or  re¬ 
cords — could  just  as  easily  be  imple¬ 
mented  using  the  general  techniques 
presented  here. 

In  the  example  program,  a  byte 


FIFO  is  used  to  buffer  intertask  data. 
To  use  the  generalized  FIFO  routines, 
you  must  perform  the  following 
steps: 

1.  Generate  a  record  structure  de¬ 
scribing  the  type  of  data  you  want  to 
store  in  the  FIFO.  The  structure  for 
the  byte  FIFO  is: 

bytefifo  =  RECORD 

ovd:  overhead;  {another  record 

used  to} 

{manage  the  fifo 

data} 

data:  ARRAY[1 .  .  bytefifosize]  OF 

BYTE; 

End; 

It  is  important  to  use  the  same  over¬ 
head  record  regardless  of  the  type  of 
data  stored  in  the  data  array.  This  al¬ 
lows  the  same  generalized  tech¬ 
niques  to  be  used  to  manage  the 
queue. 

2.  Declare  an  instance  of  the  FIFO: 
inbuffer:  bytefifo; 

3.  Initialize  the  FIFO  overhead  data 
structure  as  follows: 

Initialize_Jifo(inbuffer.ovd); 

This  sets  the  fields  in  the  overhead 
record  to  indicate  an  empty  FIFO. 
The  fields  as  initialized  are  shown  in 
Table  1,  right. 

4.  The  final  step  is  to  write  routines 
similiar  to  put— byte  and  get-byte, 
which  can  store/retrieve  items  of 
data  from  the  FIFO. 

Notice  the  use  of  the  kernel  wait 
and  signal  routines  in  the  get— byte 
and  put— byte  procedures.  Putting  the 
Not^empty  and  Not— full  signals  into 
the  overhead  record  for  all  FIFOs 
makes  use  of  the  FIFO  routines  ex¬ 
tremely  convenient  in  this  multitask¬ 
ing  environment.  If,  for  example, 
your  program  calls  put— byte  to  store 
a  byte  of  data  in  the  inbuffer  FIFO  and 
inbuffer  is  currently  full,  your  task 
will  automatically  be  put  to  sleep  un¬ 
til  a  byte  is  removed  from  inbuffer. 
As  soon  as  there  is  room  in  inbuffer, 
put— Byte  will  store  the  byte  in  in- 
buffer  and  return  to  your  program. 
Conversely,  if  a  task  is  waiting  for 
data  to  be  placed  in  inbuffer,  it  will 
automatically  awaken  as  soon  as  put- 


—byte  places  data  into  the  previously 
empty  FIFO. 

Semaphores 

Semaphores  are  generally  used  as  a 
tool  for  synchronization  in  a  multi¬ 
tasking  environment.  They  can  be 
used  to  initiate  a  synchronized  action 
or  to  provide  mutual  exclusion  for  a 
system  resource.  Three  routines  are 
provided  in  the  kernel  for  use  with 
semaphores:  Initialize— semaphore , 
which  initializes  the  semaphore  data 
structure;  Alloc,  which  is  used  to 
claim  a  resource  for  a  task's  private 
use;  and  Dealloc,  which  releases  a  re¬ 
source  from  a  task's  control.  Many 
other  semaphore  manipulation  pro¬ 
cedures  are  possible  but  have  yet  to 
be  required  in  my  serial  protocol  an¬ 
alyzer  application. 

Semaphore  usage  requires  the  fol¬ 
lowing  steps:  first,  an  instance  of  a 
semaphore  must  be  declared: 

printer_lock:  semaphore; 

and  second,  the  semaphore  must  be 
initialized: 

Initialize_semaphore(printer_lock); 

As  an  example  of  how  the  sema¬ 
phore  procedures  Alloc  and  Dealloc 
could  be  used,  consider  the  use  of  a 
printer  in  a  multitasking  environ¬ 
ment.  Suppose  many  tasks  needed  to 
produce  printed  output  on  a  printer. 
If  some  form  of  printer  control  were 
not  imposed  on  which  tasks  have  ac¬ 
cess  to  the  printer,  the  resulting 
printed  output  could  become  a  jum¬ 
bled  mess  of  intermingled  charac- 


Count 

=  0 

(number  of  items  in  the' 
FIFO  is  0) 

Inptr 

=  1 

(array  index  of  where 
items  placed  in  the 

FIFO  will  be  stored) 

Outptr 

=  1 

(array  index  of  where 
items  removed  from 

the  fifo  will  be 
obtained) 

Not_empty 

=  Nil 

(specialized  field,  dis¬ 
cussed  later) 

Not _ full 

=  Nil  (specialized field,  dis¬ 
cussed  later) 

Table  1:  Fields  in  overhead  register 
as  initialized 


48 

528 


Dr.  Dobb's  Journal,  July  1987 


ters.  To  prevent  this,  you  could  de¬ 
clare  a  semaphore  (printer— lock,  for 
example)  for  use  with  the  Alloc  and 
Dealloc  procedures.  Each  task  that  re¬ 
quired  access  to  the  printer  would 
first  Alloc  the  printer  semaphore 
before  attempting  to  print  and 
would  Dealloc  the  printer  when  fin¬ 
ished.  This  would  prevent  the  inter¬ 
mixing  of  printed  output  as  each  task 
would  have  exclusive  access  to  the 
printer  as  long  as  necessary  to  com¬ 
plete  its  output.  Another  task  await¬ 
ing  the  use  of  the  printer  would  be 
put  to  sleep  until  the  printer  was 
available. 

The  same  techniques  used  to  con¬ 
trol  access  to  a  system  device  such  as 
a  printer  can  also  be  used  to  control 
access  to  shared  memory  areas  and 
system  data  structures.  By  using  Al¬ 
loc  and  Dealloc  in  code  that  modifies 
an  important  data  structure,  you  can 
be  assured  the  complete  structure 
will  be  modified  (even  though  the 
task  doing  the  modification  repeat¬ 
edly  yields)  before  access  is  granted 
to  another  task. 


Turbodseg  must  be  initialized  with 
the  data  segment  value  used  by  Tur¬ 
bo  Pascal  for  storing  the  data  created 
in  your  program.  This  value  is  re¬ 
quired  by  the  interrupt  service  rou¬ 
tine  to  locate  and  have  access  to  all 
the  data  in  your  program.  Turbodseg 
is  actually  a  typed  constant  rather 


than  a  variable.  I  used  a  typed  con¬ 
stant  because  it  is  stored  in  the  code 
segment  instead  of  the  data  segment 
portion  of  memory.  If  a  variable 
were  used,  it  would  be  stored  in  the 
very  area  of  memory — the  data  seg¬ 
ment — you  are  trying  to  locate.  Stor¬ 
ing  Turbodseg  in  the  code  segment 


Example  4:  Preamble  to  interrupt  service 


Interrupt  Processing 

Any  real-time  multitasking  system 
must,  by  its  very  nature,  provide  a 
close  coupling  between  real-time  in¬ 
terrupts  and  the  execution  of  tasks. 
Interrupts,  like  tasks,  are  another 
method  for  dealing  with  the  asyn¬ 
chronous  nature  of  real-world 
events.  Whatever  the  source  of  the 
interrupt,  there  should  be  a  uniform, 
well-defined  method  for  handling  it 
in  a  multitasking  environment,  and 
the  method  chosen  should  hopefully 
be  transparent  to  the  programmer. 

In  keeping  with  this  philosophy, 
interrupt  handling  within  the  multi¬ 
tasking  environment  presented  here 
is  performed  in  Turbo  Pascal  instead 
of  in  assembly  language.  This  is 
made  possible  by  the  use  of  two 
small  in-line  code  procedures  that 
must  bracket  the  procedure  call  to 
your  Turbo  Pascal  interrupt  service 
routine.  The  first  in-line  routine, 
called  the  interrupt  service  routine 
preamble,  contains  the  808x  code 
shown  in  Example  4,  above. 

The  function  of  this  code  is  to  save 
all  the  registers  used  by  Turbo  Pascal 
in  servicing  your  interrupt,  to  estab¬ 
lish  access  to  the  data  area  used  by 
your  program,  and  finally  to  reena¬ 
ble  the  interrupts. 


Dr.  Dobb's  Journal,  July  1987 


49 

529 


TURBO  PASCAL  MULTITASKING 

(continued  from  page  49) 


allows  it  to  be  accessed  when  an  in¬ 
terrupt  occurs  because  the  interrupt 
procedure  is  contained  in  the  same 
code  segment  as  the  rest  of  your  pro¬ 
gram.  Once  an  interrupt  occurs,  the 
value  stored  in  Turbodseg  is  moved 
into  the  processor's  DS  register,  al¬ 
lowing  your  interrupt  service  rou¬ 
tine  access  to  all  your  program's  data. 

After  the  preamble  code  has  run, 
control  is  passed  to  your  interrupt 
service  routine.  Your  code  can  per¬ 
form  any  function  you  desire  as  long 
as  you  keep  a  few  facts  in  mind.  First, 
the  processor  registers  and  flags 
have  been  saved  onto  the  stack  of  an 
interrupted  task,  so  be  sure  you  don’t 
nest  procedures  or  create  local  data 
to  such  an  extent  that  the  stack  over¬ 
flows.  Second,  do  not  use  any  DOS 
functions  (for  example,  int  21h )  as 
they  are  not  reentrant;  on  the  other 
hand,  you  can  use  BIOS  functions. 

It  is  a  good  programming  practice 
to  keep  interrupt  routines  as  small  as 
is  practical.  This  rule  is  especially 
true  in  the  context  of  multitasking.  If 
large  amounts  of  processing  need  to 
be  performed,  it  might  be  wiser  to 
signal  a  waiting  task  to  do  the  re¬ 
quired  work.  How  interrupt  service 
routines  are  written,  however,  is 
largely  mandated  by  the  type  of  pro¬ 
cessing  that  must  be  accomplished. 

After  your  Turbo  Pascal  interrupt 
service  routine  terminates,  the  inter¬ 
rupt  postamble  code  must  execute.  It 
performs  the  inverse  function  of  the 
preamble  routine  by  restoring  all  the 
processor  registers  to  their  pre-inter¬ 
rupt  state  and  then  doing  all  the  mis¬ 
cellaneous  housecleaning  necessary 
to  clean  up  after  the  interrupt  code. 
The  postamble  routine  code  is  shown 
in  Example  5,  below. 

The  structure  of  your  interrupt 
service  routine  should  then  be  simi¬ 
lar  to  the  following: 

Procedure 


your_interrupt_service_routine; 

Begin 

preamble  code; 
interrupt_service_routine; 
postamble  code; 

End; 

It  is  very  important  that  the  inter¬ 
rupt  service  routine  (your— interrupt 
Service— routine  in  the  structure 


The  kernel 
can  be  used 
in  a  cookbook 
fashion 

in  which  all  code 
can  be  written 
in  high  level 
Turbo  Pascal. 


shown  above)  does  not  allocate  any 
local  data.  This  is  necessary  because 
the  preamble  and  postamble  rou¬ 
tines  are  not  smart  enough  to  man¬ 
age  the  stack  correctly  when  locals 
are  present.  The  actual  interrupt 
procedure  ( interrupt— service— rou¬ 
tine)  can  use  as  much  local  and 
global  data  as  it  requires. 

After  you  have  written  your  inter¬ 
rupt  service  routine,  you  must  install 
it  in  the  PC  environment  before  you 
can  execute  it.  Two  methods  exist  for 
installing  an  interrupt  routine:  you 
can  either  store  the  address  offset 
and  code  segment  values  for  the  rou¬ 
tine  in  the  processor's  interrupt  table 
directly,  or  you  can  call  MS-DOS  to  do 
the  installation  for  you.  The  second 
method  is  perferred,  and  you  should 
use  it  whenever  you  install  interrupt 
service  routines.  The  example  pro¬ 
gram  shown  in  Listing  One  shows  an 
interrupt-driven  serial  input  routine 


written  in  Turbo  Pascal  along  with 
the  MS-DOS  code  necessary  to  install 
it. 

The  Example  Program 

I  have  used  a  dumb  terminal  pro¬ 
gram  throughout  this  article  to  illus¬ 
trate  some  of  the  multitasking  ker¬ 
nel’s  features.  The  complete 
program  is  shown  in  Listing  Two, 
page  62.  Listing  Three,  page  70,  in¬ 
cludes  all  the  RS-232  support  routines 
necessary  to  compile  and  run  the  ex¬ 
ample  program. 

The  example  program  illustrates 
the  following  concepts: 

•  creation  of  four  concurrent  tasks 

•  the  use  of  FIFOs  to  pass  data  be¬ 
tween  tasks 

•processing  of  serial  interrupts  in 
Turbo  Pascal 

This  program  can  be  used  as  a  tem¬ 
plate  for  the  generation  of  your  mul¬ 
titasking  programs. 

Conclusions 

In  this  article  I've  presented  a  multi¬ 
tasking  kernel  for  use  with  MS-DOS 
and  Turbo  Pascal.  Included  are  many 
functions  useful  either  for  a  real¬ 
time  control  application  or  just  for 
experimenting  with  the  concepts 
presented.  From  a  programmer’s 
perspective,  the  kernel  can  be  used 
in  a  cookbook  fashion  in  which  all 
code  (for  both  the  tasks  and  the  inter¬ 
rupt  service  routines)  can  be  written 
entirely  in  high-level  Turbo  Pascal. 
Feel  free  to  incorporate  these  con¬ 
cepts  and  code  in  your  own  pro¬ 
grams.  I'd  appreciate  hearing  from 
you  if  you  come  up  with  any  novel 
uses  or  extensions  for  the  kernel. 

Availability 

All  the  source  code  for  articles  in  this 
issue  is  available  on  a  single  disk.  To 
order,  send  $14.95  to  Dr.  Dobb's  Jour¬ 
nal,  501  Galveston  Dr.,  Redwood  City, 
CA  94063,  or  call  (415)  366-3600,  ext. 
216.  Please  specify  the  issue  number 
and  format  (MS-DOS,  Macintosh,  Kay- 
pro). 

DDJ 

(Listings  begin  on  page  52.) 


Cli 

{interrupts  off  for  a  second) 

Pop 

DS,ES,SI,DI,DX,CX,BX,AX 

{restore  the  registers) 

Pop 

HP 

{throw  away  the  Sp  value) 

Pop 

BP 

{restore  the  Bp  register) 

Iret 

{return  from  interrupt) 

Example  5:  Postamble  to  interrupt  service 


Vote  for  your  favorite  feature/article. 
Circle  Reader  Service  No.  4. 


50 

530 


Dr.  Dobb's  Journal,  July  1987 


TURBO  PASCAL  MULTITASKING 

Listing  One  (Te)CL  begins  on  page  42.) 

{ $K-}  {Compiler  switch  -  never  change} 

^  ************************************************  j 


★  ** 

Listing  One 

★** 

★  ★  ★ 

Turbo  Pascal 

*** 

*** 

Multitasking  Kernel 

*** 

*** 

written  by 

*** 

*** 

*** 

Craig  A.  Lindley 

*** 

*** 

*** 

*** 

Ver:  1.3  Last  update:  03/11/87 

*** 

*** 

{***********★************************************} 

CONST 

task_stack_size  -  256;  {stack  size  for  each} 
{task} 

turbodseg:  integer  =0;  {storage  for  turbos} 
{data  segment  value} 


TYPE 

{possible  states  for  a  task} 

task_state  =  (ready, waiting, running) ; 

{ 808X  register  set} 

register_type  -  RECORD 
CASE  integer  OF 

Is  (ax, bx, cx, dx, bp, si, di,ds,es, flags: integer) ; 
2:  (al,ah,bl,bh,cl,ch,dl,dh  :byte); 

END; 


{Task  control  block  (tcb)  structure} 


tcbptr  “  A  tcb; 


{ptr  to  tcb} 


tcb  =  RECORD 

link:  tcbptr; 
bptr:  integer; 
state:  task_state; 
id:  byte; 

END; 


{link  to  next  tcb  in  dseg} 
{base  ptr  offset  in  sseg} 
{ready,  waiting,  running} 
{task  number} 


waitptr  =  Atcbptr;  {ptr  to  ptr  to  tcb} 

{used  for  passing  parms} 
{to  wait} 


{This  fifo  overhead  structure  is  the  same  for} 
{all  fifo  types  regardless  of  the  items  to  be} 
{stored  in  the  fifo.  The  byte  fifo  is  an  example} 
{of  just  one  possible  type  of  fifo.} 


overhead  *  RECORD 

count , 
inptr, 

out ptr:  integer; 

not_empty, 
not_full:  tcbptr; 
END; 


{fifo  overhead  data} 
{structure} 

{#  of  items  in  fifo} 

{ptr  to  where  items  are} 
{stored} 

{ptr  to  where  items  are} 
{fetched} 

{ptrs  to  waiting  tasks} 


bytefifo  =  RECORD  {definition  of  a  byte  fifo} 
ovd:  overhead;  {fifo  overhead} 

data:  ARRAY [1. .bytef ifosize} 

OF  byte;  {byte  fifo  data  area} 

END; 


semaphore  =  RECORD 
count:  integer; 

signal:  tcbptr; 

END; 


{Semaphore  data  type} 
{number  of  times  signaled} 
{pointer  to  waiting  task} 
{if  there  is  one} 


(continued  on  page  54) 


52 


Dr.  Dobb's  Journal,  July  1987 

531 


TURBO  PASCAL  MULTITASKING 

Listing  One  (Listing  continued,  text  begins  on  page  42.) 

{********  Begin  Multitasking  Variables  *********} 

VAR 


cp. 

{current  task  pointer} 

new_tcb_ptr1 

r  {ptr  to  new  tcb  in  dseg} 

tempjptr: 

tcbptr; 

waitfor: 

waitptr;  {address  of  item  to} 

{wait  on} 

stk,bp: 

integer;  {variables  for  setting} 

{ 808X  sp  and  bp} 

framejptr: 

integer;  {stack  frame  pointer} 

next  id: 

integer;  {next  task  id  number} 

i: 

integer; 

child _process:  boolean;  {fork  successful  flag) 

(********  Begin  Multitasking  Procedures  ********) 


PROCEDURE  Fork;  {fork  off  a  new  task) 

{This  procedure  manipulates  Turbo  Pascal's  stack) 
{frame  as  required  to  fool  it  into  operating  in) 

(a  new  task's  environment.) 

BEGIN 

child_process:«false;  {indicate  the  parent) 
{process  until  proven) 
{otherwise) 

{check  if  enough  stack  space  for  a  new  task) 

IF  abs  (frame_ptr  -  task_stack_size)  >  0  THEN 
BEGIN  {if  enough) 

INLINE ($89/$26/stk) ;  {get  808X  Sp  to) 

{calculate  Bp  pointer) 
cpA.bptr:-stk+2;  {save  Bp  ptr  in  this) 

{ frame ) 

new (new_tcb _ptr) ;  {allociate  new  tcb) 

{link  new  tcb  into  scheduler  loop) 

{make  its  state  running  and  give  it  an  id) 

new_tcb_ptrA .  link:  -cp'-  . link; 
cpA .link: =new_t  cb_pt  r ; 
new_tcb_ptrA . state : -running; 
next_id : =next_id+l ; 
new_tcb_ptrA . id: «next_id; 

cpA . state : -ready;  {old  frame  is  ready) 

{copy  old  stack  to  new  stack  frame) 

FOR  i:-0  TO  5  DO 

mem[sseg:frame_ptr-6+i] :=mem[sseg:stk+i] ; 

{make  Bp  storage  in  stack  frame  point  at) 
{this  frame) 

memw [sseg : frame_ptr-4 ) :=frame_ptr; 

bp: -frame  ptr- 4;  {calculate  Bp  pointer) 

INLINE ($8B/$2E/bp) ;  {set  808X  Bp  reg  to) 
{this  new  value) 

{reserve  stack  frame  space) 
frame  _ptr:=frame_ptr-task_stack_si ze; 
cp:-new_tcb_ptr;  {cp  points  at  new  task) 

child_process : -true;  {indicate  child  process) 
END; 

END; 


PROCEDURE  Yield; 

{This  procedure  cause  the  executing  task  to) 
(relinquish  control  of  the  CPU  to  the  next  ready) 
{task. ) 


Dr.  Dobb 's  Journal,  July  1987 


BEGIN 

chi ld_process: -false; 

{reset  variable} 

IF  cpA . link  <>  cp  THEN 

{must  have  more  than} 

{one  task  forked  to  be} 

{able  to  yield} 

BEGIN 

INLINE ($89/$26/stk) ; 

{get  808X  sp} 

cpA . bptr : =stk+2; 

{save  Bp  ptr  in} 

{current  task  frame} 

cpA . state : =ready ; 
temp_ptr : =cp; 

{yielded  task  ready} 

{look  for  next  ready  task  in  scheduler  loop} 

{there  must  be  at  least  one  or  else} 

WHILE  (temp  ptrA.link 

A. state  <>  ready)  DO 

temp_ptr : =temp_ptr 

A.link; 

cp:  -temp_jptrA  .link; 

{cp  points  at  new  task} 

cpA . state : -running; 

{indicate  running} 

bp:=cpA.bptr; 

{get  the  bp  of  task} 

INLINE ($8B/$2E/bp) ; 

END 

ELSE 

BEGIN 

{restore  it  to  808X  bp} 

writeln  ('Cannot  yield  only  single  task  running'); 

halt; 

END; 

END; 

PROCEDURE  Walt;  (put  current  task  in  wait  mode) 

{until  a  send  makes  it  ready} 

{Due  to  constraints  of  this  kernel,  parameters} 

{cannot  be  passed  directly  to  the  wait  procedure.} 

{To  overcome  this  limitation,  a  global  variable} 

{called  wait for  is  used.  The  address  of  the} 

{tcbptr  on  which  to  wait  should  be  stored  in} 

{waitfor.  See  the  fifo  routines  for  an  example  of} 

{the  proper  usage  of  Wait. 

BEGIN 

child  process: -false; 

{reset  variable} 

IF  cpA.link  <>  cp  THEN 

{must  have  more  than} 

{one  task  forked  to  be} 

{able  to  wait} 

BEGIN 

wait for A  :=  cp; 

(waitfor  points  at  the) 

(current  cask) 

INLINE ($89/$26/stk); 

{get  808X  sp} 

cpA .bptr:-stk+2; 

{save  it  in  current} 

{task  frame} 

cpA . state : -wait ing; 
temp_ptr:=cp; 

{task  now  waiting} 

{look  for  next  ready 

task  in  scheduler  loop} 

{there  must  be  at  least  one  or  else} 

WHILE  (temp_ptr A . linkA . state  <>  ready)  DO 

t  emp_pt  r : =temp_pt r  A . 1 i nk ; 

cp: -tempjptr  A . link ; 

{cp  points  at  new  task} 

cpA . state : -running; 

(indicate  running) 

bp:-cpA.bptr; 

(get  bp  for  this  task) 

INLINE ($8B/$2E/bp) ; 

{restore  it  to  808X  bp} 

END 

ELSE 

BEGIN 

writeln ( 'Cannot  wait 
halt; 

only  single  task  running'); 

END; 

END; 

(continued  on  page  57) 

Dr.  Dobb's  Journal,  July  1987 


55 

533 


TURBO  PASCAL  MULTITASKING 

Listing  One  (Listing  continued ,  te^ct  begins  on  page  42.) 

PROCEDURE  Send (VAR  s:tcbptr); 

{Make  the  specified  task  ready  for  next  scheduler} 

{go  around} 

BEGIN 

sA. state: =ready;  {task  state  is  ready} 

s:=NIL;  {clear  pointer} 

END; 


PROCEDURE  Pause (t : integer} ; 

{Pause  the  execution  of  a  task  for  t  1/4  sec} 
{intervals.  Note  even  t  results  in  more} 
{accurate  timmings.} 

FUNCTION  tic_count  :  integer; 

{Get  the  current  tic  count  from  the  Bios} 

VAR 


regs:  register_type; 
BEGIN 


regs. ax: =0;  {request  clock  tic  read} 

intr ($1A, regs) ; 

tic_count :=regs.dx;  { LSB  of  count  in  dx} 

END; 


VAR 


tics,!:  integer; 

BEGIN 

tics:=0;  {initial  tic  count  to  0} 

IF  t  >  0  THEN  {if  a  legal  tic  count} 

BEGIN 

FOR  i:=l  TO  t  DO  {250  msec  =  4.55  tics} 

IF  odd(i)  THEN  {use  this  algorithm  for} 
{approximation} 

tics:=tics+4  {250  msec  =  4.5  tics} 
ELSE 

tics:«tics+5; 

{add  tics  to  current  tic_count  to  get} 
tics:=tics+tic_count;  {target  time} 

REPEAT 

yield;  {return  when  tic  count  is} 
{reached  or  exceeded} 

UNTIL  tics  <=  tic_count; 

END 

ELSE 

writeln('Bad  tic  count  specified'); 


END; 


PROCEDURE  Init_Kernel; 

{This  procedure  initializes  the  multitasking} 
{for  use.  It  sets  up  the  TCB  for  task  0  and} 
{indicates  that  it  is  running.} 

Begin 

turbodseg : =dseg; 
frame_ptr:=  $FFFE; 
next_id:=0; 
new  ( ne w_t  cb_pt  r ) ; 


(continued  on  next  page) 


{save  turbo  data  segment} 
{initial  stack  location} 
{first  task  id} 

{get  new  tcb  in  dseg} 


Dr.  Dobb's  Journal ,  July  1987 

534 


57 


TURBO  PASCAL  MULTITASKING 


Listing  One  ( Listing  continued,  text  begins  on  page  42. ) 


cp : =ne w_t  cb_pt  r ; 
cpA . link:=cp; 
cpA . state : =running; 
cpA . id : =next_id; 


{cp  points  at  tcb} 
{points  at  itself) 
{in  running  state) 
{id  =  0} 


{now  allociate  1st  frame  for  task  0) 
frame_ptr:=frame_ptr-task_stack_size; 


End; 


I  ************  Begin  FIFO  Procedures  ************} 


PROCEDURE  Initialize_fifo (VAR  oioverhead); 

{Initialize  a  fifo's  overhead  data  structure.) 
{This  procedure  will  work  with  any  type  fifo. ) 
{This  makes  the  fifo  appear  empty.) 


BEGIN 


o. count :=  0; 
o.inptr :«1; 
o.outptr:=l; 

o . not_empt  y:=NIL; 
o.not  full: -NIL; 


{count  is  empty) 

{ptrs  to  1st  entry) 

{put  in  and  take  out  at) 
{entry  1) 

{signals  to  nil) 


END; 


PROCEDURE  Put_byte  (b-.byte;  VAR  f  ibytefifo)  ; 

{This  procedure  manages  the  input  of  data  into) 
{a  byte  fifo.  If  the  fifo  is  full  when  this) 
{procedure  is  called,  the  task  that  called  it) 
{will  be  put  to  sleep  automatically  until  there) 
{is  room  in  the  fifo  for  the  data  byte.) 

{The  fifo  overhead  data  structure  is  modified) 
{whenever  a  byte  is  placed  into  the  fifo) 

BEGIN 

WITH  f.ovd  DO 

BEGIN  {check  if  fifo  full) 

IF  count  -  bytefifosize  THEN 
BEGIN  {if  so  go  to  sleep) 

wait for  :=  addr  (not_full); 
wait; 

END;  {when  not  full  add) 

count ; =count +1;  {one  more  to  count) 

f .data [inptr] :=b;  {store  the  byte) 
inptr: -inptr+1;  {bump  input  pointer) 

IF  inptr  >  bytefifosize  THEN 

inptr: =1;  {wrap  ptr  if  necessary) 

{if  waiters  for  this  fifo  wake  them) 

IF  not_empty  <>  NIL  THEN 
send (not_empty) ; 

END; 

END; 


FUNCTION  Get_byte (VAR  frbytefifo)  :  byte; 

{This  procedure  manages  the  output  of  data  from) 
{a  byte  fifo.  If  the  fifo  is  empty  when  this) 
{procedure  is  called,  the  task  that  called  it) 
{will  be  put  to  sleep  automatically  until  there) 
{is  data  in  the  fifo  to  retrieve.) 

{The  fifo  overhead  data  structure  is  modified) 
{whenever  a  byte  is  removed  from  the  fifo) 


WITH  f.ovd  DO 

BEGIN  {check  if  fifo  empty) 

IF  count  =  0  THEN 


(continued  on  page  60) 


58 


Dr.  Dobb's  Journal,  July  1987 

535 


TURBO  PASCAL  MULTITASKING 


Listing  One  (Listing  continued,  text  begins  on  page  42. ) 

BEGIN  {if  so  go  to  sleep} 

wait for  :=  addr  (not_empty) ; 
wait; 

END; 

{when  data  is  available} 
count : -count-1;  {one  less  to  count} 
get_byte:=f .data [outptr] ;  {get  the  byte} 
outptr : =outptr+l ; { bump  output  pointer} 

IF  outptr  >  bytefifosize  THEN 

outptr :-l;  {wrap  ptr  if  necessary} 

{if  waiters  for  this  fifo  wake  them} 

IF  not_full  <>  NIL  THEN 
send  (not_full) ; 

END; 

END; 


{ *********  Begin  Semaphore  Procedures  **********} 
PROCEDURE  Initial! ze_semaphore (VAR  srsemaphore) ; 
{Initialize  a  semaphore  data  structure} 

BEGIN 

s. count  :  =  0;  {indicate  resource  is} 

{available} 

s . signal : -NIL;  {and  that  there  are  no} 

{waiters} 

END; 


PROCEDURE  Alloc (VAR  s : semaphore) ; 

{This  procedure  allociates  exclusive  use  of  a} 
{resource  to  the  task  that  executes  it.  This} 
{claim  is  maintained  even  though  the  task} 
{gives  up  control  of  the  CPU  via  a  yield  etc.} 

BEGIN 

WHILE  s. count  <>  0 


BEGIN 

wait for  :«  addr 
wait; 

END; 

s. count :=1; 

END; 


PROCEDURE  Dealloc(VAR  s: semaphore) ; 

{This  procedure  deallociates  a  resource.} 

{Note  this  routine  yields  so  the  deallocated} 
{resource  has  a  chance  of  being  used} 

{ immediately} 

BEGIN 

s. count :=0;  {remove  claim  on  resource} 
send  (s. signal) ;  {and  awaken  the  waiting  task} 
yield;  {give  other  tasks  a  chance} 

END; 

{End  of  kernel  procedures} 


End  Listing  One 

(Listing  Two  begins  on  page  62.) 


DO  {wait  for  semaphore} 

{ control led  resource } 
{to  become  available} 

(s. signal) ; 

{then} 

{claim  it} 


60 

536 


Dr.  Dobb's  Journal,  July  1987 


TURBO  PASCAL  MULTITASKING 


Listing  Two  (Text  begins  on  page  4Z.) 

PROGRAM  Multitasking_Demonstration_Program; 

{  ************************************************  j 


{***  Listing  Two  ***} 

{***  Multitasking  Demonstration  ***} 

{***  A  dumb  terminal  program  ***} 

{***  utilizing  4  tasks  and  a  serial  interrupt  ***} 

{***  service  routine.  ***} 

{***  **  * } 

{***  written  by  ***} 

{***  Craig  A.  Lindley  ***} 

{ ***  ***  j 

{***  ver:  1.0  Last  update:  03/11/87  ***} 

{***  ***} 


^  ************************************************  j 

CONST 

bytefifosize  «  100;  {max  size  of  byte  fifos} 


{include  the  multitasking  kernel  routines} 
{$1  multi. pas} 

{include  the  RS-232  functions} 

{ $1  serial. pas} 


VAR 

inbuffer, 

outbuffer:  bytefifo; 


{*****  Serial  Interface  Support  Procedures  *****} 


PROCEDURE  Ge t _ser i a l_cha r ; 

VAR 

b:  byte; 

BEGIN 

{Get  the  character  from  the  UART.  Place  it  in} 
{inbuffer  if  there  is  room,  throw  it  away  if} 
{not.  Signal  end  of  interrupt  (EOI  level  4  on} 
{8259. } 

b  :  =  port [portaddress] ; 

IF  inbuffer. ovd. count  <  bytefifosize  THEN 
Put_byte (b, inbuffer) ; 

port [$20]  :=  $20; 

END; 

PROCEDURE  Serial_I n t e r r upt_S er v i ce_Ro ut i ne ; 

{This  is  the  new  interrupt  service  routine.} 
{It  replaces  the  one  MsDos  normally  uses.} 

{See  text  for  details.} 

BEGIN 

{standard  interrupt  service  routine  preamble} 


INLINE  ($50/$53/$51/$52/$57/ 
$56/$06/$le/ 
$2e/$al/turbodseg/ 
$8e/$d8/ 

$fb) ; 

Get_serial  char; 


{standard  interrupt  service 

INLINE ($fa/$lf/$07/$5e/$5f/ 
$5a/$59/$5b/$58/ 

$5d/$5d/$cf ) ; 


END; 


{Push  ax,bx, cx,dx, } 
{di, si,es,ds} 

{mov  ax, cs:turbodseg} 
{mov  ds,ax} 

{sti} 


REPEAT 

IF  NOT  keypressed  THEN 
Yield 
ELSE 
BEGIN 


read (kbd, ch) ; 

Put_byte (byte  (ch) , outbuffer) ; 


END; 

UNTIL  false; 


routine  postamble} 


END; 


{interrupts  off} 
{Pop  ds, es, si, di,  } 
{dx, cx,bx,ax} 

{trash  sp,  restore} 
{Bp  and  iret} 


PROCEDURE  Task_l; 

{Task  1  takes  character  from  outbuffer  using} 

{ Get_Byte  and  sends  them  out  the  serial  port.} 


{ ************  Begin  Task  Procedures  ************} 


PROCEDURE  Task_0; 


BEGIN 

writeln (' Starting  Task  1'); 

REPEAT 

Serialout  (Get_byte (outbuffer) )  ; 
UNTIL  false; 


{Task  0  gets  keyboard  input  and  puts  it  into} 
{outbuffer.  If  no  input  available  task  0  yields.} 
{Note  infinite  loop  structure.} 

VAR 

ch:  char; 

BEGIN 


END; 


PROCEDURE  Task_2; 

{Task  2  retrives  characters  placed  in  inbuffer} 
{by  the  serial  interrupt  routine  and  displays} 
{them  on  the  screen.  Note:} 

{  1)  If  no  characters  are  available  this  routine} 
{  yields. 


writeln (' Starting  Task  O'); 
writeln; 


(continued  on  next  page) 

537 


TURBO  PASCAL  MULTITASKING 


Listing  Two  (Listing  continued,  te^ct  begins  on  page  42.) 

{  2)  Interrupts  must  be  disabled  while  inbuffer} 

{  is  being  accessed.  Otherwise  the  fifo} 

{  counter  will  get  confused  and  this  program} 

{  will  eventually  crash.} 


BEGIN 

writeln (' Starting  Task  2'); 

REPEAT 

IF  inbuffer. ovd. count  <>  0  THEN 
BEGIN 

INLINE ($FA);  (interrupts  off} 
write (chr  (Get_byte (inbuffer) ) ) ; 
INLINE ($FB);  (interrupts  on  } 

END 

ELSE 


Yield; 
UNTIL  false; 
END; 


PROCEDURE  Task_3; 

(Task  3  monitors  and  displays  the  fifo  and  cursor} 
(status.  It  wakes  up  every  1/2  second  to  do  so.} 
(The  cursor  position  is  saved  and  retrived  while} 
(the  fifo  status  is  being  updated  on  the  screen} 

VAR 


cursorx, 
cursory:  byte; 

BEGIN 


writeinTPi-arLing  Task  j  1 


REPEAT 

pause  (2) ; 

cursorx  :=  wherex; 
cursory  :=  wherey; 
window (1,1, 80,25) ; 
gotoxy  (21, 25) ; 
write (cursorx :2) ; 
gotoxy (35, 25) ; 
write (cursory :2) ; 
gotoxy (58, 25) ; 
write (inbuffer. ovd. count : 2) 
gotoxy (72, 25) ; 

write (outbuffer. ovd. count :2) ; 
window  (1, 1,80,23) ; 
gotoxy  (cursorx, cursory) ; 

UNTIL  false; 


(wake  every  1/2  sec} 
(save  cursor  position} 


(write  cursor  position} 


(write  fifo  counts} 


END; 


PROCEDURE  Initial! ze_Display; 

(This  procedure  initializes  the  screen  for  the} 
(demo  program.  It  builds  a  status  line  on  screen} 
(line  25  and  then  establishes  a  terminal  window} 
(so  the  status  line  will  not  be  over  written.} 

BEGIN 

window  (1, 1, 80, 25) ;  (window  is  full  screen} 
CLRSCR;  (clear  the  screen} 

writeln ( 'Multitasking  Demonstration' ) ; 
writeln ('  A  dumb  serial  terminal  program'); 
writeln ('  Use  AC  to  abort'); 
writeln; 


(Listing  continued  on  page  66) 


64 

538 


Dr.  Dobb's  Journal,  July  1987 


TURBO  PASCAL  MULTITASKING 


Listing  Two  (Listing  continued,  teyt  begins  on  page  42. ) 

(build  the  status  line) 

gotoxy (  1,25);  write ( 'Status  —  Cursor  X:'); 

gotoxy (25, 25) ;  write ( 'Cursor  Y: ') ; 

gotoxy (49, 25) ;  write (' Incount :')  ; 

gotoxy (62, 25) ;  write ( 'Outcount : ' ) ; 

window (1,1, 80, 23) ;  (establish  terminal  window} 

gotoxy (1,8);  (home  the  cursor} 

END; 


{*************  Begin  Main  Program  **************} 
BEGIN  (main} 

(The  main  program  builds  the  screen  status} 

(line,  initializes  the  input  and  output  fifos, } 
(initializes  the  multitasking  kernel,  installs} 
(the  serial  interrupt  handler  and  then  begins} 
(forking  the  individual  tasks.} 

Initialize_Display; 

Initialize_fifo (inbuffer. ovd) ; 

Initialize_fifo (out buffer. ovd) ; 

Init_Kernel; 

(Initialize  and  install  the  serial  interrupt} 
(handler.  Install  our  interrupt  routine  in} 
(place  of  the  original  system  IRQ4  handler} 

WITH  regs  DO 
BEGIN 

ah:«$25; 

al:=$0C; 

ds:=cseg; 

dx:=ofs (Serial_Interrupt_Service_Routine) ; 
msdos (regs) ; 

END; 

(Set  the  serial  format  to  1200  baud} 

(1  stop  bit,  8  data  bits  and  no  parity} 

Setserial (1200, 1, 8, none) ; 

(Fork  off  tasks  one  through  three} 

Fork; 

IF  childjprocess  THEN 
Task_l; 

Fork; 

IF  child_process  THEN 
Task_2; 

Fork; 

IF  child_process  THEN 
Task_3; 

(Enable  the  serial  interrupt} 

Enabl e_ser i a l_i nt ; 

(Start  Task  0} 

Task_0; 

END. 


End  Listing  Two 

(Listing  Three  begins  on  page  70.) 


66 


Dr.  Dobb’s  Journal,  July  1987 

539 


TURBO  PASCAL  MULTITASKING 

Listing  Three  (Text  begins  on  page  42.) 

{ ***********************************************  *  j 

{***  Listing  Three  ***} 

{***  Multitasking  Demonstration  ***} 

{***  support  routines  ***} 

{***  ***} 

{***  written  by  ***} 

{***  Craig  A.  Lindley  ***} 

{***  ***} 

{***  Ver:  1.0  Last  update:  03/11/87  ***} 

{***  ***} 

I  ************************************************ j 


COil  =1;  {com  one  PC  port} 

portaddress  =  $3f8;  {address  of  UART  for} 

{C0M1 } 


parity_type  =  (odd, even, none) ; 


regs:  register_type; 


PROCEDURE  Int 14  (portnumber, command, 
parameter: integer) ; 

{Procedure  to  initialize  the  com  ports} 


WITH  regs  DO 
BEGIN 

dx  portnumber  -  1; 
ah  :-  command; 
al  :»  parameter; 
flags  :»  0; 
intr  ($14, regs)  ; 

END; 


PROCEDURE  Setserial  (baudrate, stopbits, 
databits:  integer; 
parity:  parity_type) ; 

{Configure  COM1  with  the  specified  parameters} 


parameter:  integer; 


writeln (' Configuring  the  serial  parameters'); 
writeln; 

CASE  baudrate  OF 

300:  baudrate  :=  2; 

1200:  baudrate  :=  4; 

ELSE  baudrate  :=  4;  {default  is  1200  baud} 


IF  stopbits  =  2  THEN 
stopbits  :-  1 
ELSE 

stopbits  0; 


{default  is  1  stop  bit} 


IF  databits  =  7  THEN 
databits  :*  2 
ELSE 

databits  :-  3; 


{default  is  8  bit  words} 


parameter  :=  (baudrate  SHL  5}+ 

(stopbits  SHL  2) +databits; 


(continued  on  page  73 ) 


70 

540 


Dr.  Dobb's  Journal,  July  1987 


TURBO  PASCAL  MULTITASKING 


Listing  Three  (Listing  continued,  text  begins  on  page  42.) 

CASE  parity  OF 

odd:  parameter  :=  parameter  +8; 
even:  parameter  :=  parameter  +  24; 
none :  ; 

END; 

Intl4 (C0M1, 0, parameter) ; {do  the  configuration} 

END; 


FUNCTION  Serialstatus  :  integer; 
{Get  the  status  of  COM1  port} 
BEGIN 

Intl4 (COM1,3,0) ; 
serialstatus  :=  regs.ax; 

END; 


PROCEDURE  Serialout (b:byte) ; 

(Send  a  byte  out  the  C0M1  port} 

BEGIN 

{wait  till  UART  is  ready} 

WHILE  (Serialstatus  AND  $2000)  =  0  DO; 

{then  send  the  byte  out} 
port [port address]  :=  b; 

END; 


PROCEDURE  Enable_serial_int; 

BEGIN 

{clear  the  serial  interface  of  any  garbage} 

INLINE ($BA/ port address  /$EC/ 

$BA/portaddress+5/$EC/ 

$BA/portaddress+6/$EC) ; 

INLINE  ($E4/$21/$24/$EF/$E6/$21) ;  {IRQ4  enabled} 

port [port address+4 ]  :=  $0B;  {set  DTR,  RTS} 

{and  OUT 2 } 

port [portaddress+1 ]  :=  1;  {receiver} 

{interrupt} 

{enabled} 

END; 

{End  of  RS-232  procedures} 

End  Listings 


Dr.  Dobb's  Journal,  July  1987 


73 

541 


C  CHEST 


Listing  One  (Te?ct  begins  on  page  94.) 


9 

10 

11 

12 

13 

14 

15 

16 

17 

18 

19 

20 
21 
22 

23 

24 

25 

26 

27 

28 

29 

30 

31 

32 

33 

34 

35 

36 

37 

38 
39| 
40, 
411 
42| 
43| 

44  | 

45 

46 

47 

48 

49 

50 

51 

52 

53 

54 

55 

56 

57 

58 

59 

60 
61 
62 

63 

64 

65 

66 

67 

68 

69 

70 

71 

72 

73 

74 

75 

76 

77 

78 

79 

80  1 
81 
82| 

83 

84 

85 

86 

87 

88 

89 

90 

91 

92 

93 

94 

95 

96 

97 

98 

99 
100 
101 
102 

103 

104 

105 

106 


Listing  1  —  vbios.c 


(Microsoft  file)  includes  for  int86()  */ 


♦include  <stdio.h> 

♦include  <dos.h>  /* 

/*  VBIOS.C:  Various  cursor  and  i/o  routine  using 

*  the  bios  interrupts  (see  below  for  greater  detail) : 

*  Copyright  (C)  1987  Allen  I.  Holub.  All  rights  reserved. 

*  Externally  accessable  routines: 


'  int 
void 
1  void 
:  void 
void 
1  int 

void 
:  int 
void 
'  void 

'  int 
'  void 
1  void 
:  void 


vb_getpage 

vb_putchar 

vb_getchar 

vb_puts 

vb_replace 

vb  inchar 


0 

(c) 

(c) 

(s,  move) 
(c) 

(attrib) 


vb_setcur  (posn) 
vb_getcur  ( ) 
vb_ctoyx  (y,x) 
vb_getyx  (iy,  ix) 

vb_iscolor ( ) 
vb_cursize  (top, bot) 
vb_blockcur ( ) 
vb_normalcur ( ) 


Get  active  video  page  ♦ 
write  a  single  character 
get  a  key  from  the  bios . 
write  a  string 
write  char  w/o  moving  cursor 
Get  character  &  attribute 

Set  cur  pos  as  int  on  cur  page 
Getcurposas  int  from  cur  page 
Set  cursor  position  to  (y,  x) 
Get  cursor  position 

color  monitor  installed 
Set  cursor  size 
make  a  block  cursor 
revert  to  a  normal  cursor 


void  vb_scroll(l,r,t,b,a)  Scroll  region 


extern  int 
/* - 


int86 (  int,  union  REGS  *,  union  REGS  *); 


VIDEO  INT 

0x10 

/* 

Video  interrupt 

*/ 

KB_INT 

0x16 

/* 

Keyboard  interrupt 

*/ 

CUR  SIZE 

0x1 

/* 

Set  cursor  size 

*/ 

SET  POSN 

0x2 

/* 

Modify  cursor  posn 

*/ 

READ  POSN 

0x3 

/* 

Read  current  cursor  posn 

*/ 

WRITE 

0x9 

/* 

Write  character 

*/ 

WRITE  TTY 

Oxe 

/* 

Write  char  i  move  cursor 

*/ 

GET  VMODE 

Oxf 

/* 

Get  video  mode  &  disp  pg 

*/ 

static 

static 


union  REGS 
int 


Regs;  /*  Used  to  talk  to  DOS  */ 

Attribute;  /*  Current  attribute  */ 


void 

{ 


vb_scroll (  x_left,  x_right,  y_top,  y_bottom,  amt  ) 

/*  Scroll  the  indicated  region  on  the  screen. 

*^If  amt  is  negative,  scroll  down. 

if (  amt  <  0  ) 

( 

Regs. h. ah  -  7  ; 

Regs.h.al  -  -amt  ; 

} 

else 
{ 


} 


Regs. h. ah  -  6 
Regs.h.al  -  amt  ; 


Regs.h.bh  -  0x07 
Regs. h. cl  -  x_left 
Regs . h . ch  -  y_top 
Regs.h.dl  -  x_right 
Regs.h.dh  -  y_bottom  ; 
int86 (0x10,  iRegs,  iRegs); 


int 

int 

( 


int 

( 


vb_inchar (  attrib  ) 

* attrib; 

/*  Return  the  character  at  the  current  cursor 

*  position  and,  if  attrib  is  non-NULL,  put  the 

*  attribute  there.  Note  that  vb_getpage()  will  mess 

*  up  the  fields  in  the  Regs  structure  so  it  must 

*  be  called  first. 


Regs.h.bh  -  vb_getpage()  ; 

Regs. h. ah  -  8  ; 

int86 (  VIDEO_INT,  iRegs,  iRegs  ); 
if (  attrib  ) 

* attrib  -  Regs. h. ah  i  Oxff  ; 
return (  Regs.h.al  &  Oxff  ); 


vb_getpage  () 

/*  Returns  the  currently  active  display  page  number 


107| 

*/ 

108  | 

109| 

Regs. h. ah  -  GET  VMODE; 

110  | 

int86(  VIDEO  INT,  SRegs,  iRegs  ); 

HI  1 

112  | 

return  (int)  Regs.h.bh  ; 

113| 

) 

114  | 

115  | 

116| 

117| 

void 

vb  cursize(  top  line,  bot  line  ) 

118  | 

{ 

119| 

/*  Scan  lines  are  numberd  0  at  the  top  and  7  at  the 

120  | 

*  bottom  on  the  color  card.  On  the  monochrome  card 

121 1 

*  they're  0-12.  If  top  i  bot  are  reversed  you'll 

1221 

*  get  a  2  part  cursor.  Top  line  determines  the 

123| 

*  position  of  the  top  scan  line  of  the  cursor. 

124  | 

*  bot  line  is  the  bottom.  A  normal  cursor  can  be 

125  1 

*  created  with  vb  cursize ( 6, 7 ) .  Cursize(0,7)  will 

1261 

*  fill  the  entire  area  occupied  by  a  character. 

127  1 

*  Cursize  (0,1)  will  put  a  line  over  the  character 

128  | 

*  rather  than  under  it. 

129| 

*/ 

130  | 

131  | 

Regs.h.ch  -  top  line  ; 

1321 

Regs. h. cl  -  bot  line  ; 

133| 

Regs. h. ah  -  CUR  SIZE  ; 

134  | 

int86 (  VIDEO  INT,  SRegs,  SRegs  ); 

135| 

) 

136| 

137  | 

138  | 

139| 

int  vb 

iscolorO  /*  Returns  true  if  a  color  card  is  active  */ 

140| 

{ 

141  | 

Regs. h. ah  -  GET  VMODE  ; 

142  | 

int86 (  VIDEO  INT,  SRegs,  SRegs  ) : 

143  | 

return (  Regs.h.al  !-  7  ); 

144  | 

) 

145  | 

146| 

void 

vb 

blockcurO  I*  Make  the  cursor  a  block  curser  */ 

147| 

{ 

148  | 

vb  cursize  (  0,  vb  iscolorO  ?  7  :  12  ) ; 

149| 

) 

150| 

151  | 

void 

vb 

normalcur()  /*  Make  it  an  underline  cursor  */ 

152  | 

( 

153  | 

if (  vb  iscolorO  ) 

154  | 

vb  cursize (  6,  7  ); 

1551 

else 

156| 

vb  cursize(  11,  12  ); 

157  | 

} 

158  | 

160  | 

161| 

void 

vb  setcur (  posn  ) 

162| 

int 

posn; 

163| 

{ 

164  | 

/*  Modify  current  cursor  position.  The  top  byte  of 

165  | 

*  "posn"  value  holds  the  row  (y) ,  the  bottom  byte. 

166| 

*  the  column  (x) .  The  top-left  corner  of  the  screen 

167| 

*  is  (0,0).  Pagenum  is  the  video  page  number.  Note 

168  | 

*  that  vb  getpageO  will  mess  up  the  fields  in  the 

1691 

*  Regs  structure  so  it  must  be  called  first. 

170  | 

*/ 

171  | 

172  | 

Regs.h.bh  -  vb  getpageO  ; 

173| 

Regs.x.dx  -  posn 

174  | 

Regs. h. ah  -  SET  POSN 

175  | 

int8  6  (  VIDEO  INT,  iRegs,  iRegs  ); 

176  | 

) 

177| 

178  | 

int 

vb  getcur  () 

179| 

( 

180| 

1*  Get  current  cursor  position.  The  top  byte  of  the 

181  | 

*  return  value  holds  the  row,  the  bottom  by  the 

182  | 

*  column.  Pagenum  is  the  video  page  number.  Note 

183  | 

*  that  vb  getpageO  will  mess  up  the  fields  in  the 

184  | 

*  Regs  structure  so  it  must  be  called  first. 

185  | 

*/ 

186| 

187  | 

Regs.h.bh  -  vb  getpageO  ; 

188  | 

Regs. h. ah  -  READ  POSN 

189| 

int 8 6  (  VIDEO  INT,  iRegs,  iRegs  ); 

190  | 

return (  Regs.x.dx  ); 

191  | 

} 

192  | 

194  | 

*  vb 

cotyx()  and  vb  getyx  also  get  the  cursor  position. 

195  | 

*  They 

use  x  and  y  values,  however. 

196| 

*/ 

197  | 

198  | 

void 

vb  ctoyx  (  y,  x  ) 

199  1 

( 

200| 

vb  setcur (  (y  «  8)  |  (x  i  Oxff)  ); 

201  | 

) 

202  | 

203| 

void 

vb  getyx (  yp,  xp  ) 

204  | 

int 

*yp.  *xp; 

205  | 

( 

206| 

register  int  posn; 

207  | 

208  | 

posn  -  vb  getcur (); 

209| 

*xp  -  posn  i  Oxff  ; 

210| 

*YP  "  (posn  »  8)  &  Oxff  ; 

212| 

213  | 

/* - 

— 

214  | 

(continued  on  page  78) 

Dr.  Dobb’s  Journal,  July  1987 

542 


75 


C  CHEST 

215|  vb  replace (c) 

2161  { 

217 |  / *  Overwrite  the  character  at  the  current  cursor 

218|  *  position  without  moving  the  cursor. 

2191  */ 

2201 

221|  Regs. h. ah  -  10  ; 

222|  Regs.h.al  -  c;  /*  write  c  */ 

223|  Regs.h.bl  -  0x07;  /*  Normal  characters  */ 

224|  Regs.h.bh  -  0;  /*  Display  page  0  */ 

225|  Regs.x.cx  -  1;  /*  ♦  of  times  to  write  */ 

2261 

227|  int86 (  VIDEO  INT,  SRegs,  &Regs  ); 

228 1  } 

2291 

2301  /* - */ 

231| 

232|  vb  put char (  c  ) 

233|  { 

234|  /*  Write  a  character  to  the  screen  in  TTY  mode. 

235|  *  Only  normal  printing  characters,  BS,  BEL,  CR  and 

236|  *  LF  are  recognized.  The  cursor  is  automatically 

237|  *  advanced  and  lines  will  wrap. 

238|  */ 

239| 

2401  Regs.h.bl  -  0x07; 

241|  Regs.h.al  -  c; 

242|  Regs. h. ah  -  WRITE  TTY  ; 

243|  int86 (  VIDEO  INT,  &Regs,  &Regs  ); 

244  1  } 

2451 

2461  /* - */ 

247| 

248|  vb_puts(  str,  move_cur  ) 

249|  register  char  *str; 

2501  { 

251|  /*  Write  a  string  to  the  screen  in  TTY  mode.  If 

252|  *  move_cur  is  true  the  cursor  is  left  at  the  end 

253|  *  of  string.  If  not  the  cursor  will  be  restored  to 

254|  *  its  original  position  (before  the  write). 

2551  */ 

256| 

257|  register  int  posn; 

258  | 

259|  if (  'move  cur  ) 

260|  posn  -  vb  getcurO; 

261  | 

262|  while (  *str  ) 

263|  vb  putchar(  *str++  ); 

264  | 

265|  if (  Imove  cur  ) 

266|  vb  setcur (  posn  ); 

267  1  ) 

268  | 

269|  /» - */ 

270| 

271|  int  vb  getcharO 

272  1  { 

273|  /*  Get  a  character  with  a  direct  video  bios  call. 

274|  *  This  routine  can  be  used  to  complement  stderr  as 

275|  *  it  can  be  used  to  get  characters  from  the  keyboard, 

276|  *  even  when  input  redirected.  The  typed  character 

2 77  j  *  is  returned  in  the  low  byte  of  the  returned 

278|  *  integer,  the  high  byte  holds  the  auxiliary  byte 

279|  *  used  to  mark  ALT  keys  and  such.  See  the  Technical 

280|  *  Reference  for  more  info. 

281|  */ 

282  | 

283|  Regs. h. ah  -  0  ; 

284|  int86 (  KB  INT,  SRegs,  &Regs  ); 

285|  return(  (Tnt)Regs .x.ax  ); 

2861  ) 

287  | 

2881  /* - */ 

289 i  ♦ifdef  MAIN 

290  | 

291|  main() 

292  1  { 

293|  vb_replace (  ’X1  ); 

294|  vb_putchar ( ' \n' ) ; 

295|  vb_putchar ( 1 \r 1 ) ; 

296|  ) 

297  | 

298|  ♦endif 

End  Listing  One 

(Listing  Two  begins  on  page  80.) 

C  CHEST 

Listing  Two 

(Text  begins  on  page  94.) 

Listing  2  —  /include/box .h 

11 

, . 

* 

21 

*  BOX . H 

:  Copyright  (c)  1987,  Allen  I.  Holub. 

31 

* 

All  rights  reserved. 

51 

61 

*  #defines  for  the  box-drawing  characters  on  the  IBM  PC. 

81 

* 

The  names  are: 

10  1 

* 

UL 

Upper  left  corner 

111 

* 

UR 

Upper  right  corner 

12  1 

* 

LL 

lower  left  corner 

13  1 

* 

LR 

lower  right  corner 

14  | 

* 

CEN 

Center  (intersection  of  two  lines) 

151 

* 

TOP 

Tee  with  the  flat  piece  on  top 

16| 

* 

BOT 

Bottom  tee 

17| 

h 

LEFT 

Left  tee 

18| 

* 

RIGHT 

Right  tee 

19| 

* 

HORIZ 

Horizontal  line 

20  | 

* 

VERT 

Vertical  line. 

211 

* 

22| 

* 

23| 

* 

UL  -TOP-  UR  HORIZ 

24  | 

* 

1 

25| 

* 

L  R  V 

26| 

E  |  I  E 

27  | 

* 

F —  -CEN-  — G  R 

28| 

* 

T  |  H  T 

291 

* 

1  T 

30  | 

* 

1 

311 

* 

LL  -BOT-  LR 

32| 

* 

331 

* 

34  | 

*  The 

XXX 

defines  have  double  horizontal  and  vertical  lines 

35| 

*  The  HD  XXX 

defines  have  double  horizontal  lines  only 

36| 

*  The  VD  XXX 

defines  have  double  vertial  lines  only 

37  | 

* 

38| 

*  If  your  terminal  is  not  IBM  compatible,  ^define  all  of  these 

39| 

*  as  1 

1 ,  except  for  the  VERT  ♦defines,  which  should  be  a  |, 

40  | 

*  and 

the  HORIZ  #defines,  which  should  be  a 

41  I 

*/ 

42| 

43  1 

♦define 

VERT 

179 

44  | 

♦define 

RIGHT 

180 

45  1 

#define 

UR 

191 

46| 

#define 

LL 

192 

47| 

#define 

BOT 

193 

48| 

♦define 

TOP 

194 

49  | 

♦define 

LEFT 

195 

50| 

♦define 

HORIZ 

196 

511 

#define 

CEN 

197 

52| 

♦define 

LR 

217 

53| 

♦define 

UL 

218 

54| 

55| 

♦define 

D  VERT 

186 

56  | 

♦define 

D  RIGHT 

185 

57  | 

♦define 

D  UR 

187 

58| 

♦define 

D  LL 

200 

59| 

♦define 

D  BOT 

202 

60  | 

♦define 

D  TOP 

203 

611 

♦define 

D  LEFT 

204 

62| 

♦define 

D  HORIZ 

205 

63| 

♦define 

D  CEN 

206 

64| 

♦define 

D  LR 

188 

65| 

♦define 

D  UL 

201 

66  | 

67| 

♦define 

HD  VERT 

179 

68  | 

♦define 

HD  RIGHT 

181 

69| 

♦define 

HD  UR 

184 

70| 

♦define 

HD  LL 

212 

711 

♦define 

HD  BOT 

207 

72| 

♦define 

HD  TOP 

209 

73| 

♦define 

HD  LEFT 

198 

74  | 

♦define 

HD  HORIZ 

205 

75| 

♦define 

HD  CEN 

216 

76| 

♦define 

HD  LR 

190 

77  | 

♦define 

HD  UL 

213 

78  | 

79| 

♦define 

VD  VERT 

186 

80  | 

♦define 

VD  RIGHT 

182 

811 

♦define 

VD  UR 

183 

82  | 

♦  define 

VD  LL 

211 

83| 

♦define 

VD  BOT 

208 

84| 

♦define 

VD  TOP 

210 

85| 

♦define 

VD  LEFT 

199 

86  | 

♦define 

VD  HORIZ 

196 

87  | 

♦define 

VD  CEN 

215 

88  | 

♦define 

VD  LR 

189 

89  | 

VD  UL 

214  ... 

End  Listing  Two 

Listing  Three 

Listing  3  —  /include/curses .h 

* 

/* - 

21 

*  CURSES. H:  Copyright 

(c)  1987,  Allen  I.  Holub.  * 

3  | 

* 

All 

rights  reserved.  * 

41 

51 

6| 

*/ 

7  1 

typedef 

struct 

91 

int 

x_org;  /*  X 

coordinate  of  upper-left  corner  */ 

543 


C  CHEST 


int  y_org; 
int  x_size; 
int  y_size; 
int  row; 
int  col; 
int  scroll  ok- 


♦  define 

♦  define 

♦  define 

♦  define 

♦  define 

♦  define 


/*  Y  coordinate  of  upper-left  corner  */ 
/*  Horizontal  size  of  text  area.  */ 
/*  Vertical  size  of  text  area.  */ 
/*  Current  cursor  row  (0  to  y_size-l)  */ 
/*  Current  cursor  column  (0  to  x  size-1)*/ 
/*  Scrolling  permitted  in  this  window  */ 


unsigned  int 
register 
(1) 

(0) 

(0) 

(1) 


Listing  Four 


*  Reminder:  The  comma  opperator  goes  left  to  right  and 

evaluates  to  the  rightmost  thing  in  the  list. 

*  The  following  macros  implement  many  of  the  curses  functions 

*  note  that  stdscr  only  has  meaning  when  passed  to  getyx. 


♦define  stdscr  0 

♦define  getyx (win, y,x)  (win?  ( (x) -win->col, (y) -win->row)  \ 
:  getpos  (&x,  fiy) ) 

♦define  mvinch(y,x)  (  move  (  y,x),  inch  (  )  ) 

♦define  mvwinch (w, y, x)  (  wraove (w,y,x) ,  winch (w)  ) 

♦define  addch(c)  putchar(c) 

♦define  endwin() 

♦define  erase ()  clear () 

♦define  initscrO 

♦define  printw  print f 

♦define  refresh  () 

♦define  scroll (w)  wscroll (w, 1) ; 

♦define  scrollok (win, flag)  ( (win) ->scroll_ok  -  (flag)) 

♦define  subwin (w, a, b, c,d)  newwin(a,b,c,d) 

♦define  wclear  werase 

♦define  wrefresh(win) 


extern 

WINDOW  *newwin 

(int  ,int 

,  int 

int  ) 

extern 

int 

box 

(WINDOW 

* 

int  , 

int  ) ; 

extern 

int 

clear 

(void) ; 

extern 

int 

crmode 

(void) ; 

extern 

int 

echo 

(void) ; 

extern 

int 

getch 

(void) ; 

extern 

int 

move 

( int  , int 

) ; 

extern 

int 

nl 

(void) ; 

extern 

int 

nocrmode 

(void) ; 

extern 

int 

noecho 

(void) ; 

extern 

int 

nonl 

(void) ; 

extern 

int 

waddch 

(WINDOW 

* 

int 

) ; 

extern 

int 

waddstr 

(WINDOW 

* 

char 

*); 

extern 

int 

wclrtoeol 

(WINDOW 

* 

) ; 

extern 

int 

werase 

(WINDOW 

* 

) ; 

extern 

int 

wgetch 

(WINDOW 

* 

) ; 

extern 

int 

wmove 

(WINDOW 

*  e 

int 

,  int 

extern 

int  wprintw 

(WINDOW 

*  f 

char 

*  ... 

extern 

int 

wscroll 

(WINDOW 

* 

int 

End  Listing  Three 


Listing  Four 


Listing  4  —  curses. c 


♦include  <stdio.h> 

♦include  <ctype.h> 

♦include  <curses.h>  /*  routines  in  this  file, 
♦include  <box.h>  /*  of  IBM  box-drawing  characters 
♦include  <stdarg.h>  /*  va_list  and  va_start  (ANSI) 


Copyright  (c)  1987,  Allen  I.  Holub. 
All  rights  reserved. 


This  file  is  a  DOS  implementation  of  some  of  the  Unix 
CURSES  functions.  It  is  Unix  compatible  but  is  a  proper 
subset,  not  a  full  implementation,  of  curses.  It  works 
on  the  IBM-PC.  In  all  of  these  y  is  the  row  number  and 
x  is  the  column  number.  The  upper  left  corner  is  (0,0): 

I/O  functions  - 


waddch (  win,  ch  )  Works  like  putc() 

waddstr (  win,  str)  Works  like  puts() 

wprintw (win, fmt, arg . . . )  Like  printf  but  writes  to  the 
indicated  window. 
wrefresh(win)  See  below. 

box (win,  vert,  horiz)  Draws  a  box  around  the  window. 

Cursor  movement  and  screen  control  - 

werase (win)  erase  the  entire  window 

wclrtoeol (win)  erase  from  cursor  position  to  the  end 


erase  the  entire  window 

erase  from  cursor  position  to  the  end 

of  line 


wmove (win,y, x)  Move  the  cursor  to  postion  (y,x)  rela¬ 
tive  to  the  orgin  of  the  indicated  window. 


getxy (win, y, x) 


wgetch (win) 


MACRO:  puts  the  current  cursor  post ion 
into  y  and  x.  Note  that  this  is  a  macro, 
don't  put  an  &  in  front  of  y  and  x  in 
the  invocation. 

works  like  getchar  but  echos  to  the 
indicated  window.  If  crmode  is 
inactive,  it  is  activated  for  the 
duration  of  this  subroutine. 


scroll (win) 
wscroll (  win,  amt  ) 


Scrolls  the  window  up  one  line. 
NOT  A  CURSES  FUNCTION.  Scrolls 
window  by  indicated  amount.  A 
negaitve  amt  scrolls  down. 


Initialization  stuff  - 

initscrO  Initialize 

endwinO  Clean  up 

scrollok (win,  flag);  enable/disable  scrolling  for  window, 
newwin (lines, cols, begin_y, begin_x)  Create  a  new  window. 

Terminal  control  - 

Because  of  the  perversities  of  DOS,  these  work  in  a 
slightly  nonstandard  way.  In  particular  echo, noecho, 
nl,  and  nonl  only  work  if  crmode ()  is  active. 
Moreover,  they  are  ignored  for  the  non-w  fundtions. 
For  portability  reasons,  it's  best  to  always  set 
crmode ()  at  the  top  of  your  program. 


:  crmode ( ) ; 
no crmode ( ) ; 
noecho  ()  ; 
echo  ( ) ; 
nl  ( )  ; 
nonl  ( ) ; 


Turn  off  input  buffering. 

Turn  it  back  on  again.  (default) 
Turn  off  automatic  echo. 

Turn  it  back  on  again.  (default) 
Turn  on  CR-NL  mapping  (default) 
turn  it  off  again. 


Functions  that  affect  the  whole  screen. 


move (y,x) 
addch  (c) 
clear  ( ) 
printw () 
getch() 
refresh () 


move  cursor  to  abs.  position  (y, x) 
Write  a  character. 

Clear  the  screen 
works  like  printf 

get  a  character  from  the  keyboard. 
See  below. 


*  The  real  curses  keeps  an  two  internal  representations  of 

*  the  screen,  when  you  change  something  it  just  modifies 

*  one  of  these  reprsentations .  You  must  issue  a  refresh () 

*  or  wrefresh()  call  to  actually  modify  the  screen.  My 

*  version  of  curses  refreshes  the  screen  immediately  after 

*  every  write,  refresh ()  and  wrefresh()  macros  have  been 

*  provided  for  UNIX  compatability,  however.  These  macros 

*  don't  do  anything,  but  you  should  scatter  them  liberally 

*  about  your  code  if  you  want  it  to  be  portable. 

*  I've  corrected  one  bug  in  the  real  curses  that  might 

*  cause  problems  when  you  port  your  code.  The  real  curses 

*  (at  least  the  one  at  Berkeley),  doesn't  scroll  properly 

*  in  that  it  leaves  junk  on  the  bottom  line  of  the  window 

*  after  a  scroll.  I've  corrected  the  problem  here  but, 

*  again,  if  you  want  real  portability  you  should  do  a 

*  wclrtoeol (win)  after  every  scroll.  Unfortunately,  there's 

*  no  way  to  determine  that  the  screen  has  scrolled  without 

*  actually  keeping  track  of  the  characters  that  are  written 

*  to  the  screen.  Ugh. 

*  Other  differences:  curses  doesn't  know  about  characters 

*  that  it  hasn't  actually  put  on  the  screen  with  an  addch () . 

*  So,  if  echo  is  enabled,  curses  won't  erase  the  echoed 

*  characters  when  it  scrolls  the  screen  and  so  forth.  The 

*  curses  presented  here  doesn't  exibit  this  behaviour,  but 

*  if  you  want  compatablility  with  Unix,  turn  off  echo  (with 

*  a  noecho ()  call)  and  then  echo  all  characters  yourself. 

*  The  purpose  of  the  four  ♦defines  immediately  below  is  to 

*  make  it  easy  to  modify  this  package.  They  all  map  function 

*  calls  to  video-bios  subroutines  [vb_xxx()].  If  you  want  to 

*  use  your  own  functions  (that  do  direct  video  access  or  send 

*  out  escape  sequences  to  a  normal  terminal,  for  example) 

*  just  change  the  ♦defines  and  recompile.  The  functions  must 

*  behave  as  follows: 

* 

*  cmove(y, x)  Move  the  cursor  to  position  (y, x)  (y  is  row) 

where  (0,0)  is  the  upper-left  corner  of  the 

*  screen. 

*  curpos(y,x)  Put  the  (y,x)  cursor  position  into  the 

integers  pointed  to  by  y  and  x. 

*  replace (c)  Print  c  at  the  current  cursor  position  without 

moving  the  cursor.  This  routine  must  be  able 
to  put  a  character  into  the  80th  column 
without  scrolling  the  screen  or  wrapping 

*  around. 


doscroll (1, r, t, b, a)  Scroll  a  region  of  the  screen  with  the 
top  left  corner  at  (t,  1)  and  the  bottom 
right  corner  at  (b,r) .  "a"  is  the  number  of 
lines  to  scroll  (up  if  a  is  positive,  down  if 
it's  negative). 


(continued  on  page  89) 


82 

544 


Dr.  Dobb's  Journal,  July  1987 


C  CHEST 


Listing  Four 

(Listing  continued,  teyt  begins  on  page  94.) 


177 

178 

179 

180 
181 


158  | 

noecho 

(  )  { 

Echo  - 

0; 

1591 

echo 

(  )  { 

Echo  - 

1; 

160  | 

nl 

(  )  ( 

Nl 

1; 

161  | 

nonl 

(  )  ( 

Nl 

0; 

162| 

qetpos 

(yp.xp)  { 

return 

curpos (yp,  xp) ; 

163  | 

move 

(y,  x  )  { 

cmove(y,  x); 

164  | 

inch 

(  )  { 

return 

inchar  ()  ; 

1651 

167| 

168| 

crmode  () 

169| 

{ 

170  | 

FILE 

* 

console; 

171  | 

172| 

setvbuf (stdin 

NULL, 

IONBF,  0);  /* 

173| 

Crmode  -  1; 

174| 

) 

1751 

176| 

nocrmode  () 

202 

203 

204 

205 

206 

207 

208 

209 

210 
211 
212 

213 

214 
2151 
2161 

217 

218 

219 

220 
221 
222 

223 

224 

225 

226 

227 

228 

229 

230 

231 

232 

233 

234 
235| 
2361 
237  1 
2381 
2391 
240| 
241  | 


inchar ( ) 


*/ 


Returns  the  character  at  the  current  curser 
position. 


#define  cmove(y,x) 

♦define  curpos(y/x) 

♦define  replace (c) 

♦define  doscroll(l,r,t,b,a) 
♦define  inchar () 

♦define  min(a,b) 


vb_ctoyx (y,x) ; 
vb_getyx (y,x) ; 
vb_replace (c) ; 
vb_scroll(l,r,t,b,a) ; 
vb_inchar (NULL) 

((a)  <  (b)  ?  (a)  :  (b) ) 


static  int  Echo  -  1 
static  int  Crraode  -  0 
static  int  N1  -  1 


/*  Echo  enabled  */ 
/*  If  1,  use  buffered  input  */ 
/*  If  1,  map  \r  to  \n  on  input  */ 
/*  and  map  both  to  \n\r  on  output  */ 


Turn  off  buffering*/ 


freopen(  "/dev/con",  "r",  stdin 
Crmode  -  0; 


1821 

/* - 

-*/ 

183| 

184  | 

WINDOW 

*newwin(  lines,  cols,  begin  y,  begin  x  ) 

185| 

1861 

int 

cols;  /* 

Horizontal  size  (including 

border) 

*/ 

187| 

int 

lines;  /* 

Vergtical  size  (including 

bofder) 

*/ 

188| 

int 

begin  y;  /* 

X  coordinate  of  upper-left 

corner 

*/ 

189| 

int 

beqin  x;  /* 

Y  coordinate  of  upper-left 

corner 

*/ 

1901 

{ 

191  | 

WINDOW  *win 

,  *malloc(); 

192  1 

193| 

if (  ! (win  - 

malloc(  sizeof (WINDOW)  ))  ) 

194  | 

ferr("Out  of  memory\n") ; 

195  1 

196| 

win->x  org 

-  begin  x; 

197| 

win->y  org 

-  begin  y; 

198  | 

win->x  size 

-  col  s 

1991 

win->y  size 

-  lines 

200| 

win->row 

-  0; 

2011 

win->col 

-  0; 

win->scroll  ok  -  0 


werase (win) ; 
return  win; 


int 

WINDOW 

int 

( 


waddch(  win,  c  ) 
*win; 


Print  a  character:  The  following  are  handled 
specially : 

\n  Clear  the  line  from  the  current  cursor  position 
to  the  right  edge  of  the  window.  Then: 
if  nl()  is  active: 

go  to  the  left  edge  of  the  next  line 

else 

go  to  the  current  column  on  the  next  line 
In  addition,  if  scrolling  is  enabled,  the  window 
scrolls  if  you're  on  the  bottom  line. 

\t  is  expanded  into  an  8-space  field.  If  the  tab 
goes  past  the  right  edge  of  the  window,  the 
cursor  wraps  to  the  next  line. 

\r  gets  you  to  the  beginning  of  the  current  line. 

\b  backs  up  one  space  but  may  not  back  up  past 

the  left  edge  of  the  window.  Nondestructive.  The 
curses  documentation  doesn't  say  that  \b  is 
handled  explicitly  but  it  does  indeed  work. 

The  following  is  not  supported  by  Unix.  Don't  use 
explicit  escape  sequences  if  portability  is  a 
consideration: 

ESC  This  is  not  standard  curses  but  is  useful.  All 

characters  between  an  ASCII  ESC  and  analphabetic 
character  are  sent  to  the  output  but  are  otherwise 


(continued  on  ne/ct  page) 


Dr.  Dobb's  Journal,  July  1987 


_ C  CHEST 

Listing  Four  ( Listing  continued ,  tejct  begins  on  page  94. ) 


242  | 

243  | 

244  | 
245| 
246| 
247| 
248  | 
249| 
250| 
251 1 

252  | 

253  | 

254  | 

255  | 
256| 

257  | 

258  | 
2591 
260  | 
261 1 
262  | 

263  | 

264  | 

265  | 
266| 
267  | 
268| 
269| 
270| 

271  | 

272  1 

273  | 

274  | 
275| 
2761 
277| 
278  | 
279| 
280| 
281  | 
282  | 


igonred.  This  let's  you  send  escape  sequences 
directly  to  the  terminal  if  you  like.  I'm 
assuming  here  that  you  won't  change  windows  in  the 
middle  of  an  escape  sequence. 


Return  ERR  if  the  character  would  have  caused  the 
window  to  scroll  illegally. 


static  int  saw_esc  -  0; 

static  WINDOW  *oldwin  -  NULL; 

int  rval  -  OK; 

if(  oldwin  J-  win  ) 

{ 

cmove (win->y_org  +  win->row/  win->x_org  +  win->col); 
oldwin  -  win; 

} 

if(  saw  esc  ) 

{ 

if(  isalpha(  c  )  ) 

saw_esc  -  0; 

put char (c) ; 

} 

else 

{ 

switch (  c  ) 

{ 

case  '\033': 

s aw_esc  -  1; 
put char ( '\033' ) ; 
break; 

case  ' \b' : 

if(  win->col  >  0  ) 

{ 

put char ( ' \b' ) ; 

— (  win->col  ) ; 

> 


283| 

284  | 

285  | 
286| 
287| 
288  | 
289| 
290| 
291 1 
2921 
2931 
294  1 
295| 
296] 
297| 
298  | 
299| 
300| 
301| 
3021 
3031 
304  | 
305| 
306| 
307| 
308| 
309| 
310| 
311| 
312| 
313| 
3141 
315| 
3161 
317| 
318| 
319| 
320| 
321| 
3221 


break; 

case  ' \t ' ; 

do 

{ 

waddch (  win,  1  '  ) ; 

} 

while  (  win->col  %  8  ) ; 
break; 

case  '\r': 

win->col  -  0; 

cmove (win->y_org  +  win->row,  win->x_org) ; 
break; 

default  : 

putchar (c) ; 

if (  ++(win->col)  <  win->x_size  ) 
break; 

/*  fall  through  to  newline  */ 

case  1 \n 1 : 

wclrtoeol (  win  ) ; 

if(  N1  ) 

win->col  -  0; 

if(  ++(win->row)  >-  win->y_size  ) 

rval  -  wscroll (  win,  1  ) ; 

— (  win->row  ) ; 

} 

cmove (  win->y_org  +  win->row, 
win->x_org  +  win->col); 

break; 

} 


90 

546 


Dr.  Dobb’s  Journal,  July  1987 


return  rval; 


waddstr (  win,  str  ) 

WINDOW  "win; 
char  ‘str; 

{ 

while (  ‘str  ) 

waddch(  win,  ‘str  ); 


static  int  Errcode  -  OK; 

static  wputc(c,  win)  WINDOW  ‘win; 

( 

Errcode  |-  waddch (win, c) ; 

) 

wprintw(win,  fmt) 

WINDOW  ‘win; 
char  *fmt; 

< 

/»  The  doprntO  function  used  here  is  explained  in  depth 

*  in  Allen  Holub,  The  C  Companion  (Prentice-Hall:  1987), 

*  pp.  213-237.  If  you  don't  have  this  book,  an  alternate 
»  approach  is  explained  in  the  text. 


va_list  args; 
va_start(  args,  fmt  ); 

Errcode  -  OK; 

doprnt (  wputc,  win,  fmt,  args  ); 
return  Errcode; 


box(  win,  vert,  horiz  ) 

WINDOW  ‘win; 

/*  Draws  a  box  in  the  outermost  characters  of  the  window 

*  using  vert  for  the  vertical  characters  and  horiz  for 

*  the  horizontal  ones.  I've  extended  this  function  to 

*  support  the  IBM  box-drawing  characters.  That  is, 

*  if  IBM  box-drawing  characters  are  specified  for  vert 

*  and  horiz,  box()  will  use  the  correct  box-drawing 
‘  characters  in  the  corners.  These  are  defined  in 

*  box. h  as : 

‘  HORIZ  (0xc4)  single  horizontal  line 

‘  D_HORIZ  (Oxcd)  double  horizontal  line. 

*  VERT  (0xb3)  single  vertical  line 

*  D  VERT  (Oxba)  double  vertical  line. 


i,  nrows; 
ul,  ur,  11,  lr; 


if(  !(  (horiz  —  HORIZ  ||  horiz  —  D_HORIZ)  £& 
(vert  —  VERT  | |  vert  —  D  VERT  )  ) ) 


ul  -  ur  -  11  -  lr  -  vert  ; 


if(  vert  —  VERT  ) 

{ 

if (horiz — HORIZ) 

ul-UL,  ur-UR,  11-LL,  lr-LR; 

else 

ul-HD_UL,  ur-HDJJR,  11-HD_LL,  lr-HD_LR; 

) 

else 

( 

if  (horiz— HORIZ) 

ul-VD_UL,  ur-VD_UR,  11-VD_LL,  lr-VD_LR; 

else 

ul-D_UL,  ur-D_UR,  11-D_LL,  lr-D_LR; 


craove(  win->y_org,  win->x_org  ); 
putchar (  ul  ) ; 

for(  i  -  win->x_size-2;  — i  >-  0  ;  ) 
putchar (  horiz  ) ; 

putchar (  ur  )  ; 

nrows  -  win->y_size  -  2  ; 

i  -  win->y_org  +  1  ; 

while (  — nrows  >-  0  ) 


/*  Top  line  */ 


/*  Two  sides  */ 


(continued  on  next  page  ) 


Dr.  Dobb's  Journal,  July  1987 


91 

547 


_ C  CHEST 

Listing  Four 

(Listing  continued,  tejtt  begins  on  page  94.) 


422 

423 

424 

425 

426 

427 

428 

429 

430 

431 

432 

433 

434 

435 

436 

437 

438 

439 

440 

441 

442 

443 

444 

445 
4461 

447 

448 

449 

450 

451 

452 

453 

454 

455 

456 

457 

458 

459 

460 

461 

462 

463 

464 

465 

466 

467 

468 

469 

470 

471  | 

472 

473 

474 

475 

476 

477 

478 
4791 

480 

481 

482  | 
4  83 
484  | 
485| 

486 

487 

488 

489 

490 

491  | 
4  92 
4  93 
4  94 
495  ! 
4  96 

497 

498 

499 

500 

501 

502 

503 

504 
5051 

506 

507 

508 

509 

510 

511  | 
5121 

513  | 

514  | 

515  | 
516| 

517  | 

518  | 
5191 
520  | 
521 1 

522  | 

523  1 

524  | 
5251 
526  | 
527| 
528  | 
5291 


cmove (  i,  win->x_org  ); 
putchar(  vert  ); 

cmove (  i++,  win->x_org  +  (win->x_size  -  1)  ); 
putchar (  vert  ) ; 

} 

cmove (i,  win->x_org) ;  /*  Bottom  line  */ 

putchar (  11  ) ; 

for(  i  -  win->x_size-2;  — i  >-  0  ;  ) 
putchar (  horiz  ) ; 

putchar (  lr  ) ; 

} 

/* - */ 

werase(  win  ) 

WINDOW  *win; 

{ 

vb_scroll (  win->x_org,  win->x_org  +  (win->x_size  -  1), 
win->y_org,  win->y_org  +  (win->y_size  -  1), 
win->y_size  ); 

cmove (  win->y_org,  win->x_org  ) ; 
win->row  -  0; 
win->col  -  0; 


/* - 

*  Scroll  the  window  if  scrolling  is  enabled.  Return  1  if  we 

*  scrolled.  (I'm  not  sure  if  the  Unix  function  returns  1 

*  on  a  scroll  but  it's  convenient  to  do  it  here.  Don't 

*  assume  anything  about  the  return  value  if  you're  porting 

*  to  Unix.  Wscroll()  is  not  a  curses  function.  It  lets  you 

*  specify  a  scroll  amount  and  direction  (scroll  down  by  -amt 

*  if  amt  is  negative);  scroll ()  is  a  macro  that  evaluates  to 

*  a  wscroll  call  with  an  amt  of  1.  Note  that  the  Unix  curses 

*  gets  very  confused  when  you  scroll  explicitly  (using 

*  scroll ()).  In  particular,  it  doesn't  clear  the  bottom  line 

*  after  a  scroll  but  it  thinks  that  it  has.  Therefore,  when 

*  you  try  to  clear  the  bottom  line,  it  thinks  that  there's 

*  nothing  there  to  clear  and  ignores  your  wclrtoeol() 

*  commands .  Same  thing  happens  when  you  try  to  print  spaces 

*  to  the  bottom  line;  it  thinks  that  spaces  are  already  there 

*  and  does  nothing.  You  have  to  fill  the  bottom  line  with 

*  non-space  characters  of  some  sort,  and  then  erase  it. 

*/ 

wscroll (win,  amt) 

WINDOW  *win; 

{ 

if (  win->scroll  ok  ) 

doscroll  (  wTn->x_org,  win->x_org  +  (win->x_size-l) , 
win->y_org,  win->y_org  +  (win->y_size-l) , 
amt) ; 

return  win->scroll  ok  ; 

) 

/. - ./ 

wmove (  win,  y,  x  ) 

WINDOW  *win; 

{ 

/*  Seek  into  the  window.  It's  not  permitted  to  seek 
*  outside  of  the  window  area. 

*/ 


) 


cmove (  win->y_org  +  (win->row  -  min (y,win->y_size-l) )  , 
win->x_org  +  (win->col  -  min(x,  win->x_size-l) )  ); 


/* - */ 

static  getcon(  win  ) 

WINDOW  *win; 

{ 

/*  Get  a  character  from  DOS  without  echoing.  We  need 

*  to  do  this  in  order  to  support  (echo/noecho).  We'll 

*  also  do  noncrmode  input  buffering  here.  Maximum  input 

*  line  length  is  132  columns. 

* 

*  In  nocrmode(),  DOS  is  used  to  get  a  line  and  all  the 

*  normal  command-line  editing  functions  are  available. 

*  Note  that  since  there's  no  way  to  turn  off  echo  in 

*  this  case,  characters  will  be  echoed  to  the  screen 

*  regardles  of  the  status  of  echo(). 

*  In  order  to  retain  control  of  the  window,  input 

*  fetched  for  wgetchO  is  always  done  in  crmode,  even 

*  if  Crmode  isn't  set. 

*  If  nl()  mode  is  enabled,  carriage  return  (Enter,  *M) 

*  and  linefeed  (*J)  are  both  mapped  to  '\n',  otherwise 

*  they  are  not  mapped. 


static  unsigned  char 
static  unsigned  char 
static  int 

register  int 


buf[  133 
*P 

numchars 

c; 


{  133,  0  ) ; 
buf; 

0; 


if(  Crmode  ||  win  ) 

{ 

if(  (c  -  bdos (8,0,0)  &  Oxff)  —  ('z'-'e')  ) 
return  EOF  ; 


if (  c  --  1 \r '  &&  Nl) 


92 

548 


Dr.  Dobb's  Journal,  July  1987 


5301  c  -  '\n'  ; 

5311 

532 |  if (  Echo  ) 

5331  {  . 

5341  if{  win  )  waddch  (  win,  c  ); 

535 |  else  addch  (  c  ) ; 

536|  ) 

537| 

538 |  return  c; 

5391  ) 

540|  else  if(  numchars  ) 

541|  { 

542 |  — numchars 

543 |  return  *p++  ; 

544  1  } 

545|  else 

5461  { 

547|  bdos (10,  buf,  0); 

548|  numchars  -  buf[l); 

5491  p  -  &buf [2 ] ; 

550|  } 

5511  } 

5521 

5531  wgetch  (win)  WINDOW  *win;  {  return  getcon(  win  );  } 

554|  getch  (  )  {  return  getcon(  NULL  );  ) 

5551 

5561  /* - */ 

557| 

558|  clear  () 

5591  { 

5601  doscroll (  0,  79,  0,  24,  25  ); 

561|  move (  0,  0  ); 

5621  } 

5631 

5641  /* - */ 

565| 

566|  wclrtoeol (  win  ) 

567|  WINDOW  *win; 

568|  ( 

569|  /*  Clear  from  cursor  to  end  of  line,  the  cursor  isn't 

570|  *  moved.  The  main  reson  that  this  is  included  here  is 

571|  *  because  you  have  to  call  it  after  printing  every 

572|  *  newline  in  order  to  compensate  for  a  bug  in  the  real 

573|  *  curses.  This  bug  has  been  corrected  in  the  curses 

574|  *  presented  here,  however,  so  you  don't  have  to  use 

575|  *  this  routine  if  you're  not  interested  in  portability. 

576|  *  Note  that  you  must  use  a  replace  function  on  the 

577|  *  rightmost  character  to  prevent  scrolling. 

578|  */ 

579| 

580|  register  int  i; 

581| 

582|  for (  i  -  win->x_size  -  win->col  -  1;  — i  >-  0  ;) 

583|  putcharC  '); 

584  | 

585|  replace ('  '); 

586|  cmove(  win->y  org  +  win->row,  win->x  org  +  win->col  ) 

587|  ) 

5881 

5891  /* - */ 

590|  *ifdef  MAIN 

591 1 

5921  WINDOW  *boxwin (  lines,  cols,  y  start,  x  start  ) 

5931  { 

594|  /*  This  routine  works  just  like  the  newwinO  except  that 

595|  *  the  window  has  a  box  around  it  that  won't  be  destoyed 

596|  *  by  writes  to  the  window.  It  accomplishes  this  feat  by 

597 |  *  creating  two  windows,  one  inside  the  other,  with  a  box 

598|  *  drawn  around  the  outer  one. 

5991  */ 

6001 

6dl |  WINDOW  *outer,  *inner; 

6021 

603|  outer  -  newwin (lines,  cols,  y  start,  x  start); 

604  | 

6051  fifdef  MSDOS 

6061  box (  outer,  VERT,  HORIZ  ); 

607|  #else 

608|  box(  outer,  ' | * ,  ); 

6091  lendif 

6101 

611|  wrefresh  (  outer  ); 

612|  return  newwin(  lines-2,  cols-2,  y  start+1,  x  start+1  ); 

6131  } 

6141 

615|  pattern () 

6161  { 

617|  clear (); 

618 |  printf ("01234567890123456789012345678901234567890  0\n") ; 

619|  printf ("01234567890123456789012345678901234567890  l\n") ; 

6201  printf ("01234567890123456789012345678901234567890  2\n") ; 

621 |  printf ("01234567890123456789012345678901234567890  3\n") ; 

6221  printf ("01234567890123456789012345678901234567890  4\n") ; 

623|  printf ("01234567890123456789012345678901234567890  5\n") ; 

624 |  printf ("01234567890123456789012345678901234567890  6\n") ; 

625 |  printf ("01234567890123456789012345678901234567890  7\n") ; 

6261  printf ("01234567890123456789012345678901234567890  8\n") ; 

627|  printf ("01234567890123456789012345678901234567890  9\n") ; 

628|  printf ("01234567890123456789012345678901234567890  10\n") ; 

629|  printf ("01234567890123456789012345678901234567890  ll\n") ; 

630|  printf ("01234567890123456789012345678901234567890  12\n") ; 

631 |  printf ("01234567890123456789012345678901234567890  13\n") ; 

6321  printf ("01234567890123456789012345678901234567890  14\n") ; 

633|  printf ("01234567890123456789012345678901234567890  15\n") ; 

634 |  printf ("01234567890123456789012345678901234567890  16\n") ; 

635 |  printf ("01234567890123456789012345678901234567890  17\n") ; 

6361  printf ("01234567890123456789012345678901234567890  18\n") ; 

637|  printf ("01234567890123456789012345678901234567890  19\n") ; 

638|  printf  ("01234567890123456789012345678901234567890  20\n") ; 

6391  printf ("01234567890123456789012345678901234567890  21\n") ; 

640|  printf ("01234567890123456789012345678901234567890  22\n") ; 

6411  printf ("  1  2  3  4  n"); 

6421  ) 

643  | 

644|  main() 

6451  ( 

646|  /*  All  coordinates  are  (y,  x)  */ 

647  | 

648|  WINDOW  *winl,  *win2; 

649|  char  str[128]; 

650|  int  c; 

651| 

652|  initscrO;  /*  Initialize  curses  */ 

653|  noechoO;  /*  Echo  off  (it  screws  up  the  screen)  */ 

654|  crmode();  /*  Put  terminal  into  CBREAK  mode  * l 

655  | 

656|  pattern (); 

657  | 

658|  winl  -  boxwin(10,  20,  0,  0  ); 

659|  win2  -  boxwin(10,  20,  21,  11); 

660  | 

661|  scrollok(  winl,  TRUE  ); 

662|  wprintw (winl,  "This  is  window  one,  doo  wha\n"  ); 

663|  wrefresh(  winl  ); 

664|  wprintw (win2,  "This  is  window  2.\nPress  a  key\n"  ); 

665|  wrefresh (  win2  ); 

6661 

667|  c  -  wgetch ( win2 ) ; 

668|  wmove(winl,  5,  0  ); 

6691 

670|  wprintw  (winl,  "Got  %c,  0x%x\n",  c,  c  ); 

671|  wrefresh (winl) ; 

672  | 

673|  while  (  (c  -  wgetch (winl)  &  0x7f)  !-  'q*  ) 

674  1  { 

675|  if (  c  --  ' X • — ' @ '  ) 

676|  wclrtoeol (  winl  ); 

677|  else 

678|  waddch (  winl,  c  ); 

679| 

680|  wrefresh (  winl  ); 

681|  } 

6821 

683|  move (23,0); 

684|  refresh  (); 

685  1  endwinO; 

6861  } 

687 |  iendif 

End  Listings 

Dr.  Dobb's  Journal,  July  1987 


93 


COLUMNS 


C  CHEST 


Curses:  Unix-Compatible  Windowing  Output  Functions 


As  I've  mentioned  before,  I  write 
a  lot  of  code  that  has  to  work  in 
both  the  MS-DOS  and  Unix  environ¬ 
ments.  Because  I  use  a  very  Unix- 
compatible  compiler  (Microsoft  4.0), 
porting  a  program  is  usually  pretty 
easy.  Nonetheless,  incompatibilities 
occasionally  arise,  usually  when  I'm 
doing  some  sort  of  low-level  I/O.  The 
Microsoft  compiler  doesn't  have 
Unix-compatible  fcntlf  )  or  ioctlf  ) 
functions,  it  doesn't  support  the 
/dev/tty  device  for  the  console  (you 
have  to  use  /dev/con  or  con:),  and  it 
doesn’t  provide  any  sort  of  termcap- 
or  curses-compatible  function  li¬ 
brary.  Of  these,  the  most  serious 
omission  is  the  lack  of  a  curses  li¬ 
brary. 

For  the  uninitiated,  curses  is  a  col¬ 
lection  of  terminal-independent, 
low-level  I/O  functions.  These  sub¬ 
routines  let  you  do  things  such  as 
move  the  cursor  around  on  the 
screen,  create  and  delete  windows, 
write  text  and  seek  to  specific  win¬ 
dow-relative  cursor  positions,  and  so 
forth.  The  windows  can  be  overlap¬ 
ping,  and  they  support  individual 
wraparound,  scrolling,  and  so  on. 
The  curses  functions  can  talk  to  vir¬ 
tually  any  terminal.  They  accom¬ 
plish  this  feat  by  using  the  termcap 
terminal  database,  which  contains 
definitions  for  the  various  escape  se¬ 
quences  needed  to  get  around  on 
specific  terminals.  Moreover,  they 
talk  to  the  terminals  efficiently.  That 
is,  they  always  send  out  the  mini¬ 
mum  amount  of  characters  neces¬ 
sary  to  modify  the  current  screen. 


by  Allen  Holub 

Curses  functions  keep  two  internal 
images  of  the  screen — one  of  these 
reflects  what's  actually  on  the 
screen;  the  other  is  a  scratch  space 
that  you  modify  using  the  various 
curses  functions.  When  you  tell  the 
curses  functions  to  do  a  refresh,  they 
compare  the  scratch  buffer  with  the 


actual  screen  image  and  then  send 
out  the  minimum  number  of  charac¬ 
ters  necessary  to  get  these  images  to 
match.  This  behavior  is  especially 
important  when  you’re  running  a 
program  via  a  modem  and  charac¬ 
ters  are  coming  at  1,200  baud.  Re¬ 
drawing  the  entire  screen  every  time 
you  scroll  a  4-line  by  10-character 
wide  window  is  just  unacceptable 
behavior.  It  takes  too  long.  Curses 
functions  solve  the  problem  by  re¬ 
drawing  only  those  parts  of  the 
screen  that  have  actually  changed. 
Curses  functions  are  described  in 
depth  in  Volume  2  of  the  Unix  Pro¬ 
grammer's  Manual  (see  the  bibliogra¬ 
phy). 

This  month  I'm  going  to  look  at  a 
set  of  curses-compatible  I/O  func¬ 
tions  that  run  on  the  IBM  PC.,  I've  writ¬ 
ten  several  complex  programs  using 
these  functions — programs  that 
maintain  several  windows  on-screen 
simultaneously,  all  of  which  are  be¬ 
ing  updated  at  different  rates.  More¬ 
over,  the  finished  programs  have 
ported  to  Unix  with  literally  no  mod¬ 
ification.  I  have  not  implemented  the 
entire  curses  library,  however.  My 
version  of  the  curses  package  doesn’t 
let  you  delete  windows,  nor  does  it 
support  overlapping  windows.  Fi¬ 
nally,  it  is  bolted  into  the  IBM  PC.  I've 
made  no  attempt  to  do  the  write  opti¬ 
mizations  discussed  earlier,  and  I  use 
several  of  the  video  BIOS  functions  to 
do  things  such  as  scroll  the  screen.  If 
your  terminal  supports  regional 
scrolling,  however,  it  shouldn't  be 
too  difficult  to  modify  the  lowest-lev¬ 
el  scrolling  function  to  send  the 
proper  escape  sequences.  Neverthe¬ 
less,  the  package  does  give  you  quite 


a  bit  of  DOS-to-Unix  compatibility  and 
has  proved  adequate  for  my  own 
needs. 

Interfacing  to  the  IBM  PC 

All  the  curses  functions  talk  to  the 
screen  via  a  well-defined  video  BIOS 
interface.  I  chose  this  approach  for 
two  reasons:  first,  the  BIOS  is  some¬ 
what  faster  than  the  normal  DOS  in¬ 
terface  and  is  dramatically  faster 
than  the  ANSI.SYS  driver;  and  second, 
by  concentrating  the  lowest-level 
routines  in  one  place,  I've  (hopefully) 
made  it  easy  to  adapt  these  functions 
to  terminals.  The  one  problem  that's 
likely  to  arise  when  porting  this  code 
to  a  terminal  is  with  scrolling.  I  use 
the  BIOS  scroll  function  to  scroll  indi¬ 
vidual  regions  of  the  screen.  You  pass 
this  function  the  coordinates  of  two 
diagonal  corners  of  a  square  region 
on  the  screen  and  a  scroll  amount. 
The  BIOS  then  scrolls  only  that  region 
by  the  indicated  amount.  If  your  ter¬ 
minal  doesn’t  have  an  escape  se¬ 
quence  that  does  this,  you'll  have  to 
do  what  the  real  curses  package  actu¬ 
ally  does — keep  an  internal  image  of 
the  screen  and  refresh  selected  por¬ 
tions  of  it  as  part  of  the  scroll.  You 
could  also  do  a  series  of  character- 
read,  move-cursor,  character-write 
escape  sequences,  but  that  approach 
would  be  pretty  slow. 

The  BIOS  routines  are  all  in  vbios.c 
(Listing  One,  page  74).  These  subrou¬ 
tines  are  a  reworking  of  the  routines 
used  by  my  shell  (described  in  On 
Command,  see  the  bibliography). 
They're  sufficiently  different  from 
the  original  subroutines  that  I've  re¬ 
printed  them  here.  The  actual  me¬ 
chanics  of  using  the  BIOS  are  dis¬ 
cussed  extensively  by  both  Ray 
Duncan  and  Peter  Norton  (see  the 
bibliography),  so  I  won't  go  into  the 
mechanics  here.  The  supported 
functions  are  shown  in  Table  1,  page 
95. 

The  only  other  IBM-related  file  is 
box.h  (Listing  Two,  page  80),  which 


94 

550 


Dr.  Dobb’s  Journal,  July  1987 


holds  * defines  for  the  various  IBM  PC 
box-drawing  graphics  characters. 

Curses 

The  curses  package  itself  is  part  mac¬ 
ros  and  part  subroutines.  The  file 
curses. h  (Listing  Three,  page  80) 
should  be  * included  at  the  top  of  ev¬ 
ery  file  that  uses  the  curses 
functions. 

Supported  functions  are  listed 
below,  grouped  functionally. 

Initializing 

Initialization  functions  are: 

initscr( ) 
endwin( ) 

Initscrf )  initializes  the  curses  pack¬ 
age.  It  should  be  called  at  the  head  of 
your  main(  )  subroutine,  before  any 
other  curses  functions  are  called. 
Endwin( )  cleans  up.  It  should  always 
be  called  before  your  program  exits. 

In  Unix  programs,  the  terminal 
can  be  left  in  an  unknown  state  if 
you  abort  your  program  with  a 
Break.  If  you  exit  abnormally  from  a 
program  that  uses  curses,  only  to 
find  your  terminal  acting  funny  (not 
echoing,  not  handling  tabs  or  new 
lines  properly,  and  so  forth),  you  can 
usually  correct  the  problem  by  typ¬ 
ing  tset  with  no  arguments.  If  that 
doesn’t  work,  try  <NL>reset<NL> 
where  <NL>  is  a  new  line  or  Ctrl-J. 
If  that  doesn’t  work  try  stty  cooked 
echo  nl,  and  if  that  doesn't  work, 
hang  up  and  log  on  again.  To  avoid 
this  sort  of  flailing  around,  it’s  much 
better  for  your  program  to  trap  the 
S1GINT  signal  and  to  call  endwinf ) 
from  within  the  service  subroutine. 
Use  the  following: 

» include  < signal. h> 

onintrt ) 

{ 

endwinl ); 
exit(l); 

} 

main( ) 

{ 

signal!  SIGINT,  onintr  ); 

} 

None  of  the  foregoing  is  a  problem  if 
you’re  not  interested  in  porting  your 
code  to  Unix,  however. 


Responding  to  Typed  Characters 

Once  the  curses  package  is  initial¬ 
ized,  you  should  determine  how 
your  terminal  is  going  to  respond  to 
typed  characters.  Six  subroutines  are 
supported. 

Two  subroutines  control  input 
buffering: 

inf  crmode  ( ); 
inf  nocrmode  ( ); 

Crmodef )  disables  buffering.  Char¬ 
acters  will  be  available  as  soon  as 
they’re  typed.  A  nocrmodef  )  call 
cancels  a  previous  crmodef ).  Here, 
an  entire  line  is  read  before  the  first 
character  is  returned.  The  DOS  (or 
Unix)  command-line  editing  func¬ 
tions  are  all  available  if  nocrmodef ) 
is  active.  Most  curses  programs  use 
crmodef  ). 

The  following  two  subroutines 


control  character  echo: 

int  echo  ( ); 
int  noecho  ( ); 

If  echof )  is  called,  characters  are 
echoed  as  they're  typed;  noechof  ) 
suppresses  the  echoing,  so  you’ll 
have  to  do  the  echo  yourself.  The 
real  curses  package  gets  very  con¬ 
fused  when  echof )  is  enabled.  The 
problem  here  is  that  the  curses  pack¬ 
age  doesn't  know  about  any  charac¬ 
ter  that  it  has  not  written  to  the 
screen  itself.  Because  characters  are 
echoed  by  the  operating  system  (not 
by  curses),  the  package  doesn’t 
know  they're  there.  As  a  conse¬ 
quence,  when  the  curses  package 
does  a  screen  refresh,  it  won’t  delete 
the  characters  that  it  doesn’t  know 
about  and  the  screen  rapidly  fills 
with  unwanted  and  unerasable 


int  vb_getpage  (  ) 

Get  active  video  page  #. 

void  vb_putchar  (c) 

Write  a  single  character  to  the  screen  at  the  current 

int  c  ; 

cursor  position.  Only  printing  characters,  back- 

space,  new-line,  and  bell  are  supported. 

void vb_getchar (c) 
int  d ; 

Get  a  typed  character  directly  from  the  bios.  You 

can  not  redirect  input  that  is  fetched  using 

vb—getchar(  ). 

void  vb_puts ( s  ,  move ) 
char  *s ; 

int  move; 

Write  a  string  out  to  the  screen.  Move  the  cursor  only 
if  move  is  true. 

void vb_replace (c) 
int  c ; 

Write  a  character  to  the  screen  without  moving  the 

cursor. 

int  vb_inchar  (attrib) 
int  *attrib; 

Return  the  character  at  the  current  cursor  position. 
Modify  'attrib  to  hold  the  attribute  byte  associated 
with  that  character. 

int  vb_getcur(  ) 

Return  an  integer  representing  the  current  cursor  po¬ 
sition.  The  top  byte  holds  the  row;  the  bottom  holds 

the  column. 

void  vb_setcur (posn) 
int  posn; 

Send  the  cursor  to  a  position  fetched  with  a  previous 
vb—getcur(  )  call. 

void  vb_ctoyx  ( y , x ) 
int  y ,  x ; 

Set  cursor  position  to  (y,x)  (row  y,  column  x).  The 
upper-left  corner  of  the  screen  is  (0,0). 

void  vb_getyx  (  Sy ,  Sx) 
int  *y,  *x ; 

Put  the  current  cursor  position  into  *y  (the  row)  and 
*x(the  column). 

int  vb_iscolor(  ) 

Return  true  if  the  color  monitor  (as  compared  to  the 
monochrome  monitor)  is  installed. 

void vb_cursize (top,bot) 
int  top ,  bot ; 

Set  cursor  size  to  extend  from  the  top  scan  line  to 
the  bottom  scan  line  of  an  individual  character. 

void vb_blockcur  (  ) 

Make  the  cursor  a  block  cursor. 

void  vb_normalcur  (  ) 

Make  the  cursor  a  normal  (underline)  cursor. 

void  vb_scroll (left, right, 
top , bot , amt ) 

Scroll  a  region  of  the  screen.  Left  is  the  column  num¬ 
ber  of  the  leftmost  column  in  the  region,  right  of  the 
rightmost;  top  is  the  top  line,  bot  is  the  the  bottom,  and 
amt  is  the  number  of  lines  to  scroll.  If  amt  is  positive, 

the  region  scrolls  up,  otherwise  it  scrolls  down. 

Table  1:  Functions  in  vbios.c 


Dr.  Dobb’s  Journal,  July  1987 


95 

551 


C  CHEST 

(continued  from  page  95) 


characters.  Always  call  noechof  )  at 
the  top  of  your  program  and  echo 
characters  yourself.  Another  echo- 
related  problem  is  caused  by  DOS.  In 
order  to  get  buffered  input,  you 
have  to  use  a  DOS  function  that  al¬ 
ways  echoes.  So,  if  nocrmodef )  is  ac¬ 
tive,  the  echo  status  is  ignored. 

The  final  two  configuration  sub¬ 
routines  are: 

int  nl  ( ); 
int  nonl  ( ); 

When  nl(  )  is  active,  a  newline  ('\n' ) 
is  converted  to  a  carriage-return, 
line-feed  sequence  on  output,  and  a 
carriage  return  ('  \r' )  is  mapped  to  a 
newline  on  input;  otherwise,  no 
mapping  is  done.  It’s  usually  conve¬ 
nient  to  set  nl(  )  at  the  top  of  your 
program. 

Initializing  Windows 

Five  functions  are  supported  for  ini¬ 
tializing  windows.  You  don't  have  to 
use  any  of  them  if  your  screen  is  one 
big  window  that  occupies  the  whole 
screen.  The  first  function  is: 

WINDOW  *newwin(  lines,  cols,  be- 
gin— y,  begin_x ) 

int  cols; 
int  lines; 
int  begin_y; 
int  begin_x; 

which  creates  a  new  window  lines 
rows  high  and  cols  columns  wide 
with  the  upper-left  corner  at  (be¬ 
gin— y,  begin—, y).  [All  coordinates  here 
are  (y,x),  where  y  is  the  row  number 
and  y  is  the  column  number.  The  up¬ 
per-left  corner  of  the  screen  is  (0,0).] 
The  window  is  both  created  and 
cleared.  A  pointer  to  a  WINDOW 
structure,  declared  in  curses.h,  is  re¬ 
turned  in  a  manner  analogous  to  fo- 
pen(  ).  You  must  save  this  pointer  to 
pass  to  other  curses  functions. 

A  variant  on  the  newwinf  )  subrou¬ 
tine  is: 

WINDOW  *subwin(  win,  lines,  cols,  be¬ 
gin— y,  begin— x  ) 

Here,  win  is  a  pointer  to  a  window 
created  with  a  previous  newwinf  )  or 


subwinf )  command.  My  implemen¬ 
tation  of  curses  treats  the  subwinf  ) 
command  just  like  it  does  newwinf ). 
The  real  curses  package,  however, 
creates  a  subwindow.  When  a  parent 
window  is  refreshed  by  curses,  all 
subwindows  are  refreshed  too.  By 
the  same  token,  if  you  read  charac¬ 
ters  from  a  parent  window,  you’ll  be 
able  to  get  characters  from  the  sub¬ 
window  as  well.  Similarly,  when 
you  delete  a  parent  window,  all  the 
subwindows  are  deleted  too. 

The  curses  package  supports  a  spe¬ 
cial  stdscr  window  that  represents 
the  entire  screen.  This  superwindow 
is  created  for  you  automatically  by 
initscrf  ).  It’s  convenient  to  declare 
all  other  windows  as  subwindows  to 
stdscr  so  that  you  can  use  the  global 
functions  discussed  later.  Note,  how¬ 
ever,  that  you  may  not  pass  stdscr  as 
a  WINDOW  pointer  to  any  of  the 
other  subroutines  that  take  WINDOW 
pointers  as  arguments.  The  real 
curses  package  lets  you  do  this,  but 
mine  doesn’t  support  the  practice.  In 
fact,  because  no  error  checking  is 
done  in  this  situation,  passing  stdscr 
to  a  function  results  in  a  "Null  point¬ 
er  assignment”  error  message  when 
your  program  terminates. 

Three  other  subroutines  affect  an 
entire  window.  The  macro: 

scrollok  (win,  flag) 

WINDOW  ’win; 
int  flag; 

is  passed  a  WINDOW  pointer  and  a 
flag.  If  the  flag  is  true,  the  indicated 
window  is  allowed  to  scroll;  other¬ 
wise,  the  window  does  not  scroll  and 
characters  that  go  off  the  bottom  of 
the  window  are  discarded.  Note  that 
line  wrap  (when  you  go  oft'  the  right 
side  of  the  window,  you  end  up  on 
the  left  edge  of  the  next  line)  is  al¬ 
ways  enabled.  Scrolling  is  always  en¬ 
abled  on  the  stdscr  window. 

The  macros: 

refresh! ) 
wrefresh(win) 

WINDOW  ‘win; 

are  used  by  the  real  curses  to  signal  a 
screen  refresh.  They  force  the 
screen  to  coincide  with  the  internal 
representations  of  the  screen.  No 


characters  are  actually  written  out  to 
the  terminal  until  a  refresh  occurs. 
My  own  curses  package  writes  to  the 
screen  immediately,  so  both  of  these 
macros  expand  to  null  strings;  in 
other  words  they  are  ignored.  You’ll 
need  them  to  be  able  to  port  code  to 
Unix,  however.  Refreshf  )  refreshes 
the  whole  screen  (the  stdscr  window 
and  all  subwindows  of  stdscr ),  wre- 
freshfwin)  is  passed  a  WINDOW  point¬ 
er  and  refreshes  only  the  indicated 
window.  Note  that  the  refreshf  ) 
command  only  works  under  the  real 
curses  if  all  windows  are  subwin¬ 
dows  of  stdscr. 

The  function: 

int  box(  win,  vert  ,horiz  ) 

WINDOW  ‘win; 
int  vert,  horiz; 

draws  a  box  in  the  outermost  charac¬ 
ters  of  the  window  using  vert  for  the 
vertical  characters  and  horiz  for  the 
horizontal  ones.  I’ve  extended  this 
function  to  support  the  IBM  box¬ 
drawing  characters.  That  is,  if  IBM 
box-drawing  characters  are  specified 
for  vert  and  horiz,  boyf  )  uses  the  cor¬ 
rect  box-drawing  characters  for  the 
corners.  The  box-drawing  characters 
are  defined  in  box.h  as: 

HORIZ  (0xc4)  single  horizontal  line 
D_ HORIZ  (Oxcd)  double  horizontal  line 
VERT  (0xb3)  single  vertical  line 
D_VERT  (Oxba)  double  vertical  line 

Boxes  can  have  double  horizontal 
lines  and  single  vertical  ones,  or  vice 
versa.  The  Unix  bojcf  )  function  uses 
the  vertical  character  for  the  cor¬ 
ners. 

Note  that  boyf  )  doesn’t  draw  a  box 
around  the  window,  as  the  Unix  doc¬ 
umentation  would  have  you  believe; 
rather,  it  draws  the  box  in  the  outer¬ 
most  characters  of  the  window  itself. 
This  means  you  can  overwrite  the 
border  if  your  output  lines  are  too 
wide.  When  you  scroll  the  window, 
the  box  scrolls  too.  A  function  that 
creates  a  bordered  window  in  which 
the  border  is  not  part  of  the  window 
itself  is  shown  on  lines  592-613  of 
Listing  Four,  page  81.  Here,  two  win¬ 
dows  are  created,  one  nested  inside 
the  other.  The  outer  window  just 
holds  the  box,  and  the  inner  window 
is  used  normally  (for  characters). 


96 

552 


Dr.  Dobb's  Journal,  July  1987 


C  CHEST 

(continued  from  page  96) 


This  way,  you  can  overflow  the  in¬ 
ner  window  and  not  affect  the  outer 
one  (that  holds  the  border). 

Moving  the  Cursor 

Three  functions  are  supported  for 
cursor  movement: 

int  move  ( y,  x) 
int  wmovel  win,  y,  x) 
getyx  (  win,  y,  x) 

WINDOW  *win; 
int  y,  x; 

Move( )  moves  the  cursor  to  the  indi¬ 
cated  absolute  position  on  the  screen. 
The  upper-left  corner  of  the  screen  is 
(0,0).  Wmovef )  moves  the  cursor  to 
the  relative  position  within  a  specific 
window  (pointed  to  by  win).  The  up¬ 
per-left  corner  of  the  window  is  (0,0). 
If  you  try  to  move  past  the  edge  of 
the  window,  the  cursor  will  be  posi¬ 
tioned  on  the  edge.  The  getyjcf  )  mac¬ 
ro  loads  the  current  cursor  position 
for  a  specific  window  into  y  and  y. 
Note  that  this  is  a  macro,  not  a  sub¬ 
routine,  so  you  should  not  precede  y 
or  y  with  an  address-of  operator  (&). 
This  one  command  (only)  accepts 
stdscr  as  a  win  argument.  A  ge- 
tyy(stdscr,y,}t)  call  loads  the  current 
absolute  cursor  position  into  y  and  y. 

Keyboard  Input 

Two  keyboard-input  functions  are 
supported: 

int  getch( ) 
int  wgetch(win) 

WINDOW  *win; 

Getch( )  just  gets  a  character  from  the 
keyboard,  and  wgetcht  )  echoes  the 
character  to  the  indicated  window  (if 
echo( )  is  enabled,  that  is).  Note  that 
crmodet  )  has  to  be  enabled  to  get  the 
character  as  soon  as  it's  typed,  other¬ 
wise,  the  entire  line  will  be  buffered. 
It's  unfortunate  that  many  compiler 
manufactures  (Microsoft  included) 
have  chosen  to  use  getchf  )  as  the 
name  of  their  standard  direct  key¬ 
board-input  function,  but  so  it  goes. 

Reading  Back  from  the  Screen 

Three  functions  are  supported  for 


reading  back  characters  that  are  al¬ 
ready  on  the  screen: 

inch! ) 

mvinch(y,x) 

mvwinch(win,y,x) 

WINDOW  ’win; 
int  y,x; 

Inch( )  returns  the  character  at  the 
current  cursor  position.  Mvinch(y,y) 
moves  the  cursor  to  the  indicated  po¬ 
sition  and  then  returns  the  character 
at  that  position;  mvwinch(win,y,y.) 
does  the  same  but  the  cursor  position 
is  relative  to  the  specified  window. 
Some  older  versions  of  curses  don’t 
support  the  mv  versions  of  this  com¬ 
mand. 

Formatting  Output 

Two  formatted  output  functions  are 
supported: 

printw  (fmt,  args  . . . ) 

int  wprintw  (win,  fmt,  args  .  .  . ) 

WINDOW  *win; 
char  *fmt; 

Printwf  )  works  just  like  printff  ) 
does;  wprintw)  )  is  the  same  but  it 
prints  to  the  indicated  window,  mov¬ 
ing  the  cursor  to  the  correct  position 
in  the  new  window  if  necessary. 
(That  is,  it’s  moved  to  the  position  im¬ 
mediately  following  the  character 
most  recently  written  to  the  indicat¬ 
ed  window).  Printw ( )  ignores  win¬ 
dow  boundaries,  but  wprintwt  ) 
wraps  when  you  get  to  the  right  edge 
of  the  window  and  the  window 
scrolls  when  you  go  past  the  bottom 
line  (provided  that  scrollok(  )  has 
been  called  for  the  current  window). 

Single  Characters, 

Writing  Strings 

The  single-character  and  string- 
write  functions  are: 

addch(c) 

int  waddchlwin,  c) 
int  waddstrlwin,  str) 

WINDOW  "win; 
int  c; 
char  *str; 

Addcht  )  works  like  putcharl  )  does; 


waddchf )  writes  a  character  to  the 
indicated  window  (and  advances  the 
cursor);  and  waddstrf )  works  like 
fputs( ),  writing  a  string  out  to  the  in¬ 
dicated  window.  Waddstrf )  does  not 
add  a  '\n'  at  the  end  of  the  string. 
Waddchf  )  treats  several  characters 
specially: 

'  \n ' — clear  the  line  from  the  current 
cursor  position  to  the  right  edge  of 
the  window.  If  nlf )  is  active,  you  go 
to  the  left  edge  of  the  next  line;  other¬ 
wise,  you  go  to  the  current  column 
on  the  next  line.  In  addition,  if  scroll¬ 
ing  is  enabled,  the  window  scrolls  if 
you’re  on  the  bottom  line. 

'\t' — expand  to  an  eight-space  field. 
If  the  tab  goes  past  the  right  edge  of 
the  window,  the  cursor  wraps  to  the 
next  line. 

'  \r' — move  to  the  left  edge  of  the 
window,  on  the  current  line. 

'  \b’ — back  up  one  space  but  not  past 
the  left  edge  of  the  window.  Nonde¬ 
structive.  The  curses  documentation 
doesn’t  say  that  \b  is  handled  explic¬ 
itly,  but  it  does  indeed  work. 

The  escape  character  is  not  han¬ 
dled  specially  by  Unix,  but  my 
waddchf )  does  do  so.  (Don't  use  ex¬ 
plicit  escape  sequences  if  portability 
is  a  consideration.)  In  particular,  all 
characters  between  an  ASCII  ESC  and 
an  alphabetic  character  (inclusive) 
are  sent  to  the  output  but  are  other¬ 
wise  ignored.  This  lets  you  send  es¬ 
cape  sequences  directly  to  the  termi¬ 
nal  if  you  like.  I'm  assuming  here 
that  you  won't  change  windows  in 
the  middle  of  an  escape  sequence. 

Erasing 

Five  erase  functions  are  available: 

werase(win) 
erase( ) 

wclear( ) 
clear( ) 

wclrtoeol  (win) 

WINDOW  *win; 

Clearf )  and  erasef )  both  clear  the 
entire  screen,  wclearf  )  and  werasef  ) 
both  clear  only  the  indicated  win¬ 
dow,  and  wclrtoeolf  )  clears  the  line 
from  the  current  cursor  position  in 
the  indicated  window  to  the  right 
edge  of  the  indicated  window. 


Dr.  Dobb's  Journal,  July  1987 


99 

553 


Scrolling 

C  CHEST 

Finally,  two  scrolling  functions  are 
supported: 

( continued  from  page  99 ) 

The  mvinchf  )  and  mvwinchf  ) 

scroll  (win) 

macros  on  lines  40  and  41  use  the 

wscroll  (win,  amt) 

comma  operator,  often  called  the  se¬ 
quence  operator.  The  comma  opera- 

WINDOW  ‘win; 

tor  evaluates  from  left  to  right,  and 

int  amt; 

the  entire  expression  evaluates  to  the 
rightmost  object  in  the  list.  For  exam- 

ScrolK  )  scrolls  the  indicated  window 
up  one  line,  and  wscroll(  )  scrolls  by 

pie,  mvinchf )  looks  like: 

the  indicated  amount — up  if  amt  is 

^define  mvinch(y,x)\ 

positive,  down  if  it's  negative.  This 
last  function  is  not  supported  by  the 

(move(y,x),  inch! )) 

Unix  curses. 

There’s  one  caveat  about  scrolling. 

An  equivalent  subroutine  is: 

The  Unix  functions  have  a  bug  in 

mvinch(y,x) 

them  in  that,  when  a  window  scrolls, 

{ 

move(y,x); 

the  bottom  line  is  not  cleared,  leaving 

a  mess  on  the  screen.  This  problem  is 

return  inch( ); 

) 

not  restricted  to  the  scrolll )  subrou¬ 
tine  but  occurs  any  time  that  the 

window  scrolls  (as  when  you  send  a 

The  comma  operator  is  used  because 

\n  at  the  bottom  line  of  the  window 

two  statements  have  to  be  execut- 

or  when  a  character  wraps,  causing 

ed — the  movef )  call  and  the  inchf ) 

a  scroll.  As  a  consequence,  if  you’re 

call.  Were  you  to  define  the  macro 

porting  to  Unix,  you  should  always 
do  a  wclrtoeolt )  immediately  after 

as: 

either  scrolling  or  printing  a  new 
line.  Unfortunately,  there’s  no  easy 

^define  mvinch(y,x)  move(y,x);  inch( ) 

way  to  tell  if  a  window  has  scrolled 
because  of  a  character  wrap.  My 

the  following  code  wouldn't  work: 

curses  package  doesn't  have  this 

iff  condition  ) 

problem — the  bottom  line  of  the 
window  is  always  cleared  on  a  scroll. 

Implementation 

mvinch(y,x); 

because  it  would  expand  to: 

For  the  most  part,  the  code  is 

if(  condition ) 

straightforward  and  needs  little  com- 

move(y,x); 

ment.  The  WINDOW  structure  is  de¬ 
clared  on  lines  7-17  of  curses.h  (List- 

inch( ); 

ing  Three).  The  macros  for  bool,  reg, 

Putting  curly  braces  around  the 

i  hue,  ealse,  ERR,  and  OK  are  defined 

statements  doesn’t  help.  For  exam- 

in  the  Unix  curses.h  file,  so  I’ve  put 
them  here  too.  Be  careful  of: 

pie: 

^define  mvinch(y,x)\ 

lf(  foo(  )  =  =  TRUE  ) 

{move(y,x);  inch! );} 

TRUE  is  # defined  as  1,  but  in  fact  any 

if(  condition ) 

nonzero  value  is  true.  As  a  conse- 

mvinchly,  x); 

quence,  fool  )  could  return  a  perfect- 

else 

ly  legitimate  true  value  that  didn’t 
happen  to  be  1,  and  the  test  would 

something! ); 

fail.  The  test: 

expands  to: 

if(  foot  ) !  =  FALSE  ) 

if(  condition  ) 

is  safe,  however.  Most  of  the  output 

{ 

move(y,x); 

functions  return  ERR  if  scrolling  is 

inch! ); 

} 

disabled  and  the  write  would  have 

caused  a  scroll. 

else 

something! ); 

Here  the  else  will  try  to  bind  with  the 
semicolon,  which  is  a  perfectly  legiti¬ 
mate  statement  in  C,  causing  a  "No 
matching  if  for  else”  error  message. 
Though  the  comma  operator  solves 
both  of  these  problems,  it  isn’t  very 
readable.  I  don't  recommend  using  it 
unless  you  must.  Never  use  it  if  curly 
braces  will  work  in  a  particular  ap¬ 
plication. 

The  next  problem  is  the  wprintwf  ) 
function.  (Printwf  )  is  just  a  macro 
that  evaluates  to  printff ),  so  it  isn’t  a 
problem.)  In  order  to  keep  control  of 
the  window,  you  can’t  just  blast 
characters  to  standard  output;  rath¬ 
er,  characters  must  be  sent  through 
waddchf )  so  that  you  can  position 
the  cursor  correctly,  scroll  the  win¬ 
dow  when  necessary,  and  so  forth. 
The  problem  is  solved  by  using  a  spe¬ 
cial  formatting  output  function, 
called  vfprintfl )  in  the  ANSI  standard 
and  _ doprntf  )  by  Unix.  They  both 
take  three  arguments: 

_doprnt  (  fmt,  args,  stream  ) 
vfprintf!  stream,  fmt,  args  ) 

Fmt  is  the  format  string,  stream  is  the 
output  stream,  and  args  is  the  ad¬ 
dress  of  the  first  argument  in  the  ar¬ 
gument  list.  For  example,  fprintfl ) 
looks  like  this: 

fprintflstream,  fmt,  args) 

FILE  ‘stream; 
char  *fmt; 
char  ‘args; 

{ 

_doprnt(  fmt,  &args,  stream  ); 

} 

Note  that  I’ve  cheated  here  and  not 
followed  either  the  official  ANSI  or 
Unix  methods  of  passing  arguments 
to  a  subroutine  with  a  variable  num¬ 
ber  of  arguments.  The  above  exam¬ 
ple  is  easier  to  figure  out,  however. 
I’ve  done  it  correctly  in  the  code  (on 
lines  345-361  of  Listing  Four).  The 
process  is  also  discussed  in  depth  in 
my  book  The  C  Companion  (see  the 
bibliography)  in  which  a  complete 
source  for  _ doprntf  )  is  presented.  In 
fact,  I've  used  the  version  from  this 
book  in  curses  (on  line  359).  This  ver¬ 
sion  is  nonstandard  in  that  it  is 
passed  a  pointer  to  an  output  func- 


Dr.  Dobb's  Journal,  July  1987 

554 


101 


C  CHEST 

(continued  from  page  101) 


tion  rather  than  a  stream  pointer. 

It  is  possible  to  use  the  standard 
_i doprnt(  )  or  vprintfi  )  functions, 
even  though  both  of  these  are  passed 
pointers  to  output  streams  rather 
than  pointers  to  output  functions. 
The  problem  is  solved  by  the  ver¬ 
sions  of  doprntf  )  shown  in  Examples 
1  and  2,  right. 

The  first  of  these  is  ANSI 
compatible.  It  solves  the  output- 
function  problem  by  formatting  into 
a  string — using  the  ANSI  vsprintf  ) 
function — and  then  writing  the 
string  out,  one  character  at  a  time,  us¬ 
ing  the  function  pointer  that  was 
passed  as  the  first  argument. 

The  solution  for  Unix  systems  (Ex¬ 
ample  2)  is  harder  because  there's  no 
Unix  equivalent  to  vsprintf! ).  Here 
you  have  to  send  the  formatted  out¬ 
put  to  a  file,  rewind  the  file,  and  then 
read  the  file  back,  one  character  at  a 
time,  calling  your  output  function  to 
do  the  actual  printing.  The  tempo¬ 
rary  file  is  created  in  the  if  statement 
at  the  top  of  the  subroutine;  tmp—file 
will  be  NULL  the  first  time  that 
doprntf )  is  called.  The  temporary 
file  name  is  created  with  a  call  to  the 
Unix  (and  ANSI)  mktemp 
("yyXXXXXX")  function,  which  cre¬ 
ates  a  unique  file  name.  Mktemp( )  is 
passed  a  template  for  the  name,  and 
it  replaces  the  last  six  characters  in 
that  template  with  a  number  guaran¬ 
teed  to  form  a  unique  name  (one  that 
isn't  currently  in  use).  If  you  aren't 
using  a  Unix-  or  ANSI-compatible 
compiler,  just  replace  the  mktemp ( ) 
call  with  some  reasonable  string 
(such  as  ”$$$$$$$$.tmp ").  Having  cre¬ 
ated  a  target  file,  doprntf  )  calls 
—doprntf  )  to  write  out  to  that  file.  Fi¬ 
nally,  it  writes  out  a  '  \0'  to  mark  the 
end  of  string,  rewinds  the  file,  and 
then  reads  it  back,  printing  each 
character.  I’m  assuming  that  your 
program  will  call  eyitf  )  to  close  the 
temporary  file. 

Note  that,  even  though  this  code 
looks  pretty  awful,  it's  not  as  ineffi¬ 
cient  as  it  seems.  Because  I'm  using 
the  buffered  read  and  write  func¬ 
tions  and  because  most  strings  are 
shorter  than  a  buffer,  there's  virtual¬ 
ly  no  disk  activity.  That  is,  all  your 
reads  and  writes  are  really  going  to 
the  internal  disk  buffer,  not  to  the 


disk  itself.  Only  those  strings  that  are 
longer  than  the  buffer  should  cause  a 
disk  read  or  write. 

Erratum 

Jack  Whitney  of  Walnut  Creek,  Cali¬ 
fornia  found  an  error  in  Table  1  of 
the  February  C  Chest  (page  96).  The  if 
statement  in  the  HASHPJW  function 
should  look  like  this: 

if(  g  =  h  &  ~((unsigned)  ( — 0)  >>  4  )) 


Nonetheless,  the  contents  of  the  table 
are  correct.  The  somewhat  convolut¬ 
ed  code  is  discussed  in  this  month's 
Flotsam  and  Jetsam. 

Bibliography 

Arnold,  Kenneth.  "Screen  Updating 
and  Cursor  Movement  Optimization: 
A  Library  Package.”  Unix  Program¬ 
mer's  Manual,  vol.  2.  This  article  is 
the  documentation  for  the  real 
curses  package. 


#include  <stdarg.h> 

static  doprnt (ofunct,  funct_arg,  fmt,  argp) 

int  (*ofunct)  () ; 

char  *funct_arg; 

char  *fmt; 

va_list  *argp; 

{ 

/*  A  doprnt  ()  for  ANSI  */ 

/*  (c)  Copyright  1987,  Allen  I.  Holub.  */ 

char  buf[133],  *p  ; 

vsprintf (  buf,  fmt,  argp  ); 

for(  *p  =  buf;  *p;  (*ofunct) (  *p++,  funct_arg  )  ) 

/ 

1 

Example  1:  A  doprnt! )  for  ANSI 


♦include  <varargs.h> 

static  doprnt (ofunct,  funct_arg,  fmt,  argp) 

int  (*ofunct)(); 

char  *funct_arg; 

char  *fmt; 

va_list  argp; 

{ 

/*  A  doprnt  ()  for  Unix.  */ 

/*  (c)  Copyright  1987,  Allen  I.  Holub.  */ 

int  c  ; 

extern  char  *mktemp ( )  ; 

static  char  *tmp_name  ; 

static  FILE  *tmp_file  =  NULL  ; 

if (  !tmp_file  ) 

{ 

tmp__name  =  mktemp  ("  yyXXXXXX" )  ; 

if (  ! (tmp_file  =  fopen (tmp_name  ,  "w+"))  ) 

( 

fprintf (stderr, "Can't  open  temporary  file  %s\n", 

tmp_name  ) ; 

exit (  1  ) ; 

> 

} 

_doprnt(  fmt,  argp,  tmp_file  ); 

putc  (  0,  tmp_file  ) ; 
rewind  (  tmp_file  ) ; 

while  (  (c  =  getc  (tmp_file) )  !=  EOF  &&  c  ) 

(*ofunct) (  c,  funct_arg  ); 

rewind (  tmp_file  );  /*  Get  ready  for  next  call  */ 

} 


Example  2:  A  doprnt( )  for  Unix 


102 


Dr.  Dobb’s  Journal,  July  1987 

555 


C  CHEST 

(continued  from  page  102 ) 

Duncan,  Ray.  Advanced  MS-DOS.  Red¬ 
mond,  Wash.:  Microsoft  Press,  1986. 
Holub,  Allen.  On  Command:  Writing  a 
Unix-like  Shell  for  MS-DOS.  Redwood 
City,  Calif.:  M&T  Publishing,  1987. 
Holub,  Allen.  The  C  Companion.  En- 
glewood-Cliffs,  N.J.:  Prentice-Hall, 
1987.  Printf  )  is  discussed  in  depth  in 
Chapter  9. 

Norton,  Peter.  The  Peter  Norton  Pro¬ 
grammer's  Guide  to  the  IBM  PC.  Red¬ 


mond,  Wash.:  Microsoft  Press,  1985. 

Availability 

The  code  presented  this  month  is 
destined  for  a  book  I'm  writing  (on 
the  subject  of  compiler  design).  It  is 
copyrighted  by  myself.  Though 
you’re  welcome  to  both  download  it 
and  use  it  yourself,  you  may  not  dis¬ 
tribute  it  in  any  form  (including  bina¬ 
ry)  to  anyone  else  or  use  it  for  any 
commercial  purposes. 

All  the  source  code  for  articles  in 
this  issue  is  available  on  a  single  disk. 


To  order,  send  $14.95  to  Dr.  Dobb's 
Journal,  501  Galveston  Dr.,  Redwood 
City,  CA  94063,  or  call  (415)  366-3600, 
ext.  216.  Please  specify  issue  number 
and  format  (MS-DOS,  Macintosh, 
Kaypro). 

DDJ 

(Listings  begin  on  page  74.) 

Vote  for  your  favorite  feature/article. 

Circle  Reader  Service  No.  5. 


O  Portable  Masks 

Bit  masks  are  used  to  set  or 
clear  individual  bits  in  a  number. 
For  example  n  &  0x1  clears  all  but  the 
bottom  bit  of  n.  By  the  same  token, 
n  1 Oxl  sets  the  bottom  bit  (to  1).  It's 
easy  for  bit  masks  to  be  nonportable, 
however.  For  example,  if  you  want 
to  clear  only  the  bottom  bit  of  a  num¬ 
ber,  it’s  tempting  to  say:  n  &  Oxfffe 
This  statement  makes  an  important 
assumption,  however.  It  assumes 
that  an  int  is  16  bits  wide.  If  you  tried 
to  use  it  on  a  machine  that  had  a  32- 
bit  int,  the  top  16  bits  of  the  number 
would  be  cleared  too .  You'd  have  to 
^  °N8SSfe  on  a  32-bit  machine. 

You  can  correct  this  problem  by 
using  the  one’s  complement  opera¬ 
tor  (~),  which  reverses  the  sense  of 
all  bits  in  a  word  (maps  the  Is  to  Os 
and  vice  versa).  For  example,  the  ex¬ 
pression  n  &  -'J  clears  only  the  bot¬ 
tom  bit  of  a  word,  regardless  of  the 
word  width.  On  a  16-bit  machine, 
~  1  evaluates  to  Oxfffe;  on  a  32-bit 
machine,  it  evaluates  to  Oxfffffffe. 

A  similar  problem  arises  when  you 
want  to  set  or  clear  only  the  top  bit  of 
a  word.  For  example,  n  &  0x8000  sets 
the  top  bit  of  n;  but  it  also  only  works 
with  a  16-bit  int.  You’d  have  to  say  n 
&  0x80000000  on  a  32-bit  machine. 
This  problem  can  be  solved  with  a 
little  convoluted  coding.  The 
expression 

~ ((unsigned)  ( — 0)  >>  1) 

evaluates  to  0x8000  on  a  16-bit  ma¬ 
chine  and  0x80000000  on  a  32-bit  ma¬ 
chine.  The  (~0)  evaluates  to  an  int- 
size  number  in  which  all  the  bits 


Flotsam  and  Jetsam 


are  set.  The  cast  to  unsigned  tells  the 
compiler  not  to  sign  extend  the  num¬ 
ber  on  the  right  shift.  The  shift  then 
moves  all  the  bits  one  notch  to  the 
right,  shifting  in  a  zero  from  the  left 
(because  it’s  unsigned).  Finally,  the 
outermost  ~  reverses  the  sense  of 
the  mask. 

Note  that  the  cast  is  required  here 
because  many  compilers  treat  ~~0  as 
a  signed  integer  (having  the  value  -1). 
These  compilers  process  the  shift  in 
one  of  two  ways — both  incorrect.  If 
the  compiler  looks  at  »  1  as  a  di¬ 
vide  by  2,  —0»1  will  evaluate  to  0 
(-1/2  should  be  0).  If  the  compiler 
treats  the  »  1  as  an  arithmetic 
right  shift,  it’s  likely  to  duplicate  the 
top  bit  rather  than  shifting  in  a  0  (so 
~0>>1  will  do  nothing). 

The  processed  can  be  generalized. 
The  top_n~bits(n)  macro  given 
below  evaluates  to  a  constant  with 
the  top  n  bits  set: 

^define  top_n_bits(n)\ 

“-((unsigned)  (—0)  >>  n) 

Note  that  all  this  shifting  and  invert¬ 
ing  will  be  done  at  compile  time  by 
most  compilers.  That  is,  the  expres¬ 
sion  will  actually  evaluate  to  a  single 
constant  in  the  generated  code,  not 
to  a  series  of  shift  and  invert  instruc¬ 
tions.  Consequently,  it’s  no  less  effi¬ 
cient  than  the  more  straightforward- 
looking,  but  less  portable,  variant. 

MallocO 

Enough  people  have  written  about 
the  March  Flotsam  and  Jetsam,  in 
which  I  discussed  problems  with  the 
malloci  )  and  calloc( )  system  calls,  so 


that  a  little  more  discussion  of  the 
problem  seems  in  order.  First,  mal- 
Ioc( )  is  unique  in  that  it  is  guaranteed 
to  return  a  pointer  that  can  point  at 
any  sort  of  object.  You  can’t  always 
assume  that  pointers  are  the  same 
size,  regardless  of  the  object  to  which 
they  point.  For  example,  in  the  8086 
medium  model,  you  can  have  a  16-bit 
pointer  to  a  subroutine  and  an  32-bit 
pointer  to  data  (or  vice  versa).  For  the 
same  reason,  it’s  a  mistake  to  cast  a 
pointer  in  to  an  int  because,  if  your 
machine  has  a  32-bit  pointer  and  a 
16-bit  int,  the  value  will  be  truncated. 
In  some  machines  you  can’t  even  as¬ 
sume  that  pointers  to  two  data  ob¬ 
jects  will  be  the  same.  The  problem 
here  is  alignment.  A  compiler  for  a 
machine  that  requires  longs  to  be 
aligned  on  4-byte  boundaries — and 
ints  on  2-byte  boundaries — is  likely  to 
round  a  pointer  to  long  when  it  is  cast 
into  a  pointer  to  int  (in  order  to  main¬ 
tain  alignment).  In  general,  it’s  not 
portable  to  cast  a  pointer  into  any¬ 
thing.  Of  course,  it’s  not  always  possi¬ 
ble  to  avoid  this  sort  of  cast.  Printf  ), 
for  example,  often  has  to  make  cer¬ 
tain  assumptions  about  pointers  and 
these  assumptions  are  often  nonpor¬ 
table.  The  foregoing  notwithstand¬ 
ing,  you  can  always  cast  the  return 
value  of  mallocf )  into  any  kind  of 
pointer — but  only  because  mallocf  ) 
was  written  with  this  sort  of  porta¬ 
bility  in  mind.  By  the  same  token,  an 
extern  statement  can  always  be  used 
to  declare  mallocf )  or  callocf )  as  re¬ 
turning  a  pointer  to  any  type  of  ob¬ 
ject.  Don’t  do  this  with  other  subrou¬ 
tines,  however — at  least  not  if  you 
want  your  code  to  be  portable.  £ 


104 

556 


Dr.  Dobb's  Journal,  July  1987 


16-BIT  SOFTWARE  TOOLBOX 


80386  Programming  Tools 

erhaps  Santa’s  reindeer  are  get¬ 
ting  old,  or  perhaps  all  the  per¬ 
sonal  computers  hanging  off  the 
back  of  his  sleigh  create  too  much 
drag.  At  any  rate,  he  didn't  arrive 
with  my  Compaq  386  Deskpro  until 
the  end  of  January — but  the  sus¬ 
pense  was  worth  it.  Remember 
when  you  switched  from  a  4.77-MHz 
IBM  PC  to  a  PC/AT  or  compatible?  For 
the  first  few  days,  I  was  continually 
reminded  of  what  a  two  to  three 
times  speed  increase  can  do  for  one’s 
productivity  as  a  software  developer. 
Fortunately,  the  human  organism  is 
highly  adaptable,  and  I  have  already 
learned  to  take  the  speed  of  the  386 
for  granted. 

Microsoft  has  warned  the  comput¬ 
ing  industry  not  to  expect  an  80386- 
specific  version  of  protected  mode 
DOS  until  at  least  late  1988.  Program¬ 
mers,  like  other  children  of  nature, 
abhor  a  vacuum,  and  several  compa¬ 
nies  are  attempting  to  exploit  this 
window  of  opportunity  and  carve 
out  a  niche  for  themselves  in  the 
80386  marketplace.  MetaWare  is 
marketing  C  and  Pascal  compilers, 
Quarterdeck  Systems  is  shipping  an 
80386-specific  version  of  the  DESQ- 
view  control  program,  several  ver¬ 
sions  of  80386  Unix/Xenix  are  under¬ 
way,  and  The  Software  Link  (a 
company  previously  known  for  its 
copy-protection  schemesl  is  about  to 
release  a  multitasking  operating  sys¬ 
tem  called  PC/MOS  386. 

The  first  real  80386  programming 
tool  to  fall  into  my  clutches  (it  actual- 


by  Ray  Duncan 

ly  arrived  before  the  Compaq  did) 
was  the  Phar  Lap  80386  assembly- 
language  development  package 
386ASM  /  386LINK,  which  includes 
the  following  programs: 

•  386ASM.EXE:  a  full-fledged  macro  as¬ 
sembler  supporting  all  8086,  80186, 


80286,  80386,  8087,  80287,  and  80387 
opcodes. 

•  386LINK.EXE:  a  linker  that  produces 
a  load  module  in  the  "old”.  EXE  for¬ 
mat. 

•  MINIBUG.EXE:  a  program  debugger 
capable  of  running  80386  real  mode 
or  protected  mode  applications,  with 
a  command  set  equivalent  to  MS-DOS 
DEBUG  (except  for  the  A  command). 

•  RUN386.EXE:  a  protected  mode  run¬ 
time  environment  for  32-bit  80386 
applications  running  under  MS-DOS; 
Phar  Lap  calls  this  program  the  386 
DOS-Extender. 

The  distribution  disks  also  contain 
various  example  programs  and  some 
object  modules  that  allow  the  Phar 
Lap  linker  to  be  used  with  the  Micro¬ 
soft  C  compiler.  The  375-page  man¬ 
ual  is  clearly  written  with  lots  of  ex¬ 
amples.  Like  the  Microsoft  MASM 
manual,  it  concerns  itself  primarily 
with  the  assembler  pseudo-ops  or 
directives;  readers  are  referred  to  the 
Intel  80386  Programmer's  Reference 
for  information  about  the  CPU  in¬ 
struction  set  and  the  syntax  for  the 
instruction  mnemonics. 

Writing  a  new  protected  mode  32- 
bit  80386  application  with  these  tools, 
or  converting  an  existing  application 
to  take  advantage  of  32-bit  protected 
mode,  is  quite  simple.  The  attribute 
USE32  must  be  included  in  each  of 
the  segment  declarations  in  the  mod¬ 
ule.  This  tells  the  assembler  that  the 
32-bit  override  prefix  byte  is  not  re¬ 
quired  before  instructions  that  per¬ 
form  32-bit  operations.  After  assem¬ 
bly,  the  program  is  linked  with  a 
special  module  (START386.0BJ),  sup¬ 
plied  by  Phar  Lap,  that  flags  the  ap¬ 


plication  as  capable  of  being  run  in 
32-bit  protected  mode  and  prevents  it 
from  being  accidentally  run  in  real 
mode  (if  the  resulting  .EXE  file  is  run 
directly  from  the  MS-DOS  command 
line,  it  simply  displays  an  error  mes¬ 
sage  and  exits). 

The  MINIBUG  and  RUN386  tools  are 
then  used  to  debug  or  run  the  32-bit 
protected  mode  application  under 
MS-DOS.  The  applications  can  request 
MS-DOS  system  services  in  the  normal 
way — that  is,  by  loading  the  80386 
registers  with  appropriate  values 
and  performing  an  Int  21h.  The  de¬ 
bugger  or  RUN386  intercepts  the  soft¬ 
ware  interrupt,  performs  any  ad¬ 
dress  translation  that  may  be 
required,  switches  the  CPU  into  real 
mode,  and  transfers  to  MS-DOS  to  car¬ 
ry  out  the  desired  function.  Upon  re¬ 
turn  from  MS-DOS,  the  debugger  or 
RUN386  performs  any  necessary 
translation  on  returned  values, 
switches  the  CPU  back  into  protected 
mode,  and  gives  control  back  to  the 
application.  MINIBUG  and  RUN386  also 
supply  the  application  with  segment 
selectors  that  allow  it  to  access  com¬ 
mand  tail  arguments,  physical  mem¬ 
ory  from  0  to  640K,  and  the  video  re¬ 
fresh  buffer  directly. 

Phar  Lap  will  soon  be  releasing  a 
"binding”  for  32-bit  protected  mode 
applications  that  performs  the  same 
function  as  the  RUN386  program  but 
that  can  be  linked  right  into  the  ap¬ 
plication — making  the  presence  of 
the  RUN386.EXE  file  unnecessary.  The 
price  of  this  binding  will  be  $995,  and 
it  will  include  the  right  to  distribute 
the  Phar  Lap  code  as  an  integral  part 
of  an  application  without  further 
royalties  or  license  fees. 

By  now,  you  are  undoubtedly 
wondering  about  the  Phar  Lap  tools’ 
compatibility  and  performance  com¬ 
pared  to  the  Microsoft  Macro  Assem¬ 
bler  (MASM)  and  Linker  (LINK.EXE).  On 
paper,  386ASM  is  nearly  100  percent 
upward  compatible  with  the  Micro¬ 
soft  Macro  Assembler;  the  only  ex- 


106 


Dr.  Dobb's  Journal,  July  1987 

557 


ceptions  being  lack  of  support  for  the 
.ERR1,  .ERR2,  and  .CREF  directives. 
The  few  new  80386-specific  direc¬ 
tives—  .386C,  .386P,  .387,  DP  (a  6-byte 
data  type  for  32-bit  protected  mode 
address  pointers) — and  the  segment 
attributes  USE16  and  USE32  are  stylis¬ 
tically  consistent  with  the  Microsoft 
MASM  syntax. 

In  practice,  the  upward  compati¬ 
bility  of  386ASM  from  Microsoft 
MASM  for  8086/286  applications  in¬ 
deed  proved  to  be  excellent.  I  assem¬ 
bled  and  linked  several  large  source 
files  for  my  own  company’s  MS-DOS- 
based  products,  and  the  only  differ¬ 
ences  I  detected  lay  in  the  area  of 
more  stringent  error  checking  on  the 
part  of  386 ASM.  For  example,  386ASM 
reported  the  source  code  line: 

esc  equ  Olbh 

as  an  error  because  of  the  collision 
between  the  symbol  esc  and  the  Intel 
instruction  mnemonic  ESC.  Microsoft 
MASM  let  the  same  line  pass  without 
comment  and  interpreted  the  equate 
"as  expected”  when  it  was  refer¬ 
enced  later  in  a  line  such  as: 

els  db  esc,'[2J' 

The  Phar  Lap  assembler  has  two 
particularly  nice  features  that  are 
not  supported  by  the  Microsoft  as¬ 
sembler.  The  first  is  the  ability  to  re¬ 
direct  error  messages  to  a  file  or  de¬ 
vice  distinct  from  the  file  or  device 
that  receives  the  entire  program  list¬ 
ing.  The  other  is  support  for  local  la¬ 
bels;  the  scope  of  any  label  starting 
with  the  *  character  is  limited  to  the 
current  procedure. 

To  test  386LlNK's  compatibility 
with  Microsoft  LINK,  I  used  it  to  re¬ 
build  a  relatively  large  assembly-lan¬ 
guage  application  containing  some 
250  object  modules  from  existing  li¬ 
braries  that  had  been  created  with 
Microsoft's  LIB,  MASM,  and  C  and 
LMl's  UR/FORTH  object  module  com¬ 
piler.  The  resulting  .EXE  file  was  not 
identical  on  a  byte  for  byte  basis  with 
the  .EXE  file  produced  by  Microsoft 
LINK,  presumably  because  of  slightly 
different  library  search  or  segment 
building  strategies  in  the  two  linkers, 
but  it  was  exactly  the  same  size 
(19,414  bytes)  and  ran  correctly. 

It  should  be  noted  that  in  order  to 
support  the  80386’s  32-bit  (native)  in¬ 


structions,  Phar  Lap  has  extended 
the  standard  Intel  Object  Module  For¬ 
mat  (OMF-86)  with  new  classes  of  32- 
bit  offsets,  displacements,  and  fix¬ 
ups.  These  extensions  are  similar  but 
not  identical  to  the  OMF  extensions 
made  by  Microsoft  in  its  XENIX/386 
Toolkit. 

Command-line  compatibility  be¬ 
tween  the  Phar  Lap  tools  and  the  Mi¬ 
crosoft  tools  is  something  else  again. 
Phar  Lap  opted  to  design  a  complete¬ 
ly  new  command-line  syntax  for 
386ASM  and  386LINK  and  did  not 


The  Phar  Lap 
assembler 
has  two  particularly 
nice  features 
not  supported 
by  the 

Microsoft  assembler. 


make  any  attempt  to  follow  the  Mi¬ 
crosoft  conventions.  For  example, 
the  MASM  command: 

C>MASM  FILE1,FILE2; 

which  assembles  FILE1.ASM  to  create 
FILE2.0BJ  and  does  not  create  a  listing 
or  cross-reference  file,  would  look 
like  this: 

C>386ASM  FILE1  -OBJECT  FILE2  -NO¬ 
LIST  -8086 

for  the  Phar  Lap  assembler  (unlike 
MASM,  it  creates  a  listing  file  by  de¬ 
fault).  Similarly,  the  Microsoft  LINK 
command  line: 

C>LINK  /NOI  MAIN  +  MENU,MYFILE„- 

NUCLEUS 

which  links  the  modules  MAIN.OBJ 
and  MENU.OBJ  with  the  modules  in 
the  library  NUCLEUS. LIB  and  pro¬ 
duces  the  executable  module  MY 
FILE.EXE,  would  be  entered  as: 

C>386LINK  MAIN  MENU  -EXE  MYFILE 
-LIB  NUCLEUS  -8086  -TWOCASE 

for  the  Phar  Lap  linker.  I  feel  that 


this  lack  of  command-line  compati¬ 
bility  was  not  a  sound  strategic  deci¬ 
sion  by  Phar  Lap.  Aside  from  the  fact 
that  the  Microsoft  command  syntax 
is  considerably  terser  and  thus  more 
efficient,  everyone  is  already  famil¬ 
iar  with  it.  Adoption  of  the  same  syn¬ 
tax  would  have  made  transition  to 
the  Phar  Lap  tools  that  much  easier, 
and  existing  MAKE  files  could  be  used 
without  modification.  Perhaps  if 
enough  people  complain,  Phar  Lap 
will  cave  in  and  make  the  change  be¬ 
fore  it  builds  up  too  big  a  user  base. 

Performance  of  the  Phar  Lap  tools 
is  acceptable,  but  they  are  not  nearly 
as  lean  and  mean  as  their  Microsoft 
equivalents.  386ASM.EXE  weighs  in  at 
215K,  compared  to  85K  for  MASM  4.0, 
and  386LINK  is  75K,  as  opposed  to  48K 
for  Microsoft  LINK.  As  might  be  ex¬ 
pected  for  such  plump  programs, 
they  are  also  slow.  The  following 
timings  were  obtained  on  a  Compaq 
386  Deskpro  (16-MHz,  2-megabyte 
RAM,  70-megabyte  fixed  disk): 

Assembly  of  a  600-line  assembler 
source  file  with  a  few  simple  macros 
and  three  segments: 

•  Microsoft  MASM,  Version  4.0:  6.2  sec¬ 
onds 

•  Phar  Lap  386ASM,  Version  l.ld:  10.8 
seconds 

Linkage  of  a  19K  .EXE  file  containing 
approximately  250  object  modules: 

•  Microsoft  LINK,  Version  3.51:  18.6 
seconds 

•  Phar  Lap  386LINK,  Version  1.1c:  2 
minutes  14.4  seconds 

A  spokesman  for  Phar  Lap  says  that 
the  company’s  own  timings  on  its 
linker  indicate  it  is  only  about  30  per¬ 
cent  slower  than  the  Microsoft  link¬ 
er,  so  the  application  I  was  linking 
may  represent  some  sort  of  worst 
case.  Still,  I  suspect  that  the  perform¬ 
ance  of  386LINK,  together  with  the  in¬ 
compatible  command  syntax,  will 
deter  people  from  converting  all 
their  work  onto  the  Phar  Lap  tools — 
including  the  development  of  8086 
real  mode  applications — a  conver¬ 
sion  that  would  be  perfectly  feasible 
given  the  otherwise  high  degree  of 
compatibility  between  Phar  Lap  and 
Microsoft's  products. 

For  the  first  release  of  such  sophis¬ 
ticated  programming  tools,  the  Phar 
Lap  products  are  remarkably  sound, 
and  we  can  reasonably  expect  that 


Dr.  Dobb’s  Journal,  July  1987 

558 


107 


16-BIT 

(continued  from  page  107) 


they  will  continue  to  evolve  and  im¬ 
prove.  Phar  Lap's  technical  support 
is  above  reproach,  too.  I  had  several 
occasions  to  send  E-mail  questions  to 
Phar  Lap  via  its  account  on  BIX,  and 
in  each  case  I  received  helpful  re¬ 
plies  within  the  same  day. 

In  any  event,  386ASM  and  386LINK 
are  currently  the  only  game  in  town 
if  you  want  to  start  working  with  the 
80386  in  its  native  32-bit  mode  and 
don’t  hanker  to  throw  away  half 
your  CPU  cycles  and  hard  disk  on 
Xenix.  The  price  of  the  MS-DOS  ver¬ 
sion  of  the  Phar  Lap  assembler/link¬ 
er/debugger  package  is  $495.  Ver¬ 
sions  that  run  under  VAX  VMS  or  Unix 
and  cross-assemble  and  link  to  the 
386  are  also  available.  Phar  Lap  can 
be  contacted  at  60  Aberdeen  Ave., 
Cambridge,  MA  02138;  (617)  661-1510. 

Book  Corner 

Prompted  by  visionaries  such  as  Jean 
Yates,  who  prophesied  that  Unix 
would  be  running  on  everything  but 
your  toaster  oven  by  1990,  publishers 
have  been  flooding  the  bookstores 
with  "advanced”  Unix  books  for  the 
last  two  years.  Most  of  these  books 
pan  out  to  be  collections  of  Unix  E- 
mail  tricks,  warnings  not  to  leave 
your  terminal  unattended  while 
logged  in,  or  guides  to  bigger  and  bet¬ 
ter  shell  scripts  (batch  files  to  you  MS- 
DOS  types).  A  relatively  few  books  de¬ 
liver  something  more  substantial  and 
address  the  Unix  application  pro¬ 
gram  interface. 

Books  that  actually  describe  the  in¬ 
ternals  of  Unix  in  any  detail,  especial¬ 
ly  those  that  do  it  in  a  way  compre¬ 
hensible  to  a  normal  mortal,  have 
been  virtually  nonexistent,  however. 
The  only  readily  available  resource 
of  any  worth,  aside  from  the  source 
code  for  Unix  itself  (if  you  have 
$50,000  to  spare  for  it),  has  been  the 
two  special  issues  of  the  AT&T  Bell 
Labs  Technical  Journal  (July/ August 
1978  and  October  1984),  which  were 
devoted  to  Unix  articles  by  various 
Unix  program  authors,  pioneers,  gu¬ 
rus,  and  mystics. 

This  deficiency  has  been  decisively 
remedied  with  the  appearance  of 
The  Design  of  the  UNIX  Operating  Sys¬ 
tem,  by  Maurice  J.  Bach.1  Mr.  Bach 
works  at  Bell  Labs  and  based  his  book 


on  Unix  System  V,  Release  2,  source 
code,  though  some  coverage  is  given 
to  BSD  Unix  variants  as  well.  He  cov¬ 
ers  the  kernel  architecture,  file  sys¬ 
tem,  control  of  processes,  interpro¬ 
cess  communication,  device  drivers, 
and  even  multiprocessor  Unix  sys¬ 
tems.  The  book  is  thorough  and  well 
written,  but  it  will  be  heavy  going  for 
readers  lacking  previous  exposure  to 
Unix  and  a  general  understanding  of 
operating  system  and  hardware  con¬ 
cepts  such  as  processes,  scheduling, 
kernel  and  user  mode,  interrupt 
handlers,  memory  protection,  swap¬ 
ping,  page  faults,  and  so  on. 

An  even  more  welcome  book  is  Op¬ 
erating  Systems:  Design  and  Imple¬ 
mentation,  by  Andrew  S.  Tanen- 
baum, 2  which  easily  qualifies  as  the 
best  general  book  on  operating  sys¬ 
tems  I  have  ever  seen.  It  covers  all 
the  necessary  subjects  (processes,  file 
systems,  interprocess  communica¬ 
tion,  memory  management,  mass 
storage,  and  so  on)  in  the  context  of  a 
Unix-like  operating  system  for  the 
IBM  PC  called  MINIX. 

MINIX  has  the  same  system  calls  as 
Version  7  Unix,  as  well  as  a  shell  and 
some  60  other  system  utilities  com¬ 
patible  at  the  user  level  with  Unix, 
but  the  source  code  is  completely 
original  and  is  included  in  the  book 
(also  available  on  diskette).  An  inter¬ 
esting  feature  of  MINIX  is  that  the  file 
system  manager  runs  outside  the  op¬ 
erating  system  as  a  user  process.  Con¬ 
sequently,  it  can  be  easily  modified 
and  even  replaced  with  a  file  server 
process  that  accesses  un-Unix-ish  file 
structures  (MS-DOS  perhaps)  or  per¬ 
forms  reads  and  writes  across  a  com¬ 
munications  link  or  network. 

Unlike  many  Unix  authors,  Tanen- 
baum  has  broad  experience  with 
other  operating  systems  and  fre¬ 
quently  draws  examples  and  com¬ 
parisons  from  his  knowledge  of  VM/ 
370,  MULTICS,  and  so  on.  The  discus¬ 
sions  of  processes,  scheduling,  inter¬ 
process  communications,  and  dead¬ 
locks  are  particularly  coherent.  I 
recommend  this  book  without  reser¬ 
vation  to  anyone  interested  in  the 
principles  of  operation  of  modern 
operating  systems. 

MASM  Equates 

In  the  February  1987  column,  I  point¬ 
ed  out  some  subtle  differences  be¬ 


tween  equ  and  =  that  can  lead  to  un¬ 
expected  problems.  Steve  Russell  of 
SLR  Systems  writes: 

"There  are  lots  of  peculiarities  be¬ 
tween  equ  and  =.  The  differences 
are  barely  hinted  at  in  the  manual. 
The  example  in  the  manual  [MASM 
4.0  Reference  Manual,  page  55]  that 
says: 

clearax  equ  xor  ax, ax 

doesn't  work  if  you  try  to  use  it, 
though  clearax  does  show  up  in  the 
symbol  table  as  being  equal  to  the 
text  value. 

"The  distinction  between  the  equ 
and  =  pseudo-ops  is  a  little  more  sub¬ 
tle  than  suggested  in  your  column. 
For  instance,  if  you  remove  the  sup¬ 
posedly  redundant  offsets  and  just 
use: 

dw  $-test 

you  find  that  equ  and  =  yield  identi¬ 
cal  results  (to  each  other,  not  to  your 
example) — that  is,  equ  stores  the  val¬ 
ue,  not  the  text.  On  the  other  hand,  if 
you  simply  write: 

tl  equ  offset  test 
t2  =  offset  test 

then  try: 

mov  ax,tl 

and: 

mov  ax,t2 

you  get  two  different  results.  It  looks 
as  though  offset  cannot  be  stored  as 
part  of  a  symbol,  so  equ  stores  text  if 
the  expression  does  not  yield  a  'val¬ 
ue'  and  the  =  directive  throws  away 
offset  because  it  can't  store  text. 

"So  the  question  is,  is  the  latter  ac¬ 
tion  a  bug  in  =  because  it  faithfully 
throws  offset  away  without  saying 
so  or  is  it  an  undocumented  feature?" 

The  Dialogue  Goes  On 

Frank  Albe,  of  Houston,  Texas,  takes 
up  the  cudgel  this  month  on  the  sub¬ 
ject  of  high-level  vs.  assembly  lan¬ 
guages: 

"Charles  Lyall’s  letter  in  your 


108 


Dr.  Dobb's  Journal,  July  1987 

559 


March  column  has  been  nagging  me 
for  a  week  now,  so  I  decided  it’s  time 
to  respond.  As  you  pointed  out,  Mr. 
Lyall's  letter  is  well  crafted  and  states 
some  informed  opinions. 

"The  letter  makes  four  major 
points: 

"1.  You  can't  master  as  many  assem¬ 
bly  languages  as  you  can  high-level 
languages  (HLLs). 

"2.  It  takes  orders  of  magnitude 
more  time  to  write  in  assembly  lan¬ 
guages  than  it  does  in  HLLs. 

”3.  The  difference  in  execution  time 
does  not  warrant  investing  the  extra 
programmer  time. 

"4.  Fourth-generation  languages 
(4GLs)  and  HLLs  will  trample  assem¬ 
bly  languages  in  the  dirt  because  the 
cost-accountant  mentality  says,  ‘Go 
for  the  lowest  bid  for  equivalent 
function.’ 

"To  me,  this  is  the  voice  of  an  ap¬ 
plication  analyst.  From  his  vantage 
point,  his  arguments  are  very  per¬ 
suasive  and  full  of  common  sense. 
My  position  is  that  of  an  opinionated 
systems  software  developer  who 
hacked  his  first  piece  of  code'  in  1963 
on  a  CDC  1604: 

"1.  It  is  a  lot  more  difficult  to  be  flu¬ 
ent  in  many  assembly  languages  be¬ 
cause  we  have  to  know  more  than 
the  assembly  language,  per  se.  We 
must  develop  deeper  understanding 
of  the  target  operating  system  and 
hardware  than  is  required  of  the 
person  who  writes  exclusively  in 
HLLs.  In  my  opinion,  HLL  applications 
can  benefit  dramatically  from  care¬ 
ful  attention  to  these  details,  but  far 
too  few  HLL  programmers  are  will¬ 
ing  to  invest  the  time  to  gain  the 
knowledge  and  make  good  use  of  it. 
"2.  Programmers  always  underesti¬ 
mate  the  time  it  takes  to  write  a  func¬ 
tioning  program  and  to  burnish  it  to 
their  satisfaction.  I  don't  know  any¬ 
thing  about  the  example  TEE,  but  un¬ 
less  it's  pretty  trivial,  I  doubt  that 
most  people  could  write  it  from 
scratch  in  15  minutes  in  an  HLL  such 
as  C,  Pascal,  COBOL  (snicker,  snicker), 
or  FORTRAN.  For  the  purposes  of  ar¬ 
gument,  assume  the  relationships  of 
8  to  1  for  coding  and  10  to  1  for  execu¬ 
tion  time  hold  across  the  board.  I 
know  this  is  invalid,  but  these  num¬ 
bers  are  as  good  as  any  others  to 


make  my  point.  If  the  program  will 
run  infrequently  and  not  in  conjunc¬ 
tion  with  others  in  an  interactive 
suite,  the  10-second  load  and  execute 
time  is  probably  acceptable.  If  the 
program  is  a  tool  such  as  the  DIR  com¬ 
mand  in  MS-DOS,  a  perceivable  delay 
is  intolerable. 

"3.  The  total  elapsed  time  to  get  code 
operational  is  a  significant  factor.  In 
the  example,  it  is  reasonable  to  as¬ 
sume  that  one  person  can  accom¬ 
plish  either  task  in  a  single  sitting  at  a 
terminal.  It  is  a  matter  of  taste  and 
professional  judgment  which  is  bet¬ 
ter:  blindingly  fast  execution  or  in¬ 
credibly  quick  production  of  the 
source  code. 

"Let's  jump  from  the  sublime  to 
the  ridiculous  and  consider  a  com¬ 
plex  cost-accounting  application  that 
must  be  run  daily  on  existing  hard¬ 
ware.  It  will  take  2  years  to  develop 
in  assembly  language  and  will  run  3 
hours.  If  you  need  it  in  3  months,  can 
you  accept  a  30-hour  daily  run  time? 
On  the  other  hand,  can  you  stay  in 
business  2  years  while  the  software 
engineers  develop  the  Great  Golden 
Wheel,  and  will  you  still  need  the  ap¬ 
plication  then?  I  know  there  are  fal¬ 
lacies  in  this  hypothesis,  but  it  illus¬ 
trates  the  point  that  it's  not  all  black 
and  white.  There  are  many  colors 
and  shades  of  gray  out  there. 

"We  can’t  discount  performance 
and  compactness  just  because  our 
personal  computers  are  getting  big¬ 
ger  and  faster,  lest  we  be  doomed  to 
relive  the  third-generation  main¬ 
frame  era.  I  hope  there  are  enough 
responsible  newcomers  who  will 
learn  the  history  and  folklore  of  this 
age  of  dinosaurs  and  profit  from  it. 

"4.  I  always  prefer  to  program  in  an 
HLL  wherever  practical,  but  it  will 
take  at  least  one  more  generation  of 
hardware  and/or  software  before 
we  reach  the  point  where  assem¬ 
blers  are  unnecessary.  Reduced  in¬ 
struction  set  computers  look  like  the 
best  bet  right  now.  We  seem  to  be  us¬ 
ing  the  RISC  models  as  an  excuse  to 
justify  an  evolution  in  compiler  tech¬ 
nology  that  reduces  the  compiler’s 
scope  and  pushes  the  work  down  to 
a  common  object  code  optimizer. 
The  theory,  of  course,  is  that  you  can 
write  one  of  these  to  support  all  your 
compilers.  This  concept  is  not  new, 
it’s  just  getting  a  lot  of  good  press 
right  now. 


"The  most  important  benefit  will 
be  realized  when  the  vast  majority  of 
executable  code  is  truly  reentrant.  I 
have  an  inflatable  soapbox  that  I've 
been  carrying  around  since  1971, 
upon  which  I  have  preached  this  ser¬ 
mon  many  times. 

"When  my  HLL  code  is  properly 
optimized  and  reentrant,  I  will  glad¬ 
ly  clear  storage  and  disavow  all 
knowledge  of  any  assembly  lan¬ 
guage.  Until  that  time,  however,  I  re¬ 
serve  the  right  to  write  in  an  assem¬ 
bly  language  of  my  choice  when  I 
feel  the  results  are  justified.  The 
same  holds  true  for  the  HLL  of  my 
choice.” 

Erratum 

In  my  May  discussion  of  Command 
Plus  from  ESP  Software  Systems  I 
gave  an  incorrect  phone  number  for 
the  company.  The  correct  numbers 
are  (213)  390-7408  (in  California)  and 
(800)  992-4377  (outside  California). 

Notes 

1.  Maurice  J.  Bach;  The  Design  of  the 
UNIX  Operating  System  Inc.,  Engle¬ 
wood  Cliffs,  N.J.:  Prentice-Hall,  1986. 
ISBN  0-13-201799-7  025. 

2.  Andrew  S.  Tanenbaum,  Operating 
Systems:  Design  and  Implementation, 
Englewood  Cliffs,  N.J.:  Prentice-Hall, 
1987.  ISBN  0-13-637406-9  025. 


DDJ 


Vote  for  your  favorite  feature/article. 
Circle  Reader  Service  No.  6. 


110 

560 


Dr.  Dobb's  Journal,  July  1987 


COLUMNS 


STRUCTURED  PROGRAMMING 


Forth  was  forged  for  the  purpose 
of  programming:  not  to  teach 
programming,  not  to  serve  as  an  en¬ 
vironment  for  learning,  not  to  match 
pre-existing  notational  conventions. 
Charles  Moore  and  the  Forth  devel¬ 
opers  designed  tools — and  tools  to 
create  tools.  Forth's  ad  hoc  develop¬ 
ment,  rooted  in  experimentation  and 
use,  resulted  in  a  comfortable  fit 
with  both  the  structure  of  the  com¬ 
puter  and  the  process  of  problem 
solving. 

I  was  reminded  of  this  as  I  read 
Programmers  at  Work,  a  recent  book 
by  Susan  Lammers  (Redmond, 
Wash.:  Microsoft  Press,  1986).  This 
collection  of  intriguing  interviews 
with  a  selected  group  of  innovative 
programmers  reveals  in  their  work 
some  Forth-like  tracks  from  time  to 
time.  I  previously  had  encountered 
instances  of  similar  convergent  evo¬ 
lution,  when  programmers  demon¬ 
strating  a  software  development 
package  with  a  surprisingly  Forth- 
like  structure  stressed  that  they  had 
not  known  or  copied  Forth  but  had 
arrived  at  the  structure  independent¬ 
ly,  finding  it  the  optimal  solution  to 
their  programming  problem. 

Given  Forth’s  power  as  a  general 
tool  and  its  use-based  development,  I 
believe  them.  A  solution  found  once 
is  likely  to  arise  again,  especially  if  it 
is  optimal:  the  various  paths  to  a  solu¬ 
tion  converge  at  the  optimum. 

An  instance  of  convergence  is 

by  Michael  Ham 

found  in  Bob  Carr's  comments  on  his 
work  and  design  ideas  in  creating 
Framework.  He  talks,  for  instance, 
about  granularity:  "Users  must  be 
able  to  break  their  work,  and  the 
program,  into  separate  pieces  rather 
than  dealing  with  a  single,  giant  enti¬ 
ty.”  Granularity  and  "homogeneity” 
(meaning  that  the  architecture  does 


Software  Design  Rules 


not  have  a  lot  of  exceptions  or  special 
cases)  amount  to  what  is  called  fac¬ 
toring  in  the  Forth  culture. 

Factoring  is  not  unique  to  Forth,  of 
course.  My  speculation  is  that  in 
Forth  some  tools  are  presented  par¬ 
ticularly  clearly  because  of  Forth's 
workaday  evolution.  Because  these 
tools  approach  the  optimal,  they  are 
likely  to  have  been  discovered  in 
other  contexts  as  well. 

Carr  mentions  that  he  "had  to  steal 
a  terrific  design  notion  Xerox  origi¬ 
nated:  All  commands  should  act  on 
data  already  selected  or  highlighted 
by  the  user.  It  is  called  the  object- 
verb  design,  versus  the  verb-object 
design.”  Forth  users  will  recognize 
that  Forth  uses  the  object-verb  design 
extensively — putting  the  object  on 
the  stack  and  then  executing  the 
verb.  But  Forth  did  not  take  the  idea 
from  Xerox.  The  idea  arises  in  the 
search  for  solutions,  and  experience 
directed  different  paths  to  that  com¬ 
mon  optimum. 

As  Carr  points  out,  the  object-verb 
sequence  shows  its  power  by  the  de¬ 
gree  to  which  it  can  do  more  work 
with  fewer  mechanisms.  "Intuitive” 
often  means  “what  I  am  used  to,”  so  I 
don't  call  the  sequence  intuitive.  The 
sequence  can  be  learned,  of  course, 
to  the  extent  that  it  becomes  second 
nature  (witness  the  number  of  fans 
of  Hewlett-Packard  calculators),  but 
the  real  argument  in  its  favor  comes 
from  experience,  which  shows  the 
power  of  the  object-verb  sequence 
and  explains  its  emergence  in  vari¬ 
ous  contexts  other  than  Forth  (and 
under  various  names). 

Programmers  at  Work  will  be  in¬ 


teresting  reading  for  any  program¬ 
mer.  Forth  programmers  will  find 
not  only  various  ideas  familiar  from 
Forth  (though  perhaps  under  differ¬ 
ent  names)  but  also  some  mentions  of 
Forth  itself.  Jef  Raskin,  for  instance, 
discusses  using  Forth  in  his  recent 
projects  "because  Forth  is  a  rather 
compact  language  and  is  inexpensive 
to  implement.  It's  not  my  favorite 
language,  but  I  thought  it  was  suit¬ 
able  for  this  particular  application.  I 
always  believe  you  should  use  the 
right  tool  for  the  job." 

Reading  the  book  made  me  rumi¬ 
nate  on  the  design  rules  that  I  follow. 
Here  they  are,  hewn  from  my  own 
experience. 

The  User  Interface 
Is  Everything 

No  matter  how  elegant  the  code, 
how  ingenious  the  data  structures, 
how  efficient  the  file  access,  if  the 
user  interface  is  crude,  clumsy,  or 
confusing,  the  system  will  fail.  The 
user  interface  refers  to  all  interac¬ 
tions  between  users  and  the  system: 
paper  forms,  computer  messages,  in¬ 
put  procedures — all  communication 
between  users  and  the  programs  you 
have  designed. 

The  system  procedures  must  fit  us¬ 
ers’  inclinations.  Things  must  be 
done  in  a  "natural”  way,  with  no 
penalties  for  guessing  or  experiment¬ 
ing.  How  do  you  know  what  is  natu¬ 
ral  for  users?  You  can't  ask  them;  you 
have  to  watch  them.  Watch  them  be¬ 
fore  you  design  the  system,  and 
watch  them  as  they  use  your  system. 
Pay  careful  attention  to  what  causes 
them  to  stumble  or  hesitate.  You 
must  distinguish  hesitations  that 
stem  from  accidental  awkwardness 
(for  example,  the  arrangement  of  the 
furniture,  which  can  be  rearranged, 
or  habits  derived  from  the  current 
procedure)  and  hesitations  that  stem 
from  intrinsic  awkwardness  (for  ex- 


112 


Dr.  Dobb's  Journal,  July  1987 

561 


ample,  an  unreachable  key  combina¬ 
tion  or  a  complex  instructional  se¬ 
quence  for  a  common  task). 

The  system  must  be  comfortable 
for  users  for  two  reasons:  first,  to 
minimize  error,  and  second,  to  en¬ 
courage  users  to  trust  the  system.  If 
the  system  doesn't  feel  right  and  reli¬ 
able,  it  will  not  be  used. 

This  rule  presents  a  particular 
challenge  to  programmers  whose 
work  habits  were  acquired  writing 
batch  programs  in  a  mainframe  en¬ 
vironment.  There,  the  users  are  com¬ 
puter  professionals  and  semiprofes¬ 
sionals:  operators,  production  clerks, 
and  even  other  programmers.  Al¬ 
though  the  requirement  for  a  good 
user  interface  still  holds,  the  sophisti¬ 
cation  and  long-time  experience  of 
that  user  group  can  compensate  to 
some  degree  for  poor  interface  de¬ 
sign.  Moreover,  the  batch  environ¬ 
ment  typically  lacks  interactive  in¬ 
terfaces,  in  which  users  are 
confronted  immediately  and  directly 
with  the  effects  of  poor  design. 

Users  of  microcomputer  pro¬ 
grams,  on  the  other  hand,  are  often 
not  computer  professionals.  They 
are  apt  to  be  unfamiliar  with  the  pe¬ 
culiarities  of  computer  operation. 
Moreover,  microcomputer  programs 
are  typically  interactive  in  format 
and  work  with  users  face-to-face,  as 
it  were,  instead  of  sending  and  re¬ 
ceiving  notes.  These  users  depend  on 
your  design  to  make  things  go  right 
for  them. 

Use  Hindsight 
Early  and  Often 

Design  awhile,  then  stop  a  bit  and  re¬ 
view  what  you  have  done.  With 
hindsight,  you  can  see  how  you 
could  have  done  something  better. 
Revise,  do  more,  and  use  hindsight 
again. 

Note  that,  to  use  hindsight,  you 
have  to  do  something.  It  is  good  to 
plan  and  to  know  where  you're  go¬ 
ing,  but  if  you  spend  too  much  time 
in  planning  the  details,  you  won’t 
have  time  to  redo  parts  of  the  system 
after  experience  shows  where  it 
needs  revision  and  enhancement. 
Don't  forget:  no  matter  how  well  you 
plan,  when  you  show  the  final  prod¬ 
uct  to  the  users,  you  will  be  told, 
"That  is  wonderful.  But  I  guess  I  for¬ 
got  to  tell  you  this,  and  we  just  found 
out  we're  going  to  have  to  have  that, 


and  could  you  move  that  over,  and  I 
don't  think  we'll  use  this  part  after 
all.” 

To  produce  software  that  truly  fits 
users’  needs,  you  will  learn  to  value 
iteration.  You’ll  find  that  you  must 
make  closer  and  closer  approxima¬ 
tions  to  what  your  customers  or  cli¬ 
ents  want,  or  to  what  the  situation 
requires,  simply  because  the  first  fit 
(or  the  second)  is  never  perfect. 

Plan  for  iteration  and  use  it.  Don’t 
hole  up  and  work  alone  until  you 
have  the  final,  polished  product  and 


It's  art 

excellent  idea 
to  write  the  manual 
before  you  start 
the  system. 


then  show  that  to  the  users:  it  will 
not  in  fact  be  the  final  product,  and 
you  will  have  wasted  time  complet¬ 
ing  one  developmental  cycle  when 
you  could  have  completed  several  it¬ 
erations  that  work  toward  what  is 
really  needed. 

Design  the  Output  First 

Using  an  iterative  approach  does  not 
mean  that  you  start  with  no  plan  at 
all.  You  must  have  some  idea  of 
where  you’re  going,  and  that  means 
that  you  and  the  user  should  first 
agree  on  the  destination  (the  output), 
even  if  you  subsequently  agree  to 
change  it.  If  you  work  through  sever¬ 
al  drafts  of  the  output,  until  both  you 
and  the  user  are  satisfied  with  the 
format,  content,  definitions,  se¬ 
quence,  and  timing,  you  won't  find 
yourself  designing  input  forms  or 
procedures  that  collect  data  you  nev¬ 
er  use  or  (even  worse)  that  fail  to  col¬ 
lect  data  that  you  need.  Design  pro¬ 
ceeds  backward  through  the  system, 
implementation  forward. 

Document  for  You 

If  you  have  clear  documentation, 
well  organized  and  readable,  it 
means  you  truly  understand  the  sys¬ 
tem  and,  because  it  is  understand¬ 
able,  it  will  be  robust  and  easy  to  de¬ 
velop  and  maintain.  The  primary 


value  of  documentation  is  the  pro¬ 
cess — working  out  in  your  own 
mind  what  you  are  doing  and  how 
you  will  do  it.  The  secondary  value  is 
in  having  a  record  of  what  you  have 
done,  when  you  return  to  the  system 
to  make  the  inevitable  changes. 

And  what  about  users'  documenta¬ 
tion?  It  is  an  excellent  idea  to  write 
the  users'  manual  before  you  start 
the  system  and  let  users  read  it  to 
verify  that  the  system  operation 
makes  sense  in  terms  of  the  proce¬ 
dures  and  system  objectives.  Users 
will  indeed  read  it:  not  yet  having 
the  system  and  curious  about  what 
you're  planning,  most  users  will 
pore  over  the  document.  But  after 
the  system  is  up,  the  users’  documen¬ 
tation  should  (ideally)  be  unneces¬ 
sary.  Users  will  then  generally  go  to 
the  system,  not  to  the  manual.  The 
acid  test  is  whether  the  system  itself 
(without  documentation)  can  lead  na¬ 
ive  users  correctly  through  standard 
operations  while  protecting  itself 
against  casual  errors. 

I  am  not  suggesting  that  all  applica¬ 
tion  programs  can  meet  this  stand¬ 
ard.  Many  programs  have  special 
features  that  make  documentation 
necessary.  But  even  those  programs 
will  fare  better  if  their  essential  doc¬ 
umentation  is  only  a  page. 

If  users  say  your  documentation  is 
hard  to  read  and  confusing,  believe 
them.  Don't  go  through  the  odd  (but 
unfortunately  common)  exercise  of 
trying  to  prove  to  them  that  the  doc¬ 
umentation  is  clear.  If  users  cannot 
read  or  understand  the  documenta¬ 
tion,  the  documentation  is  unaccept¬ 
able.  If  users  find  the  program  hard 
to  use,  the  program  is  indeed  hard  to 
use.  If  someone  doesn’t  like  liver,  the 
statement  "You  just  never  had  it 
cooked  properly”  seldom  convinces 
them  that  liver  is  delicious. 

Make  the  Program 
Bulletproof 

Random  pecking  of  keys  should  not 
cause  a  catastrophe.  The  input  rou¬ 
tines  should  filter  out  all  dangerous 
or  irrelevant  input,  and  the  program 
should  also  be  alert  for  every  inter¬ 
nal  awkwardness  it  might  encounter 
in  operational  use:  zero  divisors,  try¬ 
ing  to  append  records  from  file  A  to 
file  B  when  A  =  B,  the  wrong  dis¬ 
kette  being  used  as  the  input  master 
file,  trying  to  sort  a  single  record,  and 


Dr.  Dobb  s  Journal,  July  1987 

562 


113 


STRUCTURED  PROGRAMMING 

(continued  from  page  113) 


so  forth. 

Avoid  maddening  tricks.  For  in¬ 
stance,  don’t  ask  users  to  confirm  an 
action  that  subsequently  proves  im¬ 
possible  to  do.  Example:  A  user  en¬ 
ters  a  request  to  delete  record  357. 
The  program  responds  "You  wish  to 
delete  record  357  (Y/N)?1'  and  the 
user  dutifully  responds  "Y."  The  pro¬ 
gram  then  responds,  "Record  357  not 
on  file.  Request  aborted.”  The  user 
rightfully  feels  tricked.  If  the  record 
can't  be  deleted,  the  program  should 
have  said  so  in  the  first  place  instead 
of  going  through  the  miniquiz  on  in¬ 
tention.  Even  better,  the  confirma¬ 
tion  question  will  display  informa¬ 
tion  from  the  record  to  the  user, 
asking  whether  this  is  the  record  to 
delete.  This  approach  satisfies  both 
needs:  the  user  is  informed  if  there  is 
no  record  to  delete  or,  if  the  record  is 
present,  can  readily  confirm  that  it  is 
the  right  record  to  delete. 

Another  example:  In  one  program 
the  user  has  the  option  of  appending 
one  file  to  another,  and  several 
things  can  go  wrong  with  that  re¬ 
quest:  the  resulting  file  might  be  larg¬ 
er  than  legal  for  this  particular  appli¬ 
cation;  the  user  may  have  specified 
the  same  file  twice  (absentmindedly 
trying  to  append  a  file  to  itself);  or  the 
two  files  may  have  had  one  or  more 
elements  in  common,  not  allowed  in 
this  application.  The  program  should 
check  for  all  three  possibilities  be¬ 
fore  asking  the  user  to  confirm  that 
file  A  is  to  be  appended  to  file  B;  if  the 
files  cannot  be  appended,  the  appro¬ 
priate  error  message  should  be  dis¬ 
played  instead  of  the  request  for  con¬ 
firmation.  This  is  only  common 
sense,  of  course,  but  it  is  common 
sense  applied  from  the  user's  view¬ 
point.  The  user’s  viewpoint  must  also 
be  one  of  the  designer's  viewpoints. 

I  particularly  dislike  messages 
such  as  "Invalid  option”  or  "Nonnu¬ 
meric.  Reenter.”  A  good  program 
should  tactfully  ignore  inappropri¬ 
ate  data.  If  only  numeric  input  is  val¬ 
id,  only  numeric  keys  should  func¬ 
tion.  Pressing  any  other  key  should 
produce  no  effect  at  all.  The  same 
goes  for  menu  selection.  "Invalid  op¬ 
tion”  should  never  be  necessary  be¬ 
cause  the  program  should  recognize 
only  valid  keys. 


Users  should  be  able  to  tell  from 
the  lack  of  action  that  something  is 
wrong.  If  A,  B,  and  C  are  the  only  val¬ 
id  data,  then  only  the  A,  B,  and  C  keys 
should  work — and  they  should  work 
equally  well  whether  capitalized  or 
not. 

The  program  must  meet  users'  rea¬ 
sonable  expectations.  For  example, 
in  one  program  it  is  necessary  to  de¬ 
termine  the  user's  sex.  For  consisten¬ 
cy  with  earlier  menus,  the  user  is 
asked  to  respond  to  the  menu  "1 — 
Female,  2 — Male,”  but  the  program 
accepts  F,  f,  M,  and  m  in  addition  to  1 
and  2.  Similarly,  a  menu  consisting  of 
"1 — Yes,  2 — No”  should  accept  1,  2,  Y, 
y,  N,  or  n — and  even  a  carriage  re¬ 
turn  if  the  default  answer  is  clearly 
specified.  An  L  typed  when  a  numer¬ 
al  is  expected  is  undoubtedly  meant 
as  a  1;  why  not  accept  it?  Don’t  let  the 
user  complain,  "It  should  have 
known  what  I  meant.”  When  the 
program  should  have  known,  make 
sure  it  does  know. 

The  program  thus  hides  within  it¬ 
self  responses  to  all  anticipated  user 
inputs,  including  users'  errors.  The 
better  job  the  designer  does  in  antici¬ 
pating  users’  moves,  the  more  pleas¬ 
ant  the  program  is  to  use.  Users  typi¬ 
cally  are  not  even  aware  that  the 
program  has  responded  from  some 
option  that  lay  in  wait  for  the  antici- 
patable  error  or  intention.  Success  is 
achieved  when  the  program  re¬ 
sponds  as  users  expect,  even  when 
they  do  not  do  precisely  what  was 
asked.  The  options  that  users  recog¬ 
nize  will  be  only  the  tip  of  the 
iceberg. 

Let  Users  Know 
What ’s  Happening 

Invalid  input  should  evoke  a  re¬ 
sponse  if  it  helps  users.  For  example, 
if  users  attempt  an  illegal  deletion  of 
some  sort,  the  program  must  so  in¬ 
form  them,  lest  they  get  the  impres¬ 
sion  that  the  deletion  was  accom¬ 
plished.  Or  if  the  datum  is  a  number 
that  must  lie  within  a  range,  the  pro¬ 
gram  should  respond  to  an  invalid 
entry  by  asking  for  reentry.  The  re¬ 
quest  for  reentry  should  state  the  val¬ 
id  range  because  the  users’  invalid  in¬ 
put  suggests  they  may  not  know  the 
valid  range. 

A  nice  example  is  the  defining 
word  LIMITS  that  a  friend  added  to 
his  Forth.  LIMITS  defines  data  input 


commands  and  expects  at  the  time  of 
definition  the  limits  on  the  input 
range:  1  10  LIMITS  PITCH  defines  the 
command  PITCH  with  limits  1  and  10. 
When  PITCH  is  executed,  it  displays 
"Enter  PITCH”  and  waits  for  a  num¬ 
ber  (using  a  word  such  as  DIGITS,  de¬ 
fined  in  the  April  1987  Structured 
Programming  column).  The  entered 
number  is  accepted  only  if  it  is  with¬ 
in  the  limits.  If  it  is  not,  an  informa¬ 
tive  error  message  is  displayed:  "In¬ 
put  out  of  range:  lower  bound  1, 
upper  bound  10.  Reenter  number.” 
LIMITS  is  used  to  define  any  com¬ 
mand  expecting  bounded  numeric 
input. 

Cute  messages  pall  quickly.  Be 
businesslike  and  brief,  and  always 
keep  users  informed.  Long,  silent 
waits — such  as  5  seconds — make  us¬ 
ers  uneasy.  Tell  them  what’s  going 
on:  "Program  loading”;  "Checking 
records”;  whatever.  When  possible, 
display  a  countdown  so  users  can  es¬ 
timate  the  rate  of  progress. 

Be  Consistent 

No  matter  how  much  trouble  it  is, 
take  pains  to  be  consistent  in  every 
way  possible:  punctuation,  signifi¬ 
cance  of  colors,  mode  of  input,  loca¬ 
tion  of  messages,  and  so  on.  If  your 
menus  are  numbered  lists  from 
which  users  make  a  selection  by  en¬ 
tering  the  appropriate  number,  don’t 
suddenly  switch  to  a  list  in  which 
they  must  enter  the  initial  letter  of 
the  command.  If  the  message  "Press 
space  bar  to  continue”  appears  in  po¬ 
sition  25  of  line  20  in  one  screen,  it 
should  be  in  the  same  place — if  possi¬ 
ble — on  every  screen  in  which  it  is 
used.  Users  will  quickly  become  ac¬ 
customed  to  its  location  and  expect  to 
see  it  there.  Users  feel  most  secure 
when  their  expectations  prove 
reliable. 

Consistency  doesn’t  come  easily. 
For  one  thing,  different  sections  of 
the  program  will  have  been  written 
at  different  times.  As  a  result,  you 
must  run  through  the  “final”  draft 
many  times  to  be  sure  that  it  is  consis¬ 
tent — that  it  feels  the  same  in  all  the 
subroutines,  that  it  embodies  a  con¬ 
sistent  design  and  approach.  Consis¬ 
tency  comforts  users  and  makes 
your  product  seem  more  trustwor¬ 
thy.  It  also  makes  the  system  easier  to 
use  and  less  likely  to  be  a  cause  of 
error. 


114 


Dr.  Dobb's  Journal,  July  1987 

563 


Give  Every  Routine 
a  Safe  Exit 

Because  users  are  (probably)  using 
the  program  without  reading  the 
documentation  (or  with  only  a  vague 
recollection  of  it),  the  program  must 
have  no  traps:  routines  that,  once  en¬ 
tered,  cannot  be  escaped  from  until 
something  is  done — for  instance, 
adding  a  record  or  deleting  one.  If  us¬ 
ers  select  "Add  record”  from  the 
menu,  the  add-record  template 
should  offer  in  the  first  entry  a  possi¬ 
ble  "escape”  value  that,  when  used, 
returns  them  to  the  menu.  The  Es¬ 
cape  key  can  be  used  as  a  generic  es¬ 
cape. 

Suppose  that  the  first  entry  in  add¬ 
ing  a  record  is  the  serial  or  ID  num¬ 
ber.  A  blank  serial  or  ID  number  is  an 
obvious  way  to  escape  the  routine. 
The  most  general  escape  command 
is,  of  course,  the  “undo”  key,  which 
retracts  the  last  command  given.  The 
undo  may  not  be  feasible  in  a  given 
application,  but  some  escape  mecha¬ 
nism  must  be  provided. 

The  escape  value  also  makes  it  easy 
for  users  to  work  through  batched 
input.  Few  users  will  add  one  record, 
delete  a  second,  revise  a  third,  add  a 
fourth,  run  a  list,  back  to  another  de¬ 
letion,  and  so  on.  They  will  normally 
work  through  a  group  of  new  re¬ 
cords,  adding  them  all,  then  turn  to  a 
batch  of  revisions  and,  finally,  dele¬ 
tions — working  through  all  cases  of  a 
particular  type.  In  such  applications, 
the  system  should  automatically  re¬ 
turn  to  do  another  action  of  the  last 
type  until  users  signify  a  desire  to  es¬ 
cape  that  routine. 

One  observation  from  experience: 
users  usually  consider  deletion  as  a 
kind  of  revision,  so  your  revision 
routine  should  usually  include  a  way 
to  delete  the  record  instead  of 
(or  in  addition  to)  a  separate  delete 
routine. 

Watch  Users 
Use  Your  System 

1  suggested  this  before,  but  it  bears 
repeating.  You  cannot  design  a  good 
system  solely  through  your  ideas  of 
how  things  should  be  done.  You  also 
need  to  learn  how  they  are  done. 
You  may  learn  that  a  particular  ap¬ 
plication  has  users  who  do  not  batch 
input.  By  seeing  how  the  system  is 
used,  you  can  polish  it  to  remove  any 
impediments  it  offers  them,  and  you 


may  also  be  able  to  suggest  substan¬ 
tial  changes  that  will  make  their 
work  easier.  Don't  wait  for  users  to 
think  up  improvements. 

Obviously,  if  users  do  suggest  an 
idea,  listen  closely.  Users  almost  cer¬ 
tainly  understand  the  ins  and  outs  of 
the  job  better  than  you  do.  But  you 
can't  shift  the  burden  of  good  design 
onto  their  shoulders.  Your  job  is  to 
make  the  system  responsive  to  their 
needs.  This  always  requires  close  ob¬ 
servation,  which  will  sometimes  lead 
not  just  to  refinements  but  also  to  ma¬ 
jor  changes. 

Many  users,  for  example,  don't 
know  what  they  really  need;  most 
will  talk  about  the  means  rather  than 
the  goal.  Because  the  designer  is  of¬ 
ten  thinking  also  about  how  to  do 
things,  it  is  easy  to  assume  the  goal 
and  design  the  "how,”  rather  than 
rigorously  focusing  first  on  “why.” 
Systems  that  simply  automate  the  ex¬ 
isting  clerical  functions  are  an  exam¬ 
ple  of  looking  only  at  the  how.  Sys¬ 
tems  that  radically  redefine  a 
process  or  final  product,  simplifying 
the  entire  procedure,  are  the  result 
of  repeatedly  asking,  "Why?  Why  do 
you  need  that?  What  do  you  do  with 
it?  What  is  its  purpose?  Who  uses  it? 
What  do  they  do  with  it?  What  is  it 
the  means  to?”  and  then  finding  the 
most  efficient  way  to  achieve  the 
overall  goals  and  objectives. 

Design  Programs  from 
the  Top  Down,  Experiment 
from  the  Bottom  Up 

I  sometimes  call  this  "sideways  de¬ 
velopment”  because  it  is  not  com¬ 
pletely  top-down  or  bottom-up.  The 
top-down  design  goes  well  if  you 
write  a  functional  analysis  of  the  en¬ 
tire  program  in  a  chunk  of  several 
words,  then  a  functional  analysis  of 
those  words,  and  so  on: 

PROGRAM:  INTRO  BEGIN  INITIALIZE 
MAIN  UNTIL  DONE  ; 
INTRO:  GET.DATE  CHECK. DATA. 

DISKETTE  ; 

INITIALIZE:  FIRST.  MENU  CASE  NEW 
.TEST  OLD.TEST  MERGE.TEST 
QUIT. WORK  ENDCASE  ; 
MAIN:  BEGIN  SECOND. MENU  CASE  AP¬ 
PEND  REVISE  REF.LIST  SUMMARY 
QUIT.TEST  ENDCASE  UNTIL  ; 

and  so  forth. 

If  your  programming  language 


Dr.  Dobb's  Journal,  July  1987 

564 


115 


STRUCTURED  PROGRAMMING 

(continued  from  page  115) 

provides  the  tools,  your  design  and 
exploratory  implementation  of  the 
elements  of  the  design  should  be  con¬ 
current:  bottom-up  coding  to  proto¬ 
type  and  test  solutions,  which  are 
then  woven  into  the  top-down  de¬ 
sign  process. 

The  top-down  descriptive  analysis 
encourages  you  to  see  the  program 
or  system  in  terms  of  its  major  logical 
divisions,  to  see  those  divisions  in 
terms  of  their  major  parts,  and  so  on. 
This  leads  to  a  clean  design  that  is 
easy  to  write,  to  read,  and  to  main¬ 
tain  and  that  keeps  you  from  getting 
lost,  unable  to  see  the  forest  for  the 
trees.  Writing  the  analysis  will  not  be 
easy,  and  you  will  probably  have  to 
go  through  several  drafts  of  each  ma¬ 
jor  function  before  they  fit  comfort¬ 
ably  together. 

The  bottom-up  exploratory  pro¬ 
gramming  will  help  you  understand 
the  problem  better  by  allowing  you 
to  test  tentative  solutions.  Explor¬ 
atory  programming  works  well  only 
in  truly  interactive  languages,  in 
which  the  write-compile-test-revise 
cycle  is  uninterrupted  by  waits.  In 
noninteractive  languages,  explora¬ 
tion  is  hindered  by  the  overhead  of 
preparing  the  source  file,  running 
the  compiler  to  create  an  object  file, 
linking  the  object  file,  and  so  on. 
These  mechanical  requirements  dis¬ 
rupt  your  train  of  thought  and  dis¬ 
courage  quick  checking  of  short  pro¬ 
cedures.  Exploration  in  such 
languages  tends  to  become  a  prema¬ 
ture  production  of  long  procedures 
rather  than  a  quick  interplay  with 
small  and  simple  prototypes. 

Note  that  moving  the  colons  in  the 
phrases  shown  earlier  to  the  start  of 
each  phrase  produces  Forth  defini¬ 
tions.  Forth  was  designed  to  lend  it¬ 
self  to  exploratory  programming.  Its 
structure  and  ability  to  accept  new 
commands  makes  for  an  easy  transi¬ 
tion  from  the  functional  analysis  to 
source  code.  The  exploratory  process 
produces  a  tested  set  of  elemental 
commands  and  functions  fitted  to 
the  problem,  and  the  top-down  anal¬ 
ysis  provides  the  high-level  words 
for  the  final  program.  You  then  can 
start  at  the  bottom  level  of  the  analy¬ 
sis  and  enter  the  elements  defined  in 
your  exploration  and  the  phrases  de¬ 


fined  in  your  analysis,  testing  your 
way  back  up  the  chain  until  you 
reach  the  final  definition,  which  is 
the  program. 

See  How  the  Flow  of 
Activities  Wants  to  Go 

Look  under  the  surface  to  find  the 
natural  sequence  and  direction  of 
events.  The  actual  procedure  in 
place  is  only  an  approximation  (and 
sometimes  a  poor  approximation)  of 
the  “true”  procedure,  the  ideal  cen¬ 
ter  that  has  pulled  the  actual  proce¬ 
dure  into  its  current  configuration. 
The  true  procedure  is  the  procedure 
in  perfect  focus;  the  actual  proce¬ 
dure  is  always  an  approximation: 
slightly  blurred,  slightly  off  center. 
As  the  implemented  procedure  ap¬ 
proaches  the  true  procedure,  things 
work  more  smoothly  because  effort 
is  not  spent  in  countering  the  natural 
tendencies  of  the  work. 

If  users  consistently  make  some  er¬ 
ror  in  the  data-entry  procedures,  it  is 
a  sign  that  the  procedures  are  wrong. 
If  users  must  stop  to  calculate  wheth¬ 
er  a  set  of  data  meets  the  require¬ 
ments  for  the  next  program  step,  you 
should  immediately  suspect  that  the 
program  itself  should  make  the 
check. 

You  can  readily  find  examples  in 
noncomputer  systems  as  well.  The 
Army  Corps  of  Engineers  stopped 
the  flow  of  sand  down  the  Atlantic 
Coast,  and  then  they  found  the  south¬ 
erly  beaches  vanishing  as  the  sand 
washed  away  and  was  not  replaced 
by  sand  from  the  north.  Now  bull¬ 
dozers  and  dollars  work  to  maintain 
beaches  once  renewed  through  nat¬ 
ural  processes.  Because  the  system's 
natural  flow  was  disrupted,  much  ef¬ 
fort  is  devoted  to  a  poor  approxima¬ 
tion  of  beach  renewal. 

Sometimes,  to  encourage  a  certain 
flow  of  events,  you  can  design  seem¬ 
ing  inefficiencies  into  a  procedure. 
For  example,  papercutting  machines 
used  in  binderies  can  also  cut  people. 
One  way  to  prevent  accidents  is  to 
hire  floor  supervisors  who  constant¬ 
ly  watch  the  operators  and  jerk  them 
back  if  their  hands  stray  too  near  the 
blade.  A  better  way  is  to  build  the 
machines  so  that  two  switches,  in¬ 
stead  of  one,  must  be  closed  to  make 
the  cut.  Although  it  is  easy  and  even 
cheaper  to  design  the  machine  so 
that  only  a  single  switch  is  used,  us¬ 


ing  two  switches  occupies  both  the 
operator’s  hands  while  the  blade 
cuts.  The  second  switch  acts  as  an  ai¬ 
leron  in  the  flow  of  events,  pulling 
the  natural  sequence  in  the  direction 
the  designer  wants  it  to  go. 

Poorly  designed  systems,  which  di¬ 
verge  markedly  from  their  natural 
center  and  course,  exhibit  great  tur¬ 
bulence  from  the  continual  efforts 
required  to  keep  the  procedure  on 
course.  In  a  large  system,  the  turbu¬ 
lence  may  be  manifested  as  many  su¬ 
pervisors  or  frequent  meetings  or  re¬ 
runs  or  down  time  or  correction 
passes.  Well-designed  systems,  on 
the  other  hand,  will  flow  smoothly 
and  swiftly  and  with  almost  no  su¬ 
pervision  or  visible  effort.  The  sys¬ 
tem  itself  pulls  any  inadvertent  devi¬ 
ations  back  into  the  natural  flow  and 
thus  is  self-correcting.  The  feedback 
into  well-designed  systems  keeps 
them  on  course;  the  feedback  in 
poorly  designed  systems  pushes 
them  further  off  course,  making  us¬ 
ers  exercise  constant  vigilance  and 
effort  to  keep  the  system  running. 

These  rules  are  stated  as  if  you 
were  designing  a  system  for  a  cus¬ 
tomer  or  client,  but  in  fact  the  user 
will  often  be  yourself.  An  iterative 
approach  will  almost  always  pro¬ 
duce  the  most  satisfactory  system  or 
program.  You  certainly  want,  for 
yourself,  a  system  that  doesn't  re¬ 
quire  extensive  documentation  to 
run,  a  system  that  is  robust  and  easy 
to  maintain.  Treat  work  for  yourself 
with  the  same  professional  care  that 
you  would  devote  to  work  meant  for 
others.  Not  only  will  you  get  better 
software  but  you  will  also  acquire 
the  habit  of  thoughtful  and  alert 
design. 

Envoi 

A  new  job  and  new  responsibilities 
have  forced  me  to  choose  among  the 
activities  I  can  do.  I  have  enjoyed  the 
opportunity  to  discuss  Forth-related 
topics  in  these  pages,  and  in  the  fu¬ 
ture  I  may  have  occasion  to  speak  out 
again.  But  my  contributions  as  a  reg¬ 
ular  columnist  end  with  this  column. 
My  current  work  includes  an  intro¬ 
ductory  book  on  programming  in 
Forth,  so  my  involvement  with  this 
fascinating  language  continues. 

DDJ 

Vote  for  your  favorite  feature/article. 

Circle  Reader  Service  No.  7. 


116 


Dr.  Dobb's  Journal,  July  1987 

565 


COLUMNS 


ARTIFICIAL  INTELLIGENCE 


The  Xerox  1186  LISP  Machine 


This  month  I’ll  describe  the  Xe¬ 
rox  1186,  an  AI  workstation,  or 
LISP  machine,  I’ve  been  working 
with  recently.  In  the  process  I  hope 
to  show  why  and  how  a  piece  of  ded¬ 
icated  hardware  is  optimized  for  a 
language  such  as  LISP  and  share  some 
programming  insights  that  have 
come  out  of  LISP  machine 
development. 

The  Xerox  1186,  nicknamed  Day¬ 
break,  provides  several  unique,  pow¬ 
erful  features  at  a  relatively  low  cost. 
It  is  the  result  of  several  years  of  ex¬ 
perience  Xerox  has  had  in  producing 
advanced  workstations  intended  spe¬ 
cifically  for  interactive,  exploratory 
programming  in  LISP  for  AI  applica¬ 
tions  and  research. 

But  LISP  machines  didn't  start  at 
Xerox. 

How  LISP  Machines 
Came  to  Be 

LISP  machines  grew  out  of  the  hacker 
culture  that  prevailed  at  the  MIT  AI 
lab  in  the  60s.  You'll  find  the  story  of 
those  days  told  in  accessible  terms  in 
Steven  Levy's  book  Hackers  (Dell 
1984).  Richard  Greenblatt,  one  of  the 
main  "hacker  heroes"  in  Levy’s 
book,  invented  the  LISP  machine. 
You'll  find  the  technical  history  in 
his  article,  "The  LISP  Machine,”  in 
the  book  Interactive  Programming 
Environments  (Barstow,  David,  etal., 


by  Ernest  R.  Tello 

McGraw-Hill  1984). 

But  why  design  a  special  computer 
just  to  run  the  LISP  programming  lan¬ 
guage?  The  motivation  was  partly 
technical,  partly  pure  hacker  cul¬ 
ture.  The  main  technical  problem 
that  Greenblatt  and  his  colleagues 
were  trying  to  solve  was  that,  be¬ 
cause  most  serious  AI  programs  tend 
to  be  rather  large,  they  need  a  partic¬ 
ularly  efficient  environment  for  exe¬ 
cution  and  development.  The  other 


consideration  that  strongly  influ¬ 
enced  the  LISP  machine  developers 
was  the  hacker  antipathy  to  time 
sharing.  The  LISP  machine  had  to  be 
a  personal  computer  rather  than  a 
time-shared  machine  so  that  the  LISP 
hackers  could  have  full  control  of  the 
machine's  resources.  They  had  expe¬ 
rienced  what  this  meant  with  the 
older  PDP-1  and  PDP-6  machines,  and 
they  had  seen  and  been  impressed 
by  the  Alto  personal  computer  that 
had  been  built  at  Xerox  PARC,  the 
first  microcoded  personal  computer 
and  the  precursor  of  both  the  cur¬ 
rent  generation  of  LISP  machines  and 
personal  computers  alike. 

One  way  in  which  the  machine 
could  be  optimized  for  LISP  was  by 
providing  a  large  virtual  memory  ar¬ 
chitecture.  Because  of  the  large  size 
of  many  AI  programs,  it  was  (at  the 
time)  a  necessity  for  such  programs 
to  reside  in  virtual  memory.  Because 
the  performance  of  large  disk  drives 
used  on  large  time-shared  machines 
had  little  better  performance  than 
those  that  could  be  used  on  small, 
personal  machines,  Greenblatt  ar¬ 
gued  that,  for  virtual  memory  sys¬ 
tems,  personal,  single-user  machines 
were  more  appropriate  and  cost- 
effective. 

The  most  peculiar  aspect  of  LISP 
implementation  to  emerge  from  the 
development  of  LISP  machines  was 
the  CDR-coding  scheme  used  for  list 
storage.  The  linked  lists  that  are  the 
heart  of  LISP  code  and  data  storage 
typically  take  up  twice  the  space  that 
comparable  arrays  require.  The 
strategy  of  CDR  coding  is  based  on  the 


fact  that  many  of  the  lists  in  a  LISP 
program  are  never  or  seldom  modi¬ 
fied.  In  this  case,  why  waste  the 
memory  for  the  extra  cell?  The  stor¬ 
age  system  Greenblatt  arrived  at  gets 
the  best  of  both  worlds.  As  long  as  a 
list  is  not  modified,  it  is  stored  as  an 
array.  Once  the  list  is  modified,  fast 
microcode  routines  reassign  the  sec¬ 
tion  of  the  list,  from  the  point  of  the 
modification  on,  into  the  normal 
linked-list  cell  format  in  high 
memory. 

CDR  Coding,  Tagged 
Architectures,  and  Invisible 
Pointers 

There  are  at  least  five  major  require¬ 
ments  for  efficient  symbolic  process¬ 
ing  systems — that  is,  for  an  efficient 
LISP  system — compact  storage  of 
linked-list  structures;  fast  function¬ 
calling  mechanisms;  rapid  run-time 
type  checking;  fast,  incremental  gar¬ 
bage  collection;  and  efficient  and 
powerful  virtual  memory  manage¬ 
ment.  The  current  generation  of  LISP 
machines  is  based  on  an  architecture 
that  addresses  all  these  issues  with  an 
ingenious  and  elegant  strategy.  The 
basic  ingredients  of  this  strategy  are 
a  tagged  architecture,  CDR  coding,  in¬ 
visible  pointers,  custom  processors 
designed  to  enable  microcoding  of 
high-level  instructions,  and  incre¬ 
mental  garbage  collection  algo¬ 
rithms. 

The  original  CONS  and  CADR  ma¬ 
chines  developed  at  MIT  used  a  32-bit 
word  size  to  implement  the  tag  field 
CDR-coding  strategy.  As  illustrated  in 
Figure  1,  page  121,  the  compressed 
list  storage  format  was  based  on  di¬ 
viding  the  32-bit  data  word  into  four 
different  segments:  a  24-bit  data  area 
for  representing  the  first  element  of 
a  list  or  CAR,  a  5-bit  data  type  tag  field, 
a  2-bit  CDR-code  tag  field,  and  a  1-bit 
garbage  collection  (GC)  tag.  The  2-bit 
CDR-code  tags  are  used  to  signify  the 
four  values  CDR-NEXT,  CDR-NIL,  CDR- 


118 

566 


Dr.  Dobb's  Journal,  July  1987 


ARTIFICIAL  INTELLIGENCE 

(continued  from  page  118) 

NORMAL,  and  CDR-ERROR.  This 
scheme  allows  list  structures  to  be 
stored  as  ordinary  vectors  with  a 
space  savings  of  close  to  50  percent. 
When  a  list  is  stored  as  an  ordinary 
array,  each  word  in  the  array  carries 
the  CDR-NEXT  tag,  except  for  the  last 
item  in  the  list,  which  is  tagged  CDR- 
N1L.  When  the  list  must  be  modified 
and  when  the  sequential  storage  can 
no  longer  be  maintained,  the  tag  of 
the  last  word  prior  to  the  point  of 
modification  is  changed  to  CDR-NOR- 
MAL,  which  indicates  that  the  nor¬ 
mal  list  storage  format  is  now  being 
used.  In  this  case,  the  next  word 
forms  the  second  cell  in  the  standard 
CONS  cell  pointer  structure. 

With  this  storage  scheme,  howev¬ 
er,  there  is  still  a  major  drawback. 
When  list  modifications  are  made, 
the  accessing  functions  can  be  slow 
and  inefficient  because  many  of  the 
list  elements  must  be  moved  into  a 
fresh  area  of  memory.  This  problem 
was  solved  by  the  invention  of  the  in¬ 
visible  pointer.  An  invisible  pointer  is 
an  indirect  addressing  scheme  that  is 
implemented  on  the  level  of  the  data 
itself  rather  than  through  an  instruc¬ 
tion.  It's  called  invisible  because 
there  is  no  way  for  most  of  the  sys¬ 
tem  to  see  that  the  indirection  is  oc¬ 
curring.  Only  the  lowest-level,  mem¬ 
ory-referencing  operations  handle 
the  invisible  pointers. 

The  invisible  pointers  make  "clo¬ 
sure”  possible — the  linking  of  inter¬ 
nal  value  cells  to  external  value  cells. 
No  macroinstructions  are  required  to 
do  this.  When  the  system  seeks  to 
read  or  write  to  an  element  that  has 
been  restored  in  the  CONS  cell  format, 
it  is  automatically  sent  by  the  invisi¬ 
ble  pointer  to  the  new  location.  In 
this  way  the  linkage  of  elements  is 
maintained  as  efficiently  as  possible 
with  no  need  to  be  concerned  with  it 
on  the  programming  level. 

Incremental  garbage  collection 
means  taking  out  a  little  bit  of  gar¬ 
bage  frequently  so  that  you  never 
reach  the  point  at  which  things  have 
to  completely  shut  down  while  a 
large  pileup  of  garbage  is  taken  out. 
Most  LISP  systems  today  claim  to 
have  this  capability. 


The  Xerox  1186 

The  1186  closely  resembles  an  earlier 
machine  from  Xerox — the  1108,  or 
Dandelion.  It  runs  all  the  same  soft¬ 
ware,  but  the  hardware  has  been 
streamlined  using  up-to-the-minute 
VLSI  technology  to  provide  a  surpris¬ 
ing  amount  of  power  for  such  a  small 
package.  The  processing  unit  of  the 
1186  is  a  compact,  vertical  standing 
module  21%  inches  high,  9%  inches 
wide,  and  12%  inches  deep.  The  large 

19- inch,  high-resolution  mono¬ 
chrome  display  monitor  is  about  the 
size  of  four  Macintosh  screens,  offer¬ 
ing  a  resolution  of  862  X 1152  pixels.  A 
15-inch  monitor  is  also  available  with 
a  632 X 833-pixel  resolution.  Both  of 
these  monitors  have  the  same  pixel 
density,  though — 80  pixels  per  inch. 
Color  bit-map  routines  are  supplied 
with  the  software  for  using  graphics 
with  a  color  display.  The  keyboard 
on  the  1186  is  the  same  as  that  used 
on  the  Xerox  word-processing  work¬ 
station  and  can  be  readily  assigned 
all  the  keys  of  the  PC  series 
computers. 

The  main  CPU  of  the  1186  uses  a 
TTL  bit-slice  processor  based  on  a 
high-speed  version  of  the  Advanced 
Micro  Devices  2901C  chip,  which  has 
been  augmented  with  custom  LSI  and 
gate  arrays  for  microinstruction 
latching  and  bus  decoding  and  arbi¬ 
tration.  It  also  uses  a  front-end  pro¬ 
cessor,  based  on  the  Intel  80186,  and 
an  option  is  available  that  uses  a  sec¬ 
ond  80186  to  permit  IBM  PC  emula¬ 
tion.  The  1186  comes  with  a  mini¬ 
mum  of  1.6  megabytes  of  RAM, 
which  is  expandable  to  3.5  mega¬ 
bytes,  and  hard  disks  are  available  in 

20- ,  40-,  and  80-megabyte  configura¬ 
tions.  A  variety  of  bus  interface  op¬ 
tions  are  available  that  allow  use  of 
IBM  PC,  Multibus,  and  IEEE-488  pe¬ 
ripherals.  The  power  requirement 


for  the  1186  is  about  800  watts. 

As  with  all  LISP  machines,  the  1186 
employs  a  virtual  memory  system 
whose  behavior  is  largely  responsi¬ 
ble  for  the  performance  quality  us¬ 
ers  experience.  Virtual  memory  is 
provided  in  pages  of  512K  each  and  is 
allocated  in  two-page  chunks  called 
quanta.  A  total  of  4  to  5  megabytes  of 
virtual  memory  is  required  for  the 
system  to  "say  hello,  "and  typical  ap¬ 
plications  need  between  8  to  10 
megabytes.  There  is  no  such  thing  as 
enough  RAM  with  this  kind  of  ma¬ 
chine;  the  only  way  to  get  maximum 
performance  is  with  maximum  ram. 

With  a  16-bit  address  bus,  the  1186 
cannot  use  the  sort  of  tagged  archi¬ 
tecture  that  I  described  earlier  and 
that  is  found  on  many  LISP  machines. 
The  virtual  memory  architecture  is 
built  on  an  ingenious  32-bit  address¬ 
ing  scheme,  however.  The  1186 
reads  in  the  32  bits  in  two  chunks,  16 
bits  at  a  time,  but  the  CPU  does  not 
have  to  wait  for  the  second  16  bits  to 
know  which  address  is  referenced. 

The  1186  uses  a  form  of  incremen¬ 
tal  garbage  collection,  which  is  cur¬ 
rently  the  feature  that  everyone 
claims  to  have. 

The  Programming  Environment 

On  the  1186,  the  main  interpreter,  or 
LISP  listener,  is  called  the  Executive 
and  is  accessed  in  its  own  window. 
One  of  the  special  features  of  the  in¬ 
teractive  LISP  Executive  is  the  Pro¬ 
grammer’s  Assistant. 

Pressing  the  right  mouse  button  in 
a  desktop  bit-map  area  always  re¬ 
sults  in  the  main  pop-up  menu  open¬ 
ing,  which  allows  you  to  access  a  va¬ 
riety  of  different  facilities, 
depending  on  the  options  installed. 
On  the  machine  I  evaluated,  which 
had  the  LOOPS  AI  system  installed, 


Data 

Type 

CDR 

Code 

GC 

Tag 

0  .  . 

.  .23 

24  ....  28 

29  30 

31 

Figure  1:  Original  MIT  implementation  of  tag  field  CDR  coding  in  a  32-bit 
word 


120 


Dr.  Dobb's  Journal,  July  1987 

567 


ARTIFICIAL  INTELLIGENCE 

(continued  from  page  121) 


the  menu  looked  like  that  shown  in 
Example  1,  below.  Menu  items  with 
an  arrow  head  to  the  right  sprout 
submenus  off  to  the  right  if  you  drag 
the  mouse  across  them.  Windows  on 
the  1186  are  completely  independent 
of  one  another  and  can  be  written  to, 
in  principle,  by  any  number  of 
processes. 

Any  window  or  icon  on  the  1186 
also  has  an  associated  standard  menu 
that  permits  operations  such  as  open¬ 
ing  and  closing  and  moving  the 
window. 

IMTERLISP-D 

The  version  of  LISP  that  has  been 
available  on  Xerox  LISP  machines  is  a 
special  version  of  INTERLISP,  a  dialect 
of  the  language  that  goes  back  to  a 
version  of  LISP  that  was  first  imple¬ 
mented  on  the  PDP-1  by  Bolt,  Ber- 
anek,  and  Newman  (BBN)  in  1967.  In 
1972,  the  name  INTERLISP  was  first 
used  to  describe  the  version  of  LISP 
that  was  implemented  as  a  joint  ef¬ 
fort  of  BBN  and  Xerox  PARC.  This  dia¬ 
lect  caught  on  and  resulted  in  the  im¬ 
plementation  of  INTERLISP  on  the 
Xerox  Alto  in  1974.  INTERLISP-D  is  the 
dialect  of  LISP  that  has  grown  out  of 
this  as  the  Alto  gave  rise  to  a  series  of 
custom-microcoded  D  machines,  to 
which  the  Daybreak  is  the  most  re¬ 
cent  addition. 

The  syntax  of  INTERLISP-D  com¬ 
bines  several  different  constructions, 
old  and  new.  The  way  one  version  of 
the  factorial  program  looks  with  this 
syntax  is: 

(DEFINEQ  (FACTORIAL  (X) 

(IF  (ZEROP  X)  THEN  1 


Loops  Icon  > 

Dinfo 
Sketch 
VStats 
AR  Edit  > 
FileBrowser 
CHAT 

Idle  > 
SaveVM 

Snap 

HardCopy  > 

PSW 

Tedit 

SendMail 

PCEmulation 


Example  1:  The  Xerox  1186's  main 
menu  with  LOOPS  installed 


ELSE  (ITIMES  X  (FACTORIAL  (SUB1  X] 

Here  ITIMES  is  a  function  for  integer 
multiplication.  The  value  it  returns  is 
always  an  integer;  if  you  pass  it  a  dec¬ 
imal  as  an  argument,  it  rounds.  This 
function  also  illustrates  the  IF . . . 
THEN .  . .  ELSE  macro.  You  can,  of 
course,  use  the  traditional  COND,  but 
INTERLISP-D  also  provides  this  more- 
readable  syntax  for  writing  condi¬ 
tional  tests. 

Any  time  you  may  need  documen¬ 
tation  on  a  function,  Dinfo  is  right 
nearby.  Dinfo  is  the  complete  docu¬ 
mentation  for  INTERLISP-D  that  is 
available  in  a  flexible  on-line  facility 
that  includes  a  graphic  tree  display 
of  all  the  topics  in  the  documentation 
system.  Besides  being  accessible 
through  the  graphic  browser  inter¬ 
face,  the  detailed  documentation  can 
be  accessed  dynamically  for  topics  as 
they  arise. 

Here  is  a  feature  I  really  like:  if  you 
begin  to  enter  a  LISP  function  in  the 
Executive,  then  type  a  ?,  the  system 
looks  up  the  appropriate  topic  in 
Dinfo,  opens  a  new  window,  and  dis¬ 
plays  in  it  detailed  documentation 
for  the  function.  The  graphics 
browser  is  also  displayed  with  the 
current  node  highlighted  so  that  you 
can  select  related  topics  if  you  like. 
This  is  a  remarkably  convenient  and 
useful  way  to  provide  such  elaborate 
documentation  for  a  programming 
environment. 

By  the  time  this  column  appears, 
Common  LISP  will  also  be  available 
on  the  1186  in  its  own  package  or 
separate  namespace.  The  imple¬ 
mentation  of  Common  LISP  is  a  full 
implementation,  developed  by  ex¬ 
tending  the  kernel  of  INTERLISP  to 
include  all  the  features  of  the  Com¬ 
mon  LISP  standard.  Because  of  this, 
statements  in  INTERLISP  and  Com¬ 
mon  LISP  can  be  intermixed  freely 
in  applications,  and  existing  pro¬ 
grams  written  in  either  of  the  two 
dialects  can  be  run  without  making 
any  major  changes.  Also  available 
for  the  1186  is  Quintus  PROLOG,  a 
standard  version  of  PROLOG  making 
use  of  microinstructions  that  allow 
a  processing  speed  of  about  50,000 
LIPS  (logical  instructions  per  sec¬ 
ond).  The  PROLOG  implementation 
is  configured  in  such  a  way  that  ap¬ 
plications  can  be  written  partly  in 
LISP  and  partly  in  PROLOG. 


Structure  Editors 

LISP  lends  itself  well  to  an  editor  that 
knows  something  about  the  lan¬ 
guage.  For  editing  program  code  on 
the  1186,  you  use  the  SEdit  editor,  an 
advanced  LISP  structure  editor.  This 
approach  departs  significantly  from 
the  usual  Emacs-like  editors  used  on 
other  LISP  machines.  A  LISP  structure 
editor,  as  the  name  suggests,  is  an  edi- 


The  SEdit  editor 
differs 
significantly 
from  the  Emacs-like 
editors 
used  on  other 
LISP  machines. 


tor  that  knows  about  the  structure  of 
LISP  programs  in  the  sense  that  edit¬ 
ing  operations  are  performed  on 
simple  or  nested  LISP  expressions 
rather  than  on  textual  structures 
such  as  characters,  words,  lines,  and 
paragraphs.  SEdit  replaces  the  earli¬ 
er  DEdit  structure  editor  on  the  1100 
series  machines.  Those  who  have 
used  command-oriented  LISP  struc¬ 
ture  editors  will  probably  not  realize 
the  enormous  difference  it  makes  to 
use  an  editor  of  this  kind  that  is  fully 
integrated  with  a  mouse  and  win¬ 
dow-oriented  environment.  Once 
you  get  the  hang  of  this  editor,  writ¬ 
ing  and  debugging  LISP  code  is  much 
quicker  than  with  a  regular,  full¬ 
screen  text  editor. 

While  speaking  about  editors,  I 
should  say  something  about  the  Files 
facility  on  the  1186.  When  you  use 
the  structure  editor  to  create  some 
code,  that  code  is  not  just  stored  in  a 
buffer — it  becomes  part  of  the  LISP 
environment.  Saving  your  work  to 
disk,  therefore,  is  not  just  a  matter  of 
writing  a  text  file  to  disk.  It  relies  on 
an  intelligent  facility  that  keeps  track 
of  any  changes  that  have  been  made 


122 

568 


Dr.  Dobb's  Journal,  July  1987 


to  the  environment.  When  it  comes 
time  to  save  anything  to  disk  (func¬ 
tions,  variables,  windows,  bit  maps, 
records,  or  all  of  these),  you  use  the 
same  basic  procedure.  If  you  say 
(FILES?)  in  the  Exec  window,  you  get 
a  list  of  variables  and  functions  that 
are  not  yet  a  part  of  any  file.  After 
these  are  displayed,  the  system  asks 
"want  to  say  where  the  above  go?”. 
You  then  type  Y  and  are  prompted 
for  what  to  do  with  each  item  in 
turn.  An  alternate  way  of  ending  a 
session  and  saving  work  is  provided 
by  the  CLEANUP  command,  which 
handles  all  this  automatically,  if  it  is 
just  a  matter  of  updating  already  ex¬ 
isting  files,  and  compiles  all  the  code 
as  well. 

Also  in  this  connection,  I  should 
mention  the  way  special  editing  keys 
on  the  keyboard  can  interact  with 
the  mouse  to  perform  powerful  op¬ 
erations  anywhere  in  the  1186  envi¬ 
ronment.  An  example  of  this  is  the 
COPY  key.  To  use  it,  you  first  place 
the  editing  caret  in  the  place  to 
which  you  want  to  copy  the  text. 
You  then  hold  down  the  COPY  key 
and  use  the  mouse  to  highlight  the 
text  to  be  copied.  You  can  use  this  to 
copy  text  from  any  window  to  any 
other.  One  of  the  most  delightful  uses 
of  this  operation  I've  experienced  on 
the  1186  is  to  copy  a  function  right 
from  the  editor  window  into  the 
Exec  window,  where  it  becomes  im¬ 
mediately  available  for  use  without 
having  to  be  loaded  from  disk. 

Jf aster  scope,  DWIM , 
and  Other  Power  Tools 

In  the  course  of  developing  large 
programs,  particularly  ones  in 
which  a  team  of  programmers  col¬ 
laborates,  it  often  happens  that  you 
forget  about  various  details  of  func¬ 
tions  and  variables  you  have  written 
or  you  need  to  know  these  details 
about  code  others  have  written.  Mas- 
terscope  is  a  tool  that  provides  sever¬ 
al  facilities  for  making  it  easy  to  ana¬ 
lyze  the  structure  of  complex 
programs.  To  use  Masterscope,  you 
first  call  upon  it  to  analyze  the  partic¬ 
ular  files  in  which  the  sources  to  a 
program  reside.  Once  you  have  done 
this,  several  commands  are  available 
for  investigating  the  code.  So,  for  ex¬ 
ample,  there  is  the  WHO  CALLS  com¬ 
mand,  which  takes  the  name  of  a 
function  as  an  argument.  Master- 


scope  then  obediently  prints  a  list  of 
all  those  functions  that  call  the 
named  function.  A  related  facility, 
called  Databasefns,  automatically 
constructs  and  maintains  Master- 
scope  databases  of  program  files. 

DWIM,  standing  for  Do  What  I 
Mean,  is  one  of  the  best-known  facili¬ 
ties  in  the  INTERLISP  environment. 
What  it  does  is  to  try  to  match  unrec¬ 
ognized  variable  and  function  names 
with  ones  it  knows.  This  amounts  to 
the  same  thing  as  a  partial  match  in¬ 
terpreter  that  is  tolerant  of  misspell¬ 
ings  and  actually  corrects  typos  on 
the  fly. 

Someday  we  may  see  an  entirely 
different  kind  of  interpreter  that  is 
truly  semantically  oriented — that 
would  have  expectations  about  what 
it  was  going  to  receive  next  and  that 
would  actively  attempt  to  read  input 
that  way  and  even  query  the  user  or 
programmer  to  get  what  it  still  need¬ 
ed.  That  would  be  a  really  intelligent, 
forgiving  environment — a  real  DWIM 
feature — but  today  all  the  DWIM  we 
have  is  tolerance  of  misspellings. 

One  of  the  real  delights  in  the  Xe¬ 
rox  environment  is  the  SPY  window. 
It  is  visibly  present  when  the  system 
boots  as  a  large  icon  of  an  eye  that  is 
tightly  closed.  If  you  mouse  click  on 
the  eye  icon,  it  instantly  bursts  to  life 
and  you  see  the  eye  open  and  freeze 
in  the  opened  position  as  if  it  were 
looking  right  at  you.  The  open-eye 
icon  indicates  that  the  SPY  facility  is 
active  and  occupied  with  keeping 
track  of  the  time  used  by  various 
processes. 

Conclusions 

The  Xerox  1186  is  an  important  step 
toward  providing  low-cost,  dedicat¬ 
ed  AI  workstations.  Although  the 
learning  curve  for  getting  up  and 
running  with  any  LISP  machine 
should  not  be  underestimated,  there 
are  things  about  the  1186  that  make  it 
more  accessible  than  its  competitors 
to  new  users.  Although  it  took  me 
longer  than  I  expected  to  gain  a 
working  knowledge  of  the  user  in¬ 
terface,  from  the  time  I've  spent 
working  with  this  machine,  I  feel  it 
offers  a  supportive  environment  for 
programmers  and  does  what  other, 
more  expensive  machines  do  but 
with  noticeably  better  efficiency.  On 
the  other  hand,  I  would  not  recom¬ 
mend  that  anyone  decide  to  pur¬ 


chase  this  machine  with  the  idea  of 
economizing  on  the  peripherals,  par¬ 
ticularly  system  memory.  The  ma¬ 
chine  I  evaluated  had  about  the  max¬ 
imum  amount  of  RAM  it  can  take — 
3.5  megabytes — and  I  would  not  sug¬ 
gest  using  any  less. 

Keeping  this  in  mind,  I  would  say 
that  this  environment  is  probably  the 
best  buy  right  now  in  advanced  AI 
hardware  and  software  technology. 
Just  about  all  the  important  higher- 
end  AI  tools,  such  as  ART  and  KEE, 
run  on  it.  And  it  is  quiet.  In  spite  of 
their  role  as  personal  workstations, 
many  LISP  machines  are  rather 
noisy. 

But  perhaps  the  best  reason  to  get 
your  hands  on  a  Xerox  AI  worksta¬ 
tion  is  the  impressive  Xerox  LOOPS 
environment.  In  my  next  column  I 
will  introduce  you  to  this  object-ori¬ 
ented  AI  programming  environ¬ 
ment,  which  Xerox  has  just  made 
available  as  a  commercial  product. 

DDJ 

Vote  for  your  favorite  feature/article. 

Circle  Reader  Service  No.  8. 


Dr.  Dobb's  Journal,  July  1987 


125 

569 


LETTERS 

(continued  from  page  1Z) 

mathematics.” 

Neil  D.  Pignatano 

280  S.  Euclid  #329 

Pasadena,  CA  91101 

New  Wave  BASICS? 

Dear  DDJ, 

The  State  of  BASIC  in  the  April  1987 
issue  of  DDJ  was  excellent,  but  the 
term  new  wave  may  be  misunder¬ 
stood.  Many  of  the  new  features 
cited  as  included  in  two  specific  ex¬ 
amples  of  current  BASIC  compilers 
have  been  available  for  nearly  a 
decade. 

The  first  new  feature  touted  was 
alphanumeric  lables.  Interpreted  BA¬ 
SIC  language  dialects  and  many  BASIC 
compilers  do  require  line  numbers. 
Microsoft's  QuickBASIC  supports  al¬ 
phanumeric  labels.  This  doesn't 
make  alphanumeric  labels  in  BASIC 
new.  Digital  Research’s  CBASIC  com¬ 
piler  has  supported  alphanumeric  la¬ 
bels  for  nearly  a  decade.  (I  used  a 
copy  of  Version  1.10,  copyright  1981, 
in  the  CP/M  environment.)  So,  alpha¬ 
numeric  labels  in  BASIC  have  been 
available  for  a  long  time.  What  is 
new  is  the  apparent  trend  toward 
the  use  of  alphanumeric  labels  in  BA¬ 
SIC  compilers. 

I  loved  the  reference  to  alphanu¬ 
meric-named  subroutines  with  pa¬ 
rameter  passing  as  part  of  this  new 
wave.  I  classify  these  subroutines  as 
merely  multiple-line,  user-defined 
functions  with  parameter  passing. 
Again,  this  is  not  a  new  feature  in  BA¬ 
SIC.  CBASIC  has  supported  multiple¬ 
line,  user-defined  functions  with 
parameter  passing  for  a  long  time. 
CBASIC  also  has  an  assembly-lan¬ 
guage  interface  with  object  file  li¬ 
brarian  and  an  overlay  linker.  New  is 
the  trend  toward  standard  BASIC  lan¬ 
guage  products  that  have  the  fea¬ 
tures  that  CBASIC,  Pascal,  FORTRAN, 
and  C  have  had  for  many  years. 

For  those  of  us  who  learned  FOR¬ 
TRAN  as  our  first  language  and  found 
Dartmouth  BASIC  an  abomination, 
CBASIC  was  a  breath  of  fresh  air. 
There  was  little  that  CBASIC  could  not 
do,  in  a  structured  way,  for  those  of 
us  who  started  with  mainframes  and 
migrated  to  desktop  personal  com¬ 
puters  running  the  CP/M  operating 
system  in  the  1970s.  The  real  surprise 
to  me  is  why  it  took  Microsoft  so  long 


to  include  the  features  of  CBASIC  in  a 
BASIC  compiler.  It  has  not  yet  includ¬ 
ed  all  the  object  file  librarian  and 
memory  overlay  features.  Although 
you  can  now  write  multiple-line, 
user-defined  functions,  you  still  can't 
separately  compile  them,  library 
them,  and  link  them  in  at  compile 
time.  Perhaps  the  new  wave  will 
eventually  catch  decade-old  CBASIC 
yet. 

The  real  new  wave  in  BASIC  lan¬ 
guages  is  buried  by  the  high-priced 
advertising  of  the  big  software 
houses.  There  is  an  obscure,  unad¬ 
vertised  BASIC  language  with  real 
power.  This  product  is  the  Minnow 
Bear  BASIC  compiler,  MB86.  It  uses 
the  CBASIC  language  syntax  but  sup¬ 
ports  long  integers,  color,  window¬ 
ing  primitives,  BCD  arithmetic,  DOS 
sytem  calls,  the  8087,  and  full  640K 
memory  usage,  to  cite  a  few  of  its  fea¬ 
tures.  MB86  compiles  to  Microsoft  C 
language  source,  which  uses  include 
files  for  standard  routines.  This 
means  that  those  of  us  who  used  CBA¬ 
SIC  and  migrated  to  C  have  a  BASIC 
that  is  full  featured  and  powerful. 

MB86  uses  the  Microsoft  C  compil¬ 
er  and  linker  to  produce  .EXE  files  un¬ 
der  PC-DOS  and  MS-DOS.  Users  can 
write  C  modules  and  include  them  in 
libraries  or  directly  in  the  C  source 
generated.  The  compiled  programs 
are  as  fast  or  faster  than  Digital  Re¬ 
search's  CBASIC  compiler.  MB86  has 
eliminated  the  memory,  8087  sup¬ 
port,  and  file  size  limitations  of  CBA¬ 
SIC.  It  has  virtually  all  the  characteris¬ 
tics  of  CBASIC  without  the  limitations 
and  is  a  real  contribution  to  the  ad¬ 
vanced  state  of  BASIC. 

There  is  ample  reason  for  Micro¬ 
soft  languages  to  support  protected 
mode  under  advanced  DOS  on  the 
80286  and  80386  processors.  Micro¬ 
soft  C  should  be  one  of  the  first  com¬ 
pilers  written  to  use  ADOS.  This 
means  that  Minnow  Bear  BASIC  will 
be  able  to  utilize  protected  mode 
shortly  after  ADOS  and  the  C  compil¬ 
er  are  available.  For  some  of  us,  MB86 
is  the  cutting  edge  of  the  new  wave 
in  BASIC  languges.  It  is  far  ahead  of 
the  products  used  as  examples. 

Keith  R.  Plossl 

George  Plossl  Educational  Services 

One  Parkway  75  Center 

1850  Parkway  PL,  Ste.  335 


Marietta,  GA  30067 

Still  Searching  for  a  Sine 

Dear  DDJ, 

I  wish  to  make  a  comment  on  the 
running  discussion  of  the  "best”  ap¬ 
proximation  to  any  given  function  or 
collection  of  data  points.  This  has 
most  recently  concerned  techniques 
for  approximating  the  sine  function. 
My  comment  is  that  the  word  best 
should  not  be  applied  to  any  method 
independently  of  the  context  within 
which  the  method  is  to  be  applied. 
Despite  our  desire  to  believe  that  sci¬ 
ence  and  mathematics  provide  abso¬ 
lute  answers  to  such  questions,  prag¬ 
matic  and  even  subjective  values 
arise  regularly. 

When  doing  a  linear  least-squares 
fit  to  empirical  data,  for  example,  we 
compute  the  slope  and  intercept  of  a 
straight  line  that  is  chosen  to  mini¬ 
mize  a  certain  sum.  Each  term  in  the 
summation  is  the  squared  difference 
between  the  line  ordinate  at  an  ob¬ 
served  abscissa  and  the  correspond¬ 
ing  observed  ordinate.  Why  is  this 
quantity  minimized?  The  answer  is 
no  less  subjective  than  the  answer  to 
the  question  of  why  a  straight  line  is 
used  in  the  fit  instead  of  a  parabola  or 
some  other  function.  But  minimizing 
the  sum  of  the  squared  deviations 
has  become  so  standardized  we  sel¬ 
dom  ask  whether  it  is  the  "best”  way, 
and  that  is  why  it  has  become  a 
standard.  It  is  usually  the  best  way 
because  the  sum  involved  is  positive, 
definite,  and  easily  differentiated  (at 
least  in  the  case  of  polynomial  fits),  so 
the  minimization  conditions  are  easy 
to  compute.  Besides  these  practical 
considerations,  what  we  are  really 
doing  when  we  use  least-squares  fits 
is  we  are  claiming  that  the  price  of 
being  wrong  in  our  approximation  is 
proportional  to  the  sum  of  the 
squared  deviations  from  it. 

This  is  usually  a  good  cost  function 
to  use,  but  we  could  use  the  sum  of 
the  cubed  deviations,  or  the  fourth 
power  of  the  deviations,  or  many 
other  functions  of  the  deviations. 
Note  that  the  cubed  deviations  are  a 
poor  choice,  as  are  all  odd-numbered 
powers,  because  negative  deviations 
contribute  negative  cost.  Another  ad¬ 
vantage  of  the  least-squares  ap¬ 
proach  is  that  we  obtain  the  actual 


126 

570 


Dr.  Dobb's  Journal,  July  1987 


value  of  the  root-mean-square  devi¬ 
ation  of  the  fit  quite  easily,  and  we 
have  come  to  regard  this  as  a  good 
figure  of  merit  for  judging  the  quali¬ 
ty  of  the  fit.  Other  cost  functions  may 
yield  the  RMS  deviation  as  easily. 

Why  use  these  things  called  Che- 
byshev  polynomials,  then?  The  an¬ 
swer  lies  in  what  their  cost  function 
is:  the  maximum  absolute  deviation 
of  the  fit.  In  general,  Chebyshev 
polynomials  yield  higher  RMS  devi¬ 
ations,  but  this  is  often  a  small  price 
to  pay  to  minimize  the  maximum  ab¬ 
solute  deviations.  They  are  also 
slightly  more  difficult  to  compute 
than  least-squares  polynomials  but 
not  enough  to  make  this  a  significant 
consideration.  When  designing  an  al¬ 
gorithm  to  be  used  on  a  computer, 
we  are  dealing  with  limited  preci¬ 
sion;  if  we  can  find  an  algorithm 
with  a  maximum  absolute  error 
below  the  precision  cutoff,  then  we 
know  that  no  better  accuracy  can  be 
attained  for  the  given  precision.  This 
is  not  as  easy  to  determine  when  all 
we  know  is  that  the  RMS  deviation 
has  been  minimized;  there  may  be  a 
range  in  which  the  absolute  error  is 
shockingly  high,  with  excellent  be¬ 
havior  elsewhere  masking  the  weak¬ 
ness.  This  also  points  out  the  necessi¬ 
ty  for  great  care  in  selecting  points  to 
fit  and  ranges  over  which  to  apply  a 
single  fit. 

So  far,  this  addresses  only  numeri¬ 
cal  accuracy  as  a  consideration. 
There  is  also  the  question  of  compu¬ 
tational  speed.  Sometimes  very 
rough  answers  are  all  that  are  need¬ 
ed,  but  speed  is  critical. 

There  are  many  more  methods, 
cost  functions  to  minimize,  subjec¬ 
tive  aspects  to  be  weighed,  and  so  on. 
I  hope  only  to  have  stimulated  some 
ideas  that  may  prove  useful  in  deter¬ 
mining  what  is  best  for  the  problem 
you  are  working  on  at  the  moment. 
John  W.  Fowler 
Global  Solutions 
230  Pacific  St.,  #205 
Santa  Monica,  CA  90405 

DDJ 


Dr.  Dobb's  Journal,  July  1987 


127 

571 


PROGRAMMER'S  SERVICES 


BOOKS 


Hillis,  W.  Daniel.  The  Connection  Ma¬ 
chine.  Cambridge,  Mass.:  MIT  Press, 
1985. 

Imagine  a  computer  that  could 
change  its  internal  structure  to  han¬ 
dle  different  types  of  problems — a 
massively  parallel  computer  whose 
data  paths  could  be  configured  to 
form  a  tree  structure  for  finding 
maxima  or  minima  or  a  directed 
graph  for  solving  "shortest  path” 
problems.  Daniel  Hillis  imagined 
such  a  beast,  and  now,  as  president  of 
Thinking  Machines  Corp.,  he  is  build¬ 
ing  it.  But  before  he  was  a  company 
president,  he  was  a  student  at  MIT, 
and  the  Ph.D.  thesis  he  wrote  there 
has  been  published  in  book  form  as 
The  Connection  Machine. 

In  this  book,  Hillis  explains  why 
conventional  von  Neumann  architec¬ 
tures  are  inadequate.  He  graphically 
illustrates  the  "von  Neumann  bottle¬ 
neck”  by  asking  you  to  visualize  all 
the  elements  of  a  modern  mainframe 
computer  on  a  single  piece  of  silicon. 
This  "megachip”  would  be  a  square 
meter  in  size  and  would  contain 
about  1  billion  transistors.  The  CPU, 
however,  would  take  up  only  2  or  3 
square  centimeters. 

Hillis  proposes  a  computer  archi¬ 
tecture  based  on  two  requirements: 
parallel  processing  and  a  dynamical¬ 
ly  configurable  communications  net¬ 
work  between  the  processors.  He 
outlines  the  structure  of  the  CM-1,  a 
prototype  machine  with  65,535 
nodes,  each  containing  its  own  pro¬ 
cessor  and  4,096  bits  of  memory.  The 
nodes  are  coupled  by  special  routing 
hardware  that  allows  messages  to  be 
passed  from  processor  to  processor. 

Hillis  next  describes  how  the  ma¬ 
chine  is  programmed.  It  is  connected 
to  a  conventional  mainframe  or  su¬ 


permini  that  can  write  directly  to  the 
memory  of  the  Connection  Machine. 
A  version  of  LISP  running  on  the  host 
computer  causes  data  and  instruc¬ 
tions  to  be  passed  to  the  Connection 
Machine.  Hillis  describes  three  exten¬ 
sions  to  the  LISP  language  that  form 
the  basis  for  exploiting  the  parallel¬ 
ism  of  the  Connection  Machine  archi¬ 
tecture.  He  examines  the  implemen¬ 
tation  of  the  architecture  in  detail 
and  follows  this  with  a  description  of 
how  various  algorithms  and  data 
structures  can  be  adapted  to  the 
machine. 

One  of  this  book’s  greatest  strengths 
is  that  Hillis  brings  his  vision  to  the 
reader  in  a  clear  and  logical  manner. 
From  an  abstract  discussion  of  the 
need  for  parallel  architectures  to  the 
specifics  of  the  CM-1,  he  covers  his 
topic  thoroughly.  Although  you  do 
not  have  to  be  an  expert  to  under¬ 
stand  the  book,  you  will  probably 
need  some  knowledge  of  computer 
architecture,  algorithms,  and  LISP.  I 
have  no  hesitation  in  recommending 
it  to  anyone  interested  in  the  future  of 
computer  architecture. 

Stroustrup,  Bjarne.  The  C++  Pro¬ 
gramming  Language.  Reading,  Mass.: 
Addison-Wesley,  1986. 

The  computer  science  research 
community  has  been  experimenting 
with  object-oriented  programming 
languages  (OOPLs)  for  more  than  15 
years,  but  the  benefits  of  working 
with  these  languages  have  only  re¬ 
cently  been  made  available  to  com¬ 
mercial  programmers.  Unfortunate¬ 
ly,  most  OOPLs  have  been  interpretive 
in  nature;  thus,  they  have  served  well 
as  design  and  prototyping  tools,  but 
they  have  not  generally  been  appro¬ 
priate  for  marketable,  end-user  appli¬ 
cations.  The  C+  +  Programming  Lan¬ 
guage  introduces  you  to  a  compiled 
OOPL  that  was  designed  to  combine 
the  numerous  advantages  of  object- 
oriented  programming  style  with  the 
efficiency  of  a  compiled  language. 
This  book  was  written  by  Bjarne 
Stroustrup  of  Bell  Labs,  who  designed 
C+  +  and  its  predecessor,  C  with 
Classes.  The  language  has  roots  in 
both  C  and  Simula67. 

The  structure  of  the  book  is  similar 
in  many  ways  to  the  classic  C  refer¬ 


ence  by  Kernighan  and  Ritchie.  It 
even  begins  with  C+  +  variants  of  the 
"Hello,  world”  and  "English-to-met- 
ric”  conversion  programs.  Readers  fa¬ 
miliar  with  C  may  find  these  pro¬ 
grams  and  several  other  sample  pro¬ 
grams  frustrating  because  they  do  not 
illustrate  object-oriented  program¬ 
ming  at  all.  Their  inclusion  can  be  jus¬ 
tified  on  the  grounds  that  program¬ 
mers  approaching  C+  +  for  the  first 
time  will  need  to  learn  the  standard 
forms  for  looping,  data  structuring, 
and  so  on.  I  think,  however,  that  ex¬ 
amples  of  the  object  paradigm  are  just 
as  important  as  the  language  syntax. 

It  is  not  until  Chapter  5  that  Strous¬ 
trup  really  begins  to  discuss  the  most 
important  feature  of  C++ — the  class 
mechanism.  From  this  point  on,  the 
book  covers  object  programming  in 
C  +  +  in  detail.  The  text  and  examples 
illustrate  the  features  of  C++  that 
take  it  well  beyond  C,  including  infor¬ 
mation  hiding,  inheritance,  and  oper¬ 
ator  overloading.  Stroustrup  is  also 
good  at  pointing  out  idiosyncrasies  in 
the  language  that  are  liable  to  be  mis¬ 
understood.  Unfortunately,  in  his  ex¬ 
ample  programs,  he  uses  some  short¬ 
hand  notations  that  I  feel  are 
inappropriate.  He  frequently  substi¬ 
tutes  the  keyword  struct  for  class  { 
public:  when  defining  classes.  This  is 
correct,  but  it  can  lead  to  confusion 
over  what  is  an  object  vs.  what  is  a 
simple  data  structure. 

As  a  reference  manual,  this  book  is 
indispensable.  You  should  not  pur¬ 
chase  this  book  as  an  introduction  to 
object-oriented  programming,  how¬ 
ever.  Stroustrup  does  not  cover  the 
rationale  for  any  of  the  design  deci¬ 
sions  or  contrast  the  implementation 
of  C++  with  any  other  OOPLs.  Yet 
without  providing  this  background, 
he  expects  readers  to  recognize  that 
procedure  calls  (with  objects  as  pa¬ 
rameters)  replace  message  passing 
and  that  all  message-to-object  bind¬ 
ing  is  done  at  compile  time  via  func¬ 
tion  prototyping.  If  C  +  +  becomes  as 
popular  as  C  is  now,  however,  The 
C+  +  Programming  Language  will  be 
popular  indeed. 

—  Ross  Nelson 

DDJ 

Vote  for  your  favorite  feature/article. 

Circle  Reader  Service  No.  9. 


128 

572 


Dr.  Dobb's  Journal,  July  1987 


PROGRAMMER'S  SERVICES 

THE  STATE  OF  BASIC 


Fundamental  Data  Types 
in  the  New  BASICS 

icrosoft  has  followed  a  rela¬ 
tively  consistent  scheme  in  im¬ 
plementing  data  types  in  MS-BASIC, 
BASICA,  and  many  other  microcom¬ 
puter  BASIC  dialects.  In  MS-BASIC  and 
BASICA,  the  types  supported  are  inte¬ 
ger,  floating  point,  double-precision 
floating  point,  and  string.  Characters 
are  considered  to  be  strings  contain¬ 
ing  a  single  character.  You  can  asso¬ 
ciate  the  data  type  with  a  BASIC  vari¬ 
able  in  two  ways:  use  a  symbol  at  the 
end  of  the  variable  name  to  explicitly 
indicate  the  type,  or  use  a  DEFjqqc 
declaration  to  perform  implicit  type 
declaration. 

QuickBASIC  uses  exactly  the  same 
data  typing  method  as  BASICA  and 
MS-BASIC  do.  QuickBASIC  strings  can 
be  as  large  as  32K  in  size,  however. 

Turbo  BASIC  follows  the  same 
scheme  but  adds  two  new  items:  long 
integers  and  integer  constants.  Long 
integers  support  a  range  between 
minus  and  plus  two  billion.  The  sym¬ 
bol  used  for  long  integers  is  the  & 
character.  Named  constants  in  Turbo 
BASIC  are  integer-type  identifiers  that 
begin  with  the  %  character.  For 
example: 

%Max.Size  =  100 

defines  a  named  constant  May-Size 
and  assigns  it  a  fixed  value  of  100. 
Named  constants  are  useful  for  per¬ 
forming  changes  in  a  program  with¬ 
out  hunting  for  specific  numbers  (an 
operation  that  sometimes  can  be 
hazardous). 

Other  BASIC  implementations  are 
able  to  mimic  named  constants  by  us¬ 
ing  functions  that  return  a  value.  For 
example,  in  QuickBASIC  you  can  de¬ 
fine  the  following  function: 


DEF  FNMax.Size%  =  100 

Although  named  constants  can  be 
faster  than  function  calls,  when  you 
use  the  function  call  technique,  you 
can  simulate  constants  that  are  of 
other  types,  such  as: 

DEFFNActorS  =  "Don  Johnson” 

DEF  FNWeekly  .Salary!  =  150000.00 

True  BASIC  takes  an  entirely  differ¬ 
ent  approach — it  supports  both  num¬ 
bers  and  strings.  The  internal  storage 
format  for  numeric  variables  uses  a 
variant  of  integers  and  floating 
points.  As  long  as  the  number  has  no 
fractional  part,  it  is  stored  internally 
as  an  integer.  Add  a  fraction,  and  the 
number  is  stored  as  a  real: 

LET  N  =  0  !  N  is  stored  as  an  integer 
LET  N  =  SQR(N) !  N  is  stored  as  real 

Variables  and  functions  in  True  BA¬ 
SIC  are  numeric  if  their  names  do  not 
end  with  a  $  character.  The  $  sign  is 
the  only  data  type  symbol  True  BASIC 
uses — it  does  not  use  the  implicit 
DEFyyy  declaration.  Strings  in  True 
BASIC  can  be  64K  long. 

BetterBASIC  uses  a  Pascal-like  ap¬ 
proach,  declaring  the  data  types  of 
variables  and  supporting  new  data 
types.  You  declare  data  types  by  first 
stating  the  data  type  and  then  listing 
the  variable  name  as  in: 

INTEGER:  Count,  Size,  Height 
REAL:  Salary,  Interest 
STRING:  Name,  Message 

BetterBASIC  supports  three  types  of 
numeric  data  types:  BYTE,  INTEGER, 
and  REAL.  The  BYTE  type  uses  1  byte 
of  storage  and  offers  a  range  of  val¬ 
ues  between  0  and  255.  The  INTEGER 
type  requires  2  bytes  of  storage  and 
offers  the  traditional  integer  range. 
The  REAL  type  in  Better  BASIC  ranges 
from  about  IE +254  to  IE-255,  with  a 
user-assigned  accuracy. 

Strings  in  BetterBASIC  have  a  de¬ 
fault  size  of  16  characters.  They  can 
be  as  large  as  32K  and  can  be  allocat¬ 
ed  as  static  or  as  dynamic  variables. 
You  can  even  use  the  extended  mem¬ 
ory  space  to  store  strings.  Consider 
the  following  declaration: 


STRING:  Name,  ThisLine[80],  Buffer/ 
X[3000],  AnyStringp] 

It  declares  Name  as  a  string  of  default 
size,  ThisLine  as  a  string  of  80  charac¬ 
ters,  Buffer  as  a  string  of  3,000  charac¬ 
ters  (stored  in  the  extended  memory), 
and  AnyString  as  a  dynamic  string. 
BetterBASIC  also  supports  the  pointer 
data  type.  Although  its  use  with  fun¬ 
damental  types  is  somewhat  limited, 
pointers  shine  when  used  with  re¬ 
cord  structures,  also  implemented  in 
BetterBASIC. 

BetterBASIC  supports  named  con¬ 
stants  that  can  be  of  any  valid  data 
type.  You  declare  them  using  the 
keyword  CONSTANT  followed  by  one 
or  more  constant  definitions,  such  as: 

CONSTANT  Actor$  =  "Tom  Selleck”, 
Series = "Magnum,  P.I." 
CONSTANT  ThisNumber =123,  That- 
One  =  1.234 

As  you  can  see  from  this  example, 
BetterBASIC  is  able  to  deduce  the 
type.  Although  the  constant  Actor$  is 
explicitly  typed  with  the  $  sign,  the 
constant  Series  is  not,  and  neither  are 
constants  ThisNumber  or  ThatOne. 
BetterBASIC  can  deduce  that  Series  is 
a  string  constant,  ThisNumber  is  an 
integer  constant,  and  ThatOne  is  a 
floating-point  constant.  If  you  assign 
a  number  to  a  named  constant  and 
the  number  contains  a  decimal  or  an 
exponential  or  is  outside  the  range  of 
integers,  the  constant  is  associated 
with  a  floating-point  type.  Other¬ 
wise,  BetterBASIC  assumes  that  you 
are  declaring  an  integer-named  con¬ 
stant. 

The  new  BASICS  offer  more  diversi¬ 
ty  in  data  typing  than  do  the  first 
generation  of  microcomputer  BASICS. 
At  the  two  ends  of  the  spectrum  are 
BetterBASIC  and  True  BASIC,  which 
offer  sophisticated  and  simple  data 
typing,  respectively.  Microsoft  has 
elected  to  make  QuickBASIC  keep  the 
traditional  types  of  BASICA  and  MS- 
BASIC.  Turbo  BASIC  has  extended 
some  of  the  BASICA  data  types. 

DDJ 

Vote  for  your  favorite  feature/article. 

Circle  Reader  Service  No.  10. 


130 


Dr.  Dobb's  Journal,  July  1987 

573 


PROGRAMMER'S  SERVICES 


OF  INTEREST 


Miscellaneous 

Computer  Professionals  for  So¬ 
cial  Responsibility  is  sponsoring  a 
symposium  entitled  Directions  and 
Implications  of  Advanced  Comput¬ 
ing,  in  Seattle,  Washington,  on  July 
12.  The  aim  of  the  symposium  is  to 
consider  the  directions  and  implica¬ 
tions  of  advanced  computing  in  a  so¬ 
cial  and  political  context  as  well  as  a 
technical  one.  Symposium  topics  will 
include  computing  research  fund¬ 
ing,  defense  applications,  computing 
in  a  democratic  society,  and  comput¬ 
ers  in  the  public  interest.  Keynote 
speakers  will  be  Robert  Kahn,  for¬ 
merly  director  of  the  Information 
Processing  Techniques  Office  at  the 
Defense  Advanced  Research  Projects 
Agency,  and  Terry  Winograd,  an  as¬ 
sociate  professor  of  computer  sci¬ 
ence  at  Stanford  and  an  AI  maven. 
Proceedings  will  be  distributed  at  the 
symposium  and  will  be  on  sale  dur¬ 
ing  the  1987  AAAI  conference.  Reader 
Service  No.  29. 

Computer  Professionals  for  Social  Re¬ 
sponsibility 
P.O.  Box  85481 
Seattle,  WA  98105 
(206)  783-0145 
(206)  548-4117 

Flambeaux  Software  has  an¬ 
nounced  the  availability  of  TECH 
Help! — The  Electronic  Manual,  an 
on-line,  pop-up,  technical  reference 
manual  of  the  most  commonly  need¬ 
ed  information  for  system-level  pro¬ 
grammers.  It  includes  comprehen¬ 
sive  coverage  of  the  DOS  and  ROM 
BIOS  services;  system  variables;  I/O 
ports;  installable  device  drivers;  dis¬ 
play  usage  (including  the  EGA);  and 
the  layouts  and  structures  of  dozens 
of  data  tables,  bit  flags,  and  switch 
settings.  It  is  up  to  date,  covering  top¬ 


ics  through  DOS  3.2  and  the  latest  PC/ 
AT  BIOS.  It  also  describes  the  Lotus/ 
Intel/Microsoft  Expanded  Memory 
Specification.  TECH  Help!  pops  up 
from  within  your  program  editor  or 
debugger  to  give  you  instant  access.  It 
sells  for  $69.95.  Reader  Service  No.  30. 
Flambeaux  Software 
1147  E.  Broadway,  Ste.  56 
Glendale,  CA  91205 
(818)  500-0044 

Connections  is  a  bimonthly  news¬ 
letter  for  networked  Macintoshes, 
designed  to  provide  answers  about 
network  products  and  planning  for 
both  novice  and  experienced  net¬ 
work  users.  It  covers  topics  of  inter¬ 
est  to  Macintosh  users  who  wish  to 
exchange  data  and  information 
among  themselves  as  well  as  with  us¬ 
ers  of  other  kinds  of  computers.  Fu¬ 
ture  issues  will  include  articles  such 
as  IBM  3270  connectivity,  Unix  con¬ 
nectivity,  file  servers,  star  control¬ 
lers,  file  transfers  and  conversions, 
AppleTalk  utilities  and  diagnostics, 
and  the  use  of  other  file  transfer  pro¬ 
tocols  on  the  Macintosh.  A  one-year 
subscription  to  Connections  costs  $60 
($70  for  overseas  subscribers).  Reader 
Service  No.  31. 

David  R.  Kosiur 
Connections 
P.O.  Box  5894 
Fullerton,  CA  92635 
(714)  738-1492 

The  Visible  Computer  (TVC):8088, 
from  Software  Masters,  is  a  book 
and  disk  combination  for  mastering 
8088  assembly  language.  It  consists  of 
a  350-page,  tutorial-style  manual,  a 
program  that  graphically  simulates 
the  inner  workings  of  the  8088  chip, 
and  dozens  of  demonstration  pro¬ 
grams.  TVC  is  designed  for  people 
with  no  prior  exposure  to  assembly 
language  and  includes  preliminary 
chapters  on  hex  and  binary  number¬ 
ing  systems.  TVC:8088  for  PC-DOS  ma¬ 
chines  requires  128K  ram,  is  not 
copy-protected,  and  sells  for  $79.95. 
Reader  Service  No.  32. 

Software  Masters 
P.O.  Box  3638 
Bryan,  TX  77805 
(409)  822-9490 

Novation  has  introduced  a  300/ 


1,200-baud,  AT  (Hayes)-compatible 
modem  called  the  Parrot  1200.  The 
modem  is  approximately  the  size  of 
an  audio  cassette  (414  X  2?A  X  %  inch¬ 
es)  and  weighs  three  ounces.  A  mi¬ 
croprocessor-controlled  power-man¬ 
agement  system  enables  the  Parrot 
1200  to  function  at  high  levels  of  reli¬ 
ability  using  only  the  power  avail¬ 
able  from  the  host  computer’s  RS-232 
serial  interface;  neither  batteries  nor 
external  AC  power  are  required.  Fea¬ 
tures  include  transmission  speeds  of 
0-300  or  1,200  bps;  Bell  103/212A 
hardware  compatibility;  an  asyn¬ 
chronous  data  format;  full-duplex 
operation;  built-in  auto  self-test,  ana¬ 
log  loop-back,  local  digital  loop-back, 
and  remote  digital  loop-back  testing; 
a  speaker  with  volume  control;  four 
LED  indicators;  and  an  AT-standard 
(Hayes)  command  format.  The  Parrot 
1200  sells  for  $119.  Reader  Service  No. 
33. 

Novation  Inc. 

21345  Lassen  St. 

Chatsworth,  CA  91311 
(818)  988-5060 

A  six-volume  set  of  books  that  serves 
as  a  software  management  tool  for 
establishing  a  company’s  internal 
programming  and  documentation 
practices  is  available  from  ATC  Soft¬ 
ware.  Five  of  the  volumes  describe 
standard  methods  for  programming 
in  COBOL,  FORTRAN,  C,  BASIC,  and 
dBASE.  The  sixth  volume  describes 
uniform  software  documentation 
standards  for  the  five  languages.  The 
set  sells  for  $72;  individual  volumes 
cost  $15  each.  Reader  Service  No.  34. 
ATC  Software 
Rte.  2  Box  448 
Estill  Springs,  TN  37330 
(615)  967-9159 


ODJ 


134 

574 


Dr.  Dobb's  Journal,  July  1987 


FORUM 


SWAINE'S  FLAMES 


One  company  worth  watching 
just  now  is  Phoenix  Technolo¬ 
gies,  developer  of  an  operating  sys¬ 
tem  extension  called  VP/ix,  a  virtual 
PC  environment  that  may  have  a  lot 
to  say  about  the  popularity  of  Unix 
on  386  machines.  Phoenix  is  working 
with  both  Microsoft  and  Interactive 
Systems  to  make  Xenix  and  Unix  sup¬ 
port  DOS  applications  on  386  systems. 
This  puts  Phoenix  in  the  heart  of 
Unix  development  for  the  386  be¬ 
cause  AT&T-Intel-Interactive  Systems 
Unix  and  Microsoft-Santa  Cruz  Oper¬ 
ation  Xenix  are  the  most  important 
strands  in  Unix  development  for  the 
386.  Phoenix  is  especially  worth 
watching  because  these  two  strands 
are  converging  as  a  result  of  this 
spring's  agreement  between  AT&T 
and  Microsoft.  Under  that  agree¬ 
ment,  Microsoft  will  develop  the 
next  version  of  Unix  for  AT&T,  a  ver¬ 
sion  that  will  be  designed  for  the  386 
and  that  will  be  upwardly  compati¬ 
ble  with  AT&T's  and  Microsoft  's  exist¬ 
ing  Unix  or  Xenix  products. 

Some  text  is  meant  only  for  human 
processing.  What  you  say  in  on-line 
conferences,  for  example,  is  often  of 
no  lasting  import  and  does  not  need 
to  be  processed  in  any  other  way. 
ROAM,  my  cousin  Corbett  calls  it: 
Read  Once,  At  Most.  Also,  you  may 
prefer  not  to  have  your  words  count¬ 
ed,  indexed,  or  munged  without 
your  approval. 

Corbett  has  come  up  with  two 
data-encoding  encryption  tech¬ 
niques  for  text  that  is  only  meant  for 
human  perusal.  One  nice  feature  of 
such  techniques  is  that  only  the  send¬ 
er  needs  any  special  software;  the 
encrypted  message  is  displayed  to  its 
intended  recipient  and  the  decryp¬ 
tion  process  is  performed  in  his 
head. 

The  simplest  is  the  etaoin  encod¬ 
ing.  This  encoding  is  trivial  to  imple¬ 
ment,  can  encode  in  real  time  during 
high-speed  transmission,  reduces 
data  by  up  to  50  percent  (making  a 


1200-bps  transmission  effectively 
2400  bps),  and  produces  ciphertext 
more  appropriate  for  human  de¬ 
cryption  than  for  machine  decryp¬ 
tion.  Although  a  program  with  ac¬ 
cess  to  an  English  dictionary  could 
crack  the  messages,  even  this  is  ques¬ 
tionable  if  the  sender  constructs  the 
messages  to  make  maximum  use  of 
context  dependencies  and  uses  un¬ 
common  words. 

Here's  how  it  works.  It  maps  up¬ 
percase  and  lowercase  es  into  lower¬ 
case,  and  similarly  for  f,  a,  o,  i,  n,  s,  h, 
r,  d,  /,  and  u.  (These  are  the  most  com¬ 
mon  letters  in  written  English.)  Low¬ 
ercase  is  used  because  the  ascenders 
help  distinguish  the  letters  at  a 
glance.  All  other  letters  are  mapped 
into  underlines,  all  punctuation  into 
periods,  and  space  into  space.  Num¬ 
bers  are  spelled  out.  Here  is  a  mes¬ 
sage  in  etaoin  encoding: 

i  need  to  _et  a  __aster  _ode _ at  least 

t_el_e  hundred  _aud. 

Not  too  hard  for  human  decryp¬ 
tion.  But  note  that  two  things  make 
the  machine  decryption  of  this  hard¬ 
er  than  you  might  at  first  expect:  let¬ 
ter-frequency  information  is  not  a 
useful  tool  for  extracting  the  remain¬ 
ing,  low-frequency  letters,  and,  in 
general,  structural  words  appear  in¬ 
tact,  but  words  crucial  to  the  mean¬ 
ing  of  the  message  are  ravaged.  The 
simplest  way  to  recover  these  words 
is  by  using  semantic  context  and  real- 
world  knowledge,  exactly  the  things 
that  people  do  naturally  and  pro¬ 
grams  don't  do.  The  word  modem  in 
the  message,  for  example,  would  be 


hard  to  recover  without  reference  to 
the  meaning  of  the  entire  message. 

The  more  ambitious  of  the  two  en¬ 
codings  is  3-Bit  English  (3BE).  This  en¬ 
coding,  which  has  a  greater  data- 
reduction  efficiency  than  etaoin,  is 
based  on  context-dependent  confusi- 
bility  studies  and  studies  of  redun¬ 
dancy-reduction  in  English  prose 
conducted  at  Matrix  Labs  in  Re¬ 
search  Triangle  Park  in  North  Caroli¬ 
na.  These  studies  show  how  to  map 
letters  into  an  eight-character  alpha¬ 
bet  so  as  to  lose  the  minimum  infor¬ 
mation  at  the  lexical  and  phonetic 
levels.  It  turns  out  that  phonetically 
confusible  letters  do  not  often  appear 
in  identical  contexts  for  the  simple 
reason  that  this  would  lead  to  audi¬ 
tory  confusions. 

Corbett's  encryption  technique 
maps,  for  example,  phonetically  sim¬ 
ilar  letters  such  as  d  and  t  together  in 
such  a  way  as  to  minimize  the  possi¬ 
bility  of  the  reader  mistaking  the 
whole  word.  Vowels,  which  are 
high-frequency  letters,  carry  little  in¬ 
formation  and  can  all  be  mapped 
into  a  single  symbol.  In  addition,  Cor¬ 
bett  hopes  to  develop  a  custom  font 
for  the  recipient  that  will  make  it 
possible  to  see  the  character  either  as 
a  f  or  a  d,  for  example.  The  visual  sys¬ 
tem,  accustomed  to  resolving  ambi¬ 
guities  at  the  letter  level  using  word- 
level  information,  will  see  the 
character  appropriately.  This  encod¬ 
ing,  depending  as  it  does  on  sublimi¬ 
nal  cues,  language  use,  and  idiom, 
should  be  extremely  difficult  to 
crack  via  computer,  yet  should  be 
readable  by  any  English-speaking 
person  who  can  squint. 


Michael  Swaine 
editor-in-chief 


136 


Dr.  Dobb’s  Journal,  July  1987 

575 


#130  AUGUST  1987 


Dr-Dobb’s  Journal  of 


2.95  (3  95  CANADA)! 


Software  Tools 

FOR  THE  PROFESSIONAL  PROGRAMMER 


Unveiling  ANSI  C 


New  Tools  For  C: 

Optimizing  Technology 

Backtracking  Techniques 

Functions  with  a  Variable 
Number  of  Arguments 


■Hi 


Ray  Duncan  on 
DOS  3.3 


Al:  Programming 
in  LOOPS 


J 


Pi 


» 


AUGUST  1987 


CONTENTS 


VOLUME  12,  ISSUE  8 


Cover  story  ► 


Tools  f or  C  ► 


Faster  code  ► 


Arguments  and 

curses  ► 


DOS  3.3  ^ 


LOOPS  ^ 


C  PROGRAMMING:  Preparing  for  A1V8I  C  16 

by  Richard  Relph 

The  final  draft  of  the  ANSI  standard  for  C  has  appeared  at  last. 
Richard  takes  a  close  look  at  the  standard  and  how  it  differs  from 
previous  C  compilers  and  practices. 

C  PROGRAMMING:  Backtracking  34 

by  Charles  Bowman 

Backtracking  is  a  common  technique  for  AI  programmers  using 
languages  such  as  LISP.  In  this  article  Charles  shows  how 
backtracking  can  be  a  useful  tool  for  C  programmers  as  well. 

UTILITIES:  What’s  the  DIFF?  30 

by  Don  Krantz 

No,  this  isn’t  a  reprint  of  the  August  1984  article  of  the  same 
name.  That  one  was  a  file  differencer  for  CP/M  Plus  in  Pascal; 
this  one  is  for  MS-DOS  and  it’s  in  C.  That’s  the  diff. 

C  COMPILERS:  Optimizing  Compilers  for  C  42 

by  Richard  Relph 

The  latest  compilers  from  Datalight  and  Microsoft  feature 
substantial  improvements  in  code  optimization.  Richard  explains 
the  various  techniques  used  and  gives  examples  of  the  resulting 
code  improvements. 


C  CHEST  lOO 

by  Allen  Holub 

Allen  explores  several  techniques  for  handling  a  variable 
number  of  arguments,  and  updates  several  topics,  including 
curses  for  MS-DOS. 

1 6-BIT  SOFTWARE  TOOLBOX  112 

by  Ray  Duncan 

OS/2  isn't  the  only  news  in  the  microcomputer  world — Ray 
takes  a  look  at  some  of  the  new  features  of  MS-DOS  3.3  as  well  as 
providing  the  usual  book  commentaries  and  flames. 
STRUCTURED  PROGRAMMING  122 

by  Namir  Clement  Shammas 

Continuing  the  exploration  of  language  translation,  Namir 
discusses  translation  code  from  MS-BASIC  to  C. 

ARTIFICIAL  INTELLIGENCE  130 

by  Ernest  R.  Tello 

Last  month's  column  explored  the  hardware  aspects  of  the  Xerox 
1186  AI  workstation.  This  month  continues  with  a  discussion  of 
LOOPS,  the  machine’s  object-oriented  programming  language. 


Or.I)obb's|ournaluI 

Software  Tools 


About  the  Cover 

The  arrival  of  ANSI  C  is  something 
many  of  us  are  looking  forward 
to,  but  it  may  be  that  not  all  the 
surprises  are  pleasant  ones. 


This  Issue 

Our  focus  on  the  C  language 
ranges  from  the  cutting  edge  of 
compiler  technology  and  lan¬ 
guage  extensions  to  traditional 
programming  utilities. 


Next  Issue 

So  you  think  you’ve  got  it  all 
wired  just  because  you've 
memorized  the  first  three 
volumes  of  Knuth’s  Art  of 
Computer  Programming?  Be  sure 
to  check  out  next  month's  issue, 
which  will  explore  algorithms 
that  even  Knuth  hasn’t  heard  of. 


C  watershed  ► 


EDITORIAL 

by  Michael  Swaine 
RUNNING  LIGHT 
by  Tyler  Sperry 
ARCHIVES 
LETTERS 
by  you 

SWAINE’S  FLAMES 
by  Michael  Swaine 


6  ADVERTISER  INDEX:  113 

Where  to  find  those  products 
8  THE  STATE  OF  BASIC:  144 

CASE  statements  and  other 
8  constructs  of  the  new  BASICS 
lO  OF  INTEREST:  146 

Products  for  programmers 


Dr.  Dobb's  Journal,  August  1987 

578 


3 


FORUM 


EDITORIAL 


DJ  has  been  in¬ 
terested — make 
that  involved — in  the 
spread  of  the  C  pro¬ 
gramming  language 
since  we  published 
the  source  code  for 
Ron  Cain's  Small  C 
compiler  back  in  1980. 

At  that  time  there  was 
no  commercial  C  com¬ 
piler  for  a  personal  computer.  In  the 
intervening  years,  C  has  grown  to  its 
present  position  as  the  preeminent 
language  for  software  development 
on  and  for  personal  computers. 

Critics  of  the  language  can  say  that, 
like  BASIC  before  it,  C  is  not  well 
structured,  is  hard  to  read,  and  is 
hard  to  maintain.  They  can’t  say  it  is 
unpopular.  C’s  myriad  supporters 
will  tell  you  that  it  provides  all  the 
low-level  access  they  normally  need; 
that  C  code,  properly  written,  is  well 
structured  and  is  not  hard  for  a  C 
programmer  to  read  or  to  maintain; 
that  the  language  allows  both  struc¬ 
ture  and  freedom;  that  it  is  an  em¬ 
powering  tool.  The  supporters'  view 
has  prevailed  and  has  led  to  C's  cur¬ 
rently  being  available  in  any  envi¬ 
ronment  you  care  to  name,  with 
hordes  of  compiler  vendors  riding 
the  wave. 

Today  an  ANSI  standard  for  C  has 
all  but  been  established;  you'll  find  a 
discussion  of  the  draft  standard  in 
this  issue. 

So  the  C  programming  language  is 
uncommonly  popular  among  serious 
developers,  is  available  in  almost  ev¬ 
ery  conceivable  programming  envi¬ 
ronment,  and  is  on  the  verge  of  stan¬ 
dardization.  But  the  waters  never 
settle  except  to  be  stirred  anew. 

Up  through  this  common  ground 
of  popularity,  ubiquity,  and  stan¬ 
dardization,  a  watershed  is  arising, 
with  C  product  designs  flowing  in 
one  of  two  directions:  toward  deeper 
levels  of  optimization  on  the  one  side 
and  toward  ease  of  learning,  ease  of 
use,  and  speed  of  the  development 
cycle  on  the  other.  Thus  we  see  a 


growing  pool  of  com¬ 
pilers  from  Datalight 
and  Microsoft,  for  ex¬ 
ample,  that  promise 
optimization  technolo¬ 
gy  heretofor  only 
available  on  minicom¬ 
puters  and  main¬ 
frames;  and  we  also 
see  products  like  Bor¬ 
land's  Turbo  C,  Micro¬ 
soft’s  Quick  C,  and  Mark  Williams's 
Let's  C  that  promise  to  extend  the 
model  of  Turbo  Pascal  or  interpreted 
BASIC  to  C.  It  is  not  clear  whether  this 
parting  of  the  Cs  is  caused  by  the  Mo¬ 
ses  of  market  dynamics  or  by  some 
deep  tectonics  of  inherently  incom¬ 
patible  programmer  needs.  But  the 
decision  to  go  with  the  flow  and  to 
learn/buy/use  C  now  leads  to  anoth¬ 
er  decision  as  to  which  flow  with 
which  to  go. 

And  tomorrow?  The  C  program¬ 
ming  language  seems  so  widespread 
that  it  is  more  likely  to  adapt  than  to 
be  replaced.  It  seems  reasonable  that 
we  will  sooner,  rather  than  later,  see 
some  of  the  following:  4GLs  built  on 
top  of  C,  a  variety  of  preprocessors 
and  user  interfaces,  support  for  dif¬ 
ferent  compilation  models,  and  ex¬ 
tensions  of  the  C  packages  sold  in  the 
direction  of  rich  sets  of  version  con¬ 
trol  tools. 

In  the  future  C  will  change  into 
something  rich  and  strange. 

-yPi^LjzJ) 

Michael  Swaine 
editor-in-chief 


1/i.JWnm  gjwuuuiui 

Software  Tbols 

FOR  THE  PROFESSIONAL  PROGRAMMER 


Editorial 

Kditor-in-Chief  Michael  Swaine 
Editor  Tyler  Sperry 
Managing  Editor  Vince  Leone 
Assistant  Editors  Sara  Noah  Ruddy 
Levi  Thomas 

Technical  Editors  Alien  Holub 
Richard  Relph 

Contributing  Editors  Ray  Duncan 
Michael  Ham 
Namir  Shammas 
Ernest  R.  Tello 

Copy  Editor  Rhoda  Simmons 
Production 

Production  Manager  Bob  Wynne 

Art  Director  Michael  Hollister 
Assoc.  Art  Director  Joe  Sikoryak 
Technical  Illustrator  Frank  Pollifrone 
Cover  Photographer  Michael  Carr 
Circulation 

Circulation  Director  Maureen  Kaminski 
Book  Marketing  Mgr.  Jane  Shaminghouse 
Circulation  Coordinator  Kathleen  Shay 
Administration 
Finance  Director  Kate  Wheat 
Business  Manager  Betty  Trickett 
Accounts  Payable  Supv.  Mayda  Uipez-Ojiintana 
Accts.  Receivable  Supv.  Laura  Di  Lazzaro 
Advertising  Director 
Ferris  Ferdon  (415)  366-3600 
Account  Managers 
seepage  113 

Promotions/Srvcs.  Mgr.  Anna  Kittleson 
Advertising  Coordinator  Charles  Shively 
Associate  Publisher 
Michael  Swaine 
Assistant  Sara  Noah  Ruddy 

Dr.  Dobb  's  Journal  of  Software  Toots  (USPS  307690) 
is  published  monthly  by  M&T  Publishing  Inc.,  501  Gal¬ 
veston  Dr.,  Redwood  City,  CA  94063;  (415)  366-3600. 
Second-class  postage  paid  at  Redwood  City  and  at  ad¬ 
ditional  entry  points.  DDJ  is  published  under  license 
from  People’s  Computer  Company,  2682  Bishop  Dr., 
Suite  107,  San  Ramon,  CA  94583,  a  nonprofit 
corporation. 

Article  Submissions:  Send  manuscripts  and  disk 
(with  article  and  listings)  to  the  Editor. 

DDJ  on  CompuServe:  Type  GO  DDJ 
Address  Correction  Requested:  Postmaster:  Send 
Form  3579  to  Dr.  Dobb  s  Journal,  P.O.  Box  27809,  San 
Diego,  CA  92128.  ISSN  0888-3076 

Customer  Service:  For  subscription  problems  call: 
outside  CA  (800)  321-3333;  in  CA  (619)  485-9623  or  566- 
6947.  For  book/software  order  problems  call  (415)  366- 
3600. 

Subscriptions:  $29.97  per  1  year;  $56.97  for  2  years. 
Canada  and  Mexico  add  $27  per  year  airmail  or  $10 
per  year  surface.  All  other  countries  add  $27  per  year 
airmail.  Foreign  subscriptions  must  be  prepaid  in  U.S. 
funds  drawn  on  a  U.S.  bank.  For  foreign  subscriptions, 
TELEX:  752-351. 

Foreign  Newsstand  Distributor:  Worldwide  Media 
Service  Inc.,  386  Park  Ave.  South,  New  York,  NY  10016; 
(212)  686-1520  TELEX:  620430  (WUI). 

Entire  contents  copyright  °  1987  by  M&T 
Publishing  Inc.  unless  otherwise  noted 
on  specific  articles.  All  rights  reserved. 


M&T  Publishing  Inc. 

Chairman  of  the  Board  Otmar  Weber 
Director  C.  F.  von  Qpadt 
President  and  Publisher  Laird  Foshay 


6 


Dr.  Dobb  s  Journal,  August  1987 

579 


FORUM 


Last  month  I  of¬ 
fered  the  impos¬ 


ter  theory  to  explain 
that  fellow  on  the 
right  and  his  necktie.  ^ 

The  fact  that  neckties 
cut  off  the  vital  flow  of 
oxygen  to  the  brain  i  i 
has  always  encour- 
aged  me  to  stay  as  far  V  Me  •  ■ 

away  from  them  as 
possible.  Thus  my  doppelganger  the¬ 
ory  for  the  photo  of  the  new  editor  at 
DDJ. 

This  month,  however,  you'll  no¬ 
tice  we  have  virtually  the  same  pho¬ 
to  as  last  month.  Doesn't  this  seem 
odd  to  you?  I  mean,  this  guy's  been 
wearing  the  very  same  shirt  and  tie 
for  a  solid  month  now,  and  he’s  still 
smiling?  There’s  definitely  some¬ 
thing  strange  going  on  here. 

I  thought  about  it  quite  a  bit,  and  I 
have  a  possible  explanation.  My  the¬ 
ory  is  that  the  guy  is  a  robot,  an  au- 
dioanimatronic  device  like  Disney¬ 
land's  Mr.  Lincoln.  Once  a  month 
they  wheel  him  out,  plug  him  into 
the  wall  so  his  eyes  will  light  up,  take 
the  picture,  and  then  shove  him  back 
in  the  closet.  It’s  a  fantastic  theory,  of 
course,  and  there's  probably  nothing 
to  it.  Still ....  What  if  someone  was 
replacing  people  in  the  computer 
community  with  robots?  How  many 
replacements  would  it  take  before 
we  noticed  something  was  amiss? 

This  robot  hypothesis  goes  a  long 
way  toward  explaining  some  of  the 
strange  things  you  see  in  computer 
magazines  today.  For  example:  how 
is  it  that  someone  could  write  a  posi¬ 
tively  glowing  "preview”  of  Bor¬ 
land's  Turbo  C  months  before  the 
program  had  shipped  and  when  the 
only  information  available  was  a 
pitch  from  a  a  company  representa¬ 
tive  and  a  short  demonstration  of  the 
program?  How  could  a  responsible 
journalist  do  something  like  that? 
With  my  theory,  the  answer  be¬ 
comes  obvious:  a  real  journalist 
wouldn’t  act  that  way,  but  a  robot 
could  easily  be  programmed  for  the 


I  jpt  p  job.  Just  set  some  glob- 

1^^  o  al  variables  (such  as 

PRAISE  =  Mag;  CON¬ 
TENT  5=  Min)  and 
r*  /  there  you  have  it.  No 

'T.  more  troubles  with 

..I  freelance  writers,  no 
^  -  f  more  deadline  hassles, 

Bt  Ac  hi  s  i  and  no  more  unhappy 
K  ir  advertisers.  What 

■LitLi —  Sf  could  be  better  for  a 
magazine? 

Well,  despite  all  those  advantages, 
the  folks  here  at  DDJ  are  still  operat¬ 
ing  with  the  traditional  organic  pro¬ 
cessors.  Bit-slice  editors  and  colum¬ 
nists  would  be  faster,  but  they’d  lack 
the  experience  and  judgement  that 
makes  a  magazine  worth  reading.  So 
though  this  is  our  annual  C  issue,  I 
have  to  confess  we  don’t  have  a  rush 
review  of  a  new  compiler.  In  the 
near  future,  we’ll  no  doubt  have 
some  interesting  things  to  say  about 
Turbo  C  (and  Microsoft’s  Quick  C)  but 
first  we  have  to  do  our  homework 
and  actually  spend  some  time  evalu¬ 
ating  the  products. 

One  last  thing  before  I  go.  If  you're 
a  hotshot  on  either  operating  systems 
or  68000  programming,  give  me  a  call 
at  (415)  366-3600  and  sell  me  on  a  bril¬ 
liant  article  idea.  The  December  and 
January  issues  are  coming  up  fast, 
and  I  still  haven’t  filled  all  the  pages. 


Tyler  Sperry 
editor 


Ten  Tears  ago  in  DDJ 

"The  neologism  ’modem’  stands  for 
modular-demodulator.  The  need  for  mo¬ 
dems  arose  when  somebody  wanted  to 
send  a  digital  signal  to  New  Medford  (for 
example)  and  noticed  that  there  was  a  tele¬ 
phone  line  going  to  New  Medford.  But  the 
telephone  lines  only  carried  audio.  So  this 
somebody  made  a  device  that  turned  a  0 
into  a  1070  Hz  tone  and  a  1  into  a  1270  Hz 
tone.  This  was  a  modulator.  At  the  other 
end  (probably  in  New  Medford)  a  device 
was  built  that  translated  a  1070  Hz  tone 
into  a  0  and  a  1270  Hz  tone  into  a  1.  This 
was  a  demodulator.  Very  much  like  a  cas¬ 
sette  interface.” — "Sour  Notes  on  a  Penny- 
whistle,"  Jef  Raskin,  DDJ,  August  1977. 

'  ‘Dear  Editor  Persons:  Do  you  think  this  is 
too  drawn  out  to  have  the  desired  effect? 
(That  is,  wide  eyes  with  dilated  pupils,  soft 
blue  glow  around  the  head  (caused  by 
surge  in  the  Force),  disorientation,  repeat¬ 
ed  pronouncements  such  as  'wow'  and  fa- 
far-faarrout.) 

"On  the  other  hand,  if  it  were  shortened 
to  remove  the  fun  and  games,  would  it  be 
appreciated  by  people  who  hadn’t  thought 
of  the  idea  already? 

"By  the  way,  is  it  original?  I  wrote  this 
between  12  and  2  in  the  morning  . . .” — 
letter  that  accompanied  manuscript  "On  the 
Effects  of  Filling  Cavities  Within  the  Fillings 
of  Cavities  Within  .  . .,  "Steve  Whitham,  DDJ, 
August  1977. 

C  Notes 

"Alexander  Graham  Bell  invented  the 
tin  ear,  so  maybe  it’s  not  surprising  that 
Bell  Lab’s  best  software  should  have  names 
like  UNIX  and  C.” — "Binary  Trees  with  Tiny- 
C, "  Les  Hancock,  DDJ,  June /July  1979. 

"C  is  a  disease.  When  1  see  people  writ¬ 
ing  spreadsheets  in  C,  I  think,  'They  're  out 
of  their  minds.'  It  was  designed  to  write 
operating  systems.  Modula-2  is  good  for 
that  [writing  spreadsheets].  We’ll  do  a  C. 
We’ll  do  a  C  because  everyone  wants  a  C. 
But  in  Europe  C  is  seen  as  an  American  dis¬ 
ease,  and  here  people  are  trying  to  spread 
it." — Philippe  Kahn,  quoted  in  The  Software 
Designer,  DDJ,  October  1984. 

DDJ  in  a  Nutshell 

"Dr.  Dobb’s  Journal  began  as  a  forum  for 
sharing  information  and  ideas  about  pro¬ 
gramming  and  computers.  It  continues  to 
be  a  place  to  present  new  languages,  utili¬ 
ties,  tools,  applications,  algorithms,  discov¬ 
eries,  and  techniques  to  the  microcomput¬ 
ing  community.  Our  authors  primarily 
come  from  within  our  readership,  and  it  is 
this  reader  involvement  that  has  sustained 
and  guided  DDJ  throughout  the  years." — 
Editorial,  Reynold  Wiggins,  DDJ,  April  1984. 


Dr.  Dobbs  I< 

./'-V  «  m  TN  T'l 


lOURNALof 


calisthenics  Orthodontia 

Running  Ughl  Without  Ovrrhyte 


8 

580 


Dr.  Dobb’s  Journal,  August  1987 


FORUM 


LETTERS 


Clear,  Compact  Code 

Dear  DDJ, 

While  reading  "Dimensional  Data 
Types"  by  Do-While  Jones  (May 
1987),  I  sat  stunned  in  disbelief.  He 
seems  to  miss  the  boat  on  virtually 
every  point  he  tries  to  make. 

“Compact  code  is  no  longer  neces¬ 
sary  or  desirable."  Several  responses 
to  this  absurd  statement  immediately 
come  to  mind.  But,  as  this  is  a  family 
publication,  just  ask  Microsoft  about 
its  Windows  project.  Just  about  every 
compiler  advertised  in  your  maga¬ 
zine  boasts  about  its  compact  code 
generation.  Compact  code  must  be 
important  to  somebody.  A  wise  engi¬ 
neer  once  told  me,  "If  software  ig¬ 
nores  the  hardware,  you  can  be  sure 
that  the  hardware  will  ignore 
the  software.” 

"The  big  money  is  now 
starting  to  go  to  people  who 
can  write  clear  code.”  Mr. 

Jones  seems  to  imply  that 
complex  code  and  clear  (that 
is,  properly  documented) 
code  are  mutually  exclusive. 

Again  this  is  absurd.  Some  of 
the  prettiest  and  cleanest 
code  I  have  ever  seen  was  a 
telecommunications  system 
written  in  8080  assembly  lan¬ 
guage.  I  can  also  just  about 
guarantee  that  the  assembly- 
language  code  would  be 
smaller  and  faster  than  the 
equivalent  written  in  Ada  (if 
an  Ada  could  be  found  that 
ran  on  a  small  system).  Ada, 

Pascal,  and  Modula-2  are  not 
synonymous  with  clean  and 
easy  to  maintain  code.  Some 
of  the  "dirtiest"  production 
code  I  have  worked  on  has 
been  in  Pascal.  Another  wise 


programmer  once  said,  “A  real  pro¬ 
grammer  can  write  FORTRAN  in  any 
language.” 

“What  makes  this  program  diffi¬ 
cult  to  validate  and  maintain  is  the 
cryptic  number  335,300,000.’’  Al¬ 
though  I  agree  with  Mr.  Jones'  state¬ 
ment,  the  way  he  made  the  program 
“uncryptic”  was  to  add  two  pages  of 
unit  definitions  so  that  the  constant  c 
could  be  defined  and  then  divided  by 
2.  It  seems  to  me  that,  with  the  addi¬ 
tion  of  one  constant  declaration  and 
a  comment  line,  the  "bad”  program 
could  have  been  much  easier  to  read 
than  wading  through  the  "good” 
(that  is,  longer)  code. 

"The  overhead  isn’t  as  bad  as  it 
looks.”  I  guess  that,  in  a  large,  aca¬ 
demic,  mainframe  Ada  environment 
in  which  everything  is  slow,  com¬ 
pile/link/load/execution  times  are  a 
mere  nit.  I  wish  my  customers  were 
as  forgiving.  To  use  Mr.  Jones'  analo¬ 
gy  of  bank  computers  balancing 
checkbooks,  I  wonder  which  system 
bank  presidents  would  want — a  tight 
software  system  that  gives  them  an¬ 
swers  faster  or  an  inefficient  one  that 
takes  longer  but  has  "cleaner”  code? 
I  guarantee  that  they  don’t  care  what 
the  code  looks  like  but  that  they  do 


care  how  quickly  the  machines  can 
give  them  information. 

"Programmers  should  no  longer 
waste  time  combining  constants  be¬ 
cause  the  compiler  should  do  it  any¬ 
way.”  Do  most  programmers  know 
what  code  their  compiler  actually 
generates?  I  doubt  it.  Do  all  compilers 
do  this  because  Mr.  Jones  thinks  they 
should?  I  doubt  it. 

Contrary  to  popular  belief,  per¬ 
formance  still  matters.  CPU  time  is 
not  infinite.  Memory  is  not  unlimited 
(virtual  memory  included).  In  the  ac¬ 
ademic  world  unpleasantries  such  as 
the  hardware  can  be  ignored.  In  the 
real  world,  they  cannot. 

Eric  Lundquist 

Centurion  Dealers  Computer  Corp. 

402  West  Bethany 

Allen,  TX  75002 

Programmer’s  Hell 

Dear  DDJ, 

"Factoring  in  Forth”  by  Michael  Ham 
(October  1986)  makes  almost  every 
mistake  in  the  book.  Although  the 
author  waxes  eloquent  about  factor¬ 
ing,  he  gives  listings  that  illustrate  the 
three  cardinal  sins  of  Forth. 

Although  it  is  common  to  recom¬ 
mend  short  Forth  definitions,  you 
should  not  lose  sight  of  two  of 
the  original  aims  of  the  lan¬ 
guage:  compactness  and 
speed.  Each  time  you  decom¬ 
pose  a  definition,  you  pay  a 
penalty  in  time  and  space. 
Therefore,  factoring  is  worth¬ 
while  only  if  one  of  the  fol¬ 
lowing  two  conditions  is  met: 
some  of  the  components  may 
be  useful  in  other  construc¬ 
tions,  or  the  components  con¬ 
tain  some  genuine  ideas  that 
may  well  be  separated  so  that 
you  can  analyze  their  imple¬ 
mentation.  Listing  Six  in  the 
article  provides  an  ideal  bad 
example — the  only  term  used 
later  is  @B1T,  which  is  fac¬ 
tored  using  four  new  defini¬ 
tions.  You  could,  however, 
define  it  more  efficiently  us¬ 
ing  only  standard  terms: 


@BIT  SWAP  8  /MOD  ROT  + 
C@  SWAP  0  DO  2/  LOOP  1  AND 
NEG; 


FUNCTION  PROTOTYPES: 
Don’t  leave  home  without  them. 


10 


Dr.  Dobb's  Journal,  August  1987 

581 


LETTERS 

(continued  from  page  10) 


(I  use  a  DO  so  that  n  n  DO  does  noth¬ 
ing.)  Whether  you  want  to  introduce: 

:  AIM  SWAP  8  /MOD  ROT  +  ' 

depends  on  how  often  you  intend  to 
use  this,  but  the  other  three  new 
terms  are  useless;  one — S>B — is  a 
synonym  for  0<>.  Who  needs  it? 

Mr.  Ham  likes  to  make  definitions 
using  CREATE  and  then  storing.  He 
should  use  VARIABLE  and  ALLOT.  This 
is  not  a  pecadillo.  In  debugging  Forth 
you  need  to  know  what  words  do. 
You  can  consider  the  dictionary  as 
being  divided  into  primitives  and 
high-level  terms.  Primitives  are  ma¬ 
chine-code  definitions;  there  has  to 
be  some  way  of  flagging  them.  Then 
you  need  a  disassembler  to  see  what 
is  there.  The  high-level  terms  are 
those  defined  in  terms  of  primitives 
and  other  high-level  terms.  On  the 
first  level,  you  have  two  species — de¬ 
fining  words  using  ;CODE  and  defin¬ 
ing  words  using  DOES>.  Those  that 
use  ;CODE  should  be  few  in  number 
so  they  can  be  traced.  Typical  exam¬ 
ples  are  :  and  VARIABLE.  You  recog¬ 
nize  a  colon  definition  by  looking  at 
the  contents  of  its  CFA.  For  DOES> 
there  is  no  problem  because  the  con¬ 
tents  of  the  CFA  signal  a  DOES>  defi¬ 
nition  and  the  first  word  of  the  pa¬ 
rameter  field  points  to  the  address 
where  the  procedure  is  given.  In¬ 
stead  of  the  definition  of  BITS  given 
in  the  listing,  Mr.  Ham  should  have 
used: 

VARIABLE  BITS  6  ALLOT 

This  tells  anyone  that  bits  is  a  storage 
word  because  its  CFA  will  contain  the 
address  of  the  ;CODE  procedure  for 
VARIABLE. 

Finally,  Listing  Five  contains  a  di¬ 
saster — VECTOR:.  This  is  a  defining 
term  that  creates  words  requiring  an 
input,  and  wrong  input  here  could 
destroy  a  disk.  When  you  do  any  seri¬ 
ous  programming,  you  make  lots  of 
mistakes  before  you  get  it  right.  The 
most  serious  problem  with  Forth  is 
that  it  lets  you  do  anything  you  want, 
including  making  catastrophic  mis¬ 
takes.  The  most  common  source  of 
such  errors  is  storing  data  in  the 
wrong  place  because  you  may  over¬ 


write  code.  At  best  you  won’t  en¬ 
counter  the  code  you  have  de¬ 
stroyed.  At  second  best  you  get  a 
crash.  At  worst  you  may  have  creat¬ 
ed  new  code  that  makes  sense  but 
writes  nonsense  to  a  disk  directory. 
I've  done  this  too  often. 

Mr.  Ham  goes  one  better  with  his 
execution  arrays — he  seems  to  think 
the  big  issue  is  a  matter  of  changing 
the  name  VECTOR:  to  EMPOWER.  He 
lists: 

4  DO-OPTION  (  unpredictable  results  ) 

where  DO-OPTION  is  defined  using 
VECTOR:.  It  would  be  nice  if  the  re¬ 
sults  were  unpredictable,  but  Til  tell 
you  what  will  happen:  the  CPU  will 
start  reading  a  location  in  memory  as 
if  it  were  instruction  code.  If  you’re 
lucky,  it  reaches  nonsense  code  be¬ 
fore  much  has  happened  and  a  mere 
crash  results.  If  you  are  unlucky,  it 
will  start  executing  something  that  is 
indeed  unpredictable,  but  it  may  be  a 
disk-write  or  it  may  be  a  change  in 
DOS.  The  chances  of  wrecking  a  disk 
are  not  negligible,  even  if  small  (de¬ 
stroying  a  hard  disk  even  once  a  year 
is  not  worth  it). 

The  fact  is  that  a  definition  such  as 
VECTOR:  should  send  its  perpetrator 
to  Programmer’s  Hell.  (The  CASE 
statement  could  do  what  Mr.  Ham 
wants  with  complete  safety  and  no 
new  defining  terms.)  Regardless  of 
the  language  being  used,  the  first 
duty  of  any  programmer  designing 
an  interactive  program  is  to  antici¬ 
pate  wrong  input  from  users.  This  is 
the  hardest  part  of  programming.  At 
the  least,  users  pressing  the  wrong 
key  should  not  cause  a  catastrophe. 
What  are  called  "user  friendly”  pro¬ 
grams  generally  have  an  easy  time — 
they  are  genuinely  "programmer 
friendly”  because  the  users  have  lim¬ 
ited  options.  Command-driven  pro¬ 
grams,  especially  those  written  in 
Forth,  let  users  make  many  mistakes. 
Some  of  these  are  their  responsibil¬ 
ity:  if  they  have  to  type  ENDEDIT  to 
leave  a  word  processor,  it’s  not  the 
programmer’s  duty  to  worry  about 
whether  they  wanted  to  save  first.  In 
contrast,  if  typing  4  instead  of  3 
wrecks  a  disk  instead  of  taking  you  to 
DOS  as  you  intended,  you  may  be 
tempted  to  enter  the  capital  punish¬ 


ment  debate  on  the  wrong  side. 

Carl  Herz 
McGill  University 
Mathematics  &  Statistics 
805  Sherbrooke  West 
Montreal,  QB 
Canada  H3A  2K6 

Michael  Ham  replies: 

I  agree  with  Carl’s  two  conditions  re¬ 
garding  when  a  name  is  needed  but 
add  also  a  third:  factor  when  a  name 
will  make  the  code  more  readable. 
Compactness  and  speed  are,  of 
course,  desirable,  and  if  the  addition¬ 
al  definition  has  a  serious  negative 
impact  on  these,  by  all  means  re¬ 
think  the  factoring.  But  I  find  that  in 
looking  at  code  long  after  I  wrote  it,  I 
am  greatly  helped  by  names.  S>B 
(single  to  Boolean,  named  by  analogy 
with  S>D )  helps  me  understand 
what  is  happening  in  a  way  that 
0<>  does  not.  Another  example:  I 
typically  use  the  definition  0  CON¬ 
STANT  US>D  to  be  able  to  use  the  in¬ 
formative  label  US>D  (unsigned  sin¬ 
gle  converted  to  double)  in  my  source 
code  rather  than  a  mysterious  0.  Sim¬ 
ilarly,  AIM  was  easier  for  me  to  read 
and  remember  than  were  the  words 
in  its  definition.  In  the  application  I 
wrote,  the  speed  and  size  penalties 
were  negligible. 

Of  the  definitions  +BIT,  -BIT, 
@BIT,  and  ~BIT,  it  is  true,  as  Carl 
points  out,  that  I  used  only  @BIT  in 
the  column.  I  should  have  made  it 
clearer  that  the  other  words  were  of¬ 
fered  for  the  reader's  toolbox,  should 
he  or  she  sometime  want  to  set,  un¬ 
set,  or  toggle  a  bit  in  addition  to  fetch¬ 
ing  it. 

I  appreciate  Carl’s  suggestions  of 
using  VARIABLE  when  creating  ar¬ 
rays  of  constants.  The  values  within 
the  variable  can  then  be  set  with  Cl. 
(Values  beyond  the  variable  can  still 
be  stored  with  C,  and  so  the  ALLOT  is 
not  needed  in  this  case.)  His  approach 
certainly  works,  and  I  recommend  it 
to  those  who  make  heavy  use  of 
disassemblers. 

I  also  agree  with  Carl’s  warnings 
about  the  disasters  that  can  ensue 
from  executing  random  locations  in 
memory.  I  have  never  had  the  mis¬ 
fortunes  he  describes,  but  I  can  un¬ 
derstand  that  they  are  possible.  I  am 
(continued  on  page  140) 


12 

582 


Dr.  Dobb's  Journal,  August  1987 


ARTICLES 


Preparing  for 

M8IC 


As  most  of  you  know 
by  now,  there  is  an 
effort  afoot  by  the 
American  National  Stan¬ 
dards  Institute  (ANSI)  to 
standardize  the  C  lan¬ 
guage.  Some  articles  about 
the  standard,  most  of  them 
pretty  general  and  glossy, 
have  appeared  in  various  magazines.  This  article  is  in¬ 
tended  to  be  specific  and  useful.  I  hope  to  give  you  an 
idea  of  how  the  standard  may  be  different  from  the  C 
you're  used  to,  how  it  will  affect  the  code  you  have 
already  written,  and  how  you  can  minimize  your  ad¬ 
justments  to  the  new  compilers  by  taking  certain  actions 
now. 

Make  no  mistake,  ANSI  C  is  coming.  Microsoft,  AT&T, 
IBM,  and  many  other  companies  are  actively  participat¬ 
ing  in  the  standard  process  by  sending  representatives 
(usually  compiler  writers)  to  the  ANSI  C  committee  meet¬ 
ings,  which  are  held  once  every  three  months.  Many  of 
these  companies,  including  AT&T,  have  announced  that 
they  will  provide  ANSI-conforming  compilers.  Although 
the  standard  will  probably  not  be  formally  accepted  un¬ 
til  early  1988,  nearly  all  the  issues  have  been  settled.  The 
current  working  document  (called  the  draft)  will  proba¬ 
bly  be  changed  very  little  before  it  becomes  "the'’ 
standard. 

I’ve  written  this  article  in  a  format  similar  to  the  orga¬ 
nization  of  a  normal  C  source  file.  After  discussing  a  few 
general  items  from  the  standard,  I'll  discuss  preproces¬ 
sor  directives,  module-scope  declarations,  function  defi¬ 
nitions,  function  bodies,  and  the  final  topic  will  be 
libraries. 


Richard  Relph,  846  Salt  Lake  Dr.,  San  Jose,  CA  95133.  Rich¬ 
ard  is  a  software  and  hardware  consultant.  He  has  written 
compilers  and  embedded  systems. 


General  Items 

To  get  around  limitations 
in  certain  standard  charac¬ 
ter  sets,  the  committee  has 
introduced  "trigraphs." 
Trigraphs  are  three-char¬ 
acter  sequences,  beginning 
with  ??,  that  the  compiler 
replaces  with  a  character 
that  cannot  be  entered  directly  into  a  given  machine — 
for  example,  ??=  is  used  for  ??!  for  1,  and  so  on.  I 
doubt  that  trigraphs  will  see  much  use  in  the  U.S.,  but  in 
certain  countries  they’ll  make  the  difference  between 
only  admiring  C  and  being  able  to  use  it.  Note  that  tri- 
graphs  are  replaced  even  inside  strings. 

One  problem  area  the  standard  addresses  is  the  name 
space  of  programs.  Which  names  are  permissible  for 
programmers  to  use,  which  are  reserved  by  the  compil¬ 
er,  and  which  are  used  by  the  library  has  always  been 
an  open  issue. 

The  new  standard  includes  a  rule  that  says  all 
names  beginning  with  _  belong  to  the  implemen¬ 
tation  and  should  not  be  used  by  any  program.  Also, 
any  identifiers  specified  in  the  standard  as  keywords, 
functions,  or  macros  are  reserved,  regardless  of  wheth¬ 
er  or  not  the  header  file  that  defines  the  function  or 
macro  is  included.  All  other  names  belong  to  the 
programmer. 

In  order  to  allow  programmers  to  write  programs 
and  be  able  to  compile  them  on  any  machine  that 
has  an  ANSI-conforming  compiler,  several  compilation 
limits  have  been  given  lower  bounds.  These  are  de¬ 
scribed  in  Table  1,  page  18.  In  addition,  all  types  have 
been  given  minimum  ranges,  which  are  provided  to  a 
program  through  a  header  file.  Basically,  char  objects 
are  a  minimum  of  8  bits,  shorts  are  16  bits  or  more,  ints 
are  at  least  as  big  as  shorts,  and  longs  are  a  minimum  of 
32  bits. 


by  Richard  Relph 


What  changes  will  you  need 
to  make  your  programs 
work  with 
an  ANSI  compiler? 


16 


Dr.  Dobb  s  Journal,  August  1987 

583 


The  Preprocessor 

The  preprocessor  has  always  been  a  part  of  C  but  sepa¬ 
rate.  Because  of  this,  and  its  lack  of  certain  features,  the 
preprocessor  has  fostered  more  implementation  differ¬ 
ences  than  any  other  part  of  C  except  the  library.  Now 
the  preprocessor  is  a  part  of  the  language,  specified 
with  the  same  rigor  as  any  other  part. 

Simple  # define  directives  are  the  same  as  before — you 
can  still  define  an  identifier  to  mean  something  else. 
What  has  changed  is  the  range  of  possibilities.  The  big¬ 
gest  changes  are  "string-izing,”  token  pasting,  and  re¬ 
cursive  definitions. 

Because  it's  often  convenient  to  be  able  to  use  a  pa¬ 
rameter  both  inside  and  outside  a  string,  a  string-izing 
preprocessor  operator  has  been  defined.  Consider  the 
following  example: 

^define  val(x)  printfi  "x  =  %d\n",  x  ) 

val(  yyzzy  )  is  replaced  either  by: 

printfi  "xyzzy  =  %d\n",  xyzzy  ) 

or  by: 

printfi  "x  =  %d\n",  xyzzy  ) 

In  the  past,  compilers  could  not  be  relied  upon  to  re¬ 
place  the  occurrence  of  a  formal  argument  within  a 
string  literal  with  the  actual  argument.  In  fact,  K  &  R 
specified  that  string  literals  were  just  that  and  they 
should  not  be  scanned  for  replacement.  The  facility  is 
useful,  however. 

So,  #  occurring  before  the  name  of  a  formal  argument 
is  now  replaced  by  the  actual  argument,  as  in  the  fol¬ 
lowing  example: 

^define  val(x)  printfi  *x  "  =  %d\n",  x  ) 


is  always  replaced  by: 

printfi  "xyzzy"  "  =  %d\n",  xyzzy  ) 

which  is  equivalent  to: 

printfi  "xyzzy  =  %d\n",  xyzzy  ) 

This  example  also  shows  another  new  feature — adja¬ 
cent  string  concatenation.  The  standard  specifies  that 
adjacent  (in  the  token  sense,  disregarding  white  space) 
string  literals  shall  be  combined  into  a  single  string  liter¬ 
al.  This  means  that  "1"  "2"  "3"  is  identical  to  "123". 

Another  useful  but  nonstandard  feature  of  some  im¬ 
plementations  is  token  pasting — the  creation  of  a  single 
token  from  two  separate  ones.  Many  C  compilers  pro¬ 
vide  this  facility  with  the  following  mechanism: 

^define  xl  23 
^define  m(z)z/*  */l 

Here  m(  y  J  is  replaced  by  y2,  which  in  some  compilers  is 
replaced  by  23.  The  new,  approved  way  to  do  the  re¬ 
placement  all  the  way  to  23  is: 

^define  xl  23 
^define  m(z)z#*l 

so  that  m(  y  )  produces  23. 

In  the  past  the  single  most  reliable  way  to  break  a  C 
compiler  (and  use  up  inordinate  amounts  of  CPU  time 
and  memory  space)  was  to  write  a  sequence  such  as: 

^define  x  x 
x 

This  should  no  longer  wreak  havoc.  The  standard 
says  that  a  macro,  once  expanded,  turns  itself  off  for  the 


Dr.  Dobb  s  Journal,  August  1987 

584 


17 


ANSiC 

(continued  from  page  17) 

duration  of  rescanning.  This  has  potential  usefulness  in 
statements  such  as  the  following: 

#  define  sizeof  (int)  sizeof 
^define  char  unsigned  char 

These  fragments  depend  on  another  new  feature  of  the 
draft — that  of  defining  keywords. 

Include  files  are  related  to  both  the  preprocessor  and 
library  issues.  The  standard  now  specifies  which  include 
files  must  exist  and  their  contents.  The  list  of  include  files 
that  must  exist  is  provided  in  Table  2,  below.  Note  that  the 
files  from  this  list  are  the  only  required  files  and  should 
be  used  when  referring  to  a  standard  function  or  macro.  I 
cannot  recommend  strongly  enough  that  you  should  in¬ 
clude  these  files  when  a  definition  is  needed. 

Including  these  files  has  several  advantages.  First,  a 
prototype  is  provided  for  each  of  the  related  functions, 
which  aids  in  diagnosing  improper  library  usage.  (I’ll  dis¬ 
cuss  prototypes  later  in  this  article.)  Second,  some  func¬ 
tions,  such  as  printf  require  a  prototype  before  use  be¬ 
cause  they  take  a  variable  number  of  arguments.  Such 
functions  cannot  be  called  with  maximum  efficiency  on 
some  machines  (including  the  8086),  and  therefore  the 
compiler  is  allowed  to  require  you  to  specify  when  it 
must  use  suboptimal  code.  Third,  many  compilers  al¬ 
ready  (and  more  will)  define  "internal”  functions  for  im¬ 
portant  library  functions. 

Such  functions  as  memcpy  and  strcpy  can  often  be  per¬ 
formed  in-line  (without  a  function  call)  with  a  great  in¬ 
crease  in  speed  and  very  little  increase  in  code  space. 
(MetaWare,  for  example,  already  uses  in-line  function  ex¬ 
pansion  for  some  functions.)  Fourth,  many  library  func¬ 
tions  do  not  modify  one  or  more  arguments.  In  such  situa¬ 
tions,  the  compiler  need  not  worry  about  such  functions 
disrupting  an  optimization  because  the  library  declara¬ 
tions  in  the  header  files  provide  the  compiler  with  the 
needed  information.  Without  this  information,  the  com¬ 
piler  must  assume  that  the  function  modifies  anything  it 
can. 

Many  of  these  advantages  may  not  be  present  in  com- 


15  “levels”  of  statements 

6  nesting  levels  of  conditional  compilation  directives 
1 2  type  modifiers  per  declarator 
1 27  levels  of  parentheses 

31  characters  in  2  cases  of  significance  in  internal  names 
6  characters  in  1  case  of  significance  in  external  names 
51 1  external  names  in  one  source  file 
1 27  local  variables  per  block 
1 024  macros  per  source  file  at  any  time 
31  parameters  to  either  a  function  or  macro 
509  characters  in  a  “logical”  source  line 
509  characters  per  string  constant 
32767  bytes  in  a  single  object 

8  nesting  levels  of  include  files 
257  cases  in  a  switch  statement 


Table  Is  Lower  bounds  of  several  compilation  limits 


pilers  for  some  time,  but  rest  assured  that  they  will  be 
eventually.  Many  vendors  that  build  Pascal  and  C  compil¬ 
ers  could  easily  take  advantage  of  the  more  optimal  call¬ 
ing  convention  provided  by  Pascal.  And  many  recently 
released  compilers  support  type-checking  features  to  aid 
library  usage  diagnosis. 

Module-Scope  Declarations 

The  next  major  section  of  your  program,  after  the  *  in¬ 
cludes  and  * defines ,  is  usually  module-level  declarations 
and  definitions,  which  is  where  variables  manipulated 
by  the  source  file  are  declared  or  defined.  The  standard's 
principle  changes  here  concern  types. 

Characters  still  have  implementation-defined  signed¬ 
ness,  although  all  types  from  char  through  long  may  have 
specified  signedness  via  the  unsigned  and  signed  type 
modifiers.  Signed  is  required  only  for  chars  and  bit  fields 
as  all  other  types  default  to  signed. 

There  are  now  three  floating-point  types  —float,  dou¬ 
ble,  and  long  double  (note  that  long  float  is  no  longer 
acceptable). 

There  is  also  a  new  type  called  enum,  which  allows 
programmers  to  associate  named  values  (not  unlike  *  de¬ 
fined  values)  to  variables  of  enum  type.  No  new  function¬ 
ality  is  created  here,  only  an  aid  to  documentation.  The 
classic  example  is: 

enum  color  {  red,  blue,  green,  white }; 
enum  color  pixel; 

Here,  enum  color  is  a  type  (just  as  struct  tag  is)  to  which  the 
values  red,  blue,  green,  and  white  apply,  and  pixel  is  an 
instance  of  an  enum  color  variable.  Certain  features  of 
enum  are  important  to  keep  in  mind.  First,  red  can  appear 
anywhere  an  integer  constant  can  appear,  even  in  ex¬ 
pressions  that  do  not  involve  the  enum  color  type.  As  a 
result,  only  one  enum  can  declare  the  name  red.  Further, 
no  variables  can  be  so  named.  A  variable  with  enum  type 
behaves  identically  to  an  integer  in  any  expression.  If  the 
compiler  determines  that  a  particular  set  of  enum  values 
fits  in  a  smaller  type,  it  is  free  to  allocate  less  space  to  any 
objects  of  that  enum  type.  Last,  you  can  specify  values  for 
any  or  all  of  the  names  in  enum. 

The  1 Xew  Void 

The  ANSI  standard  also 
gives  C  a  void  type, 
which  is  useful  in  three 
contexts:  function  re¬ 
turn  types,  pointed  to 
types,  and  prototypes. 

Many  existing  C  com¬ 
pilers  implement  void- 
type  functions.  These 
“functions”  are  really 
procedures  that  do 
things  but  don’t  return  a 
value.  Examples  include 
exit  and  longjmp  (which 
don’t  return  at  all)  and 
free. 

Void  pointers  are 


float. h 

limits,  h 

stddef.h 

assert. h 

ctype.h 

locale. h 

math.h 

setjmp.h 

signal. h 

stdarg.h 

stdio.h 

stdlib.h 

string. h 

time.h 


Table  Z:  Include  files  that 
must  exist 


18 


Dr.  Dobb's  Journal,  August  1987 

585 


ANSIC 

(continued  from  page  18) 


pointers  that  do  not  point  at  anything  in  particular  but 
that  can  point  to  anything.  Malloc  returns  a  void  pointer, 
and  free  accepts  one.  The  neat  thing  about  void  pointers  is 
that  they  can  be  assigned  to  any  pointer  and  can  be  as¬ 
signed  from  any  pointer.  So  all  those  malloc,  calloc,  and 
realloc  calls  no  longer  need  to  have  casts  in  front  of  them. 
And  now  you  finally  have  a  way  to  store  a  pointer  with¬ 
out  having  to  invent  some  fake  type  for  it. 

The  last  use  of  void  is  in  prototypes,  which  I'll  discuss 
later  in  the  section  on  function  definitions. 

Const  and  volatile  are  new  key  words  that  are  used  in 
variable  declarations  and  definitions.  Const  tells  the  com¬ 
piler  that  the  associated  object  cannot  be  modified  (by  the 
compiler’s  generated  code).  It  serves  mostly  to  aid  the 
compiler  in  determining  where  objects  can  be  placed  and 
to  help  the  compiler  warn  of  an  attempt  to  modify  the 
object — although  a  clever  compiler  realizes  that  once  the 
value  has  been  fetched,  it  need  never  be  fetched  again. 
Volatile  is  a  directive  to  the  compiler  that  every  pro¬ 
grammed  read  and  write  of  the  object  must  take  place  as 
specified.  A  compiler  cannot  optimize  out  reads  or  writes 
to  such  objects.  Usually,  volatile  objects  are  either  I/O  ob¬ 
jects  (such  as  a  UART  registers)  or  semaphores  of  some  sort. 

External  objects  have  also  changed  in  a  couple  of  ways. 
First,  each  such  object  must  be  "defined"  exactly  once.  A 
definition  is  a  declaration  outside  the  scope  of  any  func¬ 
tion  that  does  not  include  the  keyword  extern.  Unlike 
Unix,  you  cannot  (portably)  define  an  object  more  than 
once.  Second,  each  such  object  must  (unenforceably)  be 
unique  in  the  first  six  (yes,  6)  characters  without  regard  to 
case  distinctions.  Note  that  this  restriction  does  not  apply 
to  any  "internal"  names. 

Function  Definitions 

The  use  of  prototypes  is  probably  the  most  significant 
change  to  the  language  that  the  committee  has  made.  Us¬ 
ing  them  will  improve  documentation,  help  the  compiler 
detect  stupid  mistakes,  generate  better  code,  enhance 
portability,  and  assure  compatibility  with  future  C  stan¬ 
dards.  I  alluded  to  many  of  these  features  when  I  dis¬ 
cussed  include  files. 

A  prototype  improves  documentation  because  it  de¬ 
clares  the  type  (and  optionally  the  name)  of  the  argu¬ 
ments  to  a  function.  For  example,  the  standard  function 
memcpy  could  have  the  following  prototype  declaration: 

void  *memcpy(  void  *dest,  const  void  *src,  size_t  n  ); 

This  declaration  tells  the  compiler  that  memcpy  is  a  func¬ 
tion  that  returns  a  pointer  to  an  unspecified  object.  It 
takes  exactly  three  parameters — the  first  has  the  name 
dest,  the  second  src,  and  the  third  n.  The  first  two  param¬ 
eters  are  pointers  to  unspecified  objects,  and  the  last  is  an 
integral  type  large  enough  to  hold  the  number  of  bytes  in 
the  machine.  Finally,  the  second  parameter  is  a  pointer  to 
an  area  in  memory  not  modified  by  the  function. 

Not  including  a  prototype  for  a  function  that  takes  a 
variable  number  of  arguments  could  be  fatal.  The  reason 
is  this:  C  is  almost  unique  in  its  ability  to  deal  with  func¬ 


tions  that  take  a  variable  number  of  arguments.  Pascal, 
Modula-2,  Ada,  and  FORTRAN  cannot  deal  with  user-pro¬ 
vided  functions  that  take  a  variable  number  of  argu¬ 
ments.  Because  of  this,  the  callee  (the  function  being 
called)  can't  use  instructions  designed  to  clean  up  the 
stack  on  function  exit.  In  the  8086,  for  example,  the  RET 
instruction  has  an  optional  operand  that  specifies  the 
number  of  bytes  to  be  added  to  the  stack  pointer  after  the 
return  address  has  been  fetched.  Instead  of  this,  the  func¬ 
tion  has  to  use  a  simple  RET,  and  the  calling  function  (say, 
main)  has  to  clean  up  the  stack  when  it  gets  control  back, 
usually  with  an  ADD  SP,n  instruction.  Not  only  is  this  se¬ 
quence  slower  but  it  is  also  larger  because  the  ADD  ap¬ 
pears  everywhere  a  call  to  printf  occurs  rather  than  once 
at  the  end  of  printf.  Of  course,  this  requirement  does 
mean  that  if  the  caller  and  the  callee  disagree  about  the 
number  of  arguments,  at  least  the  stack  doesn’t  get  mis¬ 
aligned. 

To  let  C  use  the  faster  return  mechanism,  ANSI  says,  in 
essence,  that  you  must  tell  the  compiler  the  names  of  all 
functions  that  take  a  variable  number  of  arguments.  The 
way  you  do  this  is  by  supplying  a  prototype.  If  the  last 
argument  in  a  prototype  is  . . . ,  then  the  compiler  knows 
that  the  function  takes  a  variable  number  of  arguments 
and  therefore  that  it  must  use  the  larger,  slower  return 
mechanism.  It  is  important  to  note  that  the  standard  spec¬ 
ifies  that  the  compiler,  upon  encountering  a  call  to  a  func¬ 
tion  that  it  does  not  have  a  prototype  for,  can  assume  that 
the  function  takes  a  fixed  number  of  arguments  of  the 
supplied  types  (after  promotion). 

The  header  files  supplied  with  an  ANSI-conforming 
compiler  include  prototype  information  for  each  func¬ 
tion  in  the  library. 

The  last  problem  with  prototypes  is  how  you  specify 
that  a  function  takes  no  arguments  because  void  foot  );  is 
already  legal.  This  is  where  the  keyword  void  shows  up 
again.  To  declare  a  function  that  takes  no  arguments,  you 
place  void  between  the  parens. 

Function  Bodies 

The  only  changes  to  the  language  that  affect  actual  code 
are  all  enhancements  except  for  one,  which  is  a 
clarification. 

Switch  expressions  can  be  any  integral  type,  up  to  and 
including  unsigned  long.  Floating-point  expressions  can 
be  evaluated  in  any  of  the  floating-point  types.  Structures 
(and  unions)  can  be  passed  to  and  returned  from  func¬ 
tions  or  assigned.  Arrays  can  have  their  addresses  taken, 
and  the  resultant  value  is  of  type  pointer  to  array,  which 
is  decidedly  different  from  pointer  to  first  element  of 
array. 

Two  subtle  changes  are  of  great  importance  to  8086 
programmers.  First,  sizeof  does  not  return  an  int  (it  is 
size__t,  defined  in  stddef.h),  and  malloc  and  related  func¬ 
tions  expect  size_f-type  quantities  when  dealing  with 
numbers  of  objects  (usually  chars).  The  other  change  is 
that  pointers  to  functions  are  different  from  pointers  to 
data. 

The  only  new  operator  is  unary  +.  With  it  you  can 
force  the  compiler  to  evaluate  subexpressions  by  them¬ 
selves  instead  of  in  the  context  of  some  larger  expression. 
For  example,  a*b/c  could  be  evaluated  either  by  multi- 


20 

586 


Dr.  Dobb's  Journal,  August  1987 


ANSIC 

(continued  from  page  20) 


plying  a  and  b  then  dividing  by  c,  or  by  dividing  b  by  c 
and  then  multiplying  by  a.  Now  you  can  specify  the  or¬ 
der,  if  you  think  it  is  important.  The  expression  a  *  +(b/c) 
(note  the  unary  plus)  forces  the  compiler  to  do  the  divide, 
whereas  +(a  *  b)/c  forces  the  compiler  to  do  the  multi¬ 
ply.  This  construct  does  not  force  the  compiler  to  do  ei¬ 
ther  one  first  (although  it  is  hard  to  imagine  how  these 
examples  could  be  done  otherwise),  it  merely  forces  the 
compiler  to  fully  evaluate  the  subexpression  by  itself 
without  combining  it  with  any  other  components  of  the 
expression. 

The  last  item  I’ll  mention  here  is  unsigned  vs.  value- 
preserving  conversions.  The  following  code  has  two  pos¬ 
sible  answers,  depending  on  which  of  the  two  conversion 
rules  you  apply: 

unsigned  char  uc; 
int  i,  x; 

uc  =  2; 
i  =  -1; 

x  =  ((uc  *  i)  /  2); 

The  only  place  in  which  this  is  a  problem  is  when  an 
unsigned  quantity  is  expanded  in  size  in  the  course  of  one 
subexpression  and  this  subexpression  is  expanded  yet 
again;  shifted  right;  or  is  an  operand  of  /,%,<,<  =  ,>,  or 
>  =  as  part  of  a  larger  expression. 

In  the  unsigned-preserving  case,  (uc  *  i)  produces  an 
unsigned  int  (value  Oxfffe),  which  when  divided  by  2 
yields  0x7fff  and  gives  (signed)  y  the  value  32,767.  In  the 
value-preserving  case,  (uc  *  i)  produces  a  signed  int  (if  all 
possible  unsigned  chars  are  representable  in  a  signed  int) 
with  value  -2  (hex  Oxfffe),  which  when  divided  by  2 
yields  -1  (hex  Oxffff).  This  example  shows  a  case  in  which 
value  preserving  is  useful,  but  I  could  show  other  exam¬ 
ples  in  which  unsigned  preserving  is  better  behaved.  All 
this  is  irrelevant  if  you  use  casts.  In  my  example,  the 
placement  of  an  int  cast  before  either  the  uc  or  the  (uc  *  i) 
would  have  caused  the  same  result  to  be  produced  in 
either  case.  Most  code  works  equally  well  in  either  envi¬ 
ronment,  so  don't  lose  any  sleep  about  this  issue.  In  fact, 
in  all  the  source  code  for  Unix,  I  can't  think  of  a  single 
instance  in  which  this  matters. 

Libraries 

An  important  change  to  the  C  language  is  that  a  library  is 
now  specified.  All  hosted  implementations  must  provide 
all  the  functions  specified.  Table  3,  right,  lists  all  the  func¬ 
tions  to  be  included  in  prototype  form.  Note  that  the  open, 
read,  write,  close,  creat,  and  unlink  functions  are  all  miss¬ 
ing  from  this  set.  These  were  deemed  the  domain  of  the 
operating  system  and  somewhat  redundant,  given  fopen 
and  so  on. 

Library  function  names  and  macros  are  reserved.  The 
reason  is  so  that  function  a  in  the  library  can  reliably  call 
function  b  without  worrying  that  the  programmer  has 
replaced  b  with  a  function  of  his  or  her  own.  It  also  allows 


the  compiler  to  recognize  these  functions  and  to  generate 
special  code  for  them  if  it  wants  to. 

The  name-space  issue  I  mentioned  under  “General 
Items”  will  create  some  problems  in  the  near  term  as 
compiler  vendors  try  to  decide  what  to  do  about  the  now 
“nonstandard”  functions  in  their  libraries  that  have 
names  that  do  not  belong  to  the  implementation.  Two 
possible  solutions  exist.  The  first  is  to  deliver  these  func¬ 
tions  as  an  add-on  library,  preserving  their  names  in  this 
library  separate  from  the  standard  library.  The  other  is  to 
change  the  names  of  these  functions  in  the  standard  li¬ 
brary  to  have  leading  underscores  and  then  provide 
header  files  that  users  can  include  that  define  each  of  the 

assert. h  void  assert(  int  expression ) 
ctype.h  int  isalnum(  int  c ) 
int  isalpha(  int  c ) 
int  iscntrl(  int  c ) 
int  isdigitf  int  c ) 
int  isgraph(  int  c ) 
int  islower(  int  c ) 
int  isprint(  int  c ) 
int  ispunct(  int  c ) 
int  isspace(  int  c ) 
int  isupper(  int  c ) 
int  isxdigit(  int  c ) 
int  tolower(  int  c ) 
int  toupper(  int  c ) 

locale.h  char  *setlocale(  int  category,  char  *locale ) 
math.h  double  acos(  double  x  ) 
double  asin(  double  x ) 
double  atan(  double  x ) 
double  atan2(  double  y,  double  x ) 
double  cos(  double  x ) 
double  sin(  double  x ) 
double  tan(  double  x  ) 
double  cosh(  double  x ) 
double  sinh(  double  x ) 
double  tanh(  double  x ) 
double  exp(  double  x ) 
double  frexp(  double  value,  int  *exp ) 
double  ldexp(  double  x,  int  exp ) 
double  log(  double  x ) 
double  logl  0(  double  x ) 
double  modf(  double  value,  double  *iptr ) 
double  pow(  double  x,  double  y ) 
double  sqrt(  double  x ) 
double  ceil(  double  x ) 
double  fabs(  double  x ) 
double  floor(  double  x  ) 
double  fmod(  double  x,  double  y ) 
setjmp.h  int  setjmp(  jmp_buf  env  ) 

void  longjmp(  jmp_buf  env,  int  val ) 
signal.h  void  ( *signal(  int  sig,  void  ( *func )( int )))( int) 
int  raise(  int  sig ) 

stdarg.h  void  va_start(  va — list  ap,  parmN  ) 

type  va_arg(  va _ list  ap,  type ) 

void  va_end(  va _ list  ap ) 

stdio.h  int  remove(  const  char  'filename ) 

int  rename(  const  char  'old,  const  char  'new ) 

FILE  *tmpfile(  void ) 
char  *tmpnam(  char 's ) 
int  fdose(  FILE  'stream  ) 
int  fflush(  FILE  'stream  ) 

Table  3:  All  functions  that  must  be  included 


22 


Dr.  Dobb’s  Journal,  August  1987 

587 


old  function  names  in  terms  of  the  new  names.  I  prefer 
the  latter  approach  as  it  gives  me  the  ability  to  edit  the 
'  old''  names  I  wish  to  be  visible  without  editing  the  li¬ 
brary  or  my  source  files  and  I  can  also  use  the  functions 
(nonportably)  by  using  their  _  names.  It  also  lets  imple¬ 
mentors  continue  to  rely  on  the  existence  of  these  func¬ 
tions  by  using  the  _  versions. 

Conclusion 

ANSI  C  is  coming,  and  it  is  good.  Unlike  many  (dare  I  say  all) 
previous  language  standards,  this  effort  looks  as  though  it 
will  genuinely  help  portability  of  C  programs  without 
harming  most  existing  programs.  I  believe  that  you  can 


write  important  programs  that  will  run  reliably  on  any 
computer  that  supports  an  ANSI  C  environment  without 
changing  even  a  single  line  of  code. 

One  last  warning:  there  can  be  no  truly  ANSI-conform- 
ing  compiler  until  the  standard  is  adopted.  Any  compiler 
vendor  claiming  conformance  prior  to  that  isn't  telling 
the  whole  truth.  Further,  unless  such  compilers  address 
the  name-space  issue,  they  never  will  be  conforming. 

DDJ 

Vote  for  your  favorite  feature/article. 

Circle  Reader  Service  No.  1 . 


FILE  *fopen(  const  char  'filename,  const  char  ‘mode ) 

FILE  *freopen(  const  char  ‘filename,  const  char  ‘mode, 

FILE  'stream ) 

void  setbuf(  FILE  ‘stream,  char  *buf ) 

int  setvbuf(  FILE  ‘stream,  char  *buf,  int  mode,  size_t  size ) 

int  fprintff  FILE  ‘stream,  const  char  'format, ... ) 

int  fscanf(  FILE  'stream,  const  char  ‘format, ... ) 

int  ptintf(  const  char  ‘format, ... ) 

int  scanf(  const  char  ‘format, ... ) 

int  sprintf(  char  *s,  const  char  ‘format, ... ) 

int  sscanf(  const  char  *s,  const  char  'format, ... ) 

int  vfprintf(  FILE  'stream,  const  char  ‘format,  va _ list  arg  ) 

int  vprintf(  const  char  ‘format,  va _ list  arg ) 

int  vsprintf(  char  *s,  const  char  ‘format,  va_list  arg ) 

int  fgetc(  FILE  ‘stream  ) 

char  *fgets(  char  *s,  int  n,  FILE  'stream  ) 

int  fputc(  int  c,  FILE  ‘stream  ) 

int  fputs(  const  char  *s,  FILE  'stream  ) 

int  getc(  FILE  'stream  ) 

int  getchar(  void ) 

char  'gets(  char  *s ) 

int  putc(  int  c,  FILE  ‘stream  ) 

int  putchar(  int  c ) 

int  puts(  char ‘s ) 

int  ungetc(  int  c,  FILE  ‘stream ) 

size_t  fread(  void  *ptr,  size_t  size,  size_tnmemb,  FILE 

'stream ) 

size— t  fwrite(  const  void  *ptr,  size— t  size,  size— t  nmemb, 

FILE  ‘stream ) 

int  fgetpos(  FILE  ‘stream,  fpos_ t  *pos ) 
int  fseek(  FILE  ‘stream,  long  int  offset,  int  whence ) 
int  fsetpos(  FILE  'stream,  const  fpos_ t  'pos ) 
long  int  ftell(  FILE  'stream  ) 
void  rewind(  FILE  'stream  ) 
void  clearerr(  FILE  ‘stream  ) 
int  feof(  FILE  ‘stream  ) 
int  ferror(  FILE  'stream  ) 
void  perror(  const  char  *s ) 
stdlib.h  double  atof(  const  char  *nptr ) 
int  atoi(  const  char  'nptr ) 
long  int  atol(  const  char  ‘nptr ) 
double  strtod(  const  char  'nptr,  char  "endptr ) 
long  int  strtol(  const  char  ‘nptr,  char  “endptr,  int  base ) 
unsigned  long  int  strtoul(  const  char  ‘nptr,  char”endptr,  int 

base) 

int  rand(  void ) 

void  srand(  unsigned  int  seed ) 
void  *calloc(  size— t  nmemb,  size— t  size ) 
void  free(  void  *ptr ) 
void  *malloc(  size— t  size ) 


void  *realloc(  void  'ptr,  size— t  size ) 
void  abort(  void ) 
int  atexit(  void  ( *func )( void )) 
void  exit(  int  status ) 
char  *getenv(  const  char  ‘name ) 
int  system!  const  char  ‘string ) 
void  *bsearch(  const  void  'key,  const  void'base,  size— t 
nmemb,  size— t  size,  int  ('compar  X  const  void  *, 

const  void  *)) 

void  qsort(  void  ‘base,  size— t  nmemb,  size— t  size,  int 
( 'compar  X  const  void  ', 
const  void  *)) 
int  abs(  int  j ) 

div_ t  div(  int  nymer,  int  denom  ) 
long  int  labs(  long  int  j ) 
ldiv_ t  ldiv(  long  int  numer,  long  int  denom ) 
string. h  void  *memcpy(  void  'si ,  const  void  *s2,  sizen ) 

void  *memmove(  void  'si ,  const  void  *s2,  size— t  n  ) 
char  *strcpy(  char  ‘si ,  const  char  *s2 ) 
char  ‘strncpy!  char  ‘si ,  const  char  *s2,  size— t  n ) 
char  *strcat(  char  ‘si ,  const  char  *s2 ) 
char  *strncat(  char  ‘si ,  const  char  *s2,  size— t  n  ) 
int  memcmp!  const  void  ‘si ,  const  void  *s2,  size— t  n ) 
int  strcmp!  const  char  ‘si ,  const  char  *s2 ) 
int  strncmp!  const  char  ‘si ,  const  char  *s2,  size_t  n ) 
size— t  strcoll(  char  ‘to,  size— t  maxsize,  const  char  ‘from ) 
void  *memchr(  const  void  *s,  int  c,  size— t  n ) 
char  *strchr(  const  char 's,  int  c ) 
size— t  strcspn(  const  char  ‘si ,  const  char  *s2 ) 
char  *strpbrk(  const  char  ‘si ,  const  char  *s2 ) 
char  *strrchr(  const  char ‘s,  int  c ) 
size— t  strspn(  const  char  'si ,  const  char  *s2 ) 
char  *strstr(  const  char  ‘si ,  const  char  *s2 ) 
char  *strtok(  char  ‘si ,  const  char  *s2  ) 
void  *memset(  void 's,  int  c,  size— t  n  ) 
char  *strerror(  int  errnum  ) 
sizet  strlen(  const  char  *s ) 
time.h  dock— t  clock!  void  ) 

double  difftimef  time— t  timel ,  time— t  timeO ) 

time— t  mktime(  struct  tm  ‘timeptr ) 

time— t  time(  time— t  'timer ) 

char  *asctime(  const  struct  tm  'timeptr ) 

char  *ctime(  const  time— t  ‘timer ) 

struct  tm  *gmtime(  const  time— t  ‘timer ) 

struct  tm  *localtime(  const  time— t  ‘timer ) 

size— t  strftime(  char 's,  size— t  maxsize,  const  char  'format, 

const  struct  tm  ‘timeptr ) 


Dr.  Dobb  s  Journal,  August  1987 

588 


23 


ARTICLES 


Backtracking 


by  Charles  F.  Bowman 


Most  of  the  time,  program¬ 
ming  is  a  straightforward 
matter.  You  analyze  the 
problem,  select  the  appropriate  data 
structures  and  algorithm,  and — after 
a  certain  amount  of  work — you've 
finished.  Granted,  the  first  solution 
might  not  be  as  fast  as  you'd  like,  or 
as  elegant,  but  at  least  you  have  the 
advantage  of  knowing  the  problem  is 
solvable.  But  what  about  those  occa¬ 
sions  when  the  path  to  a  solution 
isn’t  so  clear?  This  article  is  about  a 
programming  method — called  back¬ 
tracking — that  is  commonly  used  in 
AI  programming.  In  contrast  to  nor¬ 
mal  methods,  in  which  you  program 
all  the  steps  required  to  attain  your 
goal,  you  can  use  this  approach 
when  even  the  existence  of  a  solu¬ 
tion  can't  be  guaranteed. 

Backtracking  belongs  to  a  general 
class  of  programming  methods 
termed  nondeterministic  program¬ 
ming  (NDP).  In  NDP  you  don’t  code  the 
solution  to  your  problem — you  pro¬ 
gram  a  method  that  will  lead  to  a  so¬ 
lution.  The  program  literally  makes 
guesses  until  it  either  finds  a  solution 
or  exhausts  the  available  alterna¬ 
tives.  Moreover,  there  can  be  more 
than  one  solution  for  a  given  prob¬ 
lem.  This  method  has  obvious  bene¬ 
fits  in  AI  theory  and  expert  systems 
development. 

Backtracking  is  a  programming 
method  in  which  you  proceed  along 
a  given  ‘‘path”  searching  for  a  solu¬ 
tion.  At  each  fork  in  the  road,  you 
make  a  guess  as  to  which  path  you 


Charles  F.  Bowman,  24  Jacques  Awe., 
Staten  Island,  NY  10306.  Charles  is  a 
consultant  and  is  currently  writing  a 
textbook  on  data  structures.  He  holds 
an  M.S.  degree  from  New  York 
University. 


Use  this  approach 
when  the  existence 
of  a  solution 
isn't  guaranteed. 


should  follow  to  continue  your 
search.  If  this  choice  should  prove 
unsuccessful — that  is,  if  you  encoun¬ 
ter  a  dead  end — you  back  up  and  try 
a  different  path.  The  execution  con¬ 
tinues  in  this  manner  until  you  ei¬ 
ther  reach  a  solution  or  exhaust  all 
the  possible  choices.  The  latter  condi¬ 
tion  signifies  that  no  solution  exists, 
and  the  program  should  exit  with  an 
indicative  status.  If  you  think  that 
this  sounds  similar  to  a  depth-first 
traversal,  you  are  correct.  The  only 
significant  difference  is  that  with 
backtracking  the  decision  tree  is  im¬ 
plicit  rather  than  explicit. 


1:  bktkfindf  node  ) 

2:  begin 

3:  if (  node  -  SUCCESS  ) 

4:  then 

5:  return (  I_FOUND_IT  ) 

6:  endif 

7:  for{  each_choice_at_this_ 

node  ) 

8:  do 

9:  ret_stat  =  bktkfind 

{  child_node  ) 
10:  if {  ret_stat  »  SUCCESS  ) 

11:  then 

12:  return!  ret_stat  ) 

13:  endif 

14:  done 

15:  return!  FAIL  ) 

16:  endproc 


Example  Is  Pseudocode  for  a 
chronological  backtracking  function 


There  are  two  types  of  backtrack¬ 
ing:  chronological  backtracking  (CBT) 
and  dependency-directed  backtrack¬ 
ing  (DDB). 

Chronological  Backtracking 

CBT  is  effectively  an  exhaustive 
search,  similar  to  that  discussed  earli¬ 
er.  Each  solution  path  is  attempted, 
in  what  is  tantamount  to  a  random 
order,  until  one  of  two  outcomes  is 
determined. 

Consider  the  pseudocode  in  Exam¬ 
ple  1,  below,  for  example.  If  at  any 
time  a  solution  is  found  (lines  3  -  6,  9  - 
13),  the  function  returns  a  value  indi¬ 
cating  success.  If  not,  it  must  try  an 
alternate  choice  (lines  7  -  14).  If  all  the 
alternatives  have  been  exhausted 
(line  7),  a  value  indicating  failure  is 
returned,  forcing  the  function  to 
back  up  to  a  previous  path  (line  15) 
before  continuing  the  search. 

There  are  two  important  points  to 
consider  here.  First,  whenever  you 
perform  a  backup,  you  must  restore 
the  previous  environment  before 
trying  the  next  path.  Obviously,  this 
can  become  very  expensive.  Second, 
backtracking  is  typically  implement¬ 
ed  using  a  recursive  procedure, 
which  yields  an  algorithm  that  is  ex¬ 
ponential  in  order  of  execution  mag¬ 
nitude  (also  costly).  The  following 
paragraphs  discuss  methods  of  im¬ 
proving  the  basic  algorithm. 

Dependency-Directed 

Backtracking 

Dependency-directed  backtracking 
(DDB)  works  essentially  as  described 
earlier  but  tries  to  eliminate  some  un¬ 
necessary  searching  in  two  ways. 

First,  as  the  name  implies,  you  can 
backtrack  to  choices  that  are  depen¬ 
dent  on  the  dead  end.  That  is,  you 
back  up  until  you  reach  a  point  at 


24 


Dr.  Dobb's  Journal,  August  1987 

589 


which  a  dependency  was  created 
and  continue  searching  from  there. 
As  an  example,  consider  a  case  in 
which  you  are  searching  for  a  solu¬ 
tion  that  requires  four  conditions  (A, 
B,  C,  and  D)  to  be  satisfied.  Further, 
assume  you  reach  a  state  in  your  pro¬ 
cessing  at  which  conditions  A  and  B 
are  satisfied  but  C  and  D  are  not.  In 
lieu  of  just  automatically  backtrack¬ 
ing  to  the  previous  fork,  you  contin¬ 
ue  on  to  a  point  at  which  A  and  B  are 
still  true  and  resume  the  search  from 
there.  You  can  skip  all  the  interven¬ 
ing  paths. 

The  second  way  to  eliminate  un¬ 
necessary  searching  is  called  prun¬ 
ing.  If  you  reach  a  point  in  the  search 
at  which  it  becomes  obvious  that  any 
further  effort  on  a  given  path  is  fruit¬ 
less,  you  can  eliminate  the  remain¬ 
der  of  the  subtree  from  that  point  on¬ 
ward  (that  is,  force  a  backtrack  to 
occur).  Pruning  is  a  straightforward 
approach  and  is  often  implemented 
in  game-playing  simulations.  You 
could,  for  example,  write  a  chess 
program  that  could  determine  its 
next  move  by  assigning  a  quantum 
value  to  each  board  position  it  exam¬ 
ines.  At  any  given  point,  it  would  se¬ 
lect  the  move  that  yields  the  most  ad¬ 
vantageous  (highest)  value.  If  the 
algorithm  were  to  traverse  a  path 
representing  the  loss  of  a  player's 
queen,  it  could  elect  to  eliminate  any 
further  searching  along  that  trail. 

For  the  sake  of  completeness,  I 
should  also  mention  a  third  method 
of  improving  a  backtracking  proce¬ 
dure:  explicitly  managing  a  stack.  Re¬ 
cursive  procedures  are  costly  be¬ 
cause  of  the  considerable  amount  of 
overhead  required  for  each  succes¬ 
sive  call.  Your  program  must  save 
registers,  store  a  return  address,  allo¬ 
cate  local  storage,  and  so  on.  Most  of 
this  information  is  not  directly  relat¬ 
ed  to  the  problem  at  hand  and,  there¬ 
fore,  having  to  save  and  restore  it 
only  wastes  CPU  cycles.  You  can  save 
time  and  space  (at  the  expense  of  pro¬ 
grammer  effort — there  really  is  no 
such  thing  as  a  free  lunch!)  if  you 
code  the  stack  explicitly.  You  can  ac¬ 
complish  this  easily  by  transforming 
the  algorithm  from  its  recursive 
form  to  an  iterative  one  and  main¬ 
taining  the  to-do  list  in  an  applica¬ 
tion-controlled  data  structure.  (Note 
that  recursion  is  a  really  just  a  form 
of  iteration.) 


SETL 

My  first  encounter  with  backtrack¬ 
ing  occurred  when  I  attended  gradu¬ 
ate  classes  at  New  York  University. 
The  students  and  faculty  had  devel¬ 
oped  a  language  called  SETL  (Set  Lan¬ 
guage),  which  featured  a  pair  of 
built-in  primitives  (OK /FAIL)  that  sup¬ 
ported  backtracking.  As  an  example, 
consider  the  classic  eight-queens 
puzzle.  The  challenge  is  to  place  all 
eight  queens  on  a  chess  board  such 
that  no  two  queens  are  attacking 
each  other.  Example  2,  below,  pre¬ 
sents  an  SETL  solution  to  the  problem. 

Some  of  the  constructs  might  ap¬ 
pear  strange,  but  what's  important  to 
note  is  how  the  two  primitives,  OK 
and  FAIL,  can  free  the  programmer 
from  dealing  with  some  of  the  low- 
level  details  that  would  otherwise  be 
required.  Line  3  of  the  example  ini¬ 
tializes  a  language-controlled  stack  so 
that  each  time  the  statement  OK  is 
executed,  a  snapshot  of  the  execution 
state  is  saved.  (Actually,  each  variable 


B 

E 

S 

T 

R 

0 

T 

A 

M 

P 

0 

s 

Figure  1:  Sample  puzzle 


1:  posn  :=  []; 

2:  (forall  j  in  [  1..8  ]) 

3:  {  j  :  j  in  [  1..B  ]  |  OK  ) 

4:  unattacked  :=  (  1..8  }  - 

{  posn (k)  +  ( j-k) *slope 

5:  :  k  in  [  1 . .  j-1  ], 

6:  slope  in  [  -1..+1  ]  } 

7:  if(  unattacked  =  {}  ) 

8 :  then 
9 :  FAIL; 

10:  else 

11:  posn ( j)  :=  ord  unattacked; 

12:  endif; 

13:  end  forall ; 

14:  print (  posn  ); 


Example  Z:  Eight-queens  problem 


that  you  want  saved  must  be  identi¬ 
fied  at  declaration  time.)  Lines  4,  5, 
and  6  compute  all  the  currently  unat¬ 
tacked  (that  is,  available)  positions 
and  stores  them  in  the  variable  unat¬ 
tacked.  If  none  exist  for  a  given  board 
configuration  (line  7),  the  function 
executes  the  FAIL  statement  on  line  9 
and  backtracks;  if  unattacked  is  non¬ 
empty,  one  position  is  selected  ran¬ 
domly  (line  11)  and  stored  in  posn. 
When  the  forall  loop  terminates,  the 
board  positions  are  printed. 

An  Acrostic  Example 

As  is  customary  in  DDJ  articles,  I've 
included  the  source  code  for  a  dem¬ 
onstration  program  (see  Listing  One, 
page  50  ).  This  program  is,  however, 
slightly  less  serious  than  those  you 
usually  find  in  DDJ.  The  program, 
called  kross,  solves  acrostic  puzzles 
using  a  backtracking  algorithm. 
You're  probably  familiar  with  this 
type  of  puzzle — examples  can  be 
found  in  just  about  every  newsstand 
puzzle  magazine.  If  you're  not,  an 
acrostic  puzzle  is  simply  a  crossword 
puzzle  without  the  clues:  you  are 
supplied  with  the  words  and  the  dia¬ 
gram  and,  through  trial  and  error, 
you  must  enter  all  the  words  into 
their  appropriate  slots  (see  Figure  1, 
below). 

The  program  requires  one  argu¬ 
ment — the  name  of  the  file  contain¬ 
ing  a  description  of  the  puzzle  (yes, 
you  have  to  type  the  puzzle  in!).  The 
file  is  divided  into  two  sections.  The 
first  is  a  description  of  the  puzzle  dia¬ 
gram:  a  series  of  lines — one  for  each 
row — containing  either  blanks  or  mi¬ 
nus  signs  (this  can  be  modified  by  the 
user,  if  desired).  These  characters 
represent  the  black  boxes  and  the  ac¬ 
tual  character  locations,  respective¬ 
ly.  The  second  section  is  just  the  list 
of  words,  one  per  line,  that  the  pro¬ 
gram  will  use  to  solve  the  puzzle; 
they  can  be  entered  in  any  order. 

A  couple  of  notes:  all  puzzle-de¬ 
scription  lines  must  be  of  equal 
length  (the  program  checks  for  this). 
Also,  take  the  time  to  ensure  that  all 
the  words  are  spelled  correctly.  The 
program,  as  you  would  expect,  is  un¬ 
forgiving  in  this  regard. 

Example  3,  page  26,  contains  a 
sample  input  file  for  the  puzzle  of 
Figure  1,  and  Example  4,  page  26, 
contains  the  output  generated  by  the 
program. 


Dr.  Dobb's  Journal,  August  1987 

590 


25 


BACKTRACKING 

(continued  from  page  25) 


The  overall  operation  of  the  pro¬ 
gram  is  straightforward:  read  the 
puzzle  and  word  list  into  internal 
data  structures;  search  for  a  solution; 
and  if  there  is  a  solution,  print  it.  The 
actual  backtracking  logic  is  in  the 
function  solvet ).  This  is  a  recursive 
procedure  that: 

•  Chooses  and  determines  the  size  of 
the  next  puzzle  slot  to  fill  (horizontal 
or  vertical).  This  processing  is  per¬ 
formed  by  the  function  ne\t(  )  and  is 
by  necessity  a  rather  messy  bit  of 
code. 

•Selects  at  random  (that  is,  sequen¬ 
tially)  an  appropriately  sized  word 
from  the  available  list.  The  function 
itfitsf )  is  called  to  ensure  that  a  given 
word  fits  into  the  slot  (in  typical 
crossword  puzzle  fashion). 

•  If  the  word  fits,  enters  it  into  the 
puzzle.  At  this  point,  with  the  aid  of 
the  function  enterf ),  a  snapshot  of 
the  current  state  (puzzle)  is  saved. 

•  Recursively  calls  itself  to  continue 
toward  a  solution. 

•  If  at  any  point  a  solution  is  found 
(that  is,  there  are  no  more  slots  to  fill), 
returns  the  value  SOLVED. 

•  If  a  recursive  call  fails  to  find  a  solu¬ 
tion,  the  puzzle  is  restored  to  its  pre¬ 
vious  state  ( restore)  ) );  the  word  is  re¬ 
turned  to  the  free  list;  the  next 
available  word  is  selected;  and  if 
none  remain,  the  function  returns 
the  value  FAIL  to  its  caller. 


Let's  trace  the  execution  of  the 


Spuzzle 

$-$- 

('$'  -  Blank) 

-$$- 

Swords 

best 

tamp 

tops 

era 

to 

Example  3:  Sample  input 


best 

$r$o 

('$■  =  Blank) 

tamp 

o$$s 

Example  4:  Sample  output 


function  solve (  )  as  it  begins  to  solve  a 
sample  puzzle.  The  line  numbers 
that  follow  refer  to  Example  5, 
below.  Also,  the  random  selection  of 
the  words  is  in  the  order  in  which 
they  appear  in  Figure  1. 

First,  a  four-letter  word  is  needed 
for  the  1-across  position.  The  func¬ 
tion  randomly  selects  best  (line  14), 
marks  it  as  USED  (line  16),  and  inserts 
it  into  the  puzzle  (line  17).  A  recursive 
call  is  then  made  to  continue  the  pro¬ 
cessing  (line  19).  Next,  for  the  2-down 
position,  a  three-letter  word  is  need¬ 
ed  and  era  is  similarly  inserted  into 
the  puzzle. 

The  function  now  moves  to  fill  the 
4-down  position.  It  selects  the  next 
available  four-letter  word,  tamp  (line 
13);  checks  to  see  that  it  fits  (line  14); 
and  inserts  it  also  into  the  puzzle  (line 
17). 

The  next  slot  to  fill  is  5  across,  and 
the  program,  as  usual,  selects  the 
next  available  four-letter  word — in 
this  case  tops.  This  time,  however, 
the  itfitsf  )  (line  15)  test  fails.  Recog¬ 
nizing  that  the  list  of  four-letter 
words  has  been  exhausted  (line  13), 
the  function  performs  a  backtrack 
(line  27). 

The  program  now  resumes  pro¬ 
cessing  at  the  point  at  which  it,  again, 
needs  to  solve  the  4-down  position.  It 
discards  its  current  choice,  tamp 
(lines  22  and  23),  and  selects  the  next 


available  word,  tops  (line  14).  From 
this  point  on,  the  program  solves  the 
puzzle  without  any  additional 
difficulties. 

Summary 

Backtracking  can  be  an  extremely 
powerful  tool  for  programmers,  al¬ 
though  it  is  always  an  expensive  so¬ 
lution.  Nonetheless,  it  can  be  a  tool 
that  enables  you  to  solve  problems 
that  would  be  otherwise  unsolvable. 

Availability 

All  the  source  code  for  articles  in  this 
issue  is  available  on  a  single  disk.  To 
order,  send  $14.95  to  Dr.  Dobb's  Jour¬ 
nal,  501  Galveston  Dr.,  Redwood  City, 
CA  94063  or  call  (415)  366-3600  ext. 
216.  Please  specify  the  issue  number 
and  format  (MS-DOS,  Macintosh,  Kay- 
pro).  If  you  would  rather  not  have  to 
retype  the  entire  program  and/or 
you  would  like  some  sample  puzzles 
to  work  with,  send  me  a  check  for  $6 
and  I  will  mail  you  an  MS-DOS  floppy 
(360K  format)  containing  the  pro¬ 
gram,  source  code,  and  several  sam¬ 
ple  puzzles. 

DDJ 

(Listing  begins  on  page  52.) 

Vote  for  your  favorite  feature/article. 

Circle  Reader  Service  No.  2. 


1 

solve!  length,  width  ) 

2 

int  length,  width; 

4 

int  1,  w,  i,  len,  tmp,  type; 

5 

char  old[  WORDLEN  -  MINWORD  +  1  ]; 

7 

w  =  width; 

8 

1  =  length; 

9 

len  =  next!  Si,  Sw,  Stype  ); 

10 

if  (  len  ==  0  ) 

11 

return!  SOLVED  ); 

12 

13 

ford  =  0;i<MAXWORDS SWORD  (len,  i)  [0]  !=NIL;i++)  ( 

14 

if!  FLAG  (len,  i)  =  FREE 

15 

SS  itfits(l,  w,  WORD  (len,  i) ,  type)  ){ 

16 

FLAG  (len,  i)  =  USED; 

17 

enter  (old,  1,  w,  WORD  (len,  i),  type); 

18 

prev  =  type; 

19 

tmp  =  solve!  1,  w  ); 

20 

if (  tmp  ==  SOLVED  ) 

21 

return (  SOLVED  ) ; 

22 

restore  (  old,  1,  w,  type  ) ; 

23 

FLAG (len,  i)  =  FREE; 

24 

} 

25 

) 

26 

27 

return  (  0  ) ; 

28 

) 

Example  5;  The  function  solve! ) 


26 


Dr.  Dobb's  Journal,  August  1987 

591 


ARTICLES 


What’s  the  DIFF? 

by  Don  Krantz 


File  comparison  programs 
have  been  around  for  a  long 
time,  so  you  might  reason¬ 
ably  ask,  "Do  we  really  need  anoth¬ 
er?"  If  your  work  involves  only 
small  changes  in  a  program’s  source 
code,  the  answer  is  probably  "No” — 
a  generic  file  comparison  utility  is 
just  fine.  In  contrast,  for  those  who 
write  and  update  the  documentation 
for  that  program,  it  can  be  an  entire¬ 
ly  different  story. 

This  article  got  its  start  when  we 
were  faced  with  issuing  revision  doc¬ 
uments  for  a  large  software  project 
at  work.  The  project  had  literally 
hundreds  of  associated  documents 
(performance  specs,  interface  specs, 
design  specs,  detailed  design  specs, 
design  documents,  development 
plans,  and  so  on),  and  our  customer 
requested  that  we  mark  the  docu¬ 
ments  with  change  bars.  (Change 
bars  are  vertical  lines,  usually  in  the 
right-hand  margins,  that  mark  the 
sections  of  text  that  have  changed 
since  the  last  release.)  The  request 
made  sense:  change  bars  save  read¬ 
ers  from  having  to  make  a  laborious 
line-by-line  comparison  of  the  docu¬ 
mentation  looking  for  changed  text. 
Still,  it  meant  someone  would  have  to 
make  all  those  line-by-line  compari¬ 
sons  to  mark  the  text.  Not  really  anx¬ 
ious  to  mark  thousands  of  pages  of 


Don  Krantz,  2845  42nd  Ave.  S,  Minne¬ 
apolis,  MN  55406.  Don  is  a  principal 
computer  applications  engineer  at 
Honeywell's  Defense  Systems  Divi¬ 
sion.  He  is  a  coauthor  of  Ada:  A  Pro¬ 
grammer’s  Guide  with  Microcom¬ 
puter  Examples  and  principal  author 
of  68000  Assembly  Language  Pro¬ 
gramming,  both  published  by  Addi- 
son-Wesley. 


A  text  file 
differencer 
and 

change-bar  tool 
in  C 


documentation  manually,  I  started 
looking  for  a  good  tool  to  change-bar 
the  text  automatically,  or  at  least  to 
locate  the  changes  for  me. 

I  started  my  search  by  trying  the 
DOS  COMP  command.  It  fired  up, 
looked  at  the  two  versions  of  a  file  I 
gave  it,  and  reported  "FILE  SIZES  ARE 
DIFFERENT.’’  With  that,  it  exited,  leav¬ 
ing  me  groping  around  my  work 
area  for  the  hammer  I  use  on  partic¬ 
ularly  annoying  software. 

The  next  place  I  looked  was  the 
BeeB  (the  TCOG  bulletin  board),  rea¬ 
soning  that  any  system  with  42  dif¬ 
ferent  directory  programs  was 
bound  to  have  at  least  one  file  com- 
parsion  program  that  would  fit  my 
needs.  I  located  a  promising  file, 
downloaded  it,  and  gave  it  a  try.  It 
did  a  little  better  than  COMP  did,  tell¬ 
ing  me  that  the  files  differed  at  line 
64,  before  it  too  exited — still  not  quite 
good  enough. 

So  I  sat  down  and  made  a  list  of  the 
things  a  suitable  document  compari¬ 
son  program  should  be  capable  of: 

1.  The  program  should  know  about 
the  concept  of  pages  and  be  able  to 
recognize  different  sizes  of  pages  by 
line  count  and  form  feeds. 

2.  It  should  be  smart  enough  not  to 
mark  the  header  and  footer  lines  just 
because  the  run  date  has  changed. 


3.  It  should  be  able  to  cope  with  repa¬ 
gination  because  of  insertions  and 
deletions. 

4.  It  should  be  able  to  stay  locked  on 
identical  text  that  is  paginated  differ¬ 
ently,  even  with  intervening  headers 
and  footers. 

5.  It  should  be  able  to  deal  intelligent¬ 
ly  with  differing  amounts  of  white 
space  between  paragraphs  because 
of  conditional  paging. 

6.  It  would  be  nice  if  you  could  turn 
case  sensitivity  on  and  off  so  that  you 
could  slide  by  corrections  to  trivial 
capitalization  typos  without  drawing 
everyone's  attention  to  them. 

7.  It  would  also  be  good  to  be  able  to 
ignore  the  table  of  contents  because 
nobody  expects  to  find  change  bars 
there. 

8.  The  program  should  be  capable  of 
inserting  change  bars  (or  other 
marks)  directly  into  an  output  file, 
without  any  manual  massaging. 

9.  A  change  summary  listing  would 
also  be  desirable. 

After  looking  at  this  long  list  of  fea¬ 
tures,  and  at  the  software  available,  I 
decided  that  I’d  have  to  write  the 
program  myself.  The  rest  of  this  arti¬ 
cle,  as  you  must  have  guessed,  de¬ 
scribes  the  program  I  came  up 
with — DIFF. 

Listing  One,  page  66  contains  the 
program  listing  for  DIFF.  The  source 
code  provided  has  been  tested  on 
several  large  and  small  programs 
and  documents  under  both  MS-DOS 
and  VAX/VMS,  Version  4.3.  It  compiles 
without  change  under  both  Micro¬ 
soft  C,  Version  4.00,  and  VAX-11  C  be¬ 
cause  the  VAX  C  compiler  defines  the 
macro  VAX11C  automatically,  and  I 
use  this  to  key  the  differences  be¬ 
tween  the  two  environments. 


30 

592 


Dr.  Dobb’s  Journal,  August  1987 


WHAT'S  THE  DIFF? 

(continued  from  page  30) 


Commands  and  Options 

In  its  simplest  invocation,  DIFF  takes 
two  files  as  arguments.  The  first  file 
is  considered  the  "baseline,”  or  origi¬ 
nal  version  of  a  document,  and  the 
second  file  is  the  "revision,”  or  new 
version  of  the  document  (1  am  using 
document  for  both  program  source 
files  and  formatted  documents). 

DIFF’s  command  line  takes  the  fol¬ 
lowing  form: 

DIFF  [option  option}]  newfile  oldfile 

[barfile] 

where  newfile  is  the  name  of  the 
new  release  version  of  the  document 
or  source  code.  It  must  be  a  unique 
file  name  (that  is,  it  cannot  have  wild¬ 
card  characters),  and  it  can  include  a 
path  name.  Oldfile  is  the  name  of  the 
baseline  version  of  the  document  or 
source  code.  It  is  subject  to  the  same 
rules  as  newfile  is.  Barfile  is  optional, 
and  if  supplied,  it  indicates  that  you 
want  an  output  file  with  change 
bars.  The  output  file  is  a  copy  of  new¬ 
file  with  change  bars  on  new  or  mod¬ 
ified  lines.  If  barfile  is  not  specified, 
no  change-bar  output  will  be  creat¬ 
ed,  but  you’ll  still  get  the  change 
summary  output. 

Some  example  command  lines  are: 

DIFF  FILE1.DOC  FILE1.BAK 
DIFF  /BLANKS  /LOOKAHEAD=20  FILE- 
l.DOC  FILE1.BAK 
DIFF  /BLANKS  FILE1.DOC  FILE1.BAK 

FILE1.PBN 

Options  take  the  form  of  VAX/VMS 
options,  which  are  similar  to  MS-DOS 
command-line  switches.  An  option 
begins  with  a  slash  (/),  followed  by  a 
series  of  characters.  If  the  option  re¬ 
quires  a  numeric  parameter  (for  ex¬ 
ample,  the  number  of  lines  in  a 
page),  the  series  of  characters  is  fol¬ 
lowed  by  an  equal  sign  (  =  )  and  a 
number,  without  spaces. 

Spaces  between  options  are  op¬ 
tional,  and  placement  of  the  options 
on  the  command  line  is  optional.  Up¬ 
percase/lowercase  in  an  option  is 
not  significant.  Typical  options  look 
like  this: 

/BLANKS 

/LOOKAHEAD  =  200 


In  addition,  options  can  be  abbrevi¬ 
ated,  as  long  as  enough  characters 
are  used  to  make  the  option  name 
unique.  For  instance,  the  previous 
examples  could  be  abbreviated  in  the 
following  ways: 

/BL  /B  /BLA  /BLAN  .  .  . 

/LO  =  200  /LOOK  =  200  .  .  . 

Unrecognized  options  cause  the 
program  to  print  an  error  message 


and  then  halt.  The  entire  command 
line  is  parsed  before  the  program  ex¬ 
its  if  errors  occur. 

DIFF  options  along  with  the  syntax 
and  default  value  of  each  option  are 
listed  in  Table  1,  below.  DIFF  error 
messages  along  with  the  cause  of  the 
error  are  listed  in  Table  2,  page  34. 

Output 

DIFF  has  two  forms  of  output — one 
standard  and  one  optional.  The 


BAR— COL 

Selects  the  column  in  which  the  change  bar  will  be  placed.  The  default  is  column  78.  If  col¬ 
umn  0  is  selected,  the  change  bar  will  be  placed  at  the  left  edge  of  the  document  or  source 
code,  moving  text  to  the  right  if  necessary.  If  a  space  or  a  tab  is  the  first  character  on  the 
line,  the  alignment  of  the  text  should  not  be  affected.  If  a  nonzero  column  is  selected,  the 
change  bar  will  be  placed  in  the  specified  column  if  possible.  If  text  extends  over  the  speci¬ 
fied  column,  the  change  bar  will  be  moved  as  far  to  the  right  as  is  necessary  not  to  over¬ 
write  text. 

A  "feature”  of  the  change-bar  algorithm  is  that  it  can  recognize  and  account  for  under¬ 
lined  text,  at  least  the  way  Runoff  and  WordStar  underline  for  “generic”  printers,  but  not 
tabs.  Tabs  are  counted  as  one  column  each.  This  produces  amusing  results  on  C  source 
code. 

For  best  results,  on  document  files  choose  a  column  to  the  right  of  your  text.  On  pro¬ 
gram  source  code  with  embedded  tabs,  choose  column  0.  This  is  admittedly  an  area  in 
which  the  program  could  be  improved.  I  needed  the  underline  capability,  and  if  the  tab  ex¬ 
pansion  is  added,  the  code  in  change — barf )  gets  uglier  than  it  already  is. 

Example:  BAR—COL=0. 

/TOP-SKIP 

Used  for  processing  formatted  documents.  Its  primary  reason  for  being  is  to  allow  you  to 
skip  over  the  header  line(s)  in  a  document.  Header  lines  confuse  diff  if  they  are  left  in 
place  because  diff  doesn’t  know  from  chopped  liver  about  headers  unless  you  tell  it,  and 
if  the  pagination  changes  between  the  old  and  the  new  files,  DIFF  will  cheerfully  change- 
bar  every  header.  Be  sure  to  account  for  blank  lines  at  the  top  of  the  page  that  precede 
the  header  line(s).  The  default  value  of  /TOP— SKIP  \s  0. 

Example:  /TOP-SKlP=3. 

/BOT-SKIP 

Similar  in  use  and  purpose  to  the  /TOP— skip  option.  It  specifies  how  many  lines  at  the  bot¬ 
tom  of  the  page  should  be  skipped.  Count  lines  up  from  the  bottom,  not  down  from  the 
top.  The  default  value  of  /BOT—SKIP  is  0. 

Example:  /botskip=8. 

/PAGE—LEN 

Sets  the  length  of  a  page  in  lines.  The  default  value  is  66  lines.  A  form-feed  character  will 

override  the  /PAGE _ LEN  value  and  cause  a  new  page  to  be  started.  DIFF  needs  to  know 

the  page  length  in  order  to  skip  headers  and  footers  if  /TOP— SKIP  and  /BOT—SKIP  are 
specified.  Also,  the  change  summary  lists  changes  by  page  and  line  number.  For  nonpa- 
ginated  text,  such  as  program  source  code,  you  should  specify  a  value  of  /PAGE—LEN 
greater  than  the  number  of  lines  in  newfile  so  that  the  change  summary  line  numbers  will 
correspond  to  file  line  numbers. 

Example:  /PAGE-  LEN =2000. 

UP— CASE,  /NOUP—CASE 

Controls  whether  or  not  the  case  of  alphabetic  characters  is  significant  when  deciding  if  a 
line  has  changed.  The  default  is  /NOUP-CASE,  which  means  that  the  case  of  a  letter  is  sig¬ 
nificant.  This  is  slightly  faster  than  /UP— CASE. 

Examples:  /UP—  CASE  and  /NOUP—CASE. 

(continued  on  page  33) 

Table  1:  Options 


32 


Dr.  Dobb's  Journal,  August  1987 

593 


standard  output  is  a  summary  list 
showing  deletions  from  the  baseline 
and  additions  to  the  revision.  Option¬ 
ally,  it  can  create  an  additional  third 
file  that  is  a  copy  of  the  revision  file 
with  change  bars  on  lines  that  have 
changed  or  been  added.  No  notation 
is  added  to  the  created  file  to  indicate 
the  lines  in  the  baseline  file  that  don't 
appear  in  the  revision  file. 

The  summary  output  has  this 
form: 


+  pp:ll  ->  text 
or: 

-pp:ll  ->  text 

Lines  that  are  in  newfile  but  not  old- 
file  are  preceded  by  a  plus  sign  (+) 
(for  “added" — get  it?).  Lines  that  are 
in  oldfile  but  not  newfile  are  preced¬ 
ed  by  a  minus  sign  (-)  (for  “taken 
away”). 


The  rest  of  the  summary  output 
is  the  same  for  added  or  deleted 
lines.  The  pp:ll  portion  of  the  output 
is  the  page  number  followed  by  the 
line  number  where  the  difference 
occurs.  Added  lines  show  page/line 
in  newfile.  Deleted  lines  show  page/ 
line  in  oldfile.  Text  is  the  text  of  the 
changed  line.  For  example,  if  one 
word  in  one  line  is  changed  between 
the  baseline  and  the  new  release, 
two  summary  outputs  will  be 
shown — the  new  line  from  newfile 
and  the  old  line  from  oldfile.  This 
feature  makes  it  convenient  to  com¬ 
pare  the  differences  without  having 
to  hunt  for  change  bars.  Each  group 
of  differences  is  separated  by  a 
blank  line  in  the  change  summary 
output. 

Operation 

The  program's  basic  operation  is 
pretty  simple.  It  compares  the  two 
files  line  by  line  until  it  finds  a  differ¬ 
ence.  When  this  happens,  it  drops  a 
marker  in  both  files  at  the  point  of 
difference  and  scans  ahead  through 
both  files  to  find  where  the  text 
matches  again. 

When  the  files  are  resynced,  the 
text  between  the  point  of  difference 
and  the  resync  point  in  both  files  is 
change-barred  and  output  to  the  dif¬ 
ference  summary.  The  basic  re- 
synchronization  algorithm  takes 
only  about  80  lines  of  code.  The  rest 
of  the  program  is  accounted  for  by  a 
command-line  parser  and  option 
handlers. 

The  major  data  structure  used  is 
struct  LINE.  This  structure  is  allocated 
dynamically  and  contains  a  line 
from  one  of  the  input  files  in  both 
original  and  uppercase  form,  the  line 
and  page  from  which  the  line  was 
taken,  and  a  link  pointer  used  to 
string  lines  together.  As  the  files  are 
searched  for  resync,  the  text  from 
the  two  files  is  chained  into  a  linked 
list  for  each  file.  If  blanks  and  head¬ 
ers/footers  are  being  excluded,  they 
are  also  held  in  the  linked  list  while 
the  program  compares  the  next  sig¬ 
nificant  lines. 

reason  I  never  get  them  right  the  first 
time.  There’s  just  something  about 
me  and  linked  lists  that  causes  me  to 
leave  subtle  bugs  in  the  code  I  write. 
By  now,  all  the  bugs  are  (hopefully) 
out  of  this  code,  and  the  code  for  han- 


/RE-SYNC 

Controls  how  many  lines  must  match  between  the  two  files  after  a  difference  has  been 
found  before  the  two  files  are  considered  to  be  back  in  sync.  The  default  is  five  lines.  Us¬ 
ing  a  larger  number  will  make  DIFF  smarter  when  considering  files  that  have  a  lot  of  identi¬ 
cal  lines  (such  as  BEGIN  or  END  statements  in  Pascal).  Using  a  smaller  number  will  make 
diff  smarter  when  considering  a  file  that  has  a  lot  of  small  changes  spaced  closely  to¬ 
gether.  For  text,  a  value  of  2  or  3  is  good.  For  source  code,  a  value  of  5  is  pretty  good. 
Example:  / RE-SYNC=2 . 

/OUTPUT 

Allows  you  to  specify  an  output  file  for  the  change  summary  listing.  In  MS-DOS,  this  is  ex¬ 
actly  equivalent  to  redirecting  standard  output  with  the  greater-than  command-line  option, 
and  you  can  use  either  way  in  MS-DOS.  In  VMS,  this  matches  the  VMS  standard  redirection 
syntax.  The  default  for  /output is  SYSSOUTPUT  on  VMS  and  the  console  on  MS-DOS,  but 
this  can  be  redirected  as  a  command  option. 

Example:  /OUTPUT =FILE1.  SUM. 

/BLANKS,  / NOBLANKS 

Lets  you  make  blank  lines  (for  my  purposes,  a  blank  line  contains  only  spaces  or  tabs)  ei¬ 
ther  significant  or  insignificant,  /noblanks  is  the  default  and  means  that  blank  lines  are 
not  considered  to  be  significant.  This  is  the  most  useful  as  it  accounts  for  conditional  pag¬ 
ing  and  trivial  source  code  prettying. 

Examples:  /blanks  and  /noblanks. 

/LOOKAHEAD 

Controls  how  far  diff  will  look  forward  in  both  files  to  find  a  rematch  after  it  finds  a  differ¬ 
ence.  The  default  is  200  lines.  A  larger  value  lets  you  process  files  in  which  several  pages 
are  added  or  deleted  between  revisions.  A  smaller  value  runs  much  faster  and  uses  less 
memory.  Resynchronization  time  (in  the  general  case)  is  proportional  to  the  square  of/ 
LOOKAHEAD.  This  value  also  affects  the  amount  of  memory  the  program  uses. 

Example:  /LOOKAHEAD =50. 

/SKIP1 

Allows  you  to  specify  a  number  of  pages  to  skip  in  the  two  files  before  starting  the  com¬ 
pare.  This  is  most  useful  when  skipping  tables  of  contents,  in  which  the  page  numbers 
may  change  but  nobody  cares.  The  default  for  /skipi  is  0  pages.  The  /skipi  option  sets 
the  page-skip  values  for  both  newfile  and  oldfile. 

Example:  /SKIPI  =3. 

/SKIP2 

Same  as  /SKIPI,  except  that  it  only  affects  the  page-skip  value  for  oldfile.  Because  /skipi 
affects  both  files,  the  /SKIP2 option  must  appear  to  the  right  ot/SKlPi  on  the  command  line 
to  have  any  effect.  The  two  options  are  provided  because  the  tables  of  contents  may  be 
of  different  lengths.  There  are  probably  other  reasons  why  /SKIP2  needs  to  be  here,  but  I 
can't  think  of  any  right  now. 

Example:  /SKIP2=4. 

/TRACE 

Conditionally  compiled  and  turns  on  function  tracing  (see  the  main  text  for  more  on  debug¬ 
ging  options). 


Table  1:  Continued 


Dr.  Dobb's  Journal,  August  1987 

594 


33 


WHAT'S  THE  DIFF? 

(continued  from  page  33) 


dling  the  linked  lists  is  about  half  its 
initial  size.  Exorcising  the  problems 
with  linked  lists  is  the  reason  for  the 
trace,  ret,  and  ret—val  macros.  If  the 
debugging  tools  are  compiled  in,  the 
program  can  display  function  entries 
and  exits  and  the  call  stack  on  de¬ 
mand.  (The  demand  call  stack  dis¬ 
play  only  works  under  MS-DOS.) 

A  Functional  Description 

The  function  main(  )  processes  the 
command  line,  opens  the  files,  and 
checks  for  command-line  errors.  If 
no  errors  are  found,  it  runs  the  dif¬ 
ference  check. 

Function  dont—Iook(  )  decides  if  a 
line  is  significant  or  not.  If  a  null 
pointer  is  given  to  it,  it  returns  FALSE, 


indicating  that  the  line  is  significant. 
(This  lets  you  factor  some  end-of-file 
logic  out  of  later  loops.)  If  the  line 
comes  from  the  header  or  footer 
area,  or  if  the  line  doesn't  contain 
printing  characters  and  the  /BLANKS 
option  was  used,  it  returns  TRUE ;  oth¬ 
erwise,  it  returns  FALSE. 

Function  equal (  )  decides  if  two 
lines  are  identical.  If  the  /UP— CASE 
option  was  used,  the  uppercased 
lines  are  compared;  otherwise,  the 
original  lines  are  compared.  EquaK  ) 
returns  TRUE  if  the  lines  are  identi¬ 
cal.  On  a  small-memory  system,  this 
function  could  be  modified  to  per¬ 
form  the  uppercase  conversion  each 
time  instead  of  carrying  an  upper- 
cased  copy  in  struct  LINE,  which 
would  enable  you  to  specify  /LOOK¬ 
AHEAD  values  that  were  about  80 
percent  larger  than  if  you  left  the 


program  unchanged. 

Function  position(  )  is  used  to  posi¬ 
tion  the  linked-list  pointer  to  a  given 
line  in  the  linked  list.  This  is  used 
when  resyncing  files  after  a  differ¬ 
ence  is  found. 

Function  fi/d  )  is  included  for  the 
VAX/VMS  version.  The  VAX  C  version 
of  fgetsf  )  returns  a  carriage  return/ 
line  feed  pair  at  the  end  of  a  line, 
which  caused  me  problems  detect¬ 
ing  an  end  of  line  when  inserting  the 
change  bar.  The  carriage  return 
alone  can’t  be  used  because  of  em¬ 
bedded  carriage  returns  in  lines  that 
have  underlined  portions.  Fi/c(  )  con¬ 
verts  the  end  of  line  to  a  single  line 
feed  (new  line) — but  only  on  the  VAX. 

Function  indent  )  searches  a  string 
for  a  specified  character.  I  wrote  this 
because  of  nonstandard  standard  li¬ 
brary  names  for  this  function  in  the 
different  compilers  I  use. 

Function  ne/ct—line ( )  reads  a  line 
from  one  of  the  input  files  and  links 
it  into  the  linked  list  of  lines  in  that 
file.  Space  for  struct  LINE  is  allocated 
dynamically.  It  slows  operation,  but 
a  conditionally  compiled  switch  lets 
you  look  lor  form  feeds  within  a  line 
if  your  text  might  contain  a  form 
feed  in  any  character  position  other 
than  the  first  in  a  line.  Ne/ct—Iinel  ) 
also  keeps  track  of  page  and  line 
numbers  for  the  file.  If  the  program 
has  "looked  ahead”  at  the  file  from 
which  the  next  line  is  requested, 
ne/ct—linef  )  will  return  a  pointer  to 
the  memory  copy  rather  than  read¬ 
ing  a  line  from  disk. 

Function  discardf  )  deallocates 
lines  (allocated  by  neyf—Zmef  ) )  that 
are  no  longer  needed. 

Function  vfputsf  )  outputs  lines  to  a 
data  file.  In  MS-DOS,  it  is  simply  a  call 
to  fputsf  ).  In  VAX/VMS,  it  replaces  the 
line  terminator  with  a  carriage  re¬ 
turn/line  feed  pair. 

Function  put(  )  writes  matching 
lines  from  the  input  file  to  the 
change-bar  output  file. 

Function  change— barf  )  inserts  a 
change  bar  into  its  input  string.  Two 
different  algorithms  are  used,  de¬ 
pending  on  whether  the  change  bar 
is  to  appear  to  the  left  or  the  right  of 
the  text. 

Function  addedf  )  handles  lines 
that  appear  in  the  revision  file  but 
not  in  the  baseline  file.  They  are  out¬ 
put  to  the  change  summary  and,  if 
enabled,  to  the  change-bar  file. 


Error:  Must  specify  two  files 

This  occurs  if  the  command  line  does  not  contain  at  least  two  file  names. 

Out  of  Memory 

This  error  occurs  if  a  large  look-ahead  is  specified  and/or  if  huge  sections  of  text  differ  be¬ 
tween  the  two  files. 

ERROR  -  lost  sync  in  file  <name>  at  page  <n>  line  <n> 

After  a  difference  was  located,  DIFF  could  not  find  where  the  files  became  synchronized. 
To  correct,  increase  the  value  for  /lookahead.  The  page/line  reported  is  the  start  of  the 
point  in  newfile  at  which  the  difference  was  first  detected. 

Note:  When  this  error  occurs,  the  program  closes  any  output  files  and  exits. 

Help 

Usage  information  is  printed  when  most  command-line  errors  are  detected. 

Error:  Can't  open  <filename> 

DIFF  was  unable  to  open  (for  reading)  one  of  the  input  files.  Be  sure  that  the  file  and  path 
name  are  correct  and  that  you  have  read  privilege  for  that  file.  It  may  be  caused  by  a  for¬ 
gotten  slash  (/)  on  an  option  that  made  DIFF  interpret  it  as  a  file  name. 

Error:  Can't  create  <filename> 

diff  was  unable  to  create  the  optional  output  change-bar  file.  Be  sure  that  the  name  is  a 
legal  one  and  that  you  have  write  privilege  for  the  directory  in  which  the  name  is  to  be 
placed.  Could  be  a  forgotten  slash,  too. 

ERROR  in  option  <option> 

This  error  occurs  when  an  /OUTPUT option  has  a  malformed  or  missing  file  name. 

ERROR  creating  <name> 

This  error  occurs  when  DIFF  is  unable  to  create  an  output  file  for  the  change  summary.  Be 
sure  that  the  name  is  a  legal  one  and  that  you  have  write  privilege  for  the  directory  in 
which  the  name  is  to  be  placed. 

Unrecognized  Option:  <option> 

An  option  name  is  misspelled  or  illegal. 

Error:  Too  many  files  at  <name> 

More  than  three  file  names  appear  on  the  command  line. 


Table  2:  Error  messages 


34 


Dr.  Dobb's  Journal,  August  1987 

595 


WHAT'S  THE  DIFF? 

(continued  from  page  34) 


Function  deletedt  )  handles  lines 
that  appear  in  the  baseline  file  but 
not  in  the  revision  file.  They  are  out¬ 
put  to  the  change  summary. 

Function  resync (  )  is  the  guts  of  the 
program — it  resynchronizes  the  two 
files  after  they  get  out  of  sync.  The 
input  arguments  are  pointers  to  the 
lines  that  are  found  to  differ  in  the 
two  files.  It  works  in  the  following 
manner. 

The  first  line  in  the  revision  file 
that  doesn't  match  the  baseline  file  is 
compared  to  lines  in  the  baseline  file 


until  a  matching  line  is  found  or  you 
have  looked  at  /LOOKAHEAD  signifi¬ 
cant  lines,  whichever  comes  first.  If 
you  find  a  matching  line  in  the  base¬ 
line  file,  you  compare  the  next  /RE 
—SYNC  lines  to  ensure  that  they  too 
match.  If  so,  you  consider  yourself 
synced,  print  the  change  informa¬ 
tion,  and  exit  with  the  file  input 
pointers  at  the  first  lines  that  match 
again.  If  /RE— SYNC  lines  don’t  match, 
you  continue  the  search. 

If  you  look  at  /LOOKAHEAD  lines  in 
the  baseline  file  without  matching 
the  revision  file  from  the  point  of  dif¬ 
ference,  you  move  ahead  one  line  in 
the  revision  file  and  then  repeat. 


Eventually,  you'll  either  find  a 
match  or  you'll  have  moved  ahead 
/LOOKAHEAD  lines  in  the  revision 
file.  At  this  point,  you  give  up  and- 
exit  from  the  program. 

This  is  a  fairly  brute-force  method 
as  a  file  that  contains  one  or  two 
large  difference  sections  and  a  large 
number  of  small  differences  will 
perform  poorly  because  /LOOK¬ 
AHEAD  needs  to  be  large  enough  to 
accommodate  the  big  differences  but 
will  be  inefficient  on  the  small  differ¬ 
ences.  It  should  be  relatively  easy  to 
make  this  adaptive  by  starting  with  a 
small  value  for  /LOOKAHEAD,  and 
when  a  large  difference  is  encoun¬ 
tered,  at  label  nosy  pushing  the  old 
value  of  /LOOKAHEAD,  setting  a  larg¬ 
er  value  (say,  double  the  old  value), 
and  calling  resync(  )  recursively. 
This  way,  given  enough  memory, 
the  program  will  always  resynch¬ 
ronize. 

Function  diffl )  handles  the  print¬ 
ing  of  lines  that  match  and  calls  re- 
synci )  when  differences  are  found. 

Function  pageskipf  )  skips  the 
front  ends  of  files  when  the  /SKIP  op¬ 
tions  are  used. 

Function  help(  )  prints  the  usage 
summary  when  command-line  er¬ 
rors  are  detected. 

Function  open—files( )  opens  the 
two  input  files  and,  if  specified,  the 
change-bar  output  file. 

Function  redirect  )  redirects  the 
VAX  standard  output.  Because  this 
program  has  a  variable  number  of 
arguments,  it's  easiest  to  use  if  in¬ 
stalled  as  a  foreign  command,  and 
the  standard  redirection  doesn't 
work  then.  Incidentally,  to  install 
this,  use  the  following  command: 

DIFF  :=  —  $  diskname:[pathname] 

DIFF.EXE 

Redirect  )  works  under  MS-DOS  as 
well,  but  it's  not  required. 

Function  strip— opt(  )  parses  the 
command  line.  It  is  designed  around 
the  VMS  command  syntax,  which  I 
like  better  than  MS-DOS  or  Unix  (at 
least  for  options).  I  get  annoyed 
when  I  can't  abbreviate  options  I  use 
interactively  or  leave  them  spelled- 
out  in  command  scripts.  I  also  get  an¬ 
noyed  when  options  are  case  sensi¬ 
tive,  especially  if  some  of  the  options 
are  lowercase,  some  are  uppercase, 
and  some  are  mixed  (as  in  a  certain  C 


36 

596 


Dr.  Dobb's  Journal,  August  1987 


WHAT'S  THE  DIFF? 

(continued  from  page  36) 


compiler  made  by  a  company 
known  for  PC  operating  systems  and 
mice).  But  then,  we  software  types 
are  known  for  our  unreasonable 
likes  and  dislikes — I  still  use  Word¬ 
Star — and  this  parser  can  easily  be 
made  more  Unix-like  by  changing 
the  OPT— FLAG  define  and  the  literal 
arguments  to  matchf  ). 

Function  upperf  )  converts  its 
string  argument  to  all  uppercase 
letters. 

Function  matchf  )  checks  for  (possi¬ 
ble)  partial  matches  of  command-line 
option  strings  with  a  pattern  string. 
To  make  this  really  bulletproof,  it 
ought  to  check  for  a  minimum  num¬ 
ber  of  matching  characters.  Right 
now  it  doesn't,  but  this  hasn't  caused 
any  trouble  so  far. 

Function  num(  )  retrieves  the  value 
parameter  from  command  options. 

The  rest  of  the  program  is  condi¬ 
tionally  compiled  and  is  strictly  de¬ 
bugging  support  for  making  modifi¬ 
cations.  Whenever  I  find  occasion  to 
tweak  the  program,  it  seems  to  die 
silently  the  first  couple  of  times  I  run 
it.  With  the  debugging  support  in,  if 
you  run  the  program  with  the 
/TRACE  option,  it  will  print  a  mes¬ 
sage  each  time  it  enters  or  exits  a 
function.  Pressing  T  toggles  tracing 
on  and  off.  Pressing  S  displays  the 
current  call  stack.  This  is  a  great  help 
in  finding  where  the  program  is 
hung  in  a  loop.  Be  warned:  it's  also 
hours  of  fun  to  watch  DIFF  crunch  a 
200-page  document  with  /TRACE  on. 

Areas  for  Enhancement 

Because  DIFF  has  solved  my  immedi¬ 
ate  change-bar  concerns,  it’s  unlikely 
I'll  be  making  any  major  enhance¬ 
ments  in  the  future.  I  am  releasing 
DIFF  into  the  public  domain,  howev¬ 
er,  and  would  appreciate  hearing 
from  those  of  you  who  make  modifi¬ 
cations  and  improvements  to  the 
program.  And  I  have  two  suggestions 
to  start  you  off. 

The  first  major  enhancement  I  can 
see  would  be  useful  in  archiving  ver¬ 
sions  of  source  code.  DIFF  could  be 
modified  to  emit  line  editor  script 
files  that  contain  the  commands  to 
transform  one  version  of  a  file  into 
another.  This  would  let  you  keep 
only  the  first  version  of  a  file  in  its 


complete  form.  Subsequent  versions 
would  be  kept  as  differences  (the  edi¬ 
tor  script  file),  so  any  version  could 
be  recreated  by  transforming  the 
original  into  the  desired  version  with 
a  series  of  editing  operations.  Many 
configuration  management/source 
code  control  tools  use  this  type  of  sys¬ 
tem  to  save  disk  space. 

The  second  major  enhancement  is 
more  difficult  and  possibly  is  useful 
only  to  a  small  number  of  people. 
The  current  version  of  DIFF  is  line- 
oriented,  and  thus  it  can  be  fooled  by 
any  changes  in  format — for  exam¬ 
ple,  reparagraphing  a  document 
with  different  margin  settings,  add¬ 
ing  several  words  for  a  paragraph 
(and  thus  the  paragraph  to  be  refor¬ 
matted),  and  common  alterations  to 
source  files  such  as  tabbing/detab¬ 
bing  or  pretty-printing. 

It's  rare  that  reformatting  effects 
are  major  problems,  but  if  they  be¬ 
come  a  concern,  DIFF  could  be  modi¬ 
fied  to  act  on  tokens  rather  than  on 
lines.  The  major  headache  in  token- 
izing  is  that  the  lexical  rules  are  dif¬ 
ferent  for  program  source  and  docu¬ 
ments.  A  less  serious  problem  is 


relating  the  change  information 
from  the  token  stream  back  to  the 
line-oriented  source  text.  Compilers 
seem  to  be  able  to  do  this,  so  several 
elegant  solutions  probably  exist. 

Availability 

All  the  source  code  for  articles  in  this 
issue  is  available  on  a  single  disk:  To 
order,  send  $14.95  to  Dr.  Dobb's  Jour¬ 
nal,  501  Galveston  Dr.,  Redwood  City, 
CA  94063,  or  call  (415)  366-3600,  ext. 
216.  Please  specify  the  issue  number 
and  format  (MS-DOS,  Macintosh, 
Kaypro). 

DDJ 


(Listing  begins  on  page  66.) 

Vote  for  your  favorite  feature/article. 
Circle  Reader  Service  No.  3. 


Dr.  Dobb's  Journal,  August  1987 


39 

597 


ARTICLES 


Optimizing 
Compilers  for 


You've  seen  the  ads:  Datalight  chal¬ 
lenges  Microsoft.  Our  C  compiler  ex¬ 
pert  Richard  Relph  saw  the  ads  and 
sent  for  Datalight 's  compiler.  What  he 
found  when  he  began  to  test  it  must 
have  given  him  mixed  feelings.  For  the 
past  two  years  Richard  has  been  in¬ 
volved  in  developing  the  DDJ  suite  of 
benchmarks  for  C  compilers.  The  Da¬ 
talight  compiler  flattened  those  bench¬ 
marks,  making  them  worthless.  What 
has  made  our  benchmarks  obsolete 
and  raised  the  stakes  for  C  compiler 
vendors  is  something  called  global  op¬ 
timization,  common  on  mainframe 
and  minicomputers  and  now  coming 
to  the  desktop.  Because  it  was  Rich¬ 
ard's  work  that  was  made  obsolete, 
we  gave  him  the  assignment  of  report¬ 
ing  on  the  optimization  techniques  that 
did  the  job. — eds. 

This  article  describes  some  of 
the  optimizations  that  a  C 
compiler  can  perform  to 
make  the  resultant  code  either  small¬ 
er  or  faster.  For  many  languages, 
such  as  FORTRAN,  optimization  is 
necessary  because  the  language  is 
poorly  matched  to  the  target  proces¬ 
sor.  But  C's  close  match  to  underlying 
target  instructions  and  its  rich  set  of 
operators  have  made  optimization 
largely  optional  for  C  code.  A  well- 
written  C  program,  compiled  with  a 
nonoptimizing  compiler,  can  per¬ 
form  favorably  compared  to  the 
same  program  written  in  FORTRAN 
or  Pascal, 


Richard  Relph,  846  Salt  Lake  Dr.,  San 
Jose,  CA  95133.  Richard  is  a  software 
and  hardware  consultant.  He  has  writ¬ 
ten  compilers  and  embedded 
systems. 


by  Richard  Relph 


Function  range 
optimization 
is  the  biggest 
step  forward. 


Nevertheless,  C  code  can  be  opti¬ 
mized,  for  some  applications  it  may 
be  highly  desirable  for  the  compiler 
to  do  the  optimization,  and  optimiz¬ 
ing  C  compilers  have  existed  for 
some  time  for  non-Intel  processors. 
Tartan  Labs  and  Green  Hills  come  to 
mind  when  thinking  about  optimiz¬ 
ing  C  compilers  for  VAXs  or  680x0s. 

With  the  apparent  glut  of  C  com¬ 
piler  suppliers  vying  for  the  MS-DOS 
market,  it  was  only  a  matter  of  time 
before  one  of  them  decided  to  step 
above  the  crowd  and  provide  a  reli¬ 
able  optimizing  C  compiler.  Datalight 
beat  all  others  to  the  punch  by  deliv¬ 
ering  such  a  compiler  in  February  of 
this  year.  Computer  Innovations  has 
also  shown  a  product  that  it  claims  is 
optimizing,  and  others  will  an¬ 
nounce  and  deliver  products  before 
the  year's  end. 

Datalight  provided  me  with  its  Op- 
timum-C  package  in  February,  and  it 
is  this  compiler  upon  which  this  arti¬ 
cle  is  largely  based. 

Varieties  of  Optimizations 

I  have  used  the  term  optimization 
rather  freely,  but  so  do  compiler  ven¬ 
dors.  There  are,  in  fact,  several  kinds 
of  optimizations,  and  I  would  like  to 
distinguish  what  I  mean  by  optimiza¬ 
tion  from  what  other  people  (partic¬ 
ularly  compiler  vendors)  might 


mean. 

Basically,  optimization  can  occur 
at  two  places  in  the  compilation  pro¬ 
cess.  The  simplest,  so-called  peep¬ 
hole,  optimizations  occur  in  the  final 
stages.  A  peephole  optimizer  can 
eliminate  various  dumb-looking  in¬ 
struction  sequences.  It  can  get  rid  of 
redundant  loads  (loading  a  register 
with  a  value  that  it  already  contains) 
and  some  jumps  around  jumps  (de¬ 
pending  on  the  target  instruction  set 
and  the  knowledge  the  compiler 
maintains  about  function  size).  Some 
peephole  optimizers  do  more  ad¬ 
vanced  things. 

The  other  place  in  the  compilation 
process  in  which  optimizations  can 
occur  is  in  the  middle  stages.  This  is 
after  the  program  has  been  "read" 
by  the  compiler  and  converted  to 
some  intermediate  form.  In  many 
compilers  this  intermediate  form  is 
both  host  and  target  independent, 
thereby  making  optimizations  at  this 
stage  very  useful  if  the  compiler 
must  support  multiple  targets. 

These  intermediate  optimizations, 
which  are  more  interesting  for  my 
present  purposes,  can  occur  over 
five  ranges,  which  I  refer  to  here  as 
statement,  block,  function,  module, 
and  program.  These  range  designa¬ 
tions  are  pretty  self-explanatory.  A 
statement  is  a  statement.  A  block  is  a 
sequence  of  instructions  in  which 
there  are  no  jumps  or  labels  used  as 
entry  points.  A  function  is  a  function. 
A  module  is  a  module  or  source  file. 
Optimizations  over  all  these  ranges 
can  be  performed  using  current 
compilation  models — edit,  compile, 
link,  and  debug.  Optimizations  over 
the  last  range — program — cannot.  In 
this  case  the  compile  and  link  phases 


42 

598 


Dr.  Dobb's  Journal,  August  1987 


must  be  combined. 

Optimization  from  peephole 
through  statement  range  is  nothing 
new.  These  optimizations  have  been 
done  for  a  long  time  by  almost  all  the 
C  compilers  I  am  familiar  with  (that’s 
a  lot).  Some  examples  of  statement 
range  optimization  are  constant  fold¬ 
ing,  strength  reduction,  and  dead 
statement  elimination.  These  are  ba¬ 
sic,  and  I  will  not  discuss  them  here. 

Block  range  optimization  is  where 
the  subject  begins  to  get  interesting. 
Block  range  optimizations  are  done 
by  some  of  today's  better  PC-based 
compilers.  The  most  common  and 
important  block  range  optimization 
is  common  subexpression  elimina¬ 
tion,  and  you  can  see  it  in  practice 
later  in  this  article. 

But  it  is  function  range  optimiza¬ 
tion  (sometimes  called  "global  opti¬ 
mization”)  that  represents  the  big¬ 
gest  step  forward  in  C  code 
optimization  today.  Function  range 
optimization  is  really  what  this  arti¬ 
cle  is  about,  so  I'll  defer  examples 
and  descriptions  for  a  few  lines 
while  I  dismiss  module  and  program 
range  optimization. 

Module  range  optimization  is  rare. 
The  only  commercial  compilers  I 
know  of  that  do  this  are  those  from 
Tartan  Labs.  An  optimizer  dealing 
with  the  module  range  turns  some 
small  functions  into  in-line  code  or 
automatically  passes  arguments  to 
static  functions  in  registers.  This  is 
called  "interprocedure  analysis.”  In 
fact,  all  the  simpler  kinds  of  optimi¬ 
zations  can  be  performed  across  sev¬ 
eral  functions,  and  when  this  is  done, 
it  is  module  range  optimization.  As  I 
say,  though,  this  is  rare. 

The  last  range  for  optimization  is 
program  range  optimization.  This 
takes  you  beyond  rare  and  to  nonex¬ 
istent.  When  some  compiler  (and 
built-in  linker)  implements  program 
range  optimization,  you  can  begin  to 
say  things  such  as  "this  is  optimal”  or 
"as  good  as  the  best  assembly  lan¬ 
guage”  because  all  the  information 
about  the  program  is  available  to  the 
compiler  at  one  time.  This  is  still  a 
dream,  but  some  companies  are  dis¬ 
cussing  this  kind  of  compiler.  If  you 
want  to  be  prepared,  get  a  machine 
with  lots  of  memory,  a  superfast  pro¬ 
cessor,  and,  perhaps  most  important, 
an  uninterruptible  power  supply, 
because  this  compiler’s  compilation 


time  will  be  measured  in  hours  to 
weeks  instead  of  seconds  to  hours. 

Table  1,  below,  summarizes  the 
availability  of  these  different  optimi¬ 
zations  in  present-day  compilers. 
The  optimization  ranges  are  not  as 
firm  as  they  appear  to  be  from  the 
preceding  outline.  Compiler  vendors 
may  choose  to  implement  only  one 
or  two  of  the  simpler  function  range 
optimizations — not  the  full  deck. 
Companies  currently  doing  this  are 
Wizard/Borland  and  MetaWare, 
which  implement  the  simplest  auto¬ 
matic  register  allocation  scheme,  and 
MetaWare  and  Microsoft,  which  im¬ 
plement  cross-jump  optimizations. 
Many  compiler  vendors  implement 
some  sort  of  switch  statement  opti¬ 
mizer,  which  is  hard  to  categorize  (I 
believe  it  is  a  block  range  optimiza¬ 
tion).  But  none  of  the  vendors,  until 
Datalight,  has  attempted  a  reason¬ 
ably  complete  function  range 
optimizer. 

Modern  Optimization 
Techniques 

The  rest  of  this  article  provides  a  tour 
of  the  major  function  range  optimi¬ 
zations.  I  give  each  a  name  (note  that 
I  did  not  say  "the”  name),  a  brief  de¬ 
scription  of  the  optimization,  and  a 
sample  code  fragment  to  which  the 
optimization  applies.  I'll  start  with 
some  of  the  more  basic  and  simpler 
optimizations  and  work  toward  the 
more  advanced. 

Constant  Propagation 

Constant  propagation  is  used  when  a 
variable  has  a  constant  value  over  a 
portion  of  the  function.  Although 
constant  propagation  is  not  very  use¬ 
ful  by  itself,  in  conjunction  with 
common  subexpression  elimination 
and  invariant  code  motion  (discussed 
later),  it  becomes  important. 


Table  1:  Use  of  various  optimizations 


For  the  next  several  optimizations, 
I  will  refer  to  the  following  simple 
code  segment: 

fund  p ) 

int  p; 

{ 

int  i; 
int  j; 

i  =  5; 

for  (j  =  0;  j  <  i;  j  H — h ) 

} 

which,  after  constant  propagation, 
becomes: 

fund  p ) 

int  p; 

{ 

int  i; 
int  j; 

i  =  5; 

for  (j  =  0;  j  <  5;  j  +  + ) 

/ 

} 

As  this  example  shows,  constant 
propagation  may  create  dead  assign¬ 
ments.  The  assignment  i  =  5  be¬ 
comes  pointless  unless  i  is  used  off¬ 
stage  somewhere. 

Copy  Propagation 

Copy  propagation  is  like  constant 
propagation,  except  that  the  compil¬ 
er  keeps  track  of  which  variables 
hold  the  same  values  rather  than  not¬ 
ing  that  a  certain  variable  holds  a 
constant  value.  This  results  in  substi¬ 
tution  of  one  variable  for  another 
when  they  have  the  same  value.  The 
possible  advantage  this  gives  you  is 
that  one  of  the  variables  may  be  fast¬ 
er  to  get  to  (because  it  is  in  a  register) 
than  the  other. 


in  present-day  compilers 


Stage 

Range 

Rarity 

Example 

final 

common 

“peephole”  optimization 

intermediate 

statement 

common 

dead  statement  elimination 

intermediate 

block 

some  PC 

common  subexpression 

compilers 

elimination 

intermediate 

function 

new  to  PC 

code  hoisting 

compilers 

intermediate 

module 

rare 

interprocedure  analysis 

intermediate 

program 

nonexistent 

Dr.  Dobb's  Journal,  August  1987 


43 

599 


OPTIMIZING  C  COMPILERS 

(continued  from  page  43) 


In  the  preceding  example,  if  i  were 
assigned  the  value  x  instead  of  the 
constant  5,  then  copy  propagation 
would  apply  and  you  would  see  the 
following: 

i  =  x; 

for  (j  =  0;  j  <  x;  j++) 

/ 

As  is  the  case  with  constant  propaga¬ 
tion,  copy  propagation  may  create 
some  dead  assignments;  here,  if  i  is 
not  used  subsequently,  the  assign¬ 
ment  /'  =  y  is  dead. 

Dead  Assignment 
Elimination 

Dead  assignments  are  assignments  to 
variables  that  are  not  used  before  the 
variables  are  assigned  again.  In  the 
constant  propagation  example,  after 
the  constant  has  been  propagated, 
the  assignment  to  it  is  useless,  or 
dead.  Such  an  assignment  can  be 
eliminated.  After  dead  assignment 
elimination  in  the  example  you 
have: 

fund  p ) 

int  p; 

{ 

int  i; 
int  j; 

/*  i  =  5  7 

for  (j  =  0;  j  <  5;  j+  +) 

} 

Dead  assignment  elimination  may  in 
turn  result  in  a  dead  variable,  as  has 
happened  in  this  example.  Variable  i 
is  now  dead  and  can  be  eliminated, 
which  leads  me  to  the  next  subject. 

Dead  Variable  Elimination 

A  dead  variable  is  a  variable  that  is 
never  referenced.  Looking  again  at 
the  constant  propagation  example, 
after  constant  propagation  and  dead 
assignment  elimination,  the  variable 
i  may  no  longer  be  needed,  so  its 
space  on  the  stack  or  in  a  register  can 
be  freed  for  other  use. 

Eliminating  it,  you  get: 

r 

/*  int  i;  */ 


int  j; 

/*  i  =  5  7 

for  (j  =  0;  j  <  5;  j  +  + ) 

; 

} 

The  variable  p  is  also  dead,  but  being 
an  argument  to  the  function,  it  is  not 
removable. 

Dead  Code  Elimination 

As  you  can  eliminate  dead  assign¬ 
ments  and  dead  variables,  so  too  can 
you  eliminate  dead  code.  Dead  code 
is  any  code  that  can  never  be 
reached.  Although  dead  code  is  rare 
in  practice,  this  optimization  is  fairly 
easy  to  do,  so  why  not?  Many  compil¬ 
ers  implement  simple  forms  of  this 
optimization  even  though  they  do 
not  implement  other  function  range 
optimizations.  For  the  remaining  op¬ 
timizations,  we  will  use  the  code  seg¬ 
ment  in  Example  1,  page  44. 

Global  Common 
Subexpression  Elimination 

Many  functions  use  and  reuse  sub¬ 
expressions  in  the  computation  of 
complete  expressions.  Such  sub¬ 
expressions  are  said  to  be  common.  If 
a  compiler  can  detect  such  sub¬ 
expressions,  compute  them  once, 
save  the  result,  and  simply  refer  to 
the  saved  result,  recomputation  can 
be  avoided.  This  is  particularly  im¬ 
portant  with  floating-point  and  other 
compute-intensive  data  types.  Note 
that  the  compiler  may  create  a  vari¬ 
able  in  the  process. 

The  following  shows  the  code  in 
Example  1  after  common  subexpres¬ 
sion  elimination. 


struct  x  { 

int  i  ; 

char  c 

t 

}<J[  10  ]  [  10  ]  , 
copy  (  ) 

l 

s [  1  0  ]  [  10J; 

\ 

int  i  , 

j ; 

f  or  (  i 

=  0;  i  <  10;  i  +  +  ) 
f  or  (  j  =  0  ;  j  (  1  0  ; 

j  +  +  ) 

d[  i  M  j  1  = 

s  [  i  I  1  j  1  ; 

} 

Example  1:  C  code  to  be  optimized 


{ 

to  =  i  *  10  +  j; 

d  [0]  [  tO  ]  =  s  [0]  [  tO  ]; 

} 

Lifetime  Analysis 

Lifetime  analysis  is  the  first  of  the 
hard  optimizations.  What  lifetime 
analysis  attempts  to  do  is  determine 
which  variables  have  meaningful 
values  over  what  range  of  the  func¬ 
tion. 

Variables  that  have  nonoverlap¬ 
ping  ranges  may  share  processor  re¬ 
sources,  especially  registers.  For 
straight  code  it  is  easy  to  see  how  to 
do  this  analysis,  but  loops  and  gotos 
make  it  much  harder. 

Register  Allocation 

After  the  lifetime  of  each  variable  is 
determined,  important  variables  can 
be  identified.  Important  variables 
are  those  that  are  referred  to  often, 
either  because  they  are  named  fre¬ 
quently  or  because  they  occur  inside 
loops.  There  is  usually  a  multiplier 
applied  to  the  "reference  count”  ob¬ 
tained  for  variables  in  loops.  So,  once 
the  compiler  has  ranked  the  vari¬ 
ables  in  importance,  its  goal  is  to  use 
the  processor’s  resources  well.  A 
well-known  technique  for  doing  this 
(used  by  Datalight)  is  called  "color¬ 
ing"  because  of  its  similarity  to  the 
map-coloring  problem — except  your 
"map"  is  a  variable  usage  graph. 

The  map-coloring  problem  is  this: 
Given  a  fixed  number  of  colors  (CPU 
registers),  color  the  map  (the  variable 
usage  graph)  so  that  the  fewest  (pref¬ 
erably  0)  number  of  states  (variables) 
are  left  uncolored  (not  in  registers)  as¬ 
suming  that  no  two  adjoining  states 
(variables  with  overlapping  life¬ 
times)  have  the  same  color 
(register). 

A  much  simpler  allocation  strategy 
is  used  in  some  existing  compilers. 
These  compilers  merely  take  the 
most  important  variables  and  dedi¬ 
cate  them  to  registers  throughout  the 
function. 

Loop  Invariant  Code  Motion 

Similar  to  common  subexpression 
elimination,  loop  invariant  code  mo¬ 
tion  (sometimes  called  “code  hoist¬ 
ing”)  notices  that  some  subexpres- 


44 

600 


Dr.  Dobb's  Journal,  August  1987 


sions  are  not  affected  by  the 
execution  of  the  loop.  Because  most 
loops  are  executed  more  than  once, 
such  subexpressions  are  logically 
common  (refer  to  the  definition  un¬ 
der  common  subexpression  elimina¬ 
tion).  By  computing  them  once  be¬ 
fore  the  loop  is  entered,  the  compiler 
can  save  a  lot  of  run-time 
recomputations. 

After  one  level  of  code  motion,  our 
example  looks  like  this: 

r 

to  =  i  *  10; 

for  (j  =  0;  j  <  10;  j+  +) 

{ 

tl  =  to  +  j; 
d  [0]  [  tl  ]  =  s  [0]  [  tl ); 

} 

} 


Loop  Induction 

What  loop  induction  is,  conceptual¬ 
ly,  is  strength  reduction  on  loops.  If 
a  loop  has  a  subexpression  in 
which  one  part  is  loop  index  sensi¬ 
tive  and  the  operator  is  multiply,  it 
is  possible  to  replace  the  subexpres¬ 
sion  by  a  variable  that  gets  added  to 
for  each  change  in  the  loop  index. 
The  remaining  examples  point  out 
the  usefulness  of  this  technique 
(particularly  when  such  constructs 
may  be  present  but  not  obvious) 
and  give  some  further  sense  of  how 
function  range  optimizations  can 
be  used. 


for  (tO  =  0;  tO  <  100;  tO  +  =  10) 
for  (j  =  0;  j  <  10;  j+  +) 

{ 

tl  =  to  +  j; 
d  [0]  [  tl  ]  —  s  [0]  [  tl  1; 

} 

This  code  can  be  optimized  fur¬ 
ther,  though.  Here  it  is  after  loop  in¬ 
duction  on  tl: 


for  (tl  =  tO;  tl  <  tO  +  10;  tl+  +) 
d  [0]  [  tl  ]  =  s  [0]  [  tl  ]; 


Here  it  is  after  common  subexpres¬ 
sion  elimination  on  the  implied  mul¬ 
tiply  in  the  loop: 


for  (tl  =  tO;  tl  <  tO  +  10;  tl+  +) 

{ 

t2  =  tl  *  sizeofi  struct  x  ); 
((char  *)&d)  +  t2  =  ((char 


The  following  code  fragment  shows 
what  the  effect  of  reinduction  on  tl 
and  tZ  will  be: 


for  (tl 


{ 

> 


to  *  sizeofi  struct  x  ); 
tl  <  (tO  +  10)  *  sizeofi  struct 
x);  - 

tl  +  =  sizeofi  struct  x  )) 

((char  *)&,d)  +  tl  = 
((char  *)&s)  +  tl; 


Here  it  is  after  more  invariant  code 
motion: 


t3  =  (tO  +  10)  *  sizeofi  struct  x  ); 
for  (tl  =  tO  *  sizeofi  struct  x  ); 
tl  <  t3; 

tl  +  =  sizeofi  struct  x  )) 


The  machinations  I  have  just  been 
through  can  be  expected  to  yield 
space  and  time  benefits  in  the  neigh¬ 
borhood  of  30  percent.  Datalight  has 
improved  its  dhrystone  perform¬ 
ance  from  1,084  to  1,284  dhrystones 
per  second  through  these  kinds  of 
optimizations. 

Summary 

Although  the  discussion  in  this  arti¬ 
cle  has  been  based  largely  on  work 
with  one  optimizing  compiler,  more 
optimizing  compilers  will  be  coming 
out  soon.  One  problem  they  will  pre¬ 
sent  to  software  developers  has  to  do 
with  naming.  There  is  no  agreed- 
upon  name  for  many  of  these  optimi¬ 
zation  techniques.  Just  because  some 
vendor  says  it  has  "xyz”  optimization 
doesn't  mean  nobody  else  does;  it 
may  just  mean  that  nobody  else  calls 
it  xyz.  I  hope  this  article  has  provided 
you  with  some  means  to  understand 
vendors'  claims  and  counterclaims 
and  to  make  an  informed  choice. 

I  also  hope  I  have  given  you  a  sense 


of  the  importance  of  this  develop¬ 
ment  in  personal  computer  compiler 
technology.  The  optimizations  I  have 
discussed  here  are  considered  basic 
by  minicomputer  and  mainframe 
compiler  standards,  and  it  is  precise¬ 
ly  because  of  the  lack  of  such  “basic 
tools”  that  many  computer  profes¬ 
sionals  consider  personal  computers 
to  be  toys.  Well,  these  particular  ba¬ 
sic  tools  have  arrived.  I  think  we  can 
now  safely  put  the  “toy”  complaint 
to  rest,  and  I  firmly  believe  that, 
when  program  range  optimizers  ar¬ 
rive,  personal  computers  will  be 
among  the  first  machines  of  any  size 
to  employ  them. 

DDJ 


Vote  for  your  favorite  feature/article. 
Circle  Reader  Service  No.  4. 


46 


Dr.  Dobb's  Journal,  August  1987 

601 


BACKTRACKING 

Listing  One  (Text  begins  on  page  24.) 

1 

/* 

2 

★ 

3 

★ 

KROSS.C 

4 

★ 

5 

* 

COPYRIGHT  (C)  1987  by  Charles  F.  Bowman 

6 

it 

7 

it 

ALL  RIGHTS  RESERVED. 

8 

* 

9 

V 

10 

# include 

"stdio.h" 

11 

12 

♦define 

NIL 

'\000' 

13 

14 

Idefine 

ALL 

1 

15 

#define 

PUZ 

2 

16 

I define 

DOWN 

1 

17 

♦define 

ACROSS 

2 

18 

19 

Idefine 

M INWORD 

3 

20 

#define 

MAXPUZ 

25 

21 

Idefine 

MAXWORD 

50 

22 

♦define 

WORDLEN 

15 

23 

24 

♦define 

EMPTY 

0 

25 

Idefine 

FREE 

1 

26 

♦define 

USED 

2 

27 

#define 

SOLVED 

3 

28 

29 

♦define 

BLANK 

1  • 

30 

♦define 

PADCHAR 

31 

♦define 

WORDS 

"Swords" 

32 

Idefine 

PUZZLE 

"Bpuzzle” 

33 

34 

♦define 

FLAG (x. 

y)  list [  x  -  MIN WORD  ].w[  y  ].flg 

35 

Idefine 

WORD  (x. 

y)  list(  x  -  MINWORD  ].w[  y  ].word 

36 

37 

FILE 

*fp; 

38 

int 

length, 

width; 

39 

char 

puzzle [ 

MAXPUZ  ] [  MAXPUZ  ] ; 

40 

41 

struct 

words 

{ 

42 

char 

word [  WORDLEN  ] ; 

43 

int 

fig; 

44 

); 

45 

46 

struct 

i 

47 

struct 

words  w [  MAXWORD  ]; 

48 

}  list[ 

WORDLEN 

-  MINWORD  ]; 

49 

50 

main(  ac,  av  ) 

51 

int 

ac; 

52 

char 

*av[]; 

53 

( 

54 

55 

if  (  ac 

•  ■=  2  )  < 

56 

fprintff  stderr,  "usage:  kross  puzzlefileXn"  ); 

57 

exit (  1  ) ; 

58 

) 

59 

if (  (fp 

-  fopen (  av [1] ,  "r"  ))  =-  NULL  ){ 

60 

fprintf  (  stderr,  "Cannot  open‘%s'  to  read! \n",  av(l]  ); 

61 

exit  (  2  )  ; 

62 

) 

63 

64 

readpuz (  fp  ) ; 

65 

if (  solve (0,  -1)  ) ( 

66 

pprint (  PUZ  ) ; 

67 

}  else 

: 

68 

printf(  "No  Solution !! \n"  ); 

69 

> 

70 

exit (  0 

) ; 

71 

) 

72 

73 

/* 

74 

* 

75 

* 

READPUZ () ;  read  puzzle  into  memory  from  file 

76 

* 

(continued  on  page  52) 

50 

602 


Dr.  Dobb's  Journal,  August  1987 


BACKTRACKING 

Listing  One  (Listing  continued,  text  begins  on  page  24. ) 

77 

V 

78 

readpuz  () 

79 

( 

80 

int  i; 

81 

char  buf [  85  ] ; 

82 

83 

84 

length  =  0; 

85 

/* 

86 

*  Puzzle  Section 

87 

*/ 

88 

if (  fgets (  buf,  sizeof  buf,  fp  )  ==  NULL  ) { 

89 

fprintf (  stderr,  ”%s:  Premature  EOF!\n",  PUZZLE  ); 

90 

exit (  4  ) ; 

91 

} 

92 

if (  strncmp (  buf,  PUZZLE,  strlen(  PUZZLE  )  )  )( 

93 

fprintf (  stderr,  "%s:  BAD  FORMAT ! \n",  PUZZLE  ); 

94 

exit  (  5  )  ; 

95 

) 

96 

97 

if (fgets (buf, sizeof  buf, fp) ==NULL 

98 

|  |  !  strncmp  (buf,  WORDS,  strlen  (WORDS)  ) )  { 

99 

fprintf (  stderr,  " %s:  Premature  EOF!\n",  PUZZLE  ); 

100 

exit (  4  ) ; 

101 

> 

102 

width  »  strlen (  buf  )  -  1; 

103 

104 

do  ( 

105 

if(  (strlent  buf  )  -  1)  !“  width  ){ 

106 

fprintf (stderr, "Line  %d:  bad  width ! \n", width) ; 

107 

exit (  5  ) ; 

108 

> 

109 

for  (  i  =  0;  i  <  width;  i++  ) { 

110 

if (  buf [  i  ]  ==  BLANK  ) ( 

111 

puzzle!  length  ]  [  i  ]  =*  NIL; 

112 

}  else  if (  buf [i]  —  PADCHAR  ){ 

113 

puzzle  [  length  ]'(  i  ]  “  buf[  i  ]; 

114 

}  else  { 

115 

fprintf (stderr,  "BAD  CHAR '%d'  L#  %d\n". 

116 

buf[i],  length  ); 

117 

exit (  88  ); 

118 

) 

119 

} 

120 

puzzle!  length  ][  width  ]  =  NIL; 

121 

length  +=  1; 

122 

}  while (fgets (buf , sizeof  buf , fp) ! =NULL  && 

123 

strncmp(  WORDS,  buf,  strlen  (WORDS)  )  !“  0  ); 

124 

125 

/* 

126 

*  Words  Section 

127 

*/ 

128 

While  (  fgets  (  buf,  sizeof  buf,  fP  )  1-  null  )(  (continued  on  page  54) 

52 


Dr.  Dobb's  Journal,  August  1987 

603 


BACKTRACKING 

Listing  One  (Listing  continued,  te?ct  begins  on  page  24.) 

129 

for(  i  =  0;  i  <  MAXWORD;  i++  ) { 

130 

if (  FLAG (  strlen (buf)  -  1,  i  )  —  EMPTY  )( 

131 

strncpyf  WORD(  strlen (buf)  -  1,  i  ), 

132 

buf,  strlen  (buf)  -1  ) ; 

133 

FLAG  (  strlen  (buf)  -  1,  i  )  =  FREE; 

134 

) 

135 

) 

136 

if  (  i  >=  MAXWORD  )  ( 

137 

fprintff  stderr,  "Out  of  space  %d  %s\n". 

138 

strlen  (buf) -1,  buf  ); 

139 

exit (  6  ) ; 

140 

) 

141 

) 

142 

return ; 

143 

144 

} 

145 

/* 

146 

*  _  _ _ 

147 

*  PPRINT 0 :  display  solved  pzzle 

14£ 

149 

I/ 

150 

pprint (  t  ) 

151 

int  t; 

152 

( 

153 

154 

int  i,  j; 

155 

switch  (  t  ) ( 

156 

case  ALL: 

157 

/* 

158 

*  Debug  only! 

159 

*/ 

160 

fort  i  -  MINWORD;  i  <  WORDLEN;  i++  ) ( 

161 

j  -  0; 

162 

while (  WORD ( i ,  j) [0]  !=  NIL  )( 

163 

printf (  "%s\n”,  WORD(i,  j)  ); 

164 

j++; 

) 

165 

166 

167 

) 

168 

case  PUZ: 

169 

for  (  i  =*  0;  i  <  length;  i++  )  ( 

170 

for (  j  =  0;  j  <  width;  j++  )( 

171 

if (  puzzle!  i  3 [  j  ]  ) ( 

172 

putchar(  puzzle!  i  ][  j  ]  ); 

173 

}  else  { 

174 

put char {  BLANK  ); 

} 

175 

176 

) 

177 

putchar(  '\n'  ); 

178 

) 

179 

180 

) 

181 

return; 

182 

183 

) 

184 

/* 

185 

**====s=*=:c,x  =,=  =E===,=  ==“===*  =*=»=*:==*5=:  =«s*=a==.*»i*  «=»=.*.  =rs=**sr.»ct=i3a.»sx*iia*a 

186 

187 

SOLVE () :  function  that  searches  for  a  solution 

188 

*/ 

189 

static  int  s  =  0; 

190 

191 

static  int  prev  =  -1; 

192 

solve (  length,  width  ) 

193 

int  length,  width; 

194 

1 

195 

int  1,  w,  i,  len,  tmp,  type; 

196 

197 

char  old[  WORDLEN  -  MINWORD  +  1  ); 

198 

w  =  width; 

199 

1  «  length; 

200 

len  =  next (  si,  sw,  stype  ); 

201 

if (  len  ==  0  ) 

202 

203 

return (  SOLVED  ) ; 

204 

fort  i  -  0;  i  <  MAXWORD  SS  W0RD(len,  i)  [0]  !■=  NIL;  i++  ){ 

205 

if (  FLAG (len,  i)  ==  FREE 

(continued  on  page  58) 

54 

604 


Dr.  Dobb's  Journal,  August  1987 


BACKTRACKING 

Listing  One  (Listing  continued,  text  begins  on  page  24.) 

206 

SS  itfitsd,  w,  WORD (len,  i) ,  type)  )( 

207 

FLAG(len,  i)  =  USED; 

208 

enter  (  old,  1,  w,  WORD (len,  i) ,  type  ); 

209 

prev  =  type; 

210 

tmp  =  solve {  1,  w  ); 

2,11 

if (  tmp  —  SOLVED  ) 

212 

return (  SOLVED  ) ; 

213 

restore (  old,  1,  w,  type  ) ; 

214 

FLAG(len,  1)  =  FREE; 

215 

} 

216 

) 

217 

218 

return (  0  ) ; 

219 

> 

220 

221 

??? 

/* 

223 

224 

*  NEXT():  locate  next  slot  to  fill 

225 

*/ 

226 

next (  len,  wht,  t  ) 

227 

int  *len,  *wht,  *t; 

228 

( 

229 

/* 

230 

*  Return  the  next  slot  in  the  puzzle  to  attempt 

231 

*  to  be  solved.  DOWN  has  precedence. 

232 

* 

233 

*  The  new  values  for  len  S  wht  will  be  updated. 

234 

*  The  returned  value  for  the  'w'  coordinate  for 

235 

*  an  across  'hit'  will  have  to  be  the  value  +  1. 

236 

*/ 

237 

int  1,  w,  tmp; 

238 

239 

1  “  *len; 

240 

w  *=  *wht; 

241 

242 

/* 

243 

*  Check  current  position  for  across:  down  would 

244 

*  have  been  done  already. 

245 

*/ 

246 

if (  w  !=  -1  SS  (  (w  -  1)  <  0  |  |  puzzled]  [w-1]  *==*  NIL  ) 

247 

SS  puzzleddw]  SS  (w  +  1)  <  width  SS  puzzled]  [w+1]  ){ 

248 

/* 

249 

*  Across! 

250 

*/ 

251 

*t  «  ACROSS; 

252 

253 

/* 

254 

*  Neccessary  Evil! 

255 

*/ 

256 

*wht  =*  w  +  1; 

257 

258 

tnp  =•  0; 

259 

while(  puzzleddw]  NIL  SS  w  <  width  )( 

260 

w  +=  1; 

261 

tmp  +=  1; 

262 

) 

263 

return (  tmp  ) ; 

264 

265 

)  else  if (  prev  ==  DOWN  I |  w  ==  -1  )  { 

266 

w  +=»  1; 

267 

) 

268 

269 

/• 

270 

*  Check  for  next  possible  position 

271 

*/ 

272 

for(;  1  <  length;  1  +=»  1  )  { 

273 

for(;  w  <  width;  w  +-  1  ){ 

274 

if(  (  (1-1)  <0  ||  puzzle [1-1] [w]  mm  NIL  ) 

275 

SS  puzzleddw]  !”  NIL  SS  (1  +  1)  <  length 

276 

SS  puzzle [1+1] [w]  I-  NIL  ) ( 

277 

/* 

278 

*  Down ! 

279 

*/ 

280 

*t  =  DOWN; 

(continued  on  page  60) 

58 


Dr.  Dobb's  Journal,  August  1987 

605 


BACKTRACKING 

Listing  One  (Listing  continued,  text  begins  on  page  24.) 

281 

prev  =  DOWN; 

282 

*wht  =  w; 

283 

*len  =1; 

284 

tmp  =0; 

285 

while (puzzle [11 [w] !=NIL  it  l<length) { 

286 

1  +-  1; 

287 

tmp  +=  1; 

288 

) 

289 

return  (  trip  )  ; 

290 

i 

291 

if(  ((w  -  1)  <0  II  puzzlell] [w-1]  —  NIL) 

292 

tt  puzzle[l] [w]  it  (w+1)  <  width 

293 

it  puzzle [1] [w+1]  )( 

294 

/* 

295 

*  Across ! 

296 

*/ 

297 

*t  -  ACROSS; 

298 

prev  =  ACROSS; 

299 

*len  “1; 

300 

*wht  -  w  +  1; 

301 

302 

trip  »  0; 

303 

if(  w  —  -1  )  w  »  0; 

304 

while (puzzle [1] [w]  !=  NIL  stw<width) ( 

305 

w  +»  1; 

306 

tnp  +=  1; 

307 

) 

308 

return (  tup  ) ; 

309 

j 

310 

> 

311 

w  -  0; 

312 

) 

313 

314 

/* 

315 

*  Puzzle  Completed! 

316 

*/ 

317 

return (  0  ); 

318 

) 

319 

320 

/* 

322 

★ 

ITFITSO:  determine  is  a  word  fits  into  a  slot 

324 

*/ 

325 

itfits( 

1,  w,  word,  t  ) 

326 

char 

*word; 

327 

int 

t; 

328 

{ 

329 

char  *cp; 

330 

331 

if (  t  =  ACROSS  it  w  !=  -1) 

332 

w  -=  1; 

333 

334 

cp  =  word; 

335 

while (  *cp  ) ( 

336 

if(  *cp  !=  puzzled]  [w]  it  puzzle  [1]  [w]  !“  PADCHAR  ) 

337 

return  (  0  ) ; 

338 

if {  t  ==  ACROSS  ) 

339 

w  +=  1; 

340 

else 

341 

1  +=  1; 

342 

cp++; 

343 

) 

344 

return  (  1  ) ; 

345 

> 

346 

347 

/* 

349 

* 

ENTER () :  enter  word  into  puzzle 

351 

V 

352 

enter ( 

old,  1,  w,  word,  t  ) 

353 

char 

*old; 

354 

int 

1,  w; 

355 

char 

‘word; 

( continued  on  page  62) 

60 

606 


Dr.  Dobb's  Journal,  August  1987 


BACKTRACKING 


Listing  One  (Listing  continued,  text  begins  on  page  24.) 


356 

int 

t; 

357 

{ 

358 

char  *cp; 

359 

360 

iff  t  —  ACROSS  ) 

361 

w  —  1; 

362 

363 

cp  =  word; 

364 

while (  *cp  ) ( 

365 

*old++  =  puzzle [1] [w] ; 

366 

puzzle  [1]  [w]  ■=  *cp; 

367 

iff  t  ==  ACROSS  ) 

368 

w  +«  1; 

369 

else 

370 

1  +=  1; 

371 

cp++; 

372 

) 

373 

*old  -  NIL; 

374 

375 

return ; 

376 

) 

377 

378 

379 

/* 

* 

"***■  *- 

380 

★ 

RESTORED:  restore  puzzle  to  prev  state 

381 

382 

*/ 

383 

restore 

(  old,  1,  w,  t  ) 

384 

char 

*old; 

385 

int 

1,  w,  t; 

386 

t 

387 

char  *cp; 

388 

389 

iff  t  =  ACROSS  ) 

390 

w  -«  1; 

391 

392 

cp  <*  old; 

393 

while  (  *cp  )  ( 

394 

puzzle [1] [w]  =  *cp; 

395 

iff  t  =  ACROSS  ) 

396 

w  +=  1; 

397 

else 

398 

1  +=  1; 

399 

cp++; 

400 

} 

401 

402 

return; 

403 

) 

End  Listing 


62 


Dr.  Dobb’s  Journal,  August  1987 

607 


_ WHAT'S  THE  D1FF? 

Listing  One  (Te^t  begins  on  page  30.) 


— 

Name: 

DIFF.C 

— 

Processor: 

VAX  |  MS-DOS 

— 

Class: 

C  Program 

— 

Creation  Date: 

1/8/87 

— 

Revision: 

— 

Author: 

D.  Krantz 

— 

Description: 

File  compare  and  change-bar  for  text  files. 

/*  File  Difference  Utility  */ 

# include  <ctype.h> 

♦include  <stdio.h> 

♦define  OPT_FLAG  1 /'  /*  command  line  option  switch  recognizer  */ 

#ifdef  VAX11C 

♦define  MAXLINE  16  /*  maximum  characters  in  input  line  */ 

♦  else 

♦define  MAXLINE  85 
♦endif 

♦define  FORMFEED  'L'-'e* 

struct  LINE  { 

int  linenum; 
int  pagenum; 
struct  LINE  *link; 
char  text [  MAXLINE  ]  ; 
char  dup[  MAXLINE  ]; 

} ; 


/*  structure  defining  a  line  internally  */ 
/*  what  line  on  page  */ 
/*  what  page  line  is  from  */ 
/*  linked  list  pointer  */ 
/*  text  of  line  */ 
/*  uppercase  copy  of  line  text  */ 


typedef  struct  LINE  *linejptr; 
typedef  char  *char_ptr; 
typedef  FILE  *FILE  PTR; 


struct  LINE  root [  3  ] ; 

/*  root  of  internal  linked  lists 

*/ 

FILE_PTR  msg; 

/*  differences  summary  file  pointer 

*/ 

int  line_count[  3  ]  =  { 

1, 

1,  l  ); 

/*  file's  line  counter 

*/ 

int  page_count[  3  ]  =  { 

1, 

1,  1  ); 

/*  file's  page  counter 

*/ 

int  comma nd_err or s  =0; 

/*  how  many  command  line  errors 

*/ 

char  xxl[  132  ],  xx2[  132 

]  ; 

/*  space  to  retain  file  names 

*/ 

int  files  =  0; 

/*  how  many  files  specified  on  command  line 

*/ 

charjptr  infile_name[  3 

i 

=  (  NULL 

i,  xxl,  xx2 

); 

char  outfile  name[  132  ] 

; 

/*  changebarred  output  filename 

*/ 

FILE  PTR  infile [  3  ]; 

/*  input  file  pointers 

*/ 

FILE  *outfile; 

/*  changebarred  output  file  pointer 

*/ 

static  line  ptr  at [  3  ] 

= 

(  NULL, 

s (root [  1  ] 

I),  &  (root [  2  ] )  }; 

int  debug  =  0; 
int  trace_enabled  =  0; 
int  bar_col  =  78; 
int  top_skip  =0; 
int  bot_skip  =0; 
int  page_len  -  66; 
int  up_case  =0; 
int  re_sync  =5; 
int  output  =  0; 
int  blanks  =  0; 
int  lookahead  =  200; 
int  skipl  =  0; 
int  skip2  -  0; 


/*  trace  switch  */ 

/*  keyboard  tracing  switch  */ 

/*  column  where  change  bar  is  to  appear  */ 
/*  lines  to  skip  at  top  of  page  */ 

/*  lines  to  skip  at  bottom  of  page  */ 

/*  length  of  a  page  */ 

/*  boolean,  is  upper/lower  case  significant?  */ 
/*  lines  that  must  match  for  resynchronization  */ 

/*  boolean,  is  change-barred  output  file  on?  */ 
/*  boolean,  are  blank  lines  significant?  */ 

/*  how  many  lines  to  look  ahead  before  giving  up  */ 

/*  how  many  pages  of  first  file  to  skip  */ 

/*  how  many  pages  of  second  file  to  skip  */ 


♦if  0  /*  tracing  and  other  debug  functions  turned  off  */ 


♦define  trace (  x  ) 
♦define  ret 
♦define  ret_val {  x  ) 
♦define  TRACER  FUNCTIONS 


callstack (  x  ) 

{  callpopO;  return;  } 

{  callpopO;  return  (  x  );  } 


♦  else 


66 

608 


Dr.  Dobb's  Journal,  August  1987 


#def ine  trace  (  x  ) 
#define  ret 
# define  ret_val (  x  ) 


/**  nothing  **/ 

{  return;  } 

{  return (  x  ) ;  } 


main(  argc,  argv  ) 
int  argc; 
char  *argv [ ] ; 


trace (  "main"  ); 
if (  argc  ==  1  ) 
help() ; 
msg  =  stdout; 

f or  (  i  =  1;  i  <  argc;  i++  ) 

strip_opt (  argv [  i  ]  ) ; 
if (  files  <  2  ) 

{ 

printf (  "\nError:  Must  specify  two  files"  ); 
exit  (  2  ) ; 

} 

open_files  () ; 
if(  comma nd_errors  ) 
exit (  2  ) ; 
page_skip  () ; 
diff  (); 
ret; 


D0NT_L00K  -  Tells  us  whether  or  not  this  line  should  be  considered  for 
comparison  or  is  a  filler  (e.g.  header,  blank)  line. 


dont_look (  line  ) 
line_ptr  line; 


trace (  "dont_look"  ) ; 
if (  line  ==  NULL  ) 

ret_val (  0  ) ; 

if(  line->linenum  <=  top_skip  ) 
ret_val (  1  ) ; 

if (  line->linenum  >  page_len  -  bot_skip  ) 
ret_val (  1  ) ; 
if (  Iblanks  ) 

{ 

for (  i  =  0;  i  <  MAXLINE;  i++  ) 

switch  (  line->text[  i  ]  ) 

{ 

case  ' \0' : 
case  1 \n' ; 

ret_val (  1  ) ; 
case  ' \t 1  : 
case  1  1 : 

break; 

default : 

ret  val  (  0  ) ; 


ret  val  (  0  ) ; 


EQUAL  -  tells  us  if  the  pointers  'a'  and  'b'  point  to  line  buffers  containing 
equivalent  text  or  not. 


equal (  a,  b  ) 

line_ptr  a,  b; 


trace (  "equal"  ); 
if  (  (a  ==  NULL)  ||  (b  —  NULL)  ) 
ret_val (  0  ) ; 
if(  up_case  ) 

ret_val (  !strcmp(  a->dup,  b->dup  )  ) 

else 

ret_val  (  !strcmp(  a->text,  b->text  )  ) 


(continued  on  ne^ct  page) 


Dr.  Dobb's  Journal,  August  1987 


7Tl 


WHAT'S  THE  DIFF? 


Listing  One  (Listing  continued,  text  begins  on  page  30.) 


POSITION  -  moves  the  input  pointer  for  file  1 f "  such  that  the  next  line  to 
be  read  will  be  'where'. 

position (  f,  where  ) 
int  f; 

line_ptr  where; 

{ 

trace(  "position"  ); 

at  [  f  ]  =  &root [  f  ] ; 

while!  at [  f  ]->link  !=  where  ) 

at [  f  ]  =  at [  f  ]->link; 

ret; 


-  fixes  the  end-of-line  sequence  on  a  VAX  to  be  just  a  newline  instead  of 
a  carriage-return/newline. 

char  *fix(  str  ) 
char  *str; 

( 

char  ‘strsave; 

trace!  "fix"  ); 
strsave  =  str; 
if(  str  ~  NULL  ) 

ret_val (  NULL  ) 

ffifdef  VAX11C 

while!  *str  !=  ' \o '  ) 
t 

if (  match!  str,  "\r\n"  )  ) 

{ 

*str  -  '\n'; 

*  (str  +  1)  =  '  \0 ' ; 

) 

str++; 


ret_val (  strsave  )  ; 


INDEX  -  returns  a  pointer  to  the  first  occurance  of  'c'  in  the  string  pointed 
to  by  'str',  or  NULL  if  'str'  does  not  contain  'c*. 

char  ‘index!  str,  c  ) 
char  *str,  c; 

1 

trace!  "index"  ); 

while!  (*str  !=  c)  SS  *  (str++)  ); 

if(  *str  ==  c  ) 

ret_val (  str  ) 

ret_val (  NULL  ) ; 


NEXT_LINE  -  allocates,  links,  and  returns  the  next  line  from  file  ' f '  if  no 
lines  are  buffered,  otherwise  returns  the  next  buffered  line  from  file  'f' 
and  updates  the  link  pointer  to  the  next  buffered  line. 

line_ptr  next_line(  f  ) 
int  f ; 

( 

char  *malloc(); 
linejptr  temp,  place_hold; 

trace!  "next_line“  ); 
iff  at [  f  )->link  !=  NULL  ) 

{ 

at  [  f  ]  =■  at[  f  ]  ->link; 
ret_val (  at [  f  J  ) ; 


at[  f  ]->link  -  (line_ptr)malloc (  sizeof!  struct  LINE  )  ); 
if (  at [  f  ] ->link  ==  NULL  ) 


printf!  "\nOut  of  Memory"  ); 
exit (  2  ) ; 


(continued  on  page  70) 


68 

610 


Dr.  Dobb's  Journal,  August  1987 


WHAT'S  THE  DIFF? 


Listing  One  ( Listing  continued,  te?ct  begins  on  page  30. ) 

place_hold  =  at [  f  ] ; 
at[  f  ]  =  at [  f  ]->link; 
at [  f  ]->link  =  NULL; 

if  (  fix  (  fgets  (  at  [  f  ]->text,  MAXLINE,  infile  [  f  ]  )  )  ==  NULL) 

{ 

free (  at [  f  ]  ) ; 
at [  f  ]  =  place_hold; 
at [  f  ] ->link  =  NULL; 
ret_val (  NULL  ) 

} 

#ifdef  EMBEDDED_FORMFEEDS 

if (  (index  (  at [  f  ]->text,  FORMFEED  )  !=  NULL)  || 

(line_count[  f  ]  >  page_len  )  ) 

#else 

if  (  (  *  (at  [  f  ] ->text)  ==  FORMFEED)  || 

(line_count[  f  ]  >  page_len  )  ) 


page_count [  f  ) ++; 
line_count[  f  ]  =1; 

} 

at [  f  ]->linenum  =  line_count [  f  ]++; 
at [  f  ] ->pagenum  =  page_count [  f  ) ; 
if (  up_case  ) 

{ 

strcpy (  at [  f  ] ->dup,  at [  f  ] ->text  ) ; 
upper (  at [  f  ]->dup  ); 

} 

ret  val  (  at [  f  ]  ) ; 


DISCARD  -  deallocates  all  buffered  lines  from  the  root  up  to  and  including 
'to'  for  file  ' f ' . 


discard (  f ,  to  ) 
int  f; 

line_ptr  to; 

( 

line_ptr  temp; 


trace (  "discard"  ); 
for  (;;) 

{ 

if (  root [  f  ] .link  —  NULL  ) 
break; 

temp  =  root[  f  ] .link; 
root [  f  ].link  =  root [  f  ] . link->link; 
free (  temp  ) ; 
if(  temp  ==  to  ) 
break; 

} 

at [  f  ]  =  &roct [  f  ] ; 
ret; 


VFPUTS  -  for  VAX,  un-fixes  newline  at  end  of  line  to  be  carriage-return/newline. 


vfputs (  str,  file  ) 
char  *str; 

FILE  *f ile; 


int  i; 

trace (  "vfputs"  ); 

#ifdef  VAX11C 

for (  i  =  0;  i  <  MAXLINE;  i++  ) 

{ 

if(  str[  i  ]  ==  '\n'  ) 

{ 

strcpy (  str  +  i,  "\r\n"  ); 
break; 

} 

} 

fputs(  str,  file  ); 


(continued  on  page  74) 


611 


WHAT'S  THE  DIFF? 


Listing  One  (Listing  continued,  text  begins  on  page  30.) 


fputs(  str,  file  ); 


PUT  -  If  change-barred  output  file  is  turned  on,  prints  all  lines  from  the 
root  of  file  1  up  to  and  including  'line1.  This  is  called  only  if  a  match 
exists  for  each  significant  line  in  file  2. 


put (  line  > 

line_ptr  line; 

{ 

line_ptr  temp; 


trace (  "put"  ); 
if(  output  ) 

for(  temp  =  root [  1  J.link;  ;  ) 

{ 

if (  temp  ==  NULL  ) 
ret 

vfputs  (  temp->text,  outfile  ); 
if(  temp  ==  line  ) 
ret 

temp  =  temp->link; 


CHANGE_BAR  -  inserts  a  change-bar  into  the  text  pointed  to  by 
'str*  and  returns  a  pointer  to  'str*. 


char  *change_bar(  str  ) 
char  *str; 


int  i; 

char  temp[  MAXLINE  +  1 


*dest , *base; 


trace (  "change_bar"  ); 
base  =  str; 
dest  =  temp; 
i  =  0; 

if(  bar  col  !=  0  ) 


for(  i  =  0;  *str  !=  '\n';  i++  ) 

{ 

if (  (*str  ==  '\r')  &&  (*  (str  +  1)  l-  '\n'>  ) 
i  =  0; 

*  (dest++)  =  *(str++); 

} 

while (  i++  <  bar_col  ) 

*  (str)  ++  =  1  '  ; 
strcpy(  str,  " | \n"  ); 


if  (  str [  0  ]!«='•  ) 

{ 

strcpy (  temp,  str  )  ; 
strcpy(  str  +  1,  temp  ); 
str [  0  1  -  • I •; 


ret_val (  base  ) ; 


ADDED  -  Prints  a  change  summary  for  all  significant  lines  from  the  root  of 
file  1  up  to  and  including  'line'.  If  output  is  enabled,  adds  a  change  bar 
to  the  text  and  outputs  the  line  to  the  output  file. 


added (  line  ) 

line_ptr  line; 

{ 

line_ptr  temp; 


trace (  "added"  ); 

for(  temp  =  root [  1  ] .link;  ;  ) 

{ 

if (  temp  ==  NULL  ) 
ret 

if(  !dont_look(  temp  )  ) 


74 


Dr.  Dobb's  Journal,  August  1987 


} 


fprintf(  msg,  "+%d:%d  ->  %s",  temp->pagenum, 
temp->linenum,  temp->text  ) ; 

if(  output  ) 

if (  dont_look  (  temp  )  ) 


else 


vfputs (  temp->text ,  outfile  ); 


vfputs (  change_bar(  temp->text  ),outfile  ); 

if (  temp  ==  line  ) 
ret 

temp  =  temp->link; 


/* - 

DELETED  -  outputs  a  change  summary  for  all  lines  in  file  2  from  the  root  up  to 
and  including  'line'. 

- */ 

deleted (  line  ) 
line_ptr  line; 

{ 

linejptr  temp; 

trace (  "deleted"  ); 
f or (  temp  =  root [  2  ] .link;  ;  > 

{ 

if(  temp  ==  NULL  ) 
ret 

if(  !dont_look(  temp  )  ) 

fprintf(  msg,  "-%d:%d  ->  %s",  temp->pagenum, 
temp->linenum,  temp->text  ) ; 
if(  temp  ==  line  ) 
ret 

temp  =  temp->link; 

} 

ret; 

} 


/* - - : - 

RESYNC  -  resynchronizes  file  1  and  file  2  after  a  difference  is  detected,  and 
outputs  changed  lines  and  change  summaries  via  added  ()  and  deleted  ().  Exits 
with  the  file  inputs  pointing  at  the  next  two  lines  that  match,  unless 
it  is  impossible  to  sync  up  again,  in  which  case  all  lines  in  file  1  are 
printed  via  added ().  Deallocates  all  lines  printed  by  this  function. 
- : - */ 

resync  (  first,  second  ) 
line_ptr  first,  second; 

{ 

line_ptr  filel_start,  file2_start,  last_badl,  last_bad2,  tl,  t2; 
int  i,  j  ,k,  movedl,  moved2; 

trace  (  "resync"  >; 

movedl  =0; 
filel_start  =  first; 

position  (  1,  first  ); 
f or  (  k  =  0;  k  <  lookahead;  k++  ) 

{ 

while  (  dont_look  (  filel_start  =  next_line(  1  )  )  ); 
if(  filel_start  ==  NULL  )  goto  no_sy; 

moved2  =0; 
file2_start  =  second; 

position (  2,  second  >; 
f or  (  j  =  0;  j  <  lookahead  ;  j++  ) 

{ 

while (  dont_look  (  file2_start  =  next_line  (  2  )  )  ); 
if (  file2_start  ==  NULL  )  goto  eof2; 

tl  =  filel_start; 
t2  =  file2_start; 

for(  i  =  0;  (i  <  re_sync)  &&  equal  (  tl,  t2  );  i++  ) 
{ 

while (  dont_look  (  tl  =  next_line(  1  )  )  ); 
while (  dont_look  (  t2  =  next_line(  2  )  )  ); 
if  (  (cl  ==  NULL)  ||  (t 2  ==  NULL)  ) 
break; 

} 

if (  i  ==  re_sync  )  goto  synced; 
last_bad2  =  file2_start; 

(continued  on  next  page) 


Dr.  Dobb  s  Journal,  August  1987 


75 

613 


_ WHAT'S  THE  DIFF? 

Listing  One  (Listing  continued,  text  begins  on  page  30.) 


position (  2,  file2_start  ); 

while)  dont_look (  file2_start  =  next_line(  2  )  )  ); 
moved2  ++; 


last_badl  -  filel_start; 
position)  1,  filel_start  ); 

while)  dont_look  (  filel_start  =  next_line(  1  )  )  ); 
movedl++; 

) 

printf (  "\n***  ERROR  -  lost  sync  in  file  %s  at  page  %d  line  %d", 
infile_name[  1  ),  first->pagenum,  first->linenum  ); 
fclose (  outfile  ) ; 
exit  (  2  ) ; 

no_sy: 

position)  1,  first  ); 

while)  (first  =  next_line(  1  ))  !=  NULL  ) 

) 

added (  first  ) ; 
discard)  1,  first  ); 

) 

ret; 

synced: 

if (  movedl  ) 

1 

added (  last_badl  ) ; 
discard)  1,  last_badl  ); 
l 

position)  1,  filel_start  ); 
if)  moved2  ) 

( 

deleted)  last_bad2  ); 
discard)  2,  last_bad2  ); 

( 

position)  2,  file2_start  ); 
fprlntf)  msg,  “\n"  ); 
ret; 


/* - 

DIFF  -  differencing  executive.  Prints  and  deallocates  all  lines  up  to  where 
a  difference  is  detected,  at  which  point  resync))  is  called.  Exits  on  end 
of  file  1. 

- 1/ 

diff  () 

( 

line_ptr  first,  second; 


) 


trace) 
for)  ; ; 
( 


1 


"diff"  ) ; 
) 


while)  dont_look (  first  =  next_line(  1  ) 
if)  first  ==  NULL  ) 
l 

put  (  first  ) ; 
ret; 

) 

while)  dont_look  (  second  =  next_line(  2  ) 
if)  equal)  first,  second  )  ) 

( 

put (  first  ) ; 
discard)  1,  first  ); 
discard)  2,  second  ); 

) 

else 

resync)  first,  second  ); 
if)  second  ==  NULL  ) 
ret 


)  ; 


)  ; 


/* - 

PAGE_SKIP  -  skips  the  first  'skipl1  pages  of  file  1,  and  then  the  first  '  skip2 
pages  of  file  2.  This  is  useful  to  jump  over  tables  of  contents,  etc. 

- v 

page_skip () 

t 

line_ptr  first,  second; 

trace (  ”page_skip"  )  ; 
for  (  ;  ;  ) 


614 


(continued  on  next  page) 


WHAT'S  THE  DIFF? 


Listing  One  (Listing  continued,  te?ct  begins  on  page  30. ) 


first  =  next_line(  1  ); 

if(  (first  ==  NULL)  | |  (f irst->pagenum  >  skipl)  ) 
break; 

put (  first  ) ; 
discard  (  1,  first  ); 

(  first  !=  NULL  ) 

position (  1,  first  ); 

:(  ;  ;  ) 

second  =  next_line(  2  ); 

if (  (second  ==  NULL)  | |  (second->pagenum  >  skip2)  ) 
break; 

discard  (  2,  second  ); 

(  second  !=  NULL  ) 

position (  2,  second  ); 


HELP  -  outputs  usage  information. 


printf (  "\nDIFF"  ); 

printf (  "\nText  File  Differencer  and  Change  Barrer"  ); 
printf (  "\n"  ); 
printf (  "\nFormat:"  ); 

printf (  H\nDIFF  [opt ion (opt ion} ]  newfile  oldfile  [barfile]"  ); 
printf (  "\n"  ); 

printf (  "\n  newfile  =  latest  revision  of  text  file"  ); 
printf (  "\n  oldfile  =  baseline  to  compare  against"  ); 


printf ( 

"\n 

barfile  =  output  file  if  changebars  are  desired"  ); 

printf( 

"\n"  ; 

> ; 

printf ( 

"\nOptions: "  ); 

#ifdef 

TRACER  FUNCTIONS 

printf ( 

"\n 

/TRACE 

Makes  a  mess  of  the  display  and  runs  real 

slow"  ) ; 

printf ( 

"\n 

default  =  trace  off"  ) ; 

printf! 

"\n"  ; 

1 ; 

#endif 

printf ( 

"\n 

/BAR  COL=n 

Column  of  output  file  in  which  change  bar 

will  appear"  ); 

printf ( 

"\n 

default  -  78"  ); 

printf ( 

"\n" 

); 

printf ( 

"\n 

/TOP  SKIP=n 

Lines  at  top  of  page  to  skip  for  running 

heads  &  page  nos."  ); 

printf ( 

"\n 

default  =  0"  ); 

printf ( 

"\n"  ; 

I; 

printf ( 

"\n 

/BOT  SKIP=n 

Lines  at  botom  of  page  to  skip  for  running 

foots  and  page  nos."  ); 

printf ( 

"\n 

default  -  0"  ); 

printf ( 

"\n"  ; 

1 ; 

printf ( 

"\n 

/PAGE  LEN=n 

Lines  per  page  (embedded  formfeeds  over- 

ride) "  ) ; 

printf ( 

"\n 

default  =  66"  ) ; 

printf ( 

"\n"  ; 

> ; 

printf ( 

"\n 

/UP  CASE 

Upper/Lower  case  is  significant/is  not 

significant"  ); 

printf ( 

"\n 

/NOUP  CASE 

default"  ); 

printf ( 

"\n"  ; 

I ; 

printf ( 

"\n 

/RE  SYNC=n 

Lines  that  must  match  before  files  are 

considered  synced"  ) ; 

printf ( 

"\n 

after  differences  are  found  -  default  =  5"  ); 

printf ( 

"\n"  ; 

1; 

printf ( 

"\n 

/OUTPUT=file 

File  to  redirect  differences  summary  to.  "  ); 

printf ( 

"\n 

default  =  SYSSOUTPUT  or  console."  ); 

printf ( 

"\n"  ; 

I; 

printf ( 

"\n 

/BLANKS 

Blank  lines  are  considered  significant"  ); 

(continued  on  page  81) 

615 


WHAT'S  THE  DIFF? 


Listing  One  (Listing  continued,  text  begins  on  page  30.) 


printf (  "\n  /NOBLANKS  default"  >; 
prlntf (  “\n“  ) ; 

printf (  "\n  /LOOKAHEAD=n  Lines  to  look  ahead  in  each  file  to  resync 

after  difference"  ) ; 

printf (  "\n  default  -  200"  ); 


printf (  "\n 
printf (  "\n"  ); 
printf (  "\n  /SKIPl=n 

printf (  "\n 
printf (  "\n"  ); 
printf (  "\n  /SKIP2=n 

printf (  "\n 
printf (  "\n"  ); 


Pages  in  NEWFILE  to  skip  before  compare. 

Also  sets  /SKIP 2"  ) ; 

default  =  0"  ) ; 

Pages  in  OLDFILE  to  skip  before  compare. 

Must  be  after  /SKIP1"  ) ; 

default  =0"  ) ; 


OPENJTILES  -  opens  the  input  and  output  files. 
open_files  () 


trace)  "open_files"  ); 
for (  i  =  1;  i  <  3;  i++  ) 

if(  (infile[  i  ]  -  fopen(  infile_name [  i  ],  "r"))  ==  NULL  ) 

( 

printf)  "\nError:  Can't  open  %s",  infile_name[  i  )); 
command_error s++ ; 

} 

if (  files  >  2  ) 

if(  (outfile  =  fopen(  outflle_name,  "w"  ))  ==  NULL  ) 

( 

printf (  “\nError:  Can't  create  %s",  outfile_name  ); 
command  errors++; 


REDIRECT  -  performs  output  redirection  under  VAX  11  VMS. 

redirect (  str  ) 
char  *str; 

{ 

char  filename [  132  ),  *ptr,  *dest; 


trace(  "redirect"  ); 
dest  =  filename; 
if(  (ptr  =  index  (  str. 


)+!)--  (char  *)  (NULL  +  1)  ) 


printf (  "\nERROR  in  option  %s",  str  ); 
command_errors++; 

1 

While!  (*ptr  !=  OPT_FLAG)  ss  ( (* (dest++)  -  * (ptr++) )  !=  ■ \0 • )  ); 
*dest  -  '\0'; 

if(  (msg  =  fopen (  filename,  "w"  ))  ==  NULL  ) 

( 

printf (  "\nERROR  creating  %s",  filename  }; 
command_errors++ ; 


STRI P_OPT  -  processes  each  command  line  option. 

strip_opt (  str  ) 
char  *str; 

( 

trace!  "strip_opt"  ); 
upper (  str  ) ; 

if (  str t  0  ]  ==  OPT_FLAG  ) 


if (  match!  str  +  1,  "BAR_COL“  )  ) 
bar_col  =  num (  str  ) ; 
else  if(  match!  str  +  1,  "TOP_SKIP"  )  ) 
top_skip  =  num(  str  )  ; 
else  if (  match (  str  +  1,  "BOT_SKIP"  )  ) 
bot _skip  =  num (  str  ) ; 


(continued  on  next  page ) 


Dr.  Dobb's  Journal,  August  1987 


WHAT'S  THE  DIFF? 


Listing  One  (Listing  continued,  text  begins  on  page  30.) 


else  if (  match(  str  +  1,  "PAGE_LEN"  )  ) 
page_len  =  num (  str  ); 
else  if (  match (  str  +  1,  "UP_CASE"  )  ) 
up_case  =1; 

else  if (  match (  str  +  1,  "NOUP_CASE"  )  ) 
up_case  =  0; 

else  if  (  match  (  str  +  1,  ,,RE_SYNC"  )  ) 
re_sync  =  num (  str  ) ; 
else  if (  match (  str  +  1,  "BLANKS"  )  ) 
blanks  =1; 

else  if (  match (  str  +  1,  "NOBLANKS"  )  ) 
blanks  =0; 

else  if (  match (  str  +  1,  "LOOKAHEAD"  )  ) 
lookahead  =  num(  str  ); 

else  if (  match (  str  +  1,  "SKIP1"  )  ) 
skipl  =  skip2  -  num(  str  ); 
else  if(  match (  str  +  1,  "SKIP2"  )  ) 
skip2  =  num(  str  ); 

#ifdef  TRACER_FUNCTIONS 

else  if(  match (  str  +  1,  "TRACE"  )  ) 
trace_enabled  =  debug  =1; 


#endif 


else  if(  match (  str  +  1,  "OUTPUT"  )  ) 
redirect (  str  ) ; 

else 

{ 

} 


printf (  "\nUnrecognized  Option:  %s",  str  ); 
command  errors++; 


else 


} 


switch (  files  ) 

{ 

case  0: 


case  1 : 


case  2: 


default : 


} 

files++; 


strcpy(  infile_name[  1  ],  str  ); 
break; 

strcpy(  infile_name[  2  ],  str  ); 
break; 

strcpy(  outfile_name,  str  ); 

output  =  1 ; 

break; 

printf  (  "\nError:  Too  many  files  at  %s" 

command_errors++; 

break; 


} 

if (  index  (  str  +  1,  OPT_FLAG  )  !=  NULL  ) 

strip_opt (  index  (  str  +  1,  OPT_FLAG  )  ); 

ret; 


str) 


/* - 

UPPER  -  converts  the  string  'str'  to  upper  case. 


upper (  str  ) 
char  *str; 

{ 

trace ( 
for  (  ; 
( 


1 


"upper"  )  ; 

;  ) 

if (  *str  ==  '\0'  ) 
ret 

*str  =  toupper(  *str  ); 
str++; 


*/ 


/* - 

MATCH  -  looks  for  a  match  of  'str'  with  the  first  (strlen(  str)  )  characters 
of  'pattern'.  Returns  0  for  no  match,  nonzero  on  match. 

- - 

int  match (  str,  pattern  ) 
char  *str,  *pattern; 

{ 

trace (  "match"  ) ; 

^or  1  ;  ;  *  (continued  on  page  84) 


82 


Dr.  Dobb's  Journal,  August  1987 

617 


WHAT'S  THE  DIFF? 


Listing  One  (Listing  continued,  te/ct  begins  on  page  30. ) 


If (  *str  ! =  ^pattern  ) 
ret_val (  0  ) 

str++; 

pattern++; 

if (  *pattern  =  • \0 1  ) 
ret_val (  1  ) 
if (  *str  =«  • \0  •  ) 

ret_val (  1  ) 
if (  *str  ==  •  =  •  ) 

ret_val (  1  ) 


JTUM  -  returns  the  integer  associated  with  a  command  line  option.  An  equal 
sign  must  appear  in  the  option. 


int  num  (  str  ) 
char  *str; 


trace (  "num"  ); 

if(  index  (  str,  '  =  '  )  ==  NULL  ) 
ret_val  (  0  ) 

else 

ret_val  (  atoi  (  index (  str,  '  = 


'  )  +  1  )  ) 


#ifdef  TRACER_FUNCTIONS 

charjptr  names [  20  ]; 
int  stack  =  0; 

callstack (  str  ) 
char  *str; 

{ 

int  i; 
char  c; 


names [  stack++  ]  =  str; 


if(  debug  ) 

{ 

for  (  i  =  0;  i  <  stack;  i++  ) 
printf (  "  "  )  ; 

printf  (  "Entering  %s\n",  str  ); 

} 

#ifndef  VAX11C 

if (  trace_enabled  &&  kbhit()  ) 

{ 

switch  (  getch()  ) 

{ 

case  ' t ' : 
case  'T': 


debug  =  ! debug; 
break; 


case  's': 
case  'S': 


default: 


printf  (  "\n - "  ); 

for(  i  =  stack  -  1;  i  >=  0;  i —  ) 

printf  (  "\n%s",  names[  i  ]  ); 

printf  (  "\n - \n"  ); 

break; 

break; 


#endif 

} 

callpop  () 

{ 

int  i; 


} 


if(  debug  ) 

{ 

for  (  i  =  0;  i  <  stack;  i++  ) 
printf (  "  "  )  ; 

printf (  "Exiting  %s\n",  names [  stack  ]  ); 

} 

stack — ; 


#endif 


End  Listing 


618 


STRUCTURED  PROGRAMMING 


Listing  One  (Text  begins  on  page  122. ) 

Listing  1:  BASIC  source  code  for  the  Sieve  benchmark 

1000  ’  Sieve  Benchmark  Test 

1001  '  Version  1.0  5/30/86  Namir  C.  Shamraas 

1010  DEFINT  A-Z 

1020  SIZE  -  7000 
1030  MAXITER  -  10 
1040  TRUE  -  1:  FALSE  ■■  0 
1050  DIM  FLAGS (SIZE) 

1060  PRINT  "START  ";MAXITER;n  ITERATION" 

1065  TIME$  -  "00:00:00.00" 

1070  FOR  ITER  -  1  TO  MAXITER 

1080  COUNT  -  0 

1090  FOR  I  “  0  TO  SIZE 

1100  FLAGS (I)  -  TRUE 

1110  NEXT  I 

1120  FOR  I  =  0  TO  SIZE 

1130  IF  FLAGS  (I)  <>  TRUE  THEN  1210 

1140  PRIME  -  I+I+3 

1150  K  -  I+PRIME 

1160  WHILE  K  <=  SIZE 

1170  FLAGS  (K)  -  FALSE 

1180  K  -  K  +  PRIME 

1190  WEND 

1200  COUNT  -  COUNT  +  1 

1210  NEXT  I 

1220  NEXT  ITER 

1225  PRINT  "Time  is  ";TIME$ 

1230  PRINT  COUNT;"  PRIMES" 

1240  END  End  Listing  One 

Listing  Two 

Listing  2:  Translated  C  source  code  for  the  Sieve  benchmark 
char  *TIME_(),  *balloc(); 

static  int  "FLAGS,  count,  false,  i,  iter,  k,  maxiter,  prime,  size,  true; 
static  int  it_l,  it_2,  it_3; 
static  char  *st_l; 
static  int  ml_l; 
main  (argc,  argv) 
int  argc; 
char  *argv [ ] ; 

( 

bio_init (argc,  argv,  1)  ; 

/*  Sieve  Benchmark  Test  */ 

/*  Version  1.0  5/30/86  Namir  C.  Shamnas  */ 

size  =  7000; 

maxiter  =  10; 

true  =  1; 

false  =  0; 

ml_l  =  size+1; 

bfree (FLAGS) ; 

FLAGS  =  (int*)balloc ( (long) sizeof (int)  *  (size+1)); 

BPRINT ("s; i; s",  "\006START  ",  maxiter,  "\012  ITERATION"); 
STIME_(“\01300 : 00 : 00 .00") ; 
it_l  =  maxiter; 

for  (iter  =  1;  iter  <=  it_l;  ++iter) 

( 

count  =0; 
it_2  =  size; 

for  (i  -  0;  i  <=  it_2;  ++i) 

( 

FLAGS [i]  =  true; 

) 

it_3  =  size; 

for  (i  =  0;  i  <=  it_3;  ++i) 

( 

if  (FLAGS [i]  !-  true) 

goto  1_1210; 
prime  =»  i  +  i  +  3; 
k  =  i  +  prime; 


86 


Dr.  Dobb's  Journal,  August  1987 

619 


while  (k  <=  size) 

{ 

FLAGS  [k]  ■=  false; 
k  =  k  +  prime; 

) 

count  =  count  +  1; 

1_1210:; 

) 

> 

BPRINT  ("s;s",  "\010Time  is  ",  TIME_<Sst_l) ) ; 

BPRINT ("i; s”,  count,  "\007  PRIMES") ; 
bexit  (0)  ; 
bexit (0) ; 

^  End  Listing  Two 

Listing  Three 

Listing  3:  BASIC  source  code  for  a  root-seeking  program 

1010  DEFDBL  A-H.P-Z  :  DEFINT  1-0  :  CLS 

1040  INPUT  "Enter  function  number  [1..3]  ";N  :  PRINT 

1050  IF  (N  <  1)  OR  (N  >  3)  THEN  1040 

1060  INPUT  "Enter  guess  ";X  :  PRINT  :  READ  Accuracy,  MAX. ITER 
1070  DATA  1.0E-07,  50 

1075  Iter  =  0  :  Diverge%  “  1  :  Diff  »  2  *  Accuracy 

1080  WHILE  ABS(Diff)  >  Accuracy  1  Start  root  seeking  method 

1100  H  =  .01  :  IF  ABS(X)  >  1  THEN  H  =  H  *  X 

1110  X2  =  X  :  GOSUB  1200  :  F0  =  F 

1120  X2  =  X  +  H  :  GOSUB  1200  :  FI  =  F 

1130  X2  =  X  -  H  :  GOSUB  1200  :  F2  =  F 

1140  Diff  =  2  *  H  *  F0  /  (FI  -  F2)  :  X  =  X  -  Diff  ;  Iter  =  Iter  +  1 

1170  IF  (Iter  >  MAX. ITER)  THEN  Diverge*  =  0 
1180  WEND 

1190  IF  (Diverge*  =  0)  THEN  Accuracy  =  10  *  Accuracy  :  GOTO  1075 
1192  PRINT  USING  "Root  =  +#.#######AAAA»;X  :  PRINT 
1194  PRINT  USING  "Number  of  iterations  =  ##";Iter  :  PRINT 
1196  PRINT  USING  "Accuracy  =  # . ###AAAA" ; Accuracy  :  PRINT 
1198  END 

1200  'Subroutine  to  handle  function  catalogue 
1210  ON  N  GOSUB  2100,2200,2300  :  RETURN 
2100  F  =  EXP (X2)  -  3  *  X2A2  :  RETURN 

2200  F  =  X2A2  -  5  *  X2  +  6  :  RETURN 

2300  F  =  X2A3  -  5  *  X2  +  10  :  RETURN 

End  Listing  Three 

Listing  Four 

Listing  4:  Translated  C  source  code  for  a  root-seeking  program 

typedef  struct  data 

( 

unsigned  d_line; 
char  *d_text; 

)DATA; 

static  DATA  da_l[]  -  (1060,  "1.0E-08,  50\n"); 

double  ABS  () ,  EXPO,  f_raise(); 

static  int  divergel,  iter,  max_iter,  n; 

static  double  accuracy,  diff,  f,  fO,  fl,  f2,  h,  x,  x2; 

DATA  *data_stmts  []  *= 

{ 

da_l,  0 

); 

main (argc,  argv) 
int  argc; 
char  *argv [ ] ; 

( 

bio_init (argc,  argv,  1) ; 

CLS  () ; 

1_1040 : ; 

INPUT ("P  ; i" ,  "\035Enter  function  number  [1..3]  ",  Sn) ; 

BPRINT ("") ; 

if  (-  ( (n  <  1))  |  -  ( (n  >  3) ) ) 
goto  1_1040; 

INPUT ("P  ;d“,  "\016Enter  guess  :  ",  £x); 

BPRINT  ("")  ; 

BREAD ("  d, i",  Saccuracy,  Smax_iter) ; 

(continued  on  ne?ct  page ) 


Dr.  Dobb's  Journal,  August  1987 

620 


87 


STRUCTURED  PROGRAMMING 


Listing  Four  (Listing  continued,  tejit  begins  on  page  122.) 

1_1075 : ; 

iter  =  0; 

divergel  -  1; 

diff  =  2  *  accuracy; 

while  (ABS (diff)  >  accuracy) 

( 

/*  Start  root  seeking  method  */ 
h  =  0.01; 
if  (ABS  (x)  >  1) 
h  =  h  *  x; 
x2  =  x; 
pr_1200<); 
fO  -  f; 
x2  =  x  +  h; 
pr_1200(); 
fl  =  f; 
x2  =  x  -  h; 
pr_1200(); 
f2  -  f; 

diff  =  2  *  h  *  fO  /  (fl  -  f2) ; 
x  =  x  -  diff; 
iter  -  iter  +1; 
if  (iter  >  max_iter) 
divergel  =0; 

} 

if  ((divergel  =*■  0)) 

{ 

accuracy  -  10  *  accuracy; 
goto  1_1075; 

} 

UPRINT("\025Root  =  +#.#######AAA'>",  "d",  x)  ; 

SPRINT ("") ; 

UPRINT("\032Number  of  iterations  =  ###",  "i",  iter) ; 
BPRINT ("") ; 

UPRINT ("\024Accuracy  =  #. ###AAA'V,,(  "d",  accuracy); 
BPRINT ("“); 
bexit  (0) ; 

> 

pr_1200() 

( 

/*  Subroutine  to  handle  function  catalogue  */ 
switch  (n) 

{ 

case  1: 
pr_2100(); 
break; 
case  2: 
pr_2200(); 
break; 
case  3: 
pr_2300  ()  ; 
break; 

) 

return; 

/*  Subroutine  number  1  */ 

) 

pr_2100() 

{ 

f  =  EXP  (x2)  -  3  *  f_raise  (x2,  (double)  2)  ; 
return; 

/*  Subroutine  #  2  */ 

> 

pr_2200() 

( 

f  =  f_raise (x2,  (double)  2)  -  5  *  x2  +  6; 
return; 

/*  Subroutine  #  3  */ 

) 

pr_2300() 

( 


Dr.  Dobb's  Journal,  August  1987 


L*j 


f  =  f_raise(x2,  (double)  3)  -  5  *  x2  +  10; 

return; 

bexit(O); 

) 


End  Listing  Four 


Listing  Five 


Listing  5:  BASIC  source  coda  for  Flnd/Raplace  utility. 

1000  •  Bitch  Flnd/Replice  Utility  Version  1.1  2/7/86 
1010  •  IBM  PC  BASICA  version  2.0 

1020  '  Copyright  (c)  1987  Namir  Clement  Shiirmis 

1030  ■ - 

1040  OPTION  BASE  1 
1050  DEFINT  A-Z 

1060  DIM  FILENAI«S(20),FIND.STR$(30) 

1070  DIM  REPLACE. STBS  (30), REPLACE. FLAG  (30),  TEXT.LINES  (500) 

1080  • 

1090  TRUE  -  1  :  FALSE  -  0  'Set  true/false 

1100  MAX. LINES  -  500  '  Current  maximum  number  of  lines  read  from  a  file 

1110  MAX. STRINGS  -  30  1  Number  of  find/replace  strings 

1120  MAX. FILES  "  20  '  Maximum  number  of  files 

1140  CLS 

1150  • 

1160  TS  -  -BATCH  FILE  FIND /REP LACE  PROGRAM-  :  GOSUB  2290 
1170  TS  -  "VERSION  1.0“  :  GOSUB  2290  :  PRINT  :  PRINT 
1180  GOSUB  1560  'GET  .FILENAMES  :  Get  filenames 
1190  GOSUB  1820  'GET. STRINGS  :  Get  search/ replace  strings 
1200  FOR  K  -  1  TO  NUM. FILES 

1210  GOSUB  2060  •  READ. LINES!  Read  text  lines  from  a  file 

1220  CHANGED  -  FALSE 

1230  FOR  I  -  1  TO  NUM. STRINGS 

1240  FOUND  -  FALSE 

1250  FOR  J  -  1  TO  NUM.  LINES 

1260  PTR  -  INSTR (TEXT. LINES  (J) .FIND. STRS  (I) ) 

1270  WHILE  PTR  >  0 

1280  IF  (FOUND  -  TRUE)  THEN  1330 

1290  FOUND  -  TRUE 

1300  LPRINT 

1310  LPRINT  -KEYWORD  i  "/FIND. STRS (I) 

1330  LPRINT  J;-!-;TEXT. LINES  (J) 

1340  IF  (REPLACE. FLAG  (I)  -  FALSE)  THEN  1440 

1350  CHANGED  -  TRUE 

1360  FIRSTS  -  — 

1370  IF  PTR  >  1  THEN  FIRSTS  -  MIDS  (TEXT.LINES  (J) ,  1,  (PTR-1) ) 

1380  LASTS  -  "" 

1390  IF  (PTR+LEN  (FIND. STRS  (I) ) )  <-  LEN  (TEXT .LINES  (J) ) 

THEN  1420 

H°°  LASTS  -  MIDS  (TEXT.LINES  (J) ,  (PTR+LEN  (FIND. STRS  (I)  ) )  ) 

1420  TEXT.LINES  (J)  -  FIRSTS  +  REPLACE. STRS  (I)  +  LASTS 

H40  PTR  -  INSTR(PTR+1,TEXT.LINES(J),FIND.STRS(I)) 

1450  WEND 

1460  NEXT  J 
1470  NEXT  I 

1480  IF  (CHANGED  -  TRUE)  THEN  GOSUB  2190  '  WRITE. LINES 

1490  NEXT  K 

1500 

1510  LPRINT  CHRS (140)  •  form  feed 
1520 

1530  END 
1540 

1550  • - 

1560  '  GET .FIIENAM3S :  Subroutine  to  input  filenames  from  the  keyboard 
1570  NUM. FILES  -  0 

1580  WHILE  (NUM. FILES  <-  0)  OR  (NUM. FILES  >  MAX. FILES) 

1590  INPUT  -Enter  number  of  files  -;NUM. FILES 

1600  PRINT 

1610  WEND 

1620  FOR  I  -  1  TO  NUM. FIXES 
1630  'REPEAT. LOOP1; 

1640  PRINT  -Enter  filename  I  -;Ij-  -j 

1650  INPUT  FILENAAES(I)  :  PRINT 

1660  ON  ERROR  GOTO  1750 

1670  OPEN  "I".  1. FILENAMES (I) 

1680  CLOSE  »1 

1690  ON  ERROR  GOTO  0 

1700  IF  FILENAMES (I)  -  —  THEN  1630 

1710  NEXT  I 

1720  RETURN 

1740  ' - 

1750  'HANDLE;  Error  hander  for  bad  filenames 
1760  PRINT  -File  "  .’FILENAMES  (I) ;-  was  not  found* 

1770  PRINT 

1780  FILENAMES (I)  -  — 

1790  RESUME  NEXT 
1800 

1810  ' - 

1820  '  GET. STRINGS:  Subroutines  to  input  search/replace  strings 
1830  NUM. STRINGS  -  0 

1840  WHILE  (NUM. STRINGS  <-  0)  OR  (NUM. STRINGS  >  MAX. STRINGS) 

1850  INPUT  -Enter  nunfcer  of  search/replace  strings  -;NUM. STRINGS 

1860  PRINT 

1870  WEND 

1880  FOR  I  -  1  TO  NUM. STRINGS 
1890  REPLACE. STRS  (I)  -  — 

1900  PRINT  :  PRINT  -For  string  t  ";I 


(continued  on  nejct  page) 


Dr.  Dobb's  Journal,  August  1987 

622 


89 


STRUCTURED  PROGRAMMING 


Listing  Five  ( Listing  continued,  te^ct  begins  on  page  122. ) 


1910  INPUT  ■  Enter  string  ";FIND.STR$ (I) 

1920  INPUT  -  R) eplace  F) ind  ";A$  :  PRINT 

1930  IF  (INSTR ("Rr",MID$ (A$, 1, 1) )  -  0)  THEN  REPLACE .FLAG (I)  - 

FALSE  ELSE  REP  LACE.  FLAG  (I)  -  TRUE 
1980  IF  REPLACE. FLAG (I)  -  FALSE  THEN  2020 
1990  INPUT  "Enter  replacement  atring  REP LACE. STR$ (I) 

2000  PRINT 

2020  NEXT  I 
2030  RETURN 
2040 

2050  ' - 

2060  1  READ. LINES:  Subrout inea  to  read  text  lines 
2070  LPRINT 

2080  LPRINT  "PROCESSING  FILE  :  " ; FILENAMES (K) 

2090  OPEN  "I", 1, FILENAMES (K) 

2100  NUM. LINES  -  0 

2110  WHILE  (NOT  EOF(l) )  AND  (NUM. LINES  <-  MAX. LINES) 

2120  NUM. LINES  -  NUM. LINES  +  1 

2130  LINE  INPUT  II, TEXT .LINES (NUM. LINES) 

2140  WEND 
2150  CLOSE  II 
2160  RETURN 

2180  ' - 

2190  '  WRITE. LINES:  Subroutines  to  write  text  lines 
2200  OPEN  "0",1, FILENAMES (K) 

2210  FOR  I  -  1  TO  NUM. LINES 
2220  PRINT  II, TEXT. LINES (I) 

2230  NEXT  I 
2240  CLOSE  II 
2250  RETURN 

2270  ' - 

2280  '  Subroutine  to  center  a  message 
2290  PRINT  SPC(40  -  LBN (TS) \2) ;T$ 

2300  return  End  Listing  Five 

Listing  Six 

Listing  6:  Translated  C  source  code  for  Find/Replace  utility. 

extern  int  on_error,  err_code,  err_stmt,  trap_line,  trap  err; 
char  *CHR_() ,  *MID_{),  *s_asgn(),  *a_cat(); 
int  EOF  () ,  INSTR  (),  LEN(); 

static  int  AREPLACE_FLAG [ 31 ] ,  changed,  false,  found,  i,  j,  k,  max_files; 
static  int  max_lines,  max_strings,  num_files,  nura_lines,  num_at rings,  ptr; 
static  int  true; 

static  char  *FILENAME_>(21) ,  *FIND_STR  (31),  "REPLACE  STR  (31]; 
static  char  *TEXT_LINE_(501] ,  *a_,  *first_,  *last_,  *t_;“ 
static  int  it_l,  it_2,  it_3,  it_4,  it_5,  it_6;  “ 

static  char  *st_l,  *st_2; 

main (argc,  argv) 
int  argc; 
char  *argv[]; 

( 

bio_init (argc,  argv,  1) ; 

/*  Batch  Find/Replace  Utility  Version  1.1  2/7/86  */ 

/*  IBM  PC  BASICA  version  2.0  »/ 

/*  Copyright  (c)  1987  Namir  Clement  Shammas  */ 

/* - */ 

free_sp (FILENAME_,  21,  'S'); 
free_sp(FIND_STR_,  31,  ’S'); 
f ree_sp (REPLACE_STR_,  31,  'S'); 
f ree_sp (TEXT_LINE_,  501,  'S'); 

true  »  1; 

false  -  0;  /*  Set  true/false  */ 

max_lines  -  500;  /*  Current  maximum  number  of  lines  read  from  a  file  */ 
max_st rings  -  30;  /*  Number  of  find/replace  strings  */ 
max  files  -  20;  /*  Maximum  number  of  files  */ 

CLSO  ; 

t_  -  S_asgn(t_,  "\037BATCH  FILE  FIND/REPLACE  PROGRAM"); 
sub_push (1) ; 
goto  1_2290; 
g_l : ; 

t_  -  s_asgn  (t_,  "\013VERSION  1.0"); 
sub_push (2) ; 
goto  1_2290» 
g_2:; 

E_0:; 

SPRINT ("") ; 

if  (err__code)  (err_stmt-0;  goto  err  trap; ) 

E_l:; 

BPRINT ("") ; 

if  (err_code)  (err_stmt-l;  goto  err__trap; ) 

E_2:; 

sub_push(3);  /*  GET. FILENAMES  :  Get  filenames  */ 
goto  1__1560; 
g_3:; 

sub_push(4);  /*  GET. STRINGS  :  Get  search/replace  strings  V 
goto  1_1820; 
g_«:; 

it_l  -  num_files; 

(continued  on  page  92) 


90 


Dr.  Dobb’s  Journal,  August  1987 

623 


STRUCTURED  PROGRAMMING 


Listing  Six  (Listing  continued,  tegt  begins  on  page  122.) 

for  (k  -  1;  k  <-  it  1;  ++k) 

< 

subjpush(5);  /*  READ. LINES:  Read  text  line*  from  *  file  */ 
goto  1_2060; 

g_5 : ; 

changed  -  false; 
it_2  -  nura_strings; 

for  (i  -  1;  i  <-  it  2;  ++i) 

{ 

found  -  false; 
it_3  -  num_lines; 

for  (j  -  1;  j  <-  it  3;  ++j) 

{ 

E_3:; 

Ptr  -  INSTR (-1,  TEXT_LINE__(  j) ,  FIND_STR_(i] )  ; 
if  (err_code)  (err_stmt-3;  goto  err  trap; ) 

E_4:; 

while  (ptr  >  0) 

< 

if  ((found  —  true)) 
goto  1_1330; 
found  -  true; 

E_5:; 

BLPRINT  (•"*) ; 

if  (err__code)  (err_atrat-5;  goto  err_trap; ) 

E_6 : ;  “ 

BLPRINT ("•;*",  "\012KEYW0RD  t  m,  FIND_STR  (ij); 
if  (err_code)  (err_stmt-6;  goto  err  trap;) 

E_7:; 

1_1330:; 

BLPRINT  ("i;  a;  s*,  j,  "\001:",  TEXT_LINE_(  j] ) ; 
if  (err_code)  (err_stmt-7;  goto  err  trap;) 

E_8 : ;  ~ 

if  (  (AREPLACE_FLAG  [i]  —  false)) 
goto  1_1440; 
changed  -  true; 

first_  -  s_asgn(firat_,  "\000"); 
if  (ptr  >  1) 

{ 

E_9:; 

first_  -  s_asgn  (first_,  MID_(tst_l,  TEXT_LINE_( j) , 
1,  (ptr  -  1))); 

if  (err_code)  (err  stmt-9;  goto  err  trap;) 

E  10:;  ~ 

) 

last_  -  s_asgn(last  ,  "\000") ; 
if  ((ptr  +  LEN (FIND_STR_(i] ) )  <-  LEN (TEXT_LINE_( j) ) ) 
goto  1  1420; 

E_ll:; 

last_  -  s_asgn(laat_,  MID_(Sat_l,  TEXT_LINE_(  j] ,  (ptr 
+  LEN(FIND_STR_[i]7),  -1)); 
if  (err_code)  (err_stmt-ll;  goto  err  trap; ) 

E  12:; 


TEXT_LINE_[  j]  -  s_aagn  (TEXT_LINE_ [  j  ] ,  s_cat(tst_2,  s 

cat (ist_l, 

firat_,  REPLACE_STR_ [  i  ] ) ,  lastj ) ; 

if  (err_code)  (err_stmt-12;  goto  err  trap;) 

E  13:; 


ptr  -  INSTR  (ptr  +  1,  TEXT_LINE_ [  j ] ,  FIND_STR  (i]); 
(«rr_code)  (err_stmt-13;  goto  err  trap;)  “ 

E  14:;  ” 


if  ( (changed  —  true) ) 

(  /*  WRITE. LINES  */ 
sub__push  (6)  ;  /*  WRITE. LINES  */ 
goto  1  2190; 


E__15 : ; 

BLPRINT ("a",  CHR_ (4 st_l,  140));  /*  form  feed  */ 
if  (err_code)  (err  stmt-15;  goto  err  trap;} 

E  16:; 

bexit (0) ; 

- - - 

1_1560:; 

/*  GET.  FILENAMES:  Subroutine  to  input  filenames  from  the  keyboard*/ 
num_files  -  0; 

while  (-((num_filea  <-  0) )  |  -  ( (num_filea  >  max_files))) 

E_17 :; 

INPUT("P  ;i",  ■\026Enter  number  of  files  ",  tnum  files); 
if  (err_code)  {*rr_stmt-17;  goto  err__trap;} 


(continued  on  page  94) 


92 

624 


Dr.  Dobb's  Journal,  August  1987 


STRUCTURED  PROGRAMMING 


Lifting  Six  ( Listing  continued ,  text  begins  on  page  122. ) 


E_18 

SPRINT (""); 

if  <err_code)  {err_stmt-18;  goto  err_trap; ) 

E_19:; 

} 

it_4  -  num_f lies; 

for  (i  -  1;  i  <-  it_4;  ++i) 

{ 

1_1630:; 

/*  REPEAT . LOOP 1 :  */ 

E_20 

BPRINT("a;i;s;",  "\021Enter  filename  I  ",  i,  "\001  ") ; 
if  (err_code)  {err_atmt«20;  goto  err_trap;) 

E_21 : / 

INPUT ("  3",  £FILENAME_[i] ) } 
if  (err_code)  (err__strat-21;  goto  err_trap;) 

E_22 : ; 

SPRINT ("") / 

if  (err_code)  {err_stmt-22;  goto  err  trap;) 

E_23 : ; 

on_error  -  1; 
err_code  -  0; 
trap_line  -  1; 

E  24:; 

BOPEN ("\001I",  1,  FILENAME_{i],  -1); 
if  (err_code)  (err_stmt-24;  goto  err_trap/) 

E_25 : ; 

BCLOSE (1,  0); 
if  (trapjerr) 

( 

xer_msg (trap_err) ; 
bexit(l); 

) 

on_error  -  0; 
orr_codo  -  0; 


if  (3_cocrp  (FILENAh£_[ i] ,  "\000")  —  0) 
goto  1__1630; 

) 

goto  sub  ret; 

/* - - V 


1_1750 : ; 

/*  HANDLE:  Error  hander  for  bad  filenames  */ 

E_26:; 

BPRINT {"s;s;s",  "\005File  ",  FILENAME_[i] , "\016  was  not  found"); 
if  (err_code)  {err_stmt-26;  goto  err  trap;} 

E_27 : ; 

SPRINT (""); 

if  (err_code)  {err_stmt-27;  goto  err_trap; ) 

E_28:; 

FILENAME_[i]  -  s_asgn (FILENAME_ { i ) ,  "\000"); 

++err_stmt; 
goto  un_trap; 


/. - */ 

1_1820 : ; 

/*  GET. STRINGS:  Subroutines  to  input  search/replace  strings  */ 
num_ st rings  -  0; 


while  (-{ (num_st rings  <-  0) )  I  - ( (num_strings  >  max_strings) ) ) 

{ 

E_29:; 

INPUT ("P  ;i",  "\047Enter  number  of  search/ replace  strings  ", 

£num_strings) ; 

if  (err_code)  (err_stmt-29;  goto  err  trap;} 

E_30 : ; 

SPRINT (""); 

if  (err_code)  {err  stmt-30;  goto  err  trap;) 

E_31:; 

) 

it_5  -  n um_st rings ; 

for  (i  -  1;  i  <-  it_5;  ++i) 

{ 

REPLACE_STR_ [ i )  -  s_asgn (REPLACE_STR_[ i] ,  "\000") ; 

E_32 : ; 

SPRINT (""); 

if  (err_code)  (err_stmt-32;  goto  err  trap; } 

E_33 : ; 

BPRINT ("a; i",  "\015For  string  t  ",  i); 
if  (err__code)  (err_stmt-33;  goto  err__trap; ) 

E_34 : ; 

INPUT ("P  ;s",  "\021  Enter  string  ",  £FIND_STR_[i] ) ; 
if  (err_code)  {err_stmt-34;  goto  err_trap; ) 

E_35 : ; 

INPUT ("P  ;s",  "\023  RJeplace  F) ind  ",  £a_) ; 
if  (err_code)  {err_atmt«35;  goto  err_trap; } 


(continued  on  page  96) 


94 


Dr.  Dobb's  Journal,  August  1987 

625 


STRUCTURED 

PROGRAMMING 


Listing  Six 

(Listing  continued,  tejct  begins  on  page  122.) 

E_36 

SPRINT 

if  (err_code)  {err_strat-36;  goto  err_trap;) 

E_37 : ; 

if  (<INSTR(-1,  "\002Rr",  MD_(tst_l,  a_,  1,  1))  —  0)) 

{ 

AREPLACE_FLAG ( i]  -  false; 

) 

else 

{ 

AREPLACE_FLAG[i]  -  true; 

) 

if  (AREPLACE_FLAG[i]  —  false) 
goto  1_2020; 

E_38 :  ; 

INPUT (-P  ;s","\031Enter  replacement  string  ",  fcREPLACE_STR_ 

Fin  i 

if  (err__code)  (err_atmt-38;  goto  err__trap;) 

E_39:; 

RPRINTC")  / 

if  (err_code)  (err_atrat-39;  goto  err_trap; ) 

E_40i; 

1_2020 : ; 

> 

goto  sub_ret; 

1_2060:; 

/*  READ. LINES:  Subroutines  to  read  text  lines  */ 

E_41:; 

BLPRINTC") ; 

if  (err_code)  (err_stmt-41;  goto  err_trap;) 

E_42 : ; 

BLPRINT ("s;s",  "\022PROCESSING  FILE  s  ",  FILENAME_[k) ) ; 
if  (err_code)  {err_atmt-42;  goto  err__trap;) 

E_43:; 

BOPEN  ("\001I",  1,  FILENAMJ_[k) ,  -1); 
if  (err_code)  (err_stmt-43;  goto  err_trap;) 

E_44:; 

nura_lines  -  0; 

while  ( (~ (EOF (1) ) )  &  -{ (num_lines  <-  max_lines))) 

( 

num_lines  -  num__lines  +  1; 

E_45:; 

INPUT  ("FL1",  1,  tTEXT_LINE_ [nura_lines] )  ; 
if  (err_code)  {err_stmt-45;  goto  err_trap;) 

E_46:; 

) 

BCLOSE  (1,  0); 
goto  sub  ret; 

/* - -- - */ 

1_2190:; 

/*  WRITE. LINES:  Subroutines  to  write  text  lines  */ 

E_47  :; 

BOPEN ("\0010",  1,  FILENAME_(k],  -1); 
if  (err_code)  (err_stmt-47;  goto  err_trap;) 

E_48:; 

it_6  -  num_lines; 

for  (i  -  1;  i  <-  it__6;  ++i) 

( 

E_49:; 

BFPRINT  (1,  "s",  TEXT_LINE_[i))  ; 
if  (err__code)  {err__stmt-49;  goto  err_trap/) 

E_50 : ;  ) 

BCLOSE  (1,  0); 
goto  sub_ret; 

/* - ./ 

/*  Subroutine  to  center  a  message  */ 

E_51:; 

1_2290:; 

BPRINT  ("b;s",  40  -  LEN  (t_)  /  2,  t_)  / 
if  (err_code)  (err_stmt-51;  goto  err_trap; ) 

E_52 : ; 

goto  sub_ret; 

bexit (0)7 

sub_ret : 

switch  (subjpop  ( ) ) 

{ 

case  1:  goto  g_l/ 
case  2:  goto  g__2; 
case  3:  goto  g_3; 
case  4:  goto  g_4; 
case  5:  goto  g_5; 
case  6:  goto  g_6; 

) 


STRUCTURED 

PROGRAMMING 

Listing  Six 

err_trap:  if  (trap_err) 

xer  mag  (err  code); 

bexit (1) ; 

trap  err  - 

err  code; 

err  code  - 

0; 

goto  1_1750; 

un_trap:  if  (!trap_err) 

xer  msg(-99); 

bexit  (1) ; 

trap  err  - 

err  code  ■  0; 

switch (err_stmt) 

case  0: 

goto  E  0; 

case  1: 

goto  El; 

case  2: 

goto  E  2; 

case  3: 

goto  E  3; 

case  4: 

goto  E  4; 

case  5: 

goto  E  5; 

case  6: 

goto  E  6; 

case  7: 

goto  E  7; 

case  8: 

goto  E  8; 

case  9: 

goto  E  9; 

case  10 

:  goto  E  10; 

case  11 

;  goto  E  11; 

case  12 

:  goto  E  12; 

case  13 

:  goto  E  13; 

case  14 

:  goto  E  14; 

case  15 

:  goto  E  15; 

case  16 

goto  E  16; 

case  17 

goto  E  17; 

case  18 

goto  E  18; 

case  19:  goto  E  19; 

case  20 

goto  E  20; 

case  21 

goto  E  21; 

case  22 

goto  E  22; 

case  23 

goto  E  23; 

case  24 

goto  E  24; 

case  25 

goto  E  25; 

case  26 

goto  E  26; 

case  27 

goto  E  27; 

case  28 

goto  E  28; 

case  29 

goto  E  29; 

case  30 

goto  E  30; 

case  31 

goto  E  31; 

case  32 

goto  E  32; 

case  33 

goto  E  33; 

case  34 

goto  E  34; 

case  35 

goto  E  35; 

case  36 

goto  E  36; 

case  37 

goto  E  37; 

case  38 

goto  E  38; 

case  39 

goto  E  39; 

case  40 

goto  E  40; 

case  41 

goto  E  41; 

case  42 

goto  E  42; 

case  43 

goto  E  43; 

case  44 

goto  E  44; 

case  45 

goto  E  45; 

case  46 

goto  E  46; 

case  47 

goto  E  47; 

case  48 

goto  E  48; 

case  49 

goto  E  49; 

case  50 

goto  E  50; 

case  51. 

goto  E  51; 

case  52: 
> 

goto  E_52; 

End  Listings 

626 


Dr.  Dobb's  Journal,  August  1987 


COLUMNS 


C  CHEST 


Subroutines  with  A  Variable  Number  of  Arguments 


I  had  intended  this  month  to  carry 
on  with  the  adaptive  Huffman 
tree  stuff  I  started  two  months  ago. 
After  spending  about  60  hours  work¬ 
ing  on  the  code,  I've  finally  given  up. 
The  paper  I  was  working  from  didn’t 
really  provide  enough  information 
to  implement  the  algorithm  fully, 
and  I  got  tired  of  having  to  decipher 
the  thing.  Maybe  I'll  pick  up  the  pro¬ 
ject  again  at  some  future  date,  but  for 
now  I  quit. 

So,  this  month’s  C  Chest  is  going  to 
deal  with  a  different  topic  entirely — 
subroutines  with  a  variable  number 
of  arguments.  I’m  not  going  to  pre¬ 
sent  a  specific  program;  rather,  I’ll 
discuss  various  techniques  that  you 
can  use  to  write  such  subroutines 
•and  the  sorts  of  problems  you’re  like¬ 
ly  to  encounter.  I'll  discuss  the  ANSI 
and  Unix  methods  for  variable-argu¬ 
ment  passing  at  the  end  of  the  col¬ 
umn.  First,  however,  I’ll  look  at 
what’s  actually  going  on. 

Variable  Arguments 

At  run  time,  all  C  subroutines  use  an 
area  of  the  stack  called  a  "stack 
frame'  ’  to  hold  arguments,  local  vari¬ 
ables,  and  so  forth.  Here  we’re  inter¬ 
ested  in  the  portion  of  the  stack 
frame  used  for  argument  passing. 
Subroutine  arguments  are  always 
passed  on  the  stack,  and  the  argu- 

by  Allen  Holub 

ments  are  always  pushed  in  reverse 
order  (the  rightmost  argument  is 
pushed  first).  So,  for  example,  in  call 
(of,  the,  wild)  the  variables  wild,  the, 
and  of  are  pushed  on  the  stack  in  that 
order.  The  number  of  bytes  that  are 
actually  pushed  depends  on  both  the 
declared  type  of  the  variables  and 


the  automatic-type-conversion  rules: 
both  signed  and  unsigned  variables 
of  type  char  or  short  int  are  convert¬ 
ed  to  int  before  being  pushed.  Simi¬ 
larly,  variables  of  type  float  are  con¬ 
verted  to  double.  (This  last  automatic 
conversion  is  often  compiler  depen¬ 
dent,  however,  and  can  be  sup¬ 
pressed  with  a  function  prototype). 
For  the  sake  of  the  following  exam¬ 
ples,  let’s  assume  an  8-bit  char,  a  16- 
bit  int,  32-bit  longs  and  floats,  and  a 
64-bit  double. 


Let's  start  with  the  following  sim¬ 
ple  example;  its  stack  is  illustrated  in 
Figure  1,  below. 


stack: 

address: 

return  address 

100 

of: 

10 

102 

the: 

20 

104 

wild: 

30 

106 

0 

108 

110 

Figure  1:  Stack  for  the  subroutine 
call  call(  of,  the,  wild,  0  ); 


int  of,  the,  wild; 

void  call(  int, . . . ); 

of  =  10; 

the  =  20; 

wild  =  30; 

call(  of,  the,  wild,  0  ); 


The  function  prototype  says  that 
call ( )  requires  at  least  one  int- size  ar¬ 
gument,  followed  by  any  number  of 
additional  arguments  of  indetermi¬ 
nate  type. 

The  important  thing  to  notice  here 
is  that,  because  all  four  arguments 
are  of  the  same  type  and  because 
they  are  at  four  contiguous  memory 
locations,  you  can  treat  them  as  an 
array. 

The  call(  )  subroutine,  shown  in 
Example  1,  page  102,  prints  any  num¬ 
ber  of  arguments,  terminating  when 
it  sees  a  0  argument.  Only  the  first 
argument  is  declared  because  that's 
the  only  one  you  know  will  be  pre¬ 
sent.  The  argp  variable  is  a  pointer  to 
the  additional  arguments,  which  are 
treated  as  if  they  were  an  array  of 
ints.  Argp  is  initialized  on  line  6  to 
point  at  the  first  argument  in  the 
list — that  is,  &argp  is  a  pointer  to  the 
first  variable.  It  is  of  type  pointer-to- 
int. 

In  this  example,  &argp  is  the  num¬ 
ber  102  (which  is  its  address).  From 
here  on,  you  can  treat  the  additional 
arguments  as  array  elements,  using 
the  0  to  detect  the  last  one.  This  same 
mechanism  is  used  by  several  I/O  li¬ 
brary  routines,  such  as  eyec/f  j, 
which  takes  a  variable  number  of  ar¬ 
guments  all  of  the  same  type. 

Of  course,  it’s  often  necessary  to 
pass  arguments  of  different  types. 
Consider  the  code  in  Example  2,  page 
103.  The  stacks  resulting  from  the 


100 


Dr.  Dobb’s  Journal,  August  1987 

627 


calls  to  nannucki  ),  on  lines  35  and  36, 
are  shown  in  Figures  2  and  3,  below, 
where: 


int  intvar; 

float  floatvar; 


stack: 

address: 

return  address 

100 

strucLtype: 

1 

102 

structJtself: 

10 

104 

106 

20.0 

108 

110 

112 

114 

Figure  2:  Stack  resulting  from  the 
call  to  nannuckll,  10,  20.0 ); 


stack: 

address: 

return  address 

100 

strucLtype: 

2 

102 

structJtself: 

106 

30.0 

108 

110 

112 

114 

40.0 

116 

118 

120 

Figure  3:  Stack  resulting  from  the 
call  to  nannuck(2,  30.0,  40.0  ); 


double  doublevar; 

Here  you're  pushing  the  arguments 


use  a  char  or  short  in  either  structure 
because  the  corresponding  argu¬ 
ment  will  be  converted  to  int  as  part 
of  the  subroutine  call.  By  the  same 
token,  you  can't  use  a  float  in  the 
structure  because  all  floats  are  con¬ 
verted  to  doubles  as  part  of  the  call. 
Also  note  that  this  method  won't  nec¬ 
essarily  be  portable  because  you 
aren't  guaranteed  that  the  fields  in  a 
structure  are  contiguous.  Neverthe¬ 
less,  it  works  with  most  compilers. 
Finally,  note  that  you  have  to  use  a 
cast  in  the  assignments  on  lines  21 
and  27  because  the  second  argument 
is  not  declared  as  a  structure. 

The  main  problem  with  the  previ¬ 
ous  example  is  that  the  number  and 
types  of  the  arguments  have  to  be  de- 
termined  in  advance,  at  compile 
time.  In  order  to  write  a  subroutine 
such  as  printfl  ),  which  doesn’t  know 
the  number  or  types  of  its  arguments 
until  run  time,  you  have  to  come  up 
with  a  more  sophisticated  strategy.  A 
printfl  j-like  subroutine  that  does  just 
this  is  shown  in  Example  3,  page  103. 
Figure  4,  page  103,  shows  the  stack 
resulting  from  the  following  call  to 
fungi ): 


on  the  stack  in  the  normal  way,  but 
you’re  treating  the  resulting  stack  as 
a  structure  rather  than  as  an  array. 
The  first  argument  is  used  to  select 
which  of  the  two  possible  structures 
is  being  passed.  Note  that  you  can’t 


The  heart  of  this  subroutine  is  ob¬ 
viously  the  va_arg  macro  defined  on 
line  2.  A  va__argf  argp,  int)  call  ex¬ 
pands  to: 

((int  *)(argp  +  =  sizeof(int)))[-l] 

Note  that  argp  is  a  character  pointer, 
so  pointer  arithmetic  is  just  arithme¬ 
tic.  That  is,  because  the  size  of  a  char¬ 
acter  is  1,  incrementing  a  character 
pointer  actually  adds  the  number  1 
to  the  former  contents.  Argp  starts 
out  initialized  to  102  (by  the  assign¬ 
ment  on  line  7).  Because  sizeoflint)  is 
2,  the  +  =  sets  it  to  104.  Now  you  cast 
the  resulting  number  into  a  pointer 
to  int  and  index  backward  from  it— 
that  is,  the  expression: 

((int  *)(argp  +  =  sizeof(int)))[-l] 

can  be  treated  like  this: 

int  "rvalue; 

rvalue  =  argp; 
rvalue  +  =  sizeofl  int ); 
rvalue  [-1]; 


1: 

call (  first  ) 

2: 

int 

first; 

3: 

i 

4 : 

5 : 

int  *argp; 

6: 

argp  =  & first  ; 

7: 

whi le {  *argp  ! =  0  ) 

8: 

printf ("%d\nn,  *argp++  ); 

9: 

i 

Example  1:  The  call! )  subroutine 


fang("%c,  %d,  %s,  %f0,  T,  2,  ’’3",  4.5  ); 


Dr.  Dobb's  Journal,  August  1987 

628 


101 


The  code  advances  the  pointer  past 
the  argument  on  the  stack  and  then 
backs  up  (with  the  -1)  to  fetch  the  val¬ 
ue.  The  cast  to  int*  forces  the  compil¬ 
er  to  fetch  an  int- size  argument  from 
the  stack.  The  +  =  advances  argp  to 
point  past  this  int- size  argument.  So, 
using  this  macro  you  can  fetch  any 
sort  of  argument  from  the  stack,  pro¬ 
vided  that  you  can  tell  the  subroutine 
what  the  correct  type  is — informa¬ 
tion  available  to  both  fangt  )  and 
printf  )  in  the  format  string. 

You  could  also  break  up  the  macro 
into  two  statements: 

char  ’argp; 
int  x; 

x  =  *  ((int  *)  argp); 
argp  +  =  sizeofl  int ); 

Here  you  cast  argp  into  a  pointer  to 


Figure  4:  The  stack  resulting  from  a 
fang( )  call 


01 

struct  the  first 

02 

{ 

03 

int  one; 

04 

double  two; 

05 

1;  1 

06 

07 

struct  the  second 

08 

{ 

09 

double  one; 

10 

double  two; 

11 

1; 

12 

13 

nanhuck(  struct  type,  struct  itself  ) 

14 

int  struct  type; 

15 

1  ... 

16 

struct  the  first  *firstp; 

17 

struct  the  second  *secondp; 

18 

19 

if (  struct  type  —  1  ) 

20 

{ 

21 

firstp  =  (struct  the  first  *)  Sstruct  itself; 

22 

printf("%d,  %f\n",  firstp->one. 

23 

firstp->two  ) ; 

24 

} 

25 

else 

26 

( 

27 

secondp  =  (struct  the  second  *)  Sstruct  itself; 

28 

printf(”%f,  %f\n",  secondp->one. 

29 

secondp->two  ) ; 

30 

} 

31 

1 

32 

33 

main() 

34 

( 

35 

.  nannuck(l,  10,  20.0  ); 

36 

nannuck{2,  30.0,  40.0  ); 

37 

1 

Example  2:  Using  structures  to  access  subroutine  arguments 


01: 

# in elude 

<stdio.h> 

02: 

tdefine 

va  arg (argp, type)  ((type 

*)  (argp  +=  sizeof (type) ) )  [-1] 

03: 

04: 

fang(  format,  args  ) 

05: 

char 

♦format; 

06: 

{ 

07: 

char 

♦argp  =  (char  *) 

& args; 

08: 

09: 

for 

;  * format  ; 

format++  ) 

10: 

( 

11: 

if (  *format 

!=  '%'  ) 

12: 

put char ( 

♦format  ) ; 

13: 

else 

14: 

( 

15: 

switch ( 

*++format  ) 

16: 

{ 

17: 

case  'c' 

:  printf("%c" 

,  va  arg (argp, int  ) 

) ;  break; 

18: 

case  'd' 

:  printf ("%d" 

,  va  arg (argp, int  ) 

) ;  break; 

19: 

case  ' s ' 

:  printf ("%s" 

,  va  arg (argp, char  *) 

) ;  break; 

20: 

case  ' f 1 

:  printf ("%f" 

,  va  arg ( argp , double ) 

) ;  break; 

21: 

) 

22: 

} 

23: 

} 

24: 

} 

25: 

26: 

main  ( ) 

27: 

{ 

28: 

fang(”%c,  %d 

,  %s,  %f\n". 

■1',  2,  "3",  4.5  ); 

29: 

) 

Example  3:  A  printfi  )-like  subroutine 


Dr.  Dobb  s  Journal,  August  1987 


103 

629 


C  CHEST 

(continued  from  page  103 ) 


the  correct  type  and  then  fetch  the 
object  pointed  to  by  argp,  finally  ad¬ 
vancing  argp  past  the  object. 

ANSI  (and  Unix  V)  have  formalized 
the  procedures  I've  just  discussed 
into  a  set  of  macros.  Examples  4  and 
5,  below,  show  fang(  )  rewritten  in 
both  ANSI  and  Unix  forms.  The  ANSI 
form  is,  to  my  mind,  more  readable 
than  is  the  Unix  form.  For  one  thing, 
I  don't  much  like  the  Unix  va_dc/ 
macro,  an  invocation  of  which  can¬ 


not  be  followed  by  a  semicolon. 
The  two  sets  of  macros  are  more  sim¬ 
ilar  than  not,  however.  In  fact, 
va_argf )  is  identical  in  both  systems, 
and  it  is  identical  to  the  one  I  defined 
earlier.  In  the  ANSI  system,  va  list 
is  usually  defined  as  char  *,  and  the 
invocation: 

va _ start!  argp,  format ) 

usually  expands  to: 

argp  =  (char  *)  &,format  + 

sizeoflformat); 


That  is,  it  initializes  argp  to  point  at 
the  format  argument  and  then  ad¬ 
vances  argp  past  this  argument  to  the 
next  one  on  the  stack.  The  Unix  mac¬ 
ros  function  in  a  similar  manner. 

Nifty  Stuff 
Curses 

Last  month’s  C  Chest  looked  at  a 
curses-subset  package  for  the  IBM  PC. 
My  implementation,  however, 
lacked  several  useful  features,  such 
as  overlapping  windows  and  the 
ability  to  move  or  delete  windows.  It 
also  couldn't  handle  various  screen 
attributes  and  so  forth. 

I’ve  just  come  across  a  very  nice 
implementation  of  the  full  curses 
package  done  by  Aspen  Scientific. 
The  package  implements  all  of  the 
Unix  functions  plus  a  few  more  that 
only  work  in  the  IBM  environment — 
a  hundred-odd  functions  in  all. 
These  extra  functions  give  you  more 
control  over  clearing  the  screen  and 
let  you  work  with  screen  attributes, 
change  from  the  monochrome 
adapter  to  the  CGA  or  EGA,  scroll  re¬ 
gions  of  a  window,  and  so  forth.  The 
package  also  handles  the  IBM  func¬ 
tion  keys  in  an  intelligent  manner.  It 
can  use  direct  memory  mapping,  the 
BIOS,  and  the  ANSI.SYS  driver  for  its 
output.  A  Unix-style  manual  is  pro¬ 
vided  that’s  a  considerable  improve¬ 
ment  on  the  real  Unix  documenta¬ 
tion.  There’s  a  page  for  every 
function  in  the  package,  and  each 
manual  page  contains  a  C  example  of 
how  to  use  the  function.  Most  impor¬ 
tant,  the  source  code  for  the  whole 
library  is  available.  I  don't  use  store- 
bought  subroutine  libraries  unless  I 
can  get  the  sources  because  without 
them  my  programs  cannot  be  ported 
outside  the  MS-DOS  environment. 

The  product  also  comes  with  the 
source  for  a  nifty  screen  generator 
called  FAST  that  both  provides  an  ex¬ 
ample  of  what  the  package  can  do 
and  is  a  pretty  useful  program  in  its 
own  right.  FAST  is  a  visual  editor  that 
lets  you  make  data-entry  screens  in¬ 
teractively.  It  lets  you  define  fixed 
text,  the  positions  of  various  fields 
into  which  users  can  enter  informa¬ 
tion,  and  the  sequence  of  data  entry. 
FAST  generates  a  form-description 
file  that  is  used  by  an  interface-sub- 
routine  package,  also  provided,  that 
lays  on  top  of  curses  at  run  time. 


01 

♦include  <stdarg.h>  /*  ANSI  */ 

02 

03 

ansi  fang(  format  ) 

04 

char  ‘format; 

05 

{ 

06 

va  list  argp; 

07 

va  start  (  argp,  format  ) ; 

08 

09 

fort;  ‘format  ;  format ++  ) 

10 

{ 

11 

if(  ‘format  !=  '%'  ) 

12 

putchart  ‘format  ); 

13 

else 

14 

I 

15 

switch (  *++ format  ) 

16 

{ 

17 

case  'c' :  printf C%c",  va  arg(argp, int  )  );  break; 

18 

case  'd' :  printf ("%d",  va  arg(argp,int  )  );  break; 

19 

case  's':  printf ("%s”,  va  arg(argp,char  *)  );  break; 

20 

case  'f':  printf ("%f",  va  arg (argp, double)  );  break; 

21 

1 

22 

} 

23 

1 

24 

1 

Example  4:  ANSI  variable-argument  conventions 


01 

02 

♦include  Cvarargs .h>  /*  UNIX  */ 

03 

unix_fang(  va  alist  ) 

04 

va  del  /*  NOTE:  NO  SEMICOLON  PERMITTED  HERE  */ 

05 

t 

06 

char  ‘format; 

07 

va_list  argp; 

08 

09 

va_start (  argp  ) ; 

10 

fort  format  =  va  arg (argp, char*) ;  ‘format  ;  format ++  ) 

11 

{ 

12 

if(  ‘format  !=  '%'  ) 

13 

putchar (  ‘format  ) ; 

14 

else 

15 

( 

16 

switch (  *++format  ) 

17 

t 

18 

case  'c' :  printf ("%c",  va  arg (argp, int  ));  break; 

19 

case  'd' :  printf  ("%d",  va  arg(argp,int  ));  break; 

20 

case  's':  printf ("%s",  va  arg (argp, char  *) ) ;  break; 

21 

22 

case  'f':  printf ("%f",  va  arg (argp, double) ) ;  break; 

)  . 

) 

23 

24 

} 

25 

) 

Example  S:  Unix  variable-argument  conventions 


106 

630 


Dr.  Dobb's  Journal,  August  1987 


C  CHEST 

(continued  from  page  106) 


These  routines  let  you  get  data  from 
specific  fields,  validate  data,  and  so 
forth. 

Prices  are  $119  for  an  object-code- 
only  version;  $289  gets  you  the 
source  code.  A  very  stripped-down 
but  adequate  version  of  the  Unix 
make  utility  is  also  provided.  This  is  a 
nice  product.  If  you  need  Unix-com¬ 
patible  screen  output  in  your  pro¬ 
grams,  or  if  you  just  want  a  nice 
clean  window-management  pack¬ 
age,  I’d  recommend  it.  You  can  con¬ 
tact  Aspen  Scientific  at  P.O.  Box  72, 
Wheat  Ridge,  CO  80034-0072;  (303) 
423-8088. 

MiniProbe 

It's  no  secret  that  one  of  the  main 
strengths  of  Microsoft's  C  compiler  is 
the  CodeView  debugger,  and  one  of 
the  main  strengths  of  CodeView  is 
the  "tracepoint”  (break  when  any  of 
the  following  memory  locations 
have  changed)  mechanism.  For  ex¬ 
ample,  the  command  TPB  0  52  forces 
a  break  when  any  of  the  bottom  52 
memory  locations  in  the  data  seg¬ 
ment  are  modified,  thereby  finding 
where  a  “Null  pointer  assigment”  ac¬ 
tually  happened  in  your  program.  A 
problem  with  tracepoints,  however, 
is  the  amount  of  time  it  takes  to  pro¬ 
cess  them.  The  region  of  memory  is 
inspected  in  software  after  every  in¬ 
struction,  and  this  inspection  obvi¬ 
ously  takes  a  lot  of  time. 

A  solution  to  the  problem  is  Atron's 
MiniProbe.  The  MiniProbe  is  a  short- 
slot  board  that  plugs  into  your  com¬ 


puter.  It  provides  you  with  four 
things:  a  hardware  reset  button,  a 
"stop”  button  that  generates  an  NMI 
(it  can  be  used  to  break  out  of  a  loop 
when  Codeview  is  ignoring  Ctrl- 
Break),  one  hardware  breakpoint, 
and  one  hardware  tracepoint  (that 
lets  Codeview  trace  at  full  speed).  At 
$395,  the  product  is  on  the  expensive 
side,  but  it  really  does  work  and  is  a 
godsend  if  you  use  tracepoints  a  lot. 
It’s  also  a  lot  cheaper  than  a  full  hard¬ 
ware  debugger.  Contact  Atron  at 
20665  Fourth  St.,  Saratoga,  CA  95070; 
(408)  741-5900. 


It's  no  secret 
that  one  of 
Codeview's 
main  strengths 
is  the 
tracepoint 
mechanism. 


Bug  City 

Gordon  Arbuthnot  found  four  bugs 
in  the  expression  analyzer  printed  in 
the  February  C  Chest  (Listing  Six, 
page  64): 

1.  Line  96:  the  declaration  for  con¬ 
stant (  )  is  never  used  and  can  be 
deleted. 


2.  Line  112:  A  test  for  a  blank  space 
(c=  =  ’  ' )  should  be  included  here  in 
case  there's  leading  white  space. 

3.  Line  231:  The  variable  tmp  should 
be  declared  as  type  VTYPE  to  avoid 
truncation  of  the  intermediate 
results. 

4.  Lines  301-305:  I  forgot  about  engi¬ 
neering  notation  when  I  wrote  this 
code,  so  the  analyzer  doesn't  skip 
past  stuff  such  as  100.5e+9  correctly. 
The  corrected  code  is  shown  in  Ex¬ 
ample  6,  below.  You  should  insert  it 
in  place  of  the  code  on  lines  300  to 
306  of  the  original  listing  (page  67). 
The  line  numbers  in  Example  6  ref¬ 
erence  the  original  listing. 

I've  also  found  two  bugs  in  the  pri¬ 
ority-queue  routines  printed  in  the 
June  1987  issue.  In  pq—ins(  )  (Listing 
One,  line  215)  change: 

memcpy!  p->bottom  +  =  p->item- 
size,  &item,  p->itemsize ); 

to: 

memcpyl  p->bottom  +  =  p->item- 
size,  item,  p->itemsize ); 

and  in  maini  )  (Listing  One,  line  386) 
change: 

i  =  pq_ins(  queue,  strsave(buf  +  1) ); 
to: 

p  =  strsavel  buf  +  1 ); 
i  =  pq_ins  (  queue,  &p  ); 

Bibliography 

Comer,  Douglas.  Operating  System 
Design,  the  Xinu  Approach.  Engle¬ 
wood  Cliffs,  N.J.:  Prentice-Hall,  1984. 
Pages  349-360  of  this  book  present  a 
version  of  printfj  ). 

Holub,  Allen  I.  The  C  Companion.  En¬ 
glewood  Cliffs,  N.J.:  Prentice-Hall, 
1987.  Stack  frames,  subroutine-link¬ 
age  conventions,  and  the  innards  of 
printfi  )  are  all  discussed  in  depth  in 
this  book. 


DDJ 

Vote  for  your  favorite  feature/article. 
Circle  Reader  Service  No.  5. 


298: 

299: 


else 
{ 


if (  sizeof (VTYPE)  !=  sizeof (double)  ) 
rval  =  (VTYPE)  atol (  Str  ); 

else 

{ 

rval  =  atof (  Str  )  ; 

while (  isdigit (*Str)  ||  *Str  ==  ) 

Str++; 


if  (*Str  ==  'E' 
Str  +=  2  ; 


*Str 


<e'  ) 


/*  12.34E+03  */ 


307: 


while (  isdigit (*Str)  ) 
Str++; 


Example  G:  Corrections  to  code  in  C  Chest,  Listing  Siy,  February  1987 


108 


Dr.  Dobb's  Journal,  August  1987 

631 


16-BIT  SOFTWARE  TOOLBOX 


MS-DOS  3.30 

icrosoft's  OS/2  multitasking 
protected  mode  operating  sys¬ 
tem  and  the  IBM  Personal  System/2 
computers  captured  most  of  the 
headlines  on  April  2  and  afterward, 
but  the  upgrade  to  PC-DOS/MS-DOS 

3.30  that  was  announced  at  the  same 
time  will  have  a  more  significant 
short-term  impact  on  users.  MS-DOS 

3.30  is  an  evolutionary  update  from 
3.20,  upward  compatible  with  previ¬ 
ous  versions,  that  incorporates  bug 
fixes  and  some  important  enhance¬ 
ments.  These  enhancements  fall  into 
four  general  categories:  new  or  im¬ 
proved  user  commands,  configura¬ 
tion  options,  system  functions,  and 
device  support.  In  addition,  there  is 
expanded  “internationalization  sup¬ 
port”  at  many  points  throughout  the 
system.  The  following  information 
has  been  gleaned  from  the  PC-DOS 

3.30  users'  manual,  the  technical  ref¬ 
erence,  and  a  limited  amount  of 
experimentation. 

JVeir  User  Commands 

A  terminate-and-stay-resident  MS- 
DOS  extension,  NLSFUNC,  is  a  new  ele¬ 
ment  of  MS-DOS  internationalization 
support  and  allows  new  code  pages 
to  be  selected  for  languages  other 
than  American  English.  Code  pages 
are  IBM's  term  for  resident  tables  that 

by  Ray  Duncan 


define  the  mapping  of  character 
codes  for  the  keyboard,  list  device, 
and  video  display. 

Another  TSR  extension,  called 
FASTOPEN,  can  be  loaded  that  signifi¬ 
cantly  improves  performance  for 
programs  that  repeatedly  open  and 
close  a  relatively  small  working  set 


of  files.  FASTOPEN  apparently  caches 
the  directory  information  for  the 
most  recently  opened  files  and  al¬ 
lows  actual  reads  of  the  disk  directo¬ 
ry  to  be  bypassed  for  files  that  are  in 
the  cache.  FASTOPEN  is  active  only 
for  fixed  disks  and  can  support  up  to 
four  drives;  the  default  number  of 
files  cached  per  drive  is  10,  but  a 
command-line  option  allows  as 
many  as  100  per  drive  to  be  cached. 

The  most  interesting  thing  about 
NLSFUNC  and  FASTOPEN,  when 
viewed  in  the  light  of  the  TSR-like 
SHARE,  GRAFT ABL,  GRAPHICS,  KEYBXX, 
APPEND,  and  PRINT  commands, 
which  were  added  in  previous  ver¬ 
sions  of  MS-DOS,  is  the  progressive 
trend  toward  decomposition  of  oper¬ 
ating  system  functionality  into  inde¬ 
pendent,  selectively  loadable  TSR 
modules.  I  hope  we  will  see  this 
trend  continue  and  even  accelerate 
in  future  versions  of  real  mode  MS- 
DOS — it  helps  minimize  the  squeeze 
on  memory  in  8086/88  machines  and 
provides  a  welcome  degree  of 
flexibility. 

Augmented  User  Commands 

The  APPEND  command,  which  is  a 
passive  TSR  that  defines  a  search  path 
for  open  operations  on  data  files 
analogous  to  the  PATH=  command 
for  executable  and  batch  files,  has 
been  souped  up  slightly.  New  switch¬ 
es  cause  the  APPEND  path  string  to  be 
stored  in  the  environment  block  and 


allow  the  APPEND  path  to  be 
searched  on  certain  additional  DOS 
function  calls  (il/i,  4eh,  and  4bh ).  AP¬ 
PEND  was  present  in  "generic"  MS- 
DOS  3.20  but  was  previously  distribut¬ 
ed  only  with  networking  software  in 
the  IBM  versions. 

A  BATCH  file  directive  (CALL)  has 
been  added  to  allow  one  batch  file  to 
invoke  another  and  then  regain  con¬ 
trol  without  the  intermediary  of  a 
secondary  command  processor,  and 
the  ability  to  use  the  name  of  an  envi¬ 
ronment  variable  as  a  parameter  in¬ 
side  a  batch  file  (by  framing  it  with  % 
characters)  has  been  documented  at 
last. 

A  /S  switch  has  been  added  for  AT- 
TRIB,  which  allows  the  command  to 
also  be  applied  to  matching  files  in  all 
subdirectories  of  the  named  or  de¬ 
fault  directory.  BACKUP  and  RESTORE 
have  new  switches  allowing  selec¬ 
tion  of  files  by  their  date  or  time,  and 
the  new  BACKUP  can  format  disks  on 
the  fly  (still  can’t  begin  to  compare 
with  FASTBACK,  though).  Other 
changes  to  FORMAT  and  GRAPH¬ 
ICS  are  too  minor  to  discuss  here.  Fi¬ 
nally,  the  DATE  and  TIME  commands 
have  been  spiffed  up  so  that  they  can 
reset  the  CMOS  clock  on  PC/ATs  (no 
more  rooting  around  for  your  diag¬ 
nostics  disk  with  the  SETUP  program 
just  to  “spring  forward”  or  “fall 
back”). 

JVeir  Configuration  Features 

The  default  value  for  BUFFERS—  has 
been  made  a  little  "smarter.”  In  pre¬ 
vious  versions,  BUFFERS =  always  de¬ 
faulted  to  2.  In  MS-DOS  3.30,  it  defaults 
to  a  more  appropriate  value  (in  the 
range  2-15),  depending  on  the  type 
of  disk  the  system  is  booted  from  and 
the  amount  of  RAM  installed. 


112 

632 


Dr.  Dobb's  Journal,  August  1987 


16-BIT 

(continued  from  page  112) 

The  COUNTRY —  directive  for  CON¬ 
FIG.SYS  has  been  augmented  with  a 
code  page  option  and  with  the  ability 
to  load  internationalization  informa¬ 
tion  from  a  disk  file.  This  capability 
works  in  concert  with  similar 
changes  to  the  KEYBXX  user  com¬ 
mand  and  will  allow  support  for 
new  date,  time,  and  currency  for¬ 
mats  and  collating  sequences  to  be 
added  much  more  easily.  In  the  past, 
the  internationalization  support  ta¬ 
bles  for  various  country  codes  were 
embedded  inside  the  operating  sys¬ 
tem;  adding  a  new  country  code 
meant  that  the  OEM  had  to  rebuild 
the  system  files. 

The  STACKS=  directive,  which 
specifies  the  number  of  stack  frames 
in  the  system  pool  for  use  by  the  in¬ 
terrupt  handler,  now  behaves  more 
sensibly.  The  default  for  the  PC,  PC/ 
XT,  and  IBM  Portable  has  been 
changed  so  that  no  stack  switching 
occurs  (the  default  for  PC/ATs  is  still  9 
stacks  of  128  bytes  each).  Further¬ 
more,  users  can  always  disable  stack 


switching  if  desired  by  placing 
STACKS = 0  in  the  CONFIG.SYS  file. 

JVc  w  System  Functions 

Two  new  system  services  are  avail¬ 
able  for  use  by  application  programs, 
and  the  definition  of  IOCTL  has  been 
expanded  slightly.  Int  21h,  function 
67h  (Set  Handle  Count)  allows  the  file 
table  for  a  process  to  be  expanded  so 
that  the  process  can  have  more  than 
20  files  open  at  once.  This  subject  was 
nearly  beaten  to  death  in  this  column 
about  a  year  ago.  Most  of  you  will  no 
doubt  remember  that  in  MS-DOS,  Ver¬ 
sions  3. 0-3. 2,  there  is  a  20-byte  table 
corresponding  to  file  handle  num¬ 
bers  in  a  reserved  area  of  the  PSP 
along  with  a  double-word  pointer  to 
the  table  and  an  additional  word  that 
gives  the  length  of  the  table.  Plenty 
of  DDJ  readers  have  already  hit  on 
the  fact  that  you  can  get  around  the 
20-file  limit  simply  by  building  a 
new,  larger  table  somewhere  and 
modifying  the  PSP  words  containing 
the  pointer  and  length  accordingly. 
Apparently,  the  new  function  call 
works  in  about  the  same  way.  It 
seems  to  allocate  a  block  of  memory 


outside  the  process  itself  that  is  large 
enough  to  hold  the  expanded  table, 
copies  the  old  file  table  to  the  new 
one  and  initializes  the  as-yet-unused 
positions,  and  then  twiddles  the  PSP 
to  point  to  the  new  table. 

Int  21h,  function  68h  (Commit  File) 
forces  all  the  internal  disk  buffers  as¬ 
sociated  with  a  file  handle  to  be  writ¬ 
ten  to  disk  and  the  directory  infor¬ 
mation  for  the  file  to  be  updated. 
This  is  effectively  the  same  as  DUPing 
the  handle  for  a  file  and  then  closing 
the  new  handle,  except  that  the  DUP 
method  can  fail  if  the  system  is  out  of 
handles. 

The  calling  sequences  for  the  new 
functions  67h  and  68h  are  summa¬ 
rized  in  Table  1,  below. 

Int  21h,  function  44h  (IOCTL)  has  a 
new  subfunction  (Och)  that  allows  an 
application  program  to  select  a  dif¬ 
ferent  code  page  for  a  peripheral  de¬ 
vice.  This  is  simply  another  facet  of 
the  expanded  internationalization 
support. 

Device  Support 

MS-DOS  3.30  supports  double-sided, 
double-density,  1.44-megabyte,  3.5- 
inch  disk  drives  and  the  new  higher- 
resolution  video  adapters  found  on 
the  IBM  Personal  Computer/2  mod¬ 
els.  The  built-in  asynchronous  com¬ 
munications  driver  has  been  expand¬ 
ed  to  support  up  to  four  serial  ports 
and  presumably  is  now  interrupt- 
driven  because  the  documentation 


Set  Handle  Count: 

Int  21  h,  function  67h 

Call  with:  ah  =  67h 

bx  =  desired  number  of  handles 

Returns:  Carry  clear  if  function  succeeded 
or 

Carry  set  if  function  failed  and 
ax  =  error  code 


Commit  File: 

Int21h,  function  68h 

Call  with:  ah  =  68h 

bx  =  file  handle 

Returns:  Carry  clear  if  function  succeeded 
or 

Carry  set  if  function  failed  and 
ax  =  error  code 


Table  1:  New  function  calls  in  PC- 
DOS/MS-DOS  3.30 


114 


Dr.  Dobb's  Journal,  August  1987 

633 


16-BIT 

(continued  from  page  114) 


claims  it  can  handle  data  rates  as 
high  as  19,200  baud. 

Prices 

PC-DOS  3.30  costs  $120  new  or  $75  as 
an  update  (you  have  to  send  the  cov¬ 
er  page  from  the  users'  manual  of  a 
previous  version  of  PC-DOS  along 
with  the  payment  to  qualify  for  the 
update).  Both  a  5V4-inch  and  a  3xk- 
inch  disk  are  included  in  the  pack¬ 
age.  The  DOS  3.30  Technical  Reference 
costs  $85.  The  executable  files  for  the 
linker,  DEBUG,  and  EXE2BIN  along 
with  the  source  file  for  the  VDISK 
driver  are  no  longer  supplied  on  the 
PC-DOS  distribution  disks  but  are  on  a 
disk  that  comes  with  the  technical 
reference  instead. 

32-Bit  Book  IVook 

Back  when  I  started  dabbling  in  8086 
programming,  the  Intel  manuals 
were  poorly  written  and  indexed 
and  even  more  difficult  to  use  than 
they  are  now.  I  eventually  discov¬ 
ered  Rector  and  Alexy’s  The  8086 


Book,  to  my  immense  relief,  and 
have  well-worn  copies  of  it  stashed 
everywhere  I  work.  Although  The 
8086  Book  has  a  few  weaknesses, 
they  are  far  outweighed  by  the  accu¬ 
racy  of  the  book  and  the  organiza¬ 
tion  and  presentation  of  the  informa¬ 
tion  about  the  8086  instruction  set. 

Rector  and  Alexy's  book  is  looking 
pretty  dated  these  days,  though, 
what  with  the  many  new  instruc¬ 
tions,  exceptions,  and  protected 
mode  addressing  considerations  of 
the  80286  and  80386.  I  keep  hoping 
that  someone  will  publish  an  equally 
useful  book  that  experienced  pro¬ 
grammers  can  use  as  a  reference  to 
the  entire  family  of  Intel  80x86  pro¬ 
cessors,  but  thus  far  no  new  trade 
book  has  filled  the  bill.  The  Intel 
manuals  have  improved  by  leaps 
and  bounds,  and  in  their  latest  incar¬ 
nations  are  real  treasure  troves 
(though  still  somewhat  dense:  every 
word  is  significant).  But  the  authors 
of  most  assembly-language  books 
seem  content  to  provide  rehashed 
and  weakened  versions  of  the  Intel 
manuals — they  seem  to  have  lost  the 
concept  of  added  value  altogether. 


Programming  the  Intel  80386,  by 
Smith  and  Johnson1,  at  first  glance 
looked  like  a  possible  successor  to 
Rector  and  Alexy's  book.  The  obliga¬ 
tory  20  pages  of  explanation  of  bits 
and  bytes  are  followed  by  a  brief 
look  at  the  80x86  processor  line;  a  re¬ 
view  of  80386  registers  and  address¬ 
ing  modes;  an  overview  of  the  80386 
instruction  set  by  functional  group; 
and  then  the  body  of  the  book:  a  180- 
page,  alphabetically  organized  refer¬ 
ence  to  the  80386’s  instructions,  in¬ 
cluding  opcodes,  clocks,  flags 
affected,  notes,  and  examples,  with 
each  instruction  beginning  on  a  new 
page.  The  last  65  pages  of  the  book 
contain  a  rather  sketchy  overview  of 
protected  mode  segmentation,  virtu¬ 
al  memory,  paging,  caching,  and 
even  a  few  words  about  80386  bus 
signals. 

I  really  wanted  to  like  this  book, 
but  it  is  just  too  uneven.  The  organi¬ 
zation  (other  than  the  main  refer¬ 
ence  section)  needs  improvement, 
there  are  literally  no  programming 
examples,  and  some  crucial  subjects 
(such  as  the  distinctions  between  seg¬ 
mented  virtual  memory,  paged  vir¬ 
tual  memory,  and  segmented  paged 
virtual  memory)  just  aren't  covered 
in  adequate  depth.  But  the  main 
problem  with  the  book  is  that  the  au¬ 
thors  are  clearly  paraphrasing  their 
material  from  other  sources  instead 
of  writing  from  a  solid  base  of  80386 
programming  experience.  Without 
this  experience,  they  simply  do  not 
have  a  sense  of  which  material  is  use¬ 
ful  and  how  it  should  be  presented 
and  which  material  (such  as  the 
80386  bus  signals)  is  fluff  and  should 
be  omitted. 

We  are  now  0  for  2  on  80386  assem¬ 
bly-language  books  reviewed  in  this 
column  ( 80386/80286  Assembly  Lan¬ 
guage  Programming  by  Murray  and 
Pappas,  reviewed  in  the  March  1987 
issue  of  DDJ,  didn't  cut  the  mustard 
either).  At  present,  if  you  are  interest¬ 
ed  in  the  80386,  your  best  value  is  still 
the  Intel  80386  Programmer's  Refer¬ 
ence,  which  is  far  more  authorita¬ 
tive,  readable,  and  thorough  than 
any  of  its  trade  book  competitors- 
and  cheaper  besides. 

Word  Meets  Its  Match 

Fred  Heutte,  of  Portland,  Oregon, 
writes:  "I'm  turning  in  the  extra- 
credit  question  for  my  final  exam  in 


116 

634 


Dr.  Dobb’s  Journal,  August  1987 


Algorithm  Design  101  (responding  to 
Larry  Heberlein  in  the  March  DDJ). 

"I  am  familiar  with  the  search- 
and-replace  problem  in  Microsoft 
Word.  The  particular  example  Larry 
Heberlein  uses  is  a  good  one — it's  vir¬ 
tually  a  worst  case  for  Word,  which 
is  paragraph-oriented.  When  you  re¬ 
move  the  paragraph  marks,  all  kinds 
of  contortions  result  to  keep  track  of 
the  conversion.  Other  than  this  par¬ 
ticular  case,  Word's  search-and-re- 
place  is  very  nice,  and  the  Bellevue 
gang  deserves  applause  for  this  as 
well  as  many  other  Word  features. 

"I  find  it  hard  to  believe  that  any¬ 
one  can  beat  my  entry,  however.  I 
am  running  INFORMIX-SQL,  Version 
2.0.0  on  an  AT&T  6300-Plus.  Informix 
has  a  program  called  SFORMBLD  that 
compiles  screen  entry  forms  to  be 
run  under  the  data  entry  form  mod¬ 
ule  SPERFORM.  When  compiling  a 
form  file  of  no  more  than  11K,  an 
'out  of  memory’  error  is  returned. 
With  DOS  3.1  taking  about  40K  RAM 
and  SFORMBLD  about  110K,  this  leaves 
about  490K — an  approximate  ratio  of 
44.5:1. 

"No  wonder  I’m  pushing  the  orga¬ 
nization  I  work  with  to  convert  to 
Rbase  System  V,  even  though  it’s  not 
SQL!" 

Another  Reply  to  Mr.  Lyall 

Mr.  Davidson  Corry,  of  Seattle,  Wash¬ 
ington,  writes:  "1  read  with  some  in¬ 
terest  Mr.  Charles  Lyall’s  letter  to 
your  column  in  the  March  issue  on 
the  continuing  thrash  between  pro¬ 
ponents  of  assembly  and  high-level 
languages. 

"As  it  seems  to  be  customary  to  es¬ 
tablish  one’s  pedigree,  let  me  offer 
you  mine.  I  got  my  start  on  a  Singer 
process-control  machine  (vintage 
early  50s)  with  4K  of  magnetic  drum 
memory  in  1965.  Assembly’  lan¬ 
guage  at  its  best — bootstrapping  hex 
codes  off  paper  tape!  Since  then  I’ve 
used  a  dozen-odd  other  languages  on 
two-dozen  other  machines,  finally 
curling  comfortably  up  beside  an  AT 
clone  with  MASM  and  C,  consulting 
on  systems  programming  for  compa¬ 
nies  in  the  Seattle  area. 

“Mr.  Lyall  computes  that  a  10:1  ex¬ 
ecution  speed  ratio  is  compensated 
forby  the  1:8  productivity  increase 
of  writing  in  a  high-level  language; 
that  he  would  have  to  'run  the  little 
turkey  700  times’  to  break  even.  I  pic¬ 


ture  him  sitting  calmly  at  his  termi¬ 
nal,  waiting  for  a  compile  to 
complete,  serene  in  the  knowledge 
that  the  author  of  the  compiler  has 
worked  productively.  This  is  a  level 
of  rationality  I  have  not  reached  and 
to  which  I  do  not  aspire.  Not  me, 
babe:  I  want  that  sucker  linked  and 
up!  Now! 

"I  have  equally  cursed  the  name¬ 
less  sculptors  of  silicon  who — for 
perhaps  excellent  engineering  rea¬ 
sons — hobbled  the  8086  with  short- 
range  conditional  jumps  and  saddled 
programmers  with  the  jump  around 
a  jump.  However  much  it  simplified 
their  job,  it  has  made  mine  that  much 
more  difficult. 

"This  is,  I  think,  the  central  issue  in 
the  language-level  debate:  the  ulti¬ 
mate  judgment  on  software  is  on  its 
effectiveness  (speed,  power,  fea¬ 
tures,  and  so  on)  as  a  tool  for  end-us¬ 
ers.  The  difficulties  of  engineering 
the  tool  are,  in  the  end,  irrelevant. 
The  only  justification,  and  it  is  a 
weak  one,  for  taking  'shortcuts’  is 
that  a  good  tool  today  is  more  useful 
than  a  superb  tool  next  year. 

"To  my  mind,  there  are  only  two 


values  of  execution  time:  'fast 
enough  not  to  notice’  and  'slow 
enough  to  get  impatient.’  No  one  has 
reasonably  suggested  that  a  well-cod¬ 
ed,  high-level  program  can  beat  a 
well-coded,  assembly-language  one 
for  time,  and  my  experience  suggests 
that  assembly-language  code  is  much 
more  likely  to  fall  on  the  good  side  of 
the  'come  on!'  threshold.  So  I  often 
write  in  assembly  language,  for 
speed,  compactness,  and  close  con¬ 
trol  of  the  hardware.  It’s  not  porta¬ 
ble,  you  say?  Yes,  but  a  program  that 
works  with  my  PC  keyboard  and 
screen  probably  isn't  going  to  fit  in 
the  CICS  mindset  anyway. 

"Having  said  that,  I  will  tell  you 
that  I  much  prefer  high-level  lan¬ 
guages  and  use  them  whenever  I  can 
get  away  with  it — that  is,  whenever 
the  hardware  penalty  is  not  too  dear. 
This  does  not  contradict  my  argu¬ 
ment — it  confirms  it.  Let  me  explain. 

"Assembly  language,  however  ef¬ 
ficiently  it  tickles  the  chip,  is  a  pain  to 
write  and  debug.  It  is  cryptic,  te¬ 
dious,  verbose — even  COBOL  is  better! 
By  its  very  flexibility  it  encourages 
all  sorts  of  'gimmick’  coding  (‘Fly 


Dr.  Dobb's  Journal,  August  1987 


117 

635 


16-BIT 

(continued  from  page  117 ) 


now,  crash  later  And  the  typi¬ 
cal  assembly-language  source  listing 
almost  totally  obscures  the  underly¬ 
ing  algorithm. 

“In  contrast,  my  favorite  language 
these  days  is  ICON.  It  is  no  speed  de¬ 
mon,  and  makes  no  claims  to  be,  but 
it  is  a  superb  notation  for  recording 
an  algorithm.  The  most  difficult  part 
of  learning  ICON  seems  to  be  unlearn¬ 
ing  coding  practices  that  got  you 
around  limitations  in  other  lan¬ 
guages  so  that  you  can  see  the  prob¬ 
lem  fresh  and  use  the  power  of  ICON 
expressions. 

“C — the  'portable  assembly  lan¬ 
guage— was  designed  to  let  pro¬ 
grammers  access  a  pervasive  archi¬ 
tecture  efficiently:  linear/rectilinear 
arrays  of  machine  objects  (bytes/ 
words/longs/floats/doubles/point- 
ers)  or  agglomerations  of  them.  The 
'overhead’  of  compiled  C  varies  as 
the  CPU  architecture  drifts  from  this 
ideal,  but  C  may  be  near  ultimate  in 
letting  programmers  talk  the  chip's 
language  comfortably. 

"Does  anyone  remember  UCSD  Pas¬ 


cal,  running  on  a  p-machine?  No,  not 
the  software  emulator — I  mean  the 
real  Western  Digital  p-machine  in  sil¬ 
icon.  It  screamed,  or  so  they  said. 
How  about  Modula-2  on  Wirth's  Lil¬ 
ith?  And  whatever  happened  to  the 
Forth  chip  set  that  executed  Forth 
primitives  directly?  These  are  at¬ 
tempts  to  make  the  chip  talk  the  pro¬ 
grammer’s  language. 

“And  that  brings  us  full  circle.  En¬ 
ergy  spent  engineering  hardware  so¬ 
lutions  is  repaid  a  thousandfold  at 
the  next  higher  level  -because  that 
many  more  people  use  it.  If  a  thou¬ 
sand  people  run  Mr.  Lyall’s  'little  tur¬ 
key,'  every  one  of  them  loses  ground 
on  the  very  first  run. 

“The  'debate'  over  high-level-lan¬ 
guage  efficiency  is  an  artifact  of 
adapting  human  language  to  ma¬ 
chine  architecture,  a  historical  acci¬ 
dent.  I  suggest  that  a  better  solution  is 
to  design  notations  in  which  humans 
can  clearly  and  conveniently  express 
algorithms  and  then  adapt  the  'hard¬ 
ware'  (RISC  chips  with  a  micropro¬ 
grammed  icing,  better  compilers  and 
operating  systems,  and  so  on)  to  exe¬ 
cute  these  improved  notations 
efficiently.” 


Mahalo  and  Aloha 

Every  good  thing  must  come  to  an 
end,  and  although  I  have  enjoyed 
writing  this  column  and  have 
learned  far  more  from  DDJ’s  readers 
than  they  have  learned  from  me, 
five  years  of  a  Good  Thing  is  definite¬ 
ly  enough.  The  siren  song  of  the  fab¬ 
ulous  new  class  of  32-bit  personal 
computers,  like  the  Mac  II  and  the 
PS/2  Model  80,  is  becoming  irresist¬ 
ible,  and  I've  got  a  lot  of  delving  to  do 
before  I  can  write  about  those  ma¬ 
chines  intelligently.  In  the  mean¬ 
time,  I  wish  you  all  continued  health, 
prosperity,  happiness — and  a  ma¬ 
chine  with  a  BIG  fixed  disk,  tape 
backup,  and  no-wait-state  RAM! 

Note 

1.  Bud  E.  Smith  and  Mark  T.  Johnson, 
Programming  the  Intel  80386  (Glen¬ 
view,  Ill.:  Scott,  Foresman  &  Compa¬ 
ny,  1987.  346  pages  including  index. 
$22.95.  ISBN  0-673-18568-0. 

DDJ 

Vote  for  your  favorite  feature/article. 

Circle  Reader  Service  6. 


118 

636 


Dr.  Dobb's  Journal,  August  1987 


COLUMNS 


STRUCTURED  PROGRAMMING 

Translating  from  MS-BASIC  to  C 


In  this  issue  I  discuss  translating 
programs  from  BASIC  to  C.  This  ar¬ 
ticle  is  the  third  in  a  series  that  looks 
at  translating  programs  between  dif¬ 
ferent  languages  or  dialects  of  the 
same  language.  As  in  the  previous  ar¬ 
ticles  in  the  series,  I  have  used  a  lan¬ 
guage  translator.  I  will  also  discuss 
the  translation  performed  by  BASTOC 
(Version  2.1),  a  package  from  JMI  Soft¬ 
ware  (P.O.  Box  481,  Spring  House,  PA 
19477).  The  version  I  used  converts 
MS-BASIC  source  code  to  C  source 
code  compatible  with  Microsoft  C, 
Version  4.0.  Versions  of  BASTOC  are 
available  to  translate  from  several 
other  BASIC  dialects,  such  as  CBASIC, 
too. 

To  guide  the  discussion,  I  have  se¬ 
lected  three  programs:  the  Sieve 
benchmark,  a  root-seeking  program, 
and  a  multifile  find/replace  utility.  I 
have  actually  translated  many  other 
BASIC  programs  to  test  how  well  BAS¬ 
TOC  works,  and  the  results  have 
shown  that  the  package  s  source 
code  conversion  is  well  thought  out 
and  comes  close  to  being  a  complete 
translation.  The  converted  programs 
kept  much  of  their  feel  but  ran  much 
faster. 

The  Sieve  Benchmark 

Listing  One,  page  86,  shows  the  BASIC 
source  code  for  the  Sieve  bench- 

by  Namir  Clement 
Shammas 

mark.  The  program  demonstrates 
the  conversion  of  arrays,  FOR  and 
WHILE  loops,  and  IF  statements.  List¬ 
ing  Two,  page  86,  shows  the  C  ver¬ 
sion  of  the  BASIC  source  code.  The 
first  BASIC  statements  are  OPTION 
BASE  and  DEF^qqc.  BASTOC  ignores  OP- 


122 


TION  BASE  1  declarations  because  the 
C  language  implicitly  uses  0  as  the 
lower  array  bound.  Thus,  OPTION 
BASE  1  creates  a  bit  of  wasteful  space 
when  translated.  BASTOC  uses  DfiPyyy 
statements  well  to  assign  a  data  type 
to  the  BASIC  variables.  Because  the 
BASIC  Sieve  program  makes  the  DEF- 
INT  A_Z  declaration,  the  C  version 
declares  all  the  inherited  variables  as 
int.  Notice  that  the  array  flag  is  re¬ 
placed  with  a  pointer  to  integers, 
*FLAGS,  because  the  dimensioning  of 
the  array  flags  uses  a  variable  and 
not  an  integer  constant.  This  dictates 
the  use  of  dynamic  allocation  and 
pointers.  The  BASTOC  translator  de¬ 
clares  additional  identifiers  that  it 
uses  in  controlling  loops.  The  simple 
BASIC  assignments  are  converted  in  a 
straightforward  fashion.  DIM  FLAGS 
(SIZE)  yields  a  dynamic  allocation  of 
the  array  using  the  balloc  function. 

The  few  BASIC  PRINT  statements 
are  not  converted  into  printf  func¬ 
tion  calls,  as  you  might  have  expect¬ 
ed.  Instead,  BASTOC  uses  its  own  over¬ 
loaded  BPRINTI )  function  (which  I 
presume  is  more  efficient  than 
printf).  Examine  the  three  BPRINT 
calls  in  the  C  code  and  notice  the  first 
argument.  It  is  a  string  that  maps  the 
number  and  type  of  data  to  be  dis¬ 
played.  The  letters  s  and  i  indicate  a 
string  and  an  integer,  respectively. 
The  function  BPRINT  is  able  to  tackle 
a  varying  number  of  arguments. 

The  manipulation  of  time  is  car¬ 


ried  out  using  a  function  that  sets  the 
time — namely,  STIME_( ) — and  an¬ 
other — TlME—( ) — to  return  it. 

Because  C  has  more  options  than 
MS-BASIC  does,  translating  the  FOR 
and  WHILE  loops  is  easy.  Notice  that 
the  C  version  uses  additional  identifi¬ 
ers  to  define  the  iteration  range  of 
the  FOR  loops.  The  single  IF  state¬ 
ment  that  uses  a  GOTO  in  its  THEN 
clause  is  converted  into  a  similar  set 
of  statements.  The  BASTOC  translator 
makes  use  of  the  available  gotos  in  C 
rather  than  trying  to  alter  the  pro¬ 
gram  flow  by  using  other  constructs. 
You  can  replace  the  use  of  the  gotos 
with  more  structured  if .  .  .  else 
clauses  if  you  are  willing  to  hand 
code  the  changes  with  your  editor. 

The  Root-Finding  Program 

The  second  program,  which  finds 
the  roots  of  a  single  nonlinear  func¬ 
tion,  is  in  Listing  Three,  page  87.  It 
has  the  following  features: 

•  Using  the  ON  GOSUB  statement,  the 
program  can  look  at  multiple 
functions. 

•  The  accuracy  and  maximum  num¬ 
ber  of  iterations  are  preassigned.  If 
the  number  of  iterations  exceeds  the 
maximum  limit,  the  accuracy  is 
relaxed. 

This  program  demonstrates  the  con¬ 
version  Of  BASIC  DATA,  ON  GOSUB,  GO- 
SUB,  and  PRINT  USING  statements. 

Listing  Four,  page  87,  shows  the 
translated  C  code.  The  first  declara¬ 
tion  tackles  a  data  structure  that  is 
employed  in  converting  BASIC  DATA 
statements.  The  C  structure  is  com¬ 
posed  of  an  unsigned  integer  and  a 
pointer.  The  integer  stores  the  origi¬ 
nal  BASIC  line  number,  and  the  point- 


Dr.  Dobb's  Journal,  August  1987 

637 


STRUCTURED  PROGRAMMING 

(continued  from  page  1ZZ) 


er  refers  to  a  string  containing  the  in¬ 
formation  contained  in  a  BASIC  DATA 
statement.  The  original  BASIC  line 
numbers  are  maintained  in  the  data 
structure  to  accommodate  the  RE¬ 
STORE  <line  number>  statement  in 
BASIC.  The  transformed  program 
also  declares  the  absolute,  exponen¬ 
tial,  and  power  functions  that  must 
be  called  from  C  libraries. 

Notice  that  the  first  two  lines  of  the 
BASIC  program  are  the  DEFDBL  and 
DEFINT  declarations.  Line  1010  was 
originally  DEFDBL  A-Z,  but  the  trans¬ 
lator  objected  to  the  redefinition  of 
the  I/O  range  in  line  1020.  All  the  BA¬ 
SIC  variables  except  the  integer  flag 
Diverge  %  adhere  to  the  default  type- 
by-name  association.  The  BASTOC 
translator  handles  the  data  typing  of 
variables  correctly.  Notice  that  the 
BASIC  Diverge%  variable  is  converted 
into  diverge/  in  C:  the  uppercase  /  re¬ 
places  the  %  character  in  BASIC. 

The  BASIC  INPUT  statement  is  con¬ 
verted  into  a  call  to  an  overloaded  C 
function,  INPUTf  ).  Like  the  BPRINT(  ) 
function  discussed  earlier,  the  first 
argument  of  INPUTf  )  is  a  string  that 
seems  to  determine  if  a  prompt  is 
used  as  well  as  to  indicate  the  data 
type  of  the  input.  Like  the  familiar 
scanff  )  function,  INPUTf  )  uses  the 
address  of  the  variable  to  store  the  in¬ 
put  data.  The  BASIC  READ  statements 
are  translated  into  function  calls  that 
are  similar  to  the  BPRlNTf  )  and  IN¬ 
PUTf  )  functions. 

Translating  the  BASIC  lines  that 
make  up  the  iteration  loops  takes 
place  without  snags.  Looping  with 
the  WHILE-WEND  and  the  IF  state¬ 
ments  is  correctly  converted.  The  GO- 
SUB  1200  statements  are  replaced 
with  calls  to  function  pr—1200(  ).  Be¬ 
cause  GOSUBs  take  no  parameters, 
their  counterparts  in  C  are  always 
parameterless  functions.  If  you  use 
BASTOC  to  translate  your  own  BASIC 
programs  into  C,  you  may  want  to 
edit  your  program  to  take  advantage 
of  using  argument  lists  in  C 
functions. 

PRINT  USING  statements  are  trans¬ 
lated  into  UPRlNTf  )  function  calls. 
These  C  functions  resemble  their 
BPRlNTf  )  cousins,  except  the  first  ar¬ 
gument  is  the  output  format  string 
and  the  second  argument  contains 


the  number  and  data  types  for  the 
output  variables. 

The  MS-BASIC  subroutine  that  starts 
at  line  1200  uses  the  ON  GOSUB  state¬ 
ment  to  call  other  subroutines.  The 
ON  GOSUB  is  translated  into  the  C 
switch-case  decision-making  con¬ 
struct.  The  variable  used  in  selecting 
the  proper  case  is  the  global  n  identi¬ 
fier.  The  C  function  pr—lZ00(  )  ends 
with  a  return  with  no  expression  as¬ 
sociated  with  it.  The  same  is  true  for 
the  rest  of  the  subroutines. 

The  Find/Replace  Utility 

The  third  example  program  is 
shown  in  Listing  Five,  page  89,  and 
Listing  Six,  page  90,  shows  the  C  ver¬ 
sion.  The  BASIC  program  performs 
find/replace  operations  on  one  or 
more  files.  This  program  demon¬ 
strates  the  translation  of  sequential 
file  I/O,  string  manipulation,  and  er¬ 
ror  handling. 

In  comparing  the  BASIC  declara¬ 
tions  in  lines  1040  to  1070  with  those 
in  the  C  version,  notice  the  following: 

•The  effect  of  OPTION  BASE  1  is  ig¬ 
nored,  and  BASIC  dimensions  of  20 
and  30  elements  are  replaced  with  21 
and  31  elements.  The  first  array  ele¬ 
ments,  with  an  index  of  0,  are  not 
used. 

•  DEFINT  A—Z  is  used  to  tell  BASTOC 
how  to  declare  the  data  types  of  sca¬ 
lar  variables  in  the  C  program.  The 
underscore  character  replaces  the 
dot  used  in  the  BASIC  variable  names. 
•String-type  scalars  and  arrays  are 
declared  as  pointers. 

The  first  set  of  statements  inside 
the  mainf  )  function  allocates  space 
to  the  string  arrays  using  the  corre¬ 
sponding  pointers.  The  assignments 
to  integer  variables  are  straightfor¬ 
ward.  String  assignment  involves  a 
call  to  function  s—asgn( )  instead  of 
an  ordinary  assignment  as  in  BASIC, 
Turbo  Pascal,  or  Modula-2. 

What  is  even  more  unusual  is  the 
way  the  GOSUB  Z290  is  handled:  in¬ 
stead  of  yielding  a  call  to  function 
pr—ZZ90(  ),  two  statements  are  used. 
The  first  is  sub^iushfl)  and  the  sec¬ 
ond  is  a  goto.  Has  the  BASTOC  transla¬ 
tor  finally  collapsed  under  pressure 
and  continuous  use?  No,  there  is  no 
need  to  panic!  This  sort  of  code  seems 
to  be  generated  when  the  BASIC  ON 
ERROR  is  used.  If  you  look  at  the  C  la¬ 


bel  L—2Z90,  you  see  a  goto  sub— ret, 
which  directs  the  program  flow  to 
the  correct  label.  Although  this  may 
remind  you  a  bit  of  spaghetti  BASIC 
code,  no  one  said  that  translating  er¬ 
ror  handling  from  one  language  to 
another  was  easy!  The  large  number 
of  labels  and  gotos  is  the  result  of  sup¬ 
porting  BASIC's  error  handling.  The 
BASTOC  translator  resorts  to  defen¬ 
sive  programming  to  cover  any  er¬ 
rors  generated  from  numerous  lines. 
Three  additional  code  segments  are 
inserted  by  BASTOC  to  handle  errors, 
each  starting  with  a  label.  The  first  is 
sub— ret,  mentioned  earlier,  which  is 
responsible  for  simulating  gosub  re¬ 
turns  when  error  handling  is  used. 
Label  err— trap  is  where  the  program 
flow  first  resumes  after  an  error  oc¬ 
curs.  From  label  err— trap  the  pro¬ 
gram  flows  to  the  error-handling 
routine  inherited  from  the  BASIC 
source  code.  In  this  example,  the  BA¬ 
SIC  error  handler  starts  at  line  1750, 
and  consequently,  the  C  version  re¬ 
sumes  at  label  L—1750.  Notice  that  the 
transformed  BASIC  error-handling 
code  ends  with  a  goto  un—trap  to  re¬ 
sume  program  execution.  The  code 
segment  following  the  un—trap  label 
contains  a  switch-case  with  a  long  list 
of  case  clauses  to  direct  the  program 
resumption. 

File  I/O  operations  in  the  C  version 
resort  to  calling  several  functions 
that  emulate  their  corresponding  BA¬ 
SIC  statements.  These  include  func¬ 
tions  BOPENf  ),  BCLOSEf  ),  INPUTf  ), 
and  BPRlNTf  ).  The  use  of  the  last  two 
functions  has  been  extended  to  in¬ 
clude  file  I/O.  The  first  two  argu¬ 
ments  of  INPUTf )  are  a  string-type 
file  I/O  indicator  and  the  buffer 
number.  The  BPRlNTf  )  function  call 
performing  file  output  differs  from 
the  one  involved  in  displaying  a  vari¬ 
able  by  having  the  buffer  number  as 
the  first  argument.  The  rest  of  the  ar¬ 
guments  are  the  same,  as  you  might 
expect.  The  BASIC  LPRINT  statements 
are  replaced  with  calls  to  the  C  func¬ 
tion  BLPRlNTf  ).  The  BLPRINTf  )  func¬ 
tion  is  to  the  BPRlNTf  )  function  as  BA¬ 
SIC's  PRINT  is  to  LPRINT. 

String  manipulation  involves  calls 
to  functions  that  clone  BASIC  string 
functions,  such  as  MID$f  ),  LENf  ),  and 
iNSTRf  ).  String  concatenation  em¬ 
ploys  the  function  s—asgn(  ),  follow¬ 
ing  typical  methods  used  in  C  for 
string  management. 


124 

638 


Dr.  Dobb's  Journal,  August  1987 


STRUCTURED  PROGRAMMING 

(continued  from  page  124) 


Summary 

Converting  interpreted  MS-BASIC  pro¬ 
grams  to  run  as  compiled  C  programs 
has  limitations.  Many  of  these  limita¬ 
tions  exist  because  of  the  compiled 
nature  of  the  functioning  C  ver¬ 
sions — for  example: 

•  CHAIN  is  translated  by  BASTOC  into  a 
program  execution  that  doesn’t  re¬ 
turn  to  the  original  program. 

•  COMMON  declarations  must  be  iden¬ 
tical  in  the  chained  BASIC  programs. 

•  PEEKS,  pokes,  VARPTR,  and  USR  are 
not  supported  for  the  C  and  L  memo¬ 
ry  models. 

Translating  BASIC  programs  into  C 
enables  your  programs  to  gain  the 
speed  of  compiled  programs.  The  al¬ 
ternative  is  to  use  a  BASIC  compiler  to 
attain  the  desired  speed.  Why  go  the 
C  route  instead  of  the  QuickBASIC  or 
Turbo  BASIC  way?  Transporting  BA¬ 
SIC  programs  to  C  for  the  sole  pur¬ 
pose  of  gaining  speed  may  be  over¬ 
kill.  The  advantage  of  translating 
BASIC  programs  is  that  C  programs 
can  be  modified  to  use  more  popular 
libraries,  and  more  important,  they 
can  be  enhanced  by  linking  them 
with  third-party  C  libraries.  This  en¬ 
ables  you,  for  example,  to  incorpo¬ 
rate  more  elegant  windowing  rou¬ 
tines  in  your  programs.  Also,  your 
programs  will  be  able  to  make  use  of 
C’s  efficient  pointer  manipulations, 
data  structures,  enumerated  types, 
unsigned  integers,  long  integers,  and 
more. 

Availability 

All  the  source  code  for  articles  in  this 
issue  is  available  on  a  single  disk.  To 
order,  send  $14.95  to  Dr.  Dobb's  Jour¬ 
nal,  501  Galveston  Dr.,  Redwood  City, 
CA  94063,  or  call  (415)  366-3600,  ext. 
216.  Please  specify  the  issue  number 
and  format  (MS-DOS,  Macintosh, 
Kaypro). 

DDJ 

(Listings  begin  on  page  86.) 

Vote  for  your  favorite  feature/article. 

Circle  Reader  Service  No.  7. 


126 


Dr.  Dobb  's  Journal,  August  1987 

639 


COLUMNS 


ARTIFICIAL  INTELLIGENCE 


LOOPS 


This  month  I’ll  continue  my  eval¬ 
uation  of  the  Xerox  1186  LISP 
machine  with  an  in-depth  discussion 
of  LOOPS,  the  object-oriented  AI  pro¬ 
gramming  environment  that  runs  on 
this  machine.  LOOPS  was  developed 
at  Xerox  PARC,  and  the  first  version 
was  released  in  1983.  The  principal 
designers  of  the  original  LOOPS  sys¬ 
tem  were  Dan  Bobrow  and  Mark  Ste- 
fik.  LOOPS  has  just  recently  been  re¬ 
leased  as  a  commercial  product  and 
is  an  important  addition  to  available 
expert  system  tools  and  AI  develop¬ 
ment  environments. 

Before  going  into  some  of  the  de¬ 
tails  of  LOOPS,  I  would  like  first  to  de¬ 
scribe  just  what  sort  of  an  AI  pro¬ 
gramming  environment  this  is  and 
what  its  overall  significance  might 
be.  But  first,  I  should  remove  one 
source  of  confusion — LOOPS  is  not 
the  same  thing  as  CommonLOOPS. 
The  latter  is  a  low-level,  object-ori¬ 
ented  extension  to  Common  LISP 
whereas  LOOPS  is  a  high-level,  AI  lan¬ 
guage  that  already  has  most  of  the  fa¬ 
cilities  needed  for  developing  ad¬ 
vanced  AI  applications. 

As  with  most  AI  systems,  LOOPS 
supports  rule-based  programming. 
What  makes  LOOPS  unique,  howev¬ 
er,  is  its  complete  implementation  of 


by  Ernest  R.  Tello 


an  object-oriented  programming  en¬ 
vironment.  The  language  contains 
just  about  all  that  was  valuable  and 
important  in  Smalltalk  as  well  as 
much  else  besides.  LOOPS  was  the 
tool  used  to  create  the  PRIDE  expert 
system  developed  by  Sanjay  Mittal, 
which  I  described  in  my  first  column 
(see  DDJ,  February  1987).  It  is  already 


apparent  that  some  new  paradigms 
for  expert  system  development  have 
emerged  as  a  result  of  various  pro¬ 
jects  using  LOOPS. 

One  of  the  central  ideas  in  the  de¬ 
sign  of  the  LOOPS  environment  was 
to  provide  an  AI  programming  sys¬ 
tem  that  would  support  a  multiple- 
paradigm  framework.  The  current 
system  supports  four  main  program¬ 
ming  paradigms:  the  object-oriented 
paradigm,  the  rule-based  paradigm, 
the  access-oriented  paradigm,  and 
the  normal  procedural  paradigm. 

Classes  and  Instances 

As  with  all  object-oriented  program¬ 
ming  systems,  LOOPS  provides  for 
building  hierarchies  of  classes  and 
instances  of  those  classes.  Let's  first 
look  at  the  simple  syntax  used  for  ac¬ 
cessing  objects.  The  way  you  would 
reference  a  user-defined  class  called 
Partnership,  or  any  other  class, 
would  be: 

($  Partnership) 

The  dollar  sign  means  that  the  object 
pointer  to  the  Partnership  structure 
is  to  be  referenced.  All  references  to 
objects  in  LOOPS  use  this  convention 
of  preceding  the  name  of  the  object 
with  the  dollar-sign  character. 

Another  syntax  convention  used  in 
LOOPS  is  the  back-arrow  character, 
which  I  represent  as  This  char¬ 
acter  is  accessed  on  the  standard  key¬ 


board  of  the  1186  with  the  under¬ 
score  key.  The  <T-  character 
translates  roughly  as  "send  the  mes¬ 
sage”  and  corresponds  to  message  in 
Flavors  or  send  in  SCOOPS  (the  object- 
oriented  extension  to  PC  Scheme).  So, 
the  LOOPS  expression: 

(<-($  Partnership)  New  'OurVenture) 

would  send  the  message  New  to  the 
Partnership  class  to  create  a  new  in¬ 
stance  of  itself  called  OurVenture. 

Much  of  the  activity  in  developing 
LOOPS  applications  involves  the  use 
of  its  rich  variety  of  window-  and 
menu-based  tools,  such  as  browsers 
and  editors. 

LOOPS  Browsers 

Software,  as  most  programmers  real¬ 
ize,  is  developed  in  layers,  or  shells, 
of  functionality.  All  the  major  ad¬ 
vances  in  software  engineering  coex¬ 
ist  in  some  form,  like  different  layers 
of  an  onion  or  rings  of  the  trunk  of  a 
tree.  LOOPS  and  the  1186  are  both  fine 
examples  of  this  organic  evolution  in 
the  AI  field.  The  high-level  tools  that 
are  provided  with  LOOPS  offer  an  AI 
development  environment  that,  in 
effect,  takes  software  development 
to  its  next  level. 

One  of  the  most  useful  and  spectac¬ 
ular  facilities  in  LOOPS  is  the  visually 
oriented  graphics  class  browser 
called  the  Lattice  Browser.  This  facil¬ 
ity  has  a  main  window  that  displays 
the  class  hierarchy  with  graphics 
lines  depicting  the  lines  of  inheri¬ 
tance  between  classes.  Many  facili¬ 
ties  for  editing  objects  are  available 
for  use  by  interacting  directly  with 
this  display.  The  main  menu  for  the 
Lattice  Browser  facility  looks  like 
that  in  Example  1,  page  131. 


130 

640 


Dr.  Dobb's  Journal,  August  1987 


The  PrintSummary  command 
prints  a  full  description  of  the  select¬ 
ed  class,  including  all  its  local  vari¬ 
ables  and  methods,  in  the  Exec  win¬ 
dow.  For  example,  selecting 
ActiveValue  and  using  the  PrintSum¬ 
mary  command  gives  the  display 
shown  in  Example  2,  below.  The 
PrintSummary  operation  has  the 
convenient  feature  that  the  custom 
methods  for  classes  are  shown  in 
bold  type  whereas  the  inherited 
classes  are  shown  in  normal  type. 

The  Wherels  command  is  also  con¬ 
venient.  If  you  need  to  know  the 
class  in  which  a  particular  method  is 
first  defined,  all  you  have  to  do  is 
choose  this  option,  wait  for  a  win¬ 
dow  with  a  list  of  all  the  methods  in 
the  system,  and  select  one.  Almost 
immediately,  the  name  of  the  class 
involved  on  the  Lattice  Browser  net¬ 
work  display  will  blink  on  and  off. 

Developing  applications  in  LOOPS 
involves  a  combination  of  writing 
code  in  the  editor  and  accessing  a 
large  number  of  convenient  facilities 
in  the  mouse-oriented  window  and 
menu  environment.  One  convenient 
way  of  developing  object  classes  is 
simply  to  enter  an  empty  class  in  the 
Exec  window  and  use  the  interactive 
facilities  to  flesh  out  the  class  defini¬ 
tion.  For  example: 

(DefineClass  ’Partnership) 

Once  you  have  entered  a  class  in  this 
way,  you  can  then  access  it  with  the 
class  browser.  To  do  so,  you  call  up 
the  main  menu,  select  the  Browse 
Class  command,  and  then  type  in  the 
name  of  the  root  class  at  the  prompt. 
Another  way  you  can  tell  LOOPS  that 
you  want  to  edit  the  methods  of  a 
class  is  by  accessing  them  through 
the  general  Lattice  Browser.  The 
same  menus  are  available  there  as 


PrintSummary  > 

Doc  (ClassDoc)  > 

Wherels  (WherelsMethod)  > 
DeleteFromBrowser  > 

SubBrowser 
TypelnName 


Example  1:  The  Lattice  Browser's 
main  menu 


under  an  individual  class  browser 
window.  For  me,  the  Lattice  Browser 
and  related  classes  form  the  heart  of 
the  LOOPS  user  interface. 

Methods 

In  object-oriented  systems,  methods 
are  the  private  procedures  or  func¬ 
tions  known  only  to  objects  of  a  giv¬ 
en  class  and  its  descendants.  LOOPS 
has  six  different  categories  of  meth¬ 
ods — the  class  methods  and  object 
methods  found  in  Smalltalk  and 
other  object-oriented  systems  and 
the  Internal,  Public,  Masterscope, 
and  Any  method  categories.  Internal 
methods  are  the  low-level  system 
methods  that  implement  LOOPS  itself. 
They  can  be  used  by  programmers 
who  know  what  they  are  doing  but 
are  not  intended  for  use  as  library 
methods  to  be  specialized.  Public 
methods  are  all  those  either  provid¬ 
ed  with  the  system  or  developed  by 
the  user  that  are  intended  to  be  spe¬ 
cialized  for  various  purposes.  Besides 
these,  there  are  also  special  Master- 
scope  methods  that  are  local  only  to  a 
particular  application  and  can  be 
used  only  when  it  has  been  invoked. 
Any  methods  are  all  those  that  have 
not  been  declared  to  be  one  of  the 
other  types. 

Methods  in  LOOPS  follow  the 
syntax: 

(METHOD  ((ClassName  Selector)  self 

ARG1  .  .  .  ARGn) .  .  .  body) 

Selector  is  the  name  of  the  method 
that,  when  sent  to  the  appropriate 
object,  succeeds  in  invoking  it.  The 
self  argument  is  a  dummy  term  that 
stands  for  the  class  to  which  the  mes¬ 
sage  will  be  sent.  For  example,  the 
Destroy  method,  which  is  imple¬ 
mented  for  the  Object  class,  is 
written: 


(Method  ((Object  Destroy) 
self 

(< -(Class  self) 

Destroylnstance  self)) 

It  takes  no  arguments  other  than  self 
but  is  written  so  that  it  can  be  inherit¬ 
ed  by  subsequent  classes  but  still  al¬ 
ways  destroy  the  proper  class  when 
called.  The  expression  (Class  self)  as¬ 
sures  that  the  message  will  be  sent  to 
the  class  of  the  object.  From  there,  it 
simply  calls  on  the  Destroylnstance 
method.  This  may  seem  metaphysi¬ 
cal,  but  practically  speaking  it  is  im¬ 
portant  to  be  able  to  uncreate  objects 
to  provide  memory  for  creating 
other  new  ones. 

Methods  are  created  in  LOOPS  us¬ 
ing  DefineMethod.  This  function  has 
the  form: 

(DefineMethod  class  selector 

args  OrFn  expr  file  methodType) 

The  way  you  would  usually  go 
about  defining  a  new  method  is  to 
click  the  middle  mouse  button  on  the 
class  in  the  Lattice  Browser  and  then 
select  the  Add  command  from  the 
menu  and  AddMethod  from  its  sub¬ 
menu.  At  that  point  a  prompt  panel 
opens  with  the  message: 

Type  the  selector  for  the  new  meth¬ 
od:  > 

You  then  enter  the  method's  calling 
name.  For  example,  let's  say  you  en¬ 
ter  the  name  NewMethod.  At  that 
point  a  window  of  SEdit  opens  with 
the  following  template  already  load¬ 
ed  in  it: 

(Method  ((Object  NewMethod) 
self 

(SubclassResponsibility)) 


#. ($  ActiveValue) 

Supers 

Object 

IVs 

CVs 

Methods 

AVPrintSource  AddActiveValue  CopyActiveValue  DeleteActiveValue 
De leteNestedAct ive Value  GetWrappedValue 

GetWr appedVa lueOn ly  HasAV?  NestActiveValue  PutWrappedValue 
PutWrappedValueOnly  Replace  Act iveValue  WrapOutside? 

Wr  appingP recedence 

Example  2:  Display  resulting  from  selecting  ActiveValue  and  using  PrintSum¬ 
mary  in  the  Lattice  Browser 


Dr.  Dobb's  Journal,  August  1987 


131 

641 


ARTIFICIAL  INTELLIGENCE 

(continued  from  page  131) 


This  template  is  purely  for  conve¬ 
nience  when  it  applies.  If  you  like, 
you  can  delete  any  part  of  it  or  even 
all  of  it  and  begin  with  a  clear  editing 
window. 

On  the  whole,  the  template  is  usu¬ 
ally  useful.  To  say  that  this  is  a  huge 
workspace  with  vast  resources  that  it 
takes  substantial  time  to  master 
would  run  the  risk  of  understating 
the  case. 

Active  Values 

In  LOOPS,  an  active  value  is  an  object 


that  sends  messages  as  a  side  effect  of 
attempts  to  read  or  write  to  the  in¬ 
stance  variable  of  another  object. 
This  facility  is  often  useful  in  visually 
oriented  interfaces,  debugging,  ini¬ 
tializing  variables,  and  defining  de¬ 
pendency  relationships  between 
variables.  The  ActiveValue  class  is  a 
direct  subclass  of  the  Object  class.  It  is 
an  abstract  class,  however.  Instances 
are  not  made  of  it  but  of  its 
subclasses. 

An  interesting  example  of  a  practi¬ 
cal  use  of  active  values  is  in  designing 
a  window  that  always  remains 
square  even  when  resized  by  the 
user.  The  way  you  do  this  is  to  create 


an  active  value  that  tracks  the  width 
variable  for  an  instance  of  the 
SquareWindow  class  and  automati¬ 
cally  sets  the  height  variable  equal  to 
it. 

One  of  the  built-in  uses  of  active 
values  in  LOOPS  is  for  dynamic  moni¬ 
toring  of  the  state  of  objects.  Gauges  is 
a  LOOPS  library  application  that  con¬ 
tains  a  variety  of  display  classes  that 
allow  you  to  attach  the  values  of  criti¬ 
cal  variables  to  graphic  displays. 
These  displays  depict  various  types 
of  gauges  and  meters  that  provide  for 
visual  inspection  of  the  instance  vari¬ 
ables  of  instantiated  objects.  When¬ 
ever  the  value  of  an  attached  vari¬ 
able  changes,  the  gauge  or  meter  is 
immediately  modified  to  indicate  the 
new  value. 

The  Gauges  class  has  two  main  sub¬ 
classes — LCD  and  Instrument.  LCD  in 
turn  has  the  two  main  subclasses — 
Digiscale  and  Digimeter.  Instrument, 
on  the  other  hand,  has  three  main 
branches — VerticalScale,  Round- 
Scale,  and  HorizontalScale.  Meters 
and  Dials  are  specializations  of  the 
RoundScale  class.  All  in  all,  these 
Gauge  subclasses  provide  for  just 
about  any  style  of  visual  gauge  or  me¬ 
ter  that  might  be  needed,  ranging 
from  needle  gauges  to  thermome¬ 
ters. 

To  use  Gauges,  all  that  is  really  in¬ 
volved  is  to  create  an  instance  of  one 
of  the  Gauge  classes  and  provide  it 
with  values  for  the  necessary  param¬ 
eters  by  sending  it  the  appropriate 
messages.  To  get  a  gauge  to  be  visible 
on  the  screen,  you  would  first  create 
an  instance  by  saying: 

(<-$Dial  New  ’DialOne) 

which  creates  an  instance  of  the  Dial 
class  called  DialOne.  You  would  then 
send  it  the  Update  message: 

(<-$DialOne  Update) 

If  you  need  to  set  the  value  of  the  dial 
to  an  initial  value,  you  can  send  the 
Set  message: 

(<-$DialOne  Set  100) 

The  message  that  assigns  a  meter  or 
dial  to  a  given  variable  is  the  Attach 
message,  which  is  also  simple  to  exe¬ 
cute.  If  you  wanted  to  assign  your 
dial  as  an  indicator  of  the  amount  of 


132 

642 


Dr.  Dobb's  Journal,  August  1987 


fuel  left  in  a  rocket  in  a  simulation  of 
a  space  vehicle  mission,  you  could  do 
so  easily  by  saying: 

(<-$DialOne  Attach  $TitanIV_547 
’FuelRemaining) 

This  Gauges  application  in  LOOPS  is 
similar  to  the  Activelmages  facility  in 
the  KEE  tool  from  IntelliCorp. 

Miy ins  and  Multiple 
Inheritance 

LOOPS  provides  full  support  for  mul¬ 
tiple  inheritance,  which  means  that 
a  class  can  be  defined  as  a  subclass  of 
more  than  one  superclass.  Another 
way  of  saying  this  is  that  multiple  su¬ 
perclasses  can  be  selected  as  mixins 
for  a  new  class. 

The  basic  rule  that  inheritance  fol¬ 
lows  in  LOOPS  can  be  stated  succinct¬ 
ly  as  "left  to  right,  up  to  joins.”  What 
this  means  is  that,  if  a  message  M  is 
sent  to  class  Z  and  that  method  is  not 
directly  implemented  in  Z,  then  a 
search  takes  place  up  the  class  lattice 
for  method  M  among  the  immediate 
superclasses  of  Z,  their  superclasses, 
and  so  on.  The  order  of  search  is  left 
to  right  and  "breadth  first”  in  the 
sense  that  all  the  immediate  super¬ 
classes  are  searched  first  before  any 
of  their  superclasses  and  so  forth. 

Rules 

LOOPS  has  an  original  approach  to  us¬ 
ing  rules.  Rules  are  always  organized 
into  definite  rulesets,  which  can 
have  various  different  kinds  of  con¬ 
trol  structure  to  evaluate  them.  A 
ruleset  is  always  associated  with 
some  particular  LOOPS  object  that 
provides  the  workspace  for  the 
rules.  You  can  invoke  rulesets  in  sev¬ 
eral  different  ways.  In  the  object-ori¬ 
ented  paradigm,  you  invoke  them  by 
sending  a  method  to  the  object  that 
contains  them.  In  the  access-oriented 
paradigm,  you  invoke  them  by  using 
active  values  as  a  side  effect  of  either 
reading  or  writing  data  in  object 
properties.  You  can  even  write  indi¬ 
vidual  rules  that  invoke  other  rule- 
sets,  and  you  can  also  invoke  rulesets 
from  any  LISP  program. 

There  are  six  main  control  struc¬ 
tures  for  rule  processing  in  LOOPS: 
Dol,  DoAll,  Whilel,  WhileAll,  Fori, 
and  For  All.  If  you  use  the  DoAll  con¬ 
trol  structure,  rule  processing  begins 
with  the  first  rule  of  the  ruleset  and 


executes  each  and  every  rule  that  is 
satisfied.  With  the  Dol  control  struc¬ 
ture,  only  the  first  rule  whose  condi¬ 
tions  are  satisfied  is  executed.  If  no 
rule  fires,  the  ruleset  returns  a  value 
of  NIL. 

The  Whilel  control  structure  is  a 
cyclic  version  of  Dol.  With  this  con¬ 
trol  regime,  a  while  condition  is  speci¬ 
fied.  If  the  condition  is  satisfied,  the 
first  rule  whose  condition  is  satisfied 
is  executed,  as  with  the  Dol  con¬ 
struct.  The  difference  is  that  if  the 
while  condition  is  still  satisfied  after 
that,  the  process  is  repeated  until  the 
condition  no  longer  holds  or  until  a 
Stop  instruction  is  encountered.  Simi¬ 


larly,  the  WhileAll  construct  is  the  cy¬ 
clic  version  of  DoAll.  If  the  condition 
is  satisfied,  all  the  rules  are  tried  and 
as  many  executed  as  can  fire,  and 
this  is  repeated  until  either  the  while 
condition  fails  or  Stop  is  encoun¬ 
tered. 

The  Fori  construct  is  another  cy¬ 
clic  version  of  Dol .  Instead  of  a  while 
condition,  this  type  of  control  struc¬ 
ture  has  an  iteration  condition.  The 
processing  of  rules  occurs  as  with 
Dol,  but  the  process  reiterates  over  a 
range  of  values  until  the  limit  value  is 
reached.  A  similar  control  regime  oc¬ 
curs  with  the  use  of  the  For  All  con¬ 
struct,  except  that  here  the  behavior 


Dr.  Dobb's  Journal,  August  1987 


133 

643 


ARTIFICIAL  INTELLIGENCE 

(continued  from  page  133) 

resembles  DoAll — as  many  rules  as 
can  be  satisfied  are  executed. 

One  of  the  main  ideas  behind  the 
design  of  the  LOOPS  rule-oriented 
programming  approach  was  to  al¬ 
low  control  information  to  be  fac¬ 
tored  out  as  much  as  possible.  This  is, 
of  course,  a  worthwhile  idea  because 
it  means  that  the  knowledge  is  kept 
separate  from  the  control  structure 
mechanisms.  One  of  the  advantages 
of  rule-based  programming  is  just 
this  separation  of  content  from  con¬ 
trol.  It  allows  the  modular  addition  of 
rules  so  that  a  production  system 
keeps  running  from  the  time  the  first 
rules  are  entered  until  it  is  completed 
without  rewriting  the  inferencing 
code.  Some  AI  languages,  such  as 
OPS5,  encourage  the  writing  of  nu¬ 
merous  rules  whose  function  is  con¬ 
trol  of  knowledge  processing,  which 
tends  to  neutralize  the  advantages  of 
rule-based  systems  in  separating 
knowledge  and  control.  The  LOOPS 
control  structure  declarations  I  have 
just  outlined  attempt  to  cope  with 
this. 

Another  useful  LOOPS  rule  con¬ 
struct  is  that  of  first/last  rules,  which 
are  rules  that  can  fire  either  before 
or  after  the  main  part  of  a  ruleset  is 
invoked.  They  are  implemented  by 
inserting  an  (F)  or  {L}  in  the  MetaDe¬ 


proper.  LOOPS  also  has  an  audit  trail 
capability  in  its  implementation  of 
rules. 

The  rule  syntax  in  LOOPS  can  best 
be  illustrated  by  an  example.  Exam¬ 
ple  3,  below,  is  an  illustration  from 
the  LOOPS  manual.  In  this  example, 
the  brace  indicator  {l!}  indicates  that 
the  rules  involved  are  "one-shot 
bang  rules,”  or  "try  once”  rules.  The 
rules  are  tried  only  once,  whether 
they  pass  or  fail.  Any  declaration  in 
curly  braces  before  rules  is  called  a 
metadescription  in  LOOPS.  Another 
use  of  such  metadescriptions  is  in  the 
meta-assignment  statements  used  for 
describing  audit  trails  and  rules.  Au¬ 
dit  trails  provide  a  thorough  facility 
for  debugging  and  explaining  why 
things  happened  the  way  they  did. 

Calls  to  custom  InterLlSP  or  Com¬ 
mon  LISP  functions  can  be  included 
in  LOOPS  rules  in  both  premises  and 
conclusions  simply  by  enclosing 
them  in  parentheses.  Similarly, 
LOOPS  message-sending  expressions 
can  be  nested  in  rules  by  enclosing 
them  in  parentheses  and  observing 
the  back-arrow  and  dollar-sign  con¬ 
ventions.  Access  to  LOOPS  instance 
variables  in  rules  is  done  by  using  a 
colon  (:)  operator.  So,  for  example: 

$YourPartnership:industry  =  'Law 

is  a  rule  declaration  that  assigns  the 
value  Law  to  the  industry  variable  of 


access  to  class  variables  is  provided 
with  the  double  colon  (:;)  operator. 

Virtual  Copies 

One  of  the  more  interesting  things  in 
the  LOOPS  library  is  the  provision  for 
virtual  copies  of  networks  of  in¬ 
stances.  This  is  based  on  the  insight 
that  it  can  be  useful  to  treat  a  group 
of  instances  as  a  unit  that  can  be  du¬ 
plicated  and  tracked  efficiently.  The 
copies  are  virtual  in  two  different 
ways.  Only  those  properties  of  the 
instances  that  are  modified  are  actu¬ 
ally  copied.  Those  that  remain  identi¬ 
cal  to  the  originals  just  "share”  the 
values  of  the  prototype.  The  copies 
are  also  virtual  in  the  sense  that  only 
the  specific  instances  that  will  be 
needed  in  processing  are  actually 
copied. 

Any  object  that  is  to  have  a  virtual 
copy  must  have  a  special  class  vari¬ 
able  called  VirtualVS.  The  value  of 
this  variable  specifies  which  instance 
variables  of  the  original  object  will 
be  copied  as  opposed  to  being  shared. 
The  implementation  of  virtual  copies 
is  accomplished  by  two  classes — Vir- 
tualCopyMiyin  and  VirtualCopy Con¬ 
tent.  Virtual  copies  represent  a  kind 
of  hybrid  between  classes  and  in¬ 
stances.  They  provide  a  medium-lev¬ 
el  mechanism  whereby  construc¬ 
tions  such  as  perspectives  and 
hypothetical  reasoning  can  be 
implemented. 

LOOPS  Applications 

With  LOOPS  it  is  possible  to  develop  a 
wide  variety  of  different  AI  applica¬ 
tions.  It  is  not  simply  a  shell  for  the 
development  of  expert  systems. 
Even  in  the  case  of  expert  systems, 
different  paradigms  for  them  have 
been  developed  using  LOOPS  that  de¬ 
part  dramatically  from  the  usual 
rule-based  systems.  The  facilities  I 
have  been  describing  make  it  possi¬ 
ble  to  develop  knowledge-based  sys¬ 
tems  that  make  little  or  no  use  of  the 
rule-based  paradigm.  How,  then,  are 
such  systems  designed? 

The  PRIDE  expert  system  that  I  dis¬ 
cussed  in  my  first  column  is  one  of 
the  best  examples  of  such  a  system  to 
date.  There  has  been  much  talk  at  Xe¬ 
rox  about  building  an  entirely  new 
type  of  expert  system  shell  paradigm 
based  on  the  PRIDE  application,  just  as 
the  EMYCIN  shell  was  derived  from 
the  MYCIN  expert  system  application. 


scription  field  just  prior  to  the  rule  I  the  Your  Partner  ship  object.  Similarly, 


RuleSetName:  FillTub; 

Workspace  Class:  WashingMachine; 

Control  Structure:  WhileAll; 

Tenp  Vars:  waterLimit; 

While  Cond:  T; 

{1!}  IF  loadsetting  =  'Small  THEN  waterLimit  <-  10; 

{1!}  IF  loadsetting  =  'Medium  THEN  waterLimit  <-  13.5; 

{1!}  IF  loadsetting  =  'Large  THEN  waterLimit  <-  17; 

{1!}  IF  loadsetting  =  'ExtraLarge  THEN  waterLimit  <-  20; 

IF  temperaturesetting  =  'Hot 

THEN  HotWaterValve .Open  ColdWaterValve. Close; 

IF  temperatureSetting  =  'Warm 

THEN  HotWaterValve. Open  ColdWaterValve .Open; 

IF  temperatureSetting  =  'Hot 

THEN  ColdWaterValve. Open  HotWaterValve .Close; 

IF  waterLevelSensor.Test  >=  waterLimit 

THEN  HotWaterValve .Close  ColdWaterValve. Close; 
(Stop  T) 

Example  3:  An  example  of  the  LOOPS  rule  syntax  from  the  Xerox 
LOOPS  Manual 


136 

644 


Dr.  Dobb’s  Journal,  August  1987 


ARTIFICIAL  INTELLIGENCE 

(continued  from  page  136) 


A  few  years  ago,  some  interesting  re¬ 
search  on  hierarchical  planning  in 
the  LOOPS  environment  was  conduct¬ 
ed  at  Xerox  PARC  by  the  late  Danny 
Berlin. 

Some  important  work  with  the 
LOOPS  system  has  also  been  conduct¬ 
ed  at  Ohio  State  University  under  the 
direction  of  Professor  B.  Chandrase- 
karan.  Professor  Chandrasekaran  is 
an  advocate  of  what  he  calls  "generic 
tasks”  that  operate  as  high-level 
building  blocks  in  the  development 
of  knowledge-based  AI  applications. 
At  this  point,  he  feels  that  there  are 
primarily  six  such  generic  tasks:  hi¬ 
erarchical  classification,  hypothesis 
matching  or  assessment,  knowledge- 
directed  information  passing,  abduc- 
tive  assembly,  hierarchical  design  by 
plan  selection  and  assembly,  and 
state  abstraction.  I  will  attempt  to 
give  only  a  brief  explanation  of  these 
generic  tasks  here. 

Hierarchical  classification  is  per¬ 
haps  the  best-known  type  of  prob¬ 
lem  in  the  expert  systems  category,  a 
simple  example  of  which  is  the  well- 
known  Animal  game.  It  turns  out 


that  this  problem  of  classification  is 
at  the  heart  of  many  diagnosis  prob¬ 
lems.  Hypothesis  matching  is  the 
process  of  determining  the  degree  of 
fit  of  a  collection  of  data  points  to  a 
hypothesis,  such  as  by  estimating  the 
probability  or  certainty  that  the  hy¬ 
pothesis  is  true.  Knowledge-directed 
information  passing  refers  to  the  use 
of  rules  or  frames  to  encode  knowl¬ 
edge  that  directs  a  knowledge  pro¬ 
cessing  system  to  seek  certain  values 
under  various  conditions.  Abductive 
assembly  is  another  form  of  reason¬ 
ing  that  assembles  the  best  hypothe¬ 
ses  for  a  given  set  of  data  by  a  method 
similar  to  the  means-ends  analysis 
used  in  the  Dendral  expert  system. 
Hierarchical  design  by  plan  selection 
refers  to  a  new  type  of  task  in  expert 
systems  technology — that  of  routine 
design.  This  new  category  of  applica¬ 
tion  is  typified  by  two  mechanical 
engineering  expert  systems — Aircyl 
and  PRIDE.  The  last  generic  task  is 
state  abstraction,  which  involves  a 
mechanism  for  predicting  the  conse¬ 
quences  of  actions  by  the  use  of  qual¬ 
itative  simulation. 

Amazing  as  AI  workstations  such 
as  the  Xerox  1186  may  seem,  some  of 
this  technology  has  already  started 


rubbing  off  on  powerful  micros. 
Next  month  I  will  review  a  surpris¬ 
ingly  powerful  and  cost-effective, 
object-oriented  programming  tool: 
Smalltalk/V  for  IBM  PCs  and 
compatibles. 

Bibliography 

Bylander,  T.;  and  Mittal,  S.  "CSRL:  A 
Language  for  Classificatory  Problem 
Solving  and  Uncertainty  Handling.” 
AI  Magazine,  vol.  7,  no.  3  (Summer 
1986). 

Chandrasekaran,  B.  "Generic  Tasks 
in  Knowledge-Based  Reasoning: 
High-Level  Building  Blocks  for  Ex¬ 
pert  System  Design.”  IEEE  Expert  (Fall 
1986). 

Mittal,  S.;  et  al.  "PRIDE:  An  Expert  Sys¬ 
tem  for  the  Design  of  Paper  Handling 
Systems.”  Computer  (July  1986). 
Mittal,  S.;  Bobrow,  D.;  and  Kahn,  K. 
"Virtual  Copies,  Between  Classes  and 
Instances.”  ACM  OOPSLA-86  Confer¬ 
ence  Proceedings. 

Xerox  LOOPS  Manual.  Pasadena,  Calif: 
Xerox  Corp.,  1987. 

DDJ 

Vote  for  your  favorite  feature/article. 

Circle  Reader  Service  No.  8. 


138 


Dr.  Dobb's  Journal,  August  1987 

645 


FORUM _ 

LETTERS 

(continued  from  page  12) 


at  fault  for  not  making  it  clearer  that 
the  words  defined  by  VECTOR:  and 
the  word  DO-OPTION  were  conceived 
as  getting  their  arguments  from 
some  routine  (such  as  the  menu  rou¬ 
tine  mentioned  in  the  text),  not  from 
the  user  entering  a  number  and  the 
command.  I  must  admit  that  it  never 
crossed  my  mind  that  anyone  would 
consider  giving  the  user  unedited  ac¬ 
cess  to  such  words  as  these.  The 
phrase  Carl  quotes,  4  DO-OPTION,  was 
intended  as  a  warning,  not  as  an  ex¬ 
ample.  (Although  Carl  points  out  that 
you  can  predict  in  general  an  unsatis¬ 
fying  result  from  using  the  phrase,  I 
think  he  will  agree  that  the  specific 
effects  are  unpredictable.) 

Carl  and  I  again  agree  when  he 
comments  that  programs  should  be 
written  to  protect  both  the  program 
and  the  users.  Carl's  experience 
seems  to  be  with  command-driven 
programs;  in  those  the  command 
word  must  include  the  requisite  edits 
and  filters  to  qualify  the  input — and 
as  he  points  out,  a  CASE  statement 
normally  includes  implicit  edits.  (He 
refers  to  "the”  CASE  statement,  but 
Forth  provides  no  standard  imple¬ 
mentation.  CASE  statements  are  ven¬ 
dor  dependent,  which  is  why  I  did 
not  use  them  in  my  article.) 

My  own  programs  are  usually 
menu-based,  with  the  edits  built  into 
the  routine  that  presents  the  menu 
and  collects  the  user’s  choice.  Either 
approach — menu-driven  or  com¬ 
mand-driven — can  comfortably  and 
reliably  use  execution  vectors,  pro¬ 
vided  the  programmer  made  sure 
that  the  argument  used  to  index  into 
the  vector  is  within  range.  Execution 
vectors  are  such  a  standard  and  use¬ 
ful  tool  in  Forth  programming  that  I 
think  condemning  their  writers  to 
Programmer’s  Hell  is  too  severe.  But, 
as  Carl  helpfully  reminds  us,  execu¬ 
tion  vectors  must  be  used  with 


caution. 

More  Feedback 

Dear  DDJ, 

With  respect  to  Dan  Farnsworth's  let¬ 
ter  and  code  example  on  page  12  of 
the  April  issue,  I  have  the  following 
observations: 

1.  Mr.  Farnsworth's  timings  ignore 
the  fact  that  DBF  is  faster  •  when  it 
branches  than  when  it  falls  through. 
The  correct  times  for  his  68000  and 
68008  loops  are  5,672  and  3,784  cy¬ 
cles,  respectively. 

2.  If  you  arrange  the  hardware  so 
that  each  device  register  occupies  ei¬ 
ther  four  successive  even  addresses, 
or  four  successive  odd  addresses,  you 
can  take  advantage  of  the  MOVEP  in¬ 
struction  to  produce  a  routine  [see 
Example  1,  below]  that  is  4  bytes 
longer  than  Mr.  Farnsworth's  68008 
loop  routine  but  that  takes  2,980  cy¬ 
cles  on  the  68000,  compared  to  3,784 
for  his  on  the  68008.  The  correspond¬ 
ing  straight-line  routine  is  256  bytes 
longer  than  its  68008  counterpart, 
but  it  takes  2,332  cycles,  compared  to 
2,616  for  the  68008. 

As  an  attempt  to  demonstrate  that 
the  68008  can  outrun  the  68000,  Mr. 
Farnsworth’s  example  fails.  I  had, 
however,  been  using  a  rule  of  thumb 
that  an  8-MHz  68008  has  the  through¬ 
put  of  a  4-MHz  68000  (except  for  mul¬ 
tiplication  and  division,  of  course).  It 
is  clear  that,  at  least  in  some  special 
cases,  a  68008  at  8  MHz  can  keep  up 
with  a  68000  at  6  MHz  or  better. 

(Of  course,  in  most  cases,  wait 
states  would  slow  down  any  of  those 
routines.  To  run  with  no  wait  states 
in  an  8-MHz  system,  the  peripheral 
controller  would  have  to  deliver  a 
byte  every  500  ns — an  instantaneous 
rate  of  2  megabytes  per  second.  SCSI 
with  handshake  cannot  do  better 


than  1  byte  every  667  ns,  or  1.5  mega¬ 
bytes/second.  Anyone  who  is  pre¬ 
pared  to  spend  the  bucks  for  a  syn¬ 
chronous  SCSI  channel  with  2 
megabytes/second  or  better 
throughput  is  likely  to  use  a  68020 
rather  than  a  68008  or  68000.) 
Christopher  T.  Jewell 
3900  Moorpark  Ave. 

San  Jose,  CA  95117 

Buttons  and  Gadgets 

Dear  DDJ, 

Thank  you  for  publishing  Jan  L.  Har¬ 
rington’s  article  on  Macintosh  and 
Amiga  interface  programming  (Janu¬ 
ary  1987).  It  provided  me  with  a  good 
starting  point  for  assembly-language 
programming  on  the  Amiga. 

Please  note,  however,  that  some 
typographical  errors  appeared  in  the 
Amiga  program  listing  (Listing 
Three)  [see  Table  1,  page  142],  The 
first  four  typos  generate  assembler 
errors,  and  the  last  prevents  exiting 
when  you  select  "Quit”  on  the  Pro¬ 
ject  menu. 

James  P.  Hastey,  Jr. 
c/o  PPCoN,  Dept.  90 
P.O.  Box  220 
N-4056  Tananger 
Norway 

Dear  DDJ, 

This  is  a  follow-up  to  my  previous 
letter  on  typographical  errors  in  the 
Amiga  program  listing  accompany¬ 
ing  Jan  Harrington’s  article. 

My  exuberance  in  actually  getting 
the  program  to  run  caused  me  to 
overlook  two  shortcomings:  the 
Workbench  memory  counter  report¬ 
ed  176  bytes  (cumulative)  less  of  free 
memory  each  time  the  program  ran 
plus  the  system  crashed  after  five  to 
eight  successive  runs. 

Closer  examination  revealed  that 
the  program  needed  to  perform  a  lit¬ 
tle  more  housecleaning  before  exit¬ 
ing.  Perhaps  the  omission  was  inten¬ 
tional,  with  the  listings 
accompanying  the  article  intended 
only  as  examples  of  the  difference  in 
the  two  user  interfaces.  As  such,  they 
served  the  purpose  well.  For  the 
Amiga  listing  (Listing  Three)  to  be  a 
more  practical  example,  a  few  addi¬ 
tions  need  to  be  made. 

This  is  not  a  complaint.  I  thorough¬ 
ly  enjoyed  reading  Jan  Harrington’s 


MOVEQ 

#63, D1 

4 

LEA 

DCA,  A0 

12 

LOOP 

MOVEP .L 

0 (A0) , DO 

24  * 

64 

1536 

MOVE.L 

DO,  (Al)  + 

12  * 

64 

768 

DBF 

Dl.LOOP 

10  * 

63  +  14 

644 

RTS 

/ 

16 

• 

* 

Total 

time 

2980  cycles 

Example  1:  Alternative  to  Dan  Farnsworths'  68008  loop  routine 


140 

646 


Dr.  Dobb  s  Journal,  August  1987 


LETTERS 

(continued  from  page  140) 

article  and  learning  a  bit  about  the 
Amiga  Intuition  interface  in  assem¬ 
bly  language.  I  would  like  to  point 
out  that  I  am  not  a  programmer  by 
profession  but  have  worked  with 
6502  and  8088  assembly  language  as  a 
hobby. 

The  following  additions  I  propose 
are  based  solely  upon  recommenda¬ 
tions  found  in  the  Amiga  ROM  Kernel 
Reference  Manual:  Eyec,  the  Amiga 
ROM  Kernel  Reference  Manual:  Li¬ 
braries  and  Devices,  and  the  Amiga 


Intuition  Reference  Manual.  On  the 
surface  they  appear  to  cure  the  prob¬ 
lems  I've  described.  Please  be  aware 
that  as  a  relative  novice  I  may  have 
overlooked  something. 

Add  the  following  to  the  list  of  ylibs 
in  lines  23-41: 

xlib  FreeSignal 

xlib  FreeMem 

xlib  RemPort 

xlib  CloseLibrary 

xlib  ClearMenuStrip 


Insert  the  code  in  Example  2,  below, 
between  lines  233  and  234  in  the  sub¬ 
routine  Close AndOpit ,  and  insert  the 
following  between  lines  239  and  240 
of  the  same  routine: 

move.l  IntBase,Al 

move.l  _AbsExecBase,A6 

callsys  CloseLibrary  ;close  the 

library 

James  P.  Hastey,  Jr. 

(address  on  previous  letter) 

Jan  Harrington  replies: 

My  thanks  to  James  Hastey  for  find¬ 
ing  the  typographical  errors  in  the 
Amiga  listing.  They  arose  primarily 
because  the  Amiga  code,  after  being 
debugged  on  that  machine,  was 
keyed  into  the  Mac  from  printed  list¬ 
ings  to  prepare  copy  for  typesetting 
(I  suppose  the  smarter  way  would 
have  been  a  direct  machine-to-ma- 
chine  transfer).  His  housekeeping  ad¬ 
ditions  to  the  code  are  also  well  tak¬ 
en.  The  purpose  of  that  program, 
however,  was  simply  to  demonstrate 
the  Amiga  user  interface,  not  to  pro¬ 
duce  a  program  that  would  actually 
do  useful  work.  As  it  was  getting 
very  long,  even  with  just  the  user  in¬ 
terface  code,  I  decided  to  keep  it  as 
simple  as  possible. 

Macintosh  programs,  in  general, 
do  not  require  the  same  sort  of  clean¬ 
up  as  Amiga  programs  do.  Under  or¬ 
dinary  circumstances,  the  Mac’s  op¬ 
erating  system  performs  the  cleanup 
on  its  own;  programmers  needn’t 
worry  about  it. 

The  Right  Tool 

Dear  DDJ, 

I  refer  to  Mike  Suman’s  Viewpoint  in 
your  February  1987  issue.  I  whole¬ 
heartedly  agree  with  Mr.  Suman's 
analysis  of  what  is  wrong  with  high- 
level  languages.  Though  I  am  a  high- 
level  language  user  and  have  never 
written  production  programs  in  as¬ 
sembly  language  (except  in  comput¬ 
er  school),  I  must  admit  that  at  times 
they  make  the  life  of  the  program¬ 
mer  difficult,  if  not  clerical. 

I  differ  with  Mr.  Suman’s  view 
with  regard  to  the  assembly-lan¬ 
guage  version  of  Mr.  Anderson’s  Mo¬ 
dula-2  program  (Code  Example  3), 
however.  I  have  never  programmed 


Line 

Appears  in  Listing  as 

Should  Read 

5 

CALLIB _ LVO\  1 

CALLLIB _ LV0\  1 

23 

(This  instruction  is  missing 

to 

from  the  series  of  xlibs  in 

lines 

xlib  OpenScreen 

41 

23  to  4  1  .  ) 

62 

move.l  #0,ns_Fonts(A0) 

move.l  #0,ns _ Font (A0) 

122 

lea  Pro  jltera, A1 

lea  Proglteml  ,A1 

226 

and  1X00000000001  111  11, DO 

and #560000000000 111  111,  D1 

Table  1:  Corrections  to  Listing  Three  of  Jan  Harrington 's  article  (January  1987) 


move . 1  ReadMsg,  A1 
move  fXOSTD  SIZE, DO 

move.l  _AbsExecBase,A6 
callsys  FreeMem 

;free  up  read  io  block  memory 

move.l  ReadPort, A1 
move.l  _AbsExecBase, A6 
callsys  RemPort 

; remove  read  port  from  system 

move.l  ReadPort,  A1 
move  #MP_SIZE, DO 

move . 1  _AbsExecBase,  A6 
callsys  FreeMem 

;free  up  read  port  memory 

move.l  ReadPort,  M 
clr.l  DO 

move .b  MP_SIGBIT (A4) , DO 
move.l  _AbsExecBase,  A6 
callsys  FreeSignal 

;free  up  read  port  signal  bit 

move.l  WriteMsg,Al 
move  #IOSTD_SIZE, DO 

move.l  _AbsExecBase,A6 
callsys  FreeMem 

;free  up  write  io  block  memory 

move.l  WritePort,Al 
move  #MP_SIZE, DO 

move . 1  _AbsExecBase,  A6 
callsys  FreeMem 

;free  up  write  port  memory 

move.l  WritePort,A4 
clr.l  DO 

move.b  MP_SIGBIT(A4)  ,D0 
move . 1  _AbsExecBase, A6 
callsys  FreeSignal 

;free  up  write  port  signal  bit 

move.l  WindowPtr, A0 
move.l  IntBase,A6 
callsys  ClearMenuStrip 

; clear  menu  strip 

Example  2:  Insert  for  subroutine  Close  And  Quit 


142 


Dr.  Dobb's  Journal,  August  1987 

647 


in  Modula-2,  but  between  two  evils  I 
still  prefer  Modula-2  for  I  find  no 
thrill  or  excitement  in  writing  strings 
of  Is  and  Os  as  shown  in  the  example. 
I  feel  that  programmers  who  have 
been  exposed  to  a  wide  variety  of 
languages  would  agree  with  me  in 
this  case.  We  must  bear  in  mind  that 
the  main  aim  of  a  high-level  lan¬ 
guage  is  to  unburden  programmers 
from  dealing  with  trivial  things  so 
they  can  concentrate  on  the  main 
problem  at  hand. 

This  is  why  I  feel  that  dBASE  III  lan¬ 
guages  are  best  when  it  comes  to 
handling  file  and  tabular  problems. 
They  provide  the  right  ingredients 
for  attacking  trivial  problems  such  as 
the  one  discussed.  To  dBASE  pro¬ 
grammers  the  benefit  is  obvious 
right  away — ease  of  maintenance.  I 
wish  facilities  such  as  this  (table  han¬ 
dling)  would  be  included  in  the  new 
languages  flooding  the  market.  Fur¬ 
thermore,  we  must  not  forget  that 
every  language  is  designed  to  suit  a 
particular  problem,  and  we  should 


therefore  use  the  right  tool  for  the 
job.  We  should  not  use  a  tool  for 
something  for  which  it  is  not  intend¬ 
ed.  We  border  on  the  unreasonable 
when  we  overstretch  the  validity  of 
a  thing. 

Lito  Cruz 
3  Spring  St. 

Thomastown,  Victoria 
Australia  3074 

Correction 

Dear  DDJ, 

As  the  author  of  “An  Extended  IBM 
PC  COM  Port  Driver”  (June  1987),  I 
wanted  to  get  a  head  start  on  readers’ 
complaining  about  bugs  in  Listing 
One  presented  with  the  article.  The 
seventh  line  of  code  after  the  label 
bOOO  needs  to  have  BUFSZ  *  Z  changed 
to  (BUFSZ  *  2)  -  2.  This  bug  causes 
COM1  operation  to  mess  up  COM2's 
buffer  pointers  and  COM2  operation 
to  destroy  the  ability  to  restore 
intOB’s  vector  upon  termination. 
There's  no  problem  at  all  if  you  only 
use  COM1.  The  seventh  line  (I  thought 


7  was  a  lucky  number)  of  code  after 
the  label  b230,  with  the  comment  in¬ 
dicate  receiver  enabled,  needs  to  be 
moved  to  be  the  fifth  line  after 
bZ30 — that  is,  after  the jg  .  .  line.  This 
bug  causes  DTR  protocol  to  work  only 
if  XON/XOFF  protocol  is  used  also. 

I  apologize  for  any  inconvenience 
and  suggest  that  readers  try  excom, 
Version  2,  available  through  DDJ  or 
CompuServe.  It  has  these  bugs  fixed, 
some  substantial  enhancements,  and 
probably  has  some  new  and  more 
exciting  bugs. 

Also,  the  following  code  was  de¬ 
leted  from  the  bottom  of  page  76: 

init  1=  thebit; 

} 

/*  set  port  number  */ 
void 

Tom  Zimniewicz 

2695  Pond  Rd. 

Lima,  NY  14485 

DDJ 


Dr.  Dobb's  Journal,  August  1987 

648 


143 


PROGRAMMER'S  SERVICES 


THE  STATE  OF  BASIC 


The  New  Internal 
Coding  Engines 

In  this  issue  we  look  at  the  decision¬ 
making  and  loop  constructs  imple¬ 
mented  by  the  new  BASICS.  These 
contructs  enable  you  to  write  more 
stt-uctured  code  and  use  fewer  GO- 
TOs.  Most  of  these  constructs  are  de¬ 
rived  from  features  of  other  lan¬ 
guages,  notably  Pascal.  Does  that 
mean  that  BASIC  is  being  Pascal-ized? 
No!  It  simply  indicates  that  a  well- 
thought-out  and  more  structured 
program  is  superior  to  an  un¬ 
planned,  GOTO-riddled  program. 
Clarity  and  neatness  of  code  really 
pays  off  when  you  come  back  later 
to  update  your  program. 

The  new  decision-making  con¬ 
structs  have  much  to  olfer.  First,  the 
one-line  IF  ..  .  THEN  .  .  .  ELSE  state¬ 
ment  has  been  extended  to  spread 
over  multiple  lines.  Moreover,  ELSEIF 
clauses  are  now  supported.  This  syn¬ 
tax  for  the  IF  statement  is  a  much 
needed  improvement.  The  ability  for 
the  THEN  and  ELSE  clauses  to  contain 
a  series  of  statements  enables  you  to 
phase  out  GOTOs  painlessly.  Example 
1,  right,  shows  the  general  syntax  of 
the  extended  IF  statement. 

The  new  BASICS  have  also  added  a 
CASE  statement.  QuickBASIC  (Version 
3.0),  Turbo  BASIC,  and  True  BASIC 
have  implemented  SELECT  CASE  with 
features  that  outperform  the  CASE 
statements  of  Pascal  and  Modula-2. 
Example  2,  right,  demonstrates  all 
the  types  of  CASE  statements  in  the 
new  CASE  SELECT.  They  are: 

•  single  items  to  be  compared  with 
the  selected  variable 

•  a  list  of  items,  delimited  by  commas 

•  a  range  of  values  to  be  compared 
with  the  selected  variable 


•  partial  logical  expressions 

CASE  statements  can  also  contain  a 
combination  of  these. 

Looking  at  Example  2,  notice  that 
the  first  CASE  statement  compares 
the  selected  string  A$  with  a  single 
item.  The  second  CASE  contains  a  list 
of  selected  symbols.  The  following 
three  CASE  statements  use  the  value 
range  ( <first>  to  <last> )  to  detect 
if  the  input  character  is  lowercase, 
uppercase,  or  a  digit.  The  following 
two  CASE  statements  use  the  inequal¬ 
ity  operator  to  test  if  the  character  is 
a  control  character  or  an  extended 
ASCII  character.  Finally,  the  CASE 
ELSE  clause  is  an  important  catchall 
clause. 

The  new  BASICS  also  include  a  new 
loop  construct — namely,  the  DO 
. .  .  LOOP,  a  powerful  and  flexible 
loop  that  has  the  ability  to  use  logical 
testing.  The  standard  FOR  .  .  .  NEXT 
loop  has  been  supported  by  adding 

IF  <expre3Sion>  THEN 

<sequence  of  statements> 
ELSEIF  <expression>  THEN 
<sequence  of  statements> 

ELSE 

<sequence  of  statements> 

END  IF 

Example  1:  General  syntax  of 
extended  If  statement 


INPUT  "Enter  Character  ";A$ 
IF  A$  =  “"  THEN  A$  »  "  " 

A$  -  LEFT$  (A$,  1) 

SELECT  CASE  AS 


CASE  "+" 


PRINT  "Plus  sign" 

CASE 

PRINT  "Special  symbols" 
CASE  "a"  to  "2" 

PRINT  "Lower  case" 

CASE  "A”  to  "Z" 

PRINT  "Upper  case" 

CASE  "0"  to  "9" 

PRINT  "Digits" 

CASE  is  <  CHRS  (27) 

PRINT  "Control  character" 
CASE  is  >  CHRS (127) 

PRINT  "Extended  ASCII  set" 
CASE  ELSE 

PRINT  "Not  classified  by 


END  SELECT 


this  program 


END 


Example  Z:  Short  program  to 


demonstrate  SELECT  CASE 


an  EXIT  FOR  statement  to  enable  a 
clean  exit  from  a  FOR  loop.  Turbo  BA¬ 
SIC  also  supports  EXIT  statements  for 
the  WHILE  loop  and  IF  and  CASE 
statements. 

The  bare  bones  DO .. .  LOOP  is  an 
open  loop  that  is  exited  from  with  an 
EXIT  LOOP/DO  statement.  The  DO 
. . .  LOOP  EXIT  statement  is  embedded 
in  the  loop  body.  The  WHILE  and/or 
UNTIL  clauses  can  be  placed  after  the 
DO  or  LOOP  keywords,  as  in: 

DO  [[WHILE  !  UNTIL]  <expression>] 
<sequence  of  statements> 

LOOP  [[WHILE  !  UNTIL]  <expression>] 

This  creates  an  interesting  combina¬ 
tion  of  tests,  especially  because 
WHILE/UNTIL  clauses  can  occur  si¬ 
multaneously  after  the  DO  and  LOOP 
keywords.  The  powerful  DO 
.  .  .  LOOP  can  easily  offer  the  equiva¬ 
lent  of  WHILE  . . .  WEND  (in  BASIC  it¬ 
self),  REPEAT  .  .  .  UNTIL  (in  Pascal), 
and  DO  .  .  .  WHILE  (in  C). 

In  addition,  loops  using  UNTIL  < ex¬ 
pression  >  clauses  can  easily  replace 
the  equivalent  WHILE  NOT  <expres- 
sion>.  Thus,  the  familiar  file-read¬ 
ing  loop: 

OPEN  1, "I", "DATA.DAT" 

WHILE  NOT  EOFtl) 

<file  input  operations> 

WEND 

close  n 

can  be  replaced  by: 

OPEN  1, "I", "DATA.DAT" 

DO  UNTIL  EOF(l) 

<file  input  operations> 

LOOP 
CLOSE  *\ 

This  State  of  BASIC  is  the  last  in  a 
series  of  introductory  material  on  as¬ 
pects  of  the  new  BASICS.  Future  issues 
of  DDJ  will  contain  articles  that  dis¬ 
cuss,  in  an  integral  fashion,  various 
aspects  of  programming  with  BASIC. 

DDJ 

Vote  for  your  favorite  feature/article. 

Circle  Reader  Service  No.  9. 


144 


Dr.  Dobb's  Journal,  August  1987 

649 


PROGRAMMER'S  SERVICES 


OF  INTEREST 


Spring  Comdex  in  Atlanta  is  usually  a 
software  show,  but  this  year,  despite  a 
conference  program  that  focused  on 
software  issues,  the  exhibit  floor  con¬ 
tained  a  lot  of  hardware  products.  In 
terms  of  delivery,  some  of  the  PS/2 
add-on  hardware  was  softer  than  the 
386  system  software,  of  which  there 
was  also  an  abundance. 

386  Computers 

Wyse  has  announced  a  line  of  286 
machines  and  a  386  machine,  all 
planned  to  track  IBM  and  OS/2  but 
with  some  features  that  make  the 
machines  interesting  to  technical  us¬ 
ers.  Each  machine  has  front-panel 
controls  and  an  LCD  system-status  dis¬ 
play,  disk  caching,  software  emula¬ 
tion  of  the  LIM  expanded  memory 
spec,  and  a  disk  reorganization  utility 
to  improve  hard-disk  access  time. 
The  386  machine  can  support  both 
the  80287  and  the  80387  math  co¬ 
processors;  it  costs  $4,999  for  a  40- 
megabyte  hard-disk  model. 

Wyse  has  cut  prices  on  older  mod¬ 
els,  for  example,  an  AT-compatible 
now  costs  $1,999.  Wyse  tube  subsid¬ 
iary  Amdek  has  also  entered  the  per¬ 
sonal  computer  market  with  its  own 
line  of  8088-  through  80386-based  ma¬ 
chines.  Reader  Service  No.  16. 

Wyse  Technology 
3571  North  First  St. 

San  Jose,  CA  95134 
(408)  433-1000 

TeleVideo  has  introduced  four  386 
computers,  some  with  concurrent 
Unix  and  DOS,  with  prices  starting  at 
$3,995.  One,  TeleStar,  comes  with  an 
80387  coprocessor.  Another,  Telenix, 
uses  Microport's  V/386  Unix  and  DOS- 


Merge  software,  allowing  Unix  and 
DOS  applications  to  run  concurrently. 
TeleVideo  thinks  Unix/DOS  integra¬ 
tion's  hour  is  nigh,  citing  32-bit  pro¬ 
cessor  performance,  convergence  on 
System  V  for  32-bit  systems,  and  good 
sales  in  key  markets  for  Unix  in  the 
past  few  months.  TeleVideo  bought 
into  Microport  Systems,  a  company 
with  a  pivotal  role  in  Unix/DOS  inte¬ 
gration,  in  March  of  this  year.  Reader 
Service  No.  17. 

TeleVideo 
1170  Morse  Ave. 

P.O.  Box  3568 
Sunnyvale,  CA  94088-3568 
(408)  745-7760 

In  the  continuing  saga  of  Unisys  reor¬ 
ganization,  the  company  has  re¬ 
grouped  a  number  of  former  Sperry 
and  Burroughs  Unix-based  comput¬ 
ers  into  one  line  running  System  V.2 
from  AT&T.  Prices  range  from 
$14,000  to  $500,000.  Unisys  has  also 
announced  a  386  workstation  called 
the  B38  for  $4,835,  or  $5,635  with  an 
80287  coprocessor.  Reader  Service 
No.  18. 

Unisys  Corp. 

P.O.  Box  500 

Blue  Bell,  PA  19424-0001 

(313)  972-9515 

Intelligent  Data  Systems,  with 
heavy  backing  from  Taiwan  clone 
maker  Copam,  has  entered  the  386 
market  with  a  16-MHz  386  machine 
with  1-megabyte  RAM,  1.2-megabyte 
floppy  drive,  two  serial  ports,  one 
parallel  port,  DOS  3.2,  a  40-megabyte 
hard  disk,  and  EGA  graphics  for 
$4,495.  Reader  Service  No.  19. 
Intelligent  Data  Systems 
6319  East  Alondra  Blvd. 

Paramount,  CA  90723 
(213)  633-5504 

Contrary  to  popular  belief,  the  lead¬ 
ing  mail-order  manufacturer  of  per¬ 
sonal  computers  is  not  an  Austin- 
based  company  named  PC’s  Limited. 
The  company  that  has  been  doing 
business  as  PC's  Limited  was  actually 
incorporated  as  Dell  Computer  Cor¬ 
poration  in  1984,  and  it  announced 
at  Comdex  that  it  would  henceforth 
be  using  its  real  name,  although  exist¬ 
ing  products  will  continue  to  carry 


the  PC's  Limited  logo.  The  change  is 
undoubtedly  because  of  a  desire  to 
get  away  from  the  low-budget  mail¬ 
order  image  and  to  problems  the 
name  would  cause  as  the  company 
pushed  into  international  markets. 
The  fact  that  22-year-old  founder 
Mike  Dell  has  acquired  a  degree  of 
fame  may  have  had  something  to  do 
with  it,  too.  The  company  continues 
to  challenge  the  more  pin-striped 
manufacturers  on  price  and  per¬ 
formance,  has  cut  prices  on  its  286 
machines,  and  is  pushing  price  com¬ 
parisons  of  its  386  machine  with 
Compaq's  and  IBM's.  The  38616  sys¬ 
tems  start  at  $4,499  for  a  40-megabyte 
hard  drive,  monochrome  system. 
Reader  Service  No.  20. 

PC's  Limited/Dell  Computer  Corp. 
1611  Headway  Cir.,  Bldg.  3 
Austin,  TX  78754 
(512)  339-6800 

SOTA  Technology  claims  that  you 
don’t  really  need  a  386  machine,  cit¬ 
ing  benchmarks  on  which  its  286 
Mothercard  5.0  outperforms  a  Com¬ 
paq  Deskpro  386.  The  PC  or  compati¬ 
ble  card  comes  with  up  to  4  mega¬ 
bytes  on-board  RAM  and  a  12.5-MHz 
286  processor;  a  daughtercard  can  in¬ 
crease  RAM  to  12  megabytes  and  uses 
only  one  bus  slot.  An  EPROM  and  bat¬ 
tery-backed  RAM  hold  the  BIOS,  mak¬ 
ing  it  eminently  reconfigurable,  so 
SOTA  can  pursue  OS/2  compatibility. 
Reader  Service  No.  21. 

SOTA  Technology 
657  North  Pastoria  Ave. 

Sunnyvale,  CA  94086 
(408)  245-3366 

386  System  Software 

The  THEOS  multiuser,  multitasking 
operating  system  has  been  enhanced 
for  the  386.  Version  2.2  addresses  up 
to  16  megabytes  of  real  memory  in 
protected  virtual  address  mode,  can 
read  and  write  DOS  files  from/to  a 
DOS  partition,  and  supports  the  80387 
coprocessor.  Reader  Service  No.  22. 
THEOS 

1777  Botelho  Dr.,  Ste.  360 
Walnut  Creek,  CA  94596-5022 
(415)  935-1118 

The  Software  Link  is  shipping  PC- 
MOS/386,  the  multiuser,  multitasking 


146 

650 


Dr.  Dobb’s  Journal,  August  1987 


OF  INTEREST 

(continued  from  page  146) 

operating  system  for  386  machines 
that  we  discussed  in  July.  Early  re¬ 
ports  are  mixed;  we’ll  say  more 
when  we  know  more.  Reader  Ser¬ 
vice  No.  23. 

The  Software  Link 

3577  Parkway  Ln. 

Atlanta,  GA  30092 
(404)  448-5465 

Quarterdeck  has  announced  Ver¬ 
sion  2.0  of  DesqView,  its  multitasking, 
multiwindow  environment.  Version 
2.0  has  VGE  and  MCGA  graphics,  sup¬ 
ports  Microsoft  Windows-specific, 
Digital  Research  GEM-specific,  and 
IBM  Topview-specific  applications; 
and  EGA's  43-line  and  VGA's  50-  and 
60-line  text  modes.  There's  a  full  API 
capability  so  programmers  can  de¬ 
velop  to  the  Deskview  look  and  feel 
and  a  386  memory  manager  a  la  Com¬ 
paq’s  on  the  Deskpro  386.  Suggested 
retail  price  is  $129.95.  Reader  Service 
No.  24. 

Quarterdeck  Office  Systems 

150  Pico  Blvd. 

Santa  Monica,  CA  90405 
(213)  392-9851 

DesqView  looks  particularly  impres¬ 
sive  when  one  of  its  tasks  is  actually 
running  on  Definicon’s  DSI-780 
68020/68881  card.  The  780  line  is  just 
the  latest  of  Definicon’s  32-bit  co¬ 
processor  products,  which  the  Defin- 
icon  folks  have  described  in  the  Au¬ 
gust  and  September  1985  and  July 
and  August  1986  issues  of  Byte.  They 
now  claim  that  the  780  in  an  AT  runs 
AutoDesk’s  AutoCad  faster  than  any 
other  system  does,  including  the  best 
unannounced  386  machine  and  the 
top-of-the-line  Sun  workstation. 
Reader  Service  No.  25. 

Definicon  Systems  Inc. 

1100  Business  Center  Cir. 

Newbury  Park,  CA  91320 
(805)  499-0652 

OS/286  and  OS/386  from  AI  Archi¬ 
tects  are  extensions  to  DOS  3.x  that 
permit  programs  to  run  in  protected 
mode  and  address  several  megabytes 
of  memory.  They  do  not  replace  DOS; 
system  calls  still  work  and  the  operat¬ 
ing  system  still  looks  like  DOS  to  the 
user.  AI  Architects  is  the  developer  of 
Hummingboard,  a  PC  card  with  up  to 
24  megabytes  of  on-card  RAM  and  a 

16-MHz  or  20-MHz  386.  The  Hum¬ 
mingboard  does  not  replace  the  286 
or  8086  CPU  in  the  way  that  an  accel¬ 
erator  card  does  but  implements  a 
smooth  coprocessor  design.  Reader 
Service  No.  26. 

AI  Architects  Inc. 

One  Kendall  Sq.,  Ste.  2200 

Cambridge,  MA  02139 
(617)  577-8052 

For  people  using  386  systems  for  mul¬ 
tiuser  purposes,  Arnet  has  developed 
an  add-on  board  called  the  Virtual 
Terminal  Adapter  that  makes  the  386 
think  that  dumb  terminals  are  video 
RAM.  The  board  has  four  RS-232 
ports.  The  problems  the  board  solves 
have  to  do  with  running  DOS  applica¬ 
tions,  multitasking,  and  supporting 
local  printers.  Many  DOS  applications 
write  directly  to  video  RAM,  so  they 
don’t  work  with  dumb  terminals. 
Multiuser  operating  systems  general¬ 
ly  do  not  support  multitasking  on 
dumb  terminals  because  they  are  not 
memory-mapped,  and  multiuser  sys¬ 
tems  usually  cannot  drive  both  the 
dumb  terminal  and  a  printer  at¬ 
tached  to  it.  Mapping  the  dumb  ter¬ 
minals  into  video  RAM  and  saving  the 
386  the  associated  housekeeping 
chores  solves  all  these  problems,  Ar- 
tek  claims.  The  board's  price  is 
$1,500-2,000.  Reader  Service  No.  27. 
Arnet  Corp. 

476  Woodycrest  Ave. 

Nashville,  TN  37210 
(615)  254-0646. 

PS/2  Add-Ons 

Many  board  companies  announced 
the  launching  of  PS/2  product  lines, 
but  reading  between  the  lines  showed 
that  often  the  only  product  near  re¬ 
lease  was  a  board  for  the  non-Micro 
Channel  Model  30. 

Orchid  claims  orchids  for  exhibiting 
the  first  memory  board  for  the  IBM 
PS/2  with  a  2-megabyte  Micro  Chan¬ 
nel  board  for  $995.  It’s  designed  for 
Model  50/60  machines  and  will  sup¬ 
port  both  LIM  Expanded  Memory 
Spec  and  extended  memory,  which 
later  will  matter  when  Microsoft  re¬ 
leases  OS/2. 

Orchid  has  also  announced  Jet- 
RAM,  a  32-bit  RAM  board  with  2  mega¬ 
bytes  of  extended  memory  for  DOS  or 

Unix  users,  and  a  graphics  card  com¬ 
patible  with  PGA,  CGA,  EGA,  MDA,  and 
Hercules  graphics.  Reader  Service 
No.  28. 

Orchid  Technology 

45365  Northport  Loop  West 

Fremont,  CA  94538 
(415)  683-0300 

Tecmar  is  heavily  committed  to  sup¬ 
porting  PS/2  machines.  Tecmar,  you 
may  recall,  was  quick  to  deliver  a 
third-party  hardware  product  for 
the  IBM  PC  back  in  1981,  and  it  would 
like  to  be  near  the  front  of  the  PS/2 
pack.  Announced  products  include  a 
memory/multifunction  board  with 
up  to  2  megabytes  and  two  serial 
ports.  Reader  Service  No.  29. 

Tecmar  Inc. 

6225  Cochran  Rd. 

Solon  (Cleveland),  OH  44139-3377 
(216)  349-0600 

Quadrant  announced  several  PS/2 
line  products,  including  a  2-mega- 
byte  memory  board,  2-megabyte 
multifunction  board,  and  I/O  boards 
for  Models  50  &  60.  Only  512K-  and  1- 
megabyte  versions  will  be  available 
at  release  this  summer,  using  256K 
SIMMs  (Single  In-line  Memory  Mod¬ 
ules);  the  2-megabyte  version  will  use 
1-megabyte  SIMMs.  Quadram  also  an¬ 
nounced  an  80287-based  graphics 
board,  offering  thrice  the  clarity,  16 
times  the  color  selection,  and  25  times 
the  speed  of  EGA,  now  shipping; 
modes  include  800  X  600  X  4.  Reader 
Service  No.  30. 

Quadram 

One  Quad  Wy. 

Norcross,  GA  30093-2919 
(404)  923-6666 

DDJ 

148 


Dr.  Dobb’s  Journal,  August  1987 

651 


FORUM 


SWAINE'S  FLAMES 


One  database-language  develop¬ 
er  at  the  spring  Comdex  threw 
up  his  hands  and  squawked,  "Whose 
idea  was  all  this  SQL  stuff,  anyway?" 
This  particular  developer  was  of  the 
opinion  that  an  intelligent  user  ought 
to  be  able  to  structure  his  own  que¬ 
ries  and  that  if  he  couldn't,  he  proba¬ 
bly  didn't  know  what  he  wanted  to 
know;  he  opined  that  SQL  must  be 
something  foisted  on  developers  by 
Esther  Dyson. 

Philippe  Kahn  wants  to  know  why 
price  has  ceased  to  be  an  issue  wor¬ 
thy  of  discussion  in  the  computer 
press.  He  thinks  that  users  would  be 
better  served  if  more  attention  were 
paid  to  what  they  get  for  their  dol¬ 
lars.  In  grinding  the  Borland  Ax,  Phi¬ 
lippe  makes  a  fine  point.  Why  in¬ 
deed  would  people  who  receive 
hordes  of  free  review  copies  of  soft¬ 
ware  be  insensitive  to  price  and 
value? 

Aussie  reader  Richard  Walding 
writes  to  ask  if  I'm  sure  about  the 
genesis  I  presented  in  Fire  in  the  Val¬ 
ley  of  the  Apple  Computer  bitten  ap¬ 
ple  logo.  "Alan  Turing,”  he  writes, 
"the  man  who  laid  the  theoretical  ba¬ 
sis  of  digital  computing. .  .ended  his 
life  by  eating  an  apple  covered  with 
cyanide."  There  seems  to  be  some 
question  about  how  the  cyanide  got 
into  Turing’s  system,  but  the  bitten 
apple  was  there  at  the  scene,  and  it  is 
the  sort  of  metaphor  that  would  ap¬ 
peal  to  Wozniak,  who  priced  his  Ap¬ 
ple  I  at  $666,  the  Biblical  Number  of 
the  Beast. 

Which  raises  the  question  of  what 
logo  Bill  Campbell  will  use  for  his 
new  Apple  software  company  when 
Apple  spins  off  its  in-house  software 
development  as  an  independent 
company  under  Bill's  management. 
A  candy-coated  apple? 

Then  there  is  the  odd  subscription 
inquiry: 

"Regarding  your  reply  to  our  letter 
of  April  1,  1987.  You  wrote  that  we 
should  indicate  which  publication 
we  are  requesting.  Please  note  that 


our  client  is  asking  for: 

Dr.  Dobb's  Journal  of  Computer  Tools 
Dr.  Dobb  's  Journal  of  Modular  Tools 
Dr.  Dobb's  Journal  of  Computer 
Calisthenics" 

He  seems  to  think  that  in  addition 
to  Dr.  Dobb's  Journal  of  Software 
Tools  (to  which  he  is  already  sub¬ 
scribing)  there  are  three  other  Dr. 
Dobb's  Journals.  Do  you  know  any¬ 
thing  about  these  other  three  Dr. 
Dobb's? 

Some  people,  on  the  other  hand, 
are  all  answers,  like  my  entrepre¬ 
neurial  cousin  Corbett. . . . 

Cousin  Corbett's  Secrets  of  Soft¬ 
ware  Success,  Part  I:  The  Modern 
Software  Product  Life  Cycle. 

The  savvy  software  developer  of 
today  must  keep  pace  with  mercuri¬ 
al  market  conditions.  The  modern 
software  product  life  cycle,  or 
MSPLC,  consists  of  the  familiar 
phases  of  research,  development, 
maintenance,  and  maturity,  but  in  a 
streamlined  form. 

Research.  First,  you  must  identify 
an  appropriate  market.  Size  alone  is 
not  important:  lots  of  people  use 
computers  at  home,  yet  there  is  no 
home  market;  on  the  other  hand, 
there  are  only  a  few  hundred  com¬ 
puter-industry  journalists,  yet  they 
constitute  a  market  vital  to  your  suc¬ 
cess.  Industry  writers  found  laptop 
computers  extremely  useful  in  their 
work  and  ensured  their  success,  and 
a  program  that  simulates  flying  an 
airplane  has  been  successful  in  part 
because  of  its  value  to  reviewers  in 
evaluating  IBM  ROM  BIOS  compati¬ 
bility.  So  focus  on  products  that  bene¬ 
fit  journalists.  (N.B.:  your  time  is  pre¬ 


cious  and  should  not  be  squandered 
on  nonproducts,  so  write  no  code 
during  this  phase.  If  the  research 
phase  indicates  that  your  idea  has 
market  viability,  announce  a  product 
and  start  taking  orders.) 

Development.  Once  you  have  be¬ 
gun  cashing  checks  you  legally  have 
only  three  months  to  produce  some¬ 
thing.  Many  developers  make  the 
mistake  of  trying  to  do  too  much  in 
this  phase,  resulting  in  missed  deliv¬ 
ery  dates  and  buggy  software.  Recog¬ 
nize  this  truth:  three  months  is  not 
enough  time  to  write  a  useful 
application. 

The  solution?  Concentrate  your  ef¬ 
forts  on  the  user  interface.  Develop 
your  version  1.0  using  one  of  the 
many  programs  for  producing  de¬ 
mos  and  mockups.  This  will  give  the 
user  something  to  critique,  establish 
your  “look  and  feel"  claims,  and  al¬ 
low  you  to  concentrate  on  launching 
your  full-scale  user-funded  promo¬ 
tional  campaign. 

Maintenance.  In  this  stage,  hire 
some  coders  to  flesh  put  the  product 
based  on  the  useful  feedback  users 
will  provide  on  the  features  they 
most  want.  Note  that  by  postponing 
actual  coding  until  this  phase  you 
have  avoided  the  waste  of  creating 
unwanted  features. 

Maturity.  The  maintenance  phase 
ultimately  produces  Release  1.1,  the 
first  working  version,  and  marks  the 
beginning  of  the  mature  phase  of  the 
product's  life.  Don't  think  that  you 
can  then  relax,  though — to  keep  the 
product  viable  you  should  produce  a 
completely  new  version  of  the  pack¬ 
aging  at  least  twice  a  year. 


Michael  Swaine 
editor-in-chief 


152 

652 


Dr.  Dobb's  Journal,  August  1987 


:  131  SEPTEMBER  198? 


Dr.Dobb’sJournalof 


Software  Tools 

FOR  THE  PROFESSIONAL  PROGRAMMER 


Quest  for  Algorithms 


%  lg§% 

J?  llB 

■  'vSwS.-* 


Writing  MS-DOS 
Device  Drivers 


Languages: 

C,  Pascal,  Smalltalk, 
and  V.I.P 


’M,:  ? 


t 


SEPTEMBER  1987 


CONTENTS 


VOLUME  12,  ISSUE  9 


Circle  algorithms  ^ 
Comparing  files  ► 

Linking  lists  ^ 
DOS  drivers  ^ 

Keeping  time  ► 

Quad  trees  ^ 

Smalltalk  ^ 


ARTICLES 


How  Many  Ways  Can  Vou  Draw  a  Circle?  18 

by  James  F.  Blinn 

The  fine  art  of  circular  reasoning,  with  numerous  examples 

A  Survey  of  File  Comparison  Algorithms  as 

by  Tom  Steppe 

Tom  examines  various  alternative  approaches  in  making  a 
good  file  comparison  utility  and  follows  with  C  source  for  his 
own  offering. 

The  XOil  Chain  Revisited  38 

by  Bennette  R.  Harris 

Here’s  how  to  use  the  exclusive-OR  operation  to  compress  two 
pointers  into  a  single  link  field  of  a  doubly  linked  list. 

Writing  MS-DOS  Device  Drivers  la  C  44 

by  Andy  Klein 

Andy  uses  an  example  print  driver  written  in  C  to  show  how 
to  write  installable  device  drivers. 


COLUMNS 


C  CHEST  106 

by  Allen  Holub 

Allen  examines  the  use  of  the  PC  system  clock  and  DOS 
interrupts  in  a  metronome  program  he’s  written  for  a  MIDI 
application. 

STRUCTURED  PROGRAMMING  123 

by  Namir  Clement  Shammas 

Namir’s  column  begins  with  a  look  at  VIP.  (Visual  Interactive 
Programming),  an  icon-based  programming  environment  for 
the  Mac.  He  also  offers  algorithms  for  optimizing  searches  and 
sorts  of  clustered  binary  trees. 

ARTIFICIAL  INTELLIGENCE  128 

by  Ernest  R.  Tello 

Ernie  takes  a  look  at  Digitalk’s  Smalltalk^  for  the  PC. 


EDITORIAL 

6 

ADVERTISER  INDEX:  137 

by  Tyler  Sperry 

Where  to  go  for  more  infor- 

RUNNING  LIGHT 

8 

mation  on  products 

by  Tyler  Sperry 

OF  INTEREST:  142 

ARCHIVES 

8 

Products  for  programmers 

LETTERS 

12 

by  you 

SWAINE’S  FLAMES 

152 

by  Michael  Swaine 

Dr.lloiib'sjfltirnalo! 

Software  Torts 

Quest  for  Algorithms 


,1 


About  the  Cover 

It’s  a  problem  we’ve  all  faced, 
and  it  dates  back  to  Og,  the 
caveman.  You  struggle  with 
primitive  tools  and  hostile  envi¬ 
ronments,  only  to  discover  that 
someone  else  has  already 
invented  the  wheel.  Special 
thanks  to  Bob  Wynne,  our 
production  manager,  for  bat¬ 
tling  saber-toothed  tigers  every 
month  to  put  this  magazine  in 
your  hands — and  for  taking  the 
time  to  model  for  the  cover 
photo.  All  through  the  photo 
session,  we  heard  him  mut¬ 
tering  to  himself,  “What  goes 
round,  comes  round  ....’’ 


This  Issue 

In  case  you  hadn’t  guessed,  it’s 
algorithm  time  again.  We've  got 
a  good  selection  of  items  for 
your  toolbox:  everything  from 
going  in  circles  to  climbing 
trees.  Now  all  you  need  is  a 
generic  data  structure  (one  size 
fits  all). 


Next  Issue 

October's  ddj  is  our  annual 
Forth  issue.  Among  the  articles 
is  a  demonstration  of  how 
threaded  subroutines  let  a 
68000  Forth  approach  the  speed 
of  compiled  languages.  We  also 
have  a  special  feature  on 
hacking  the  AppleTalk  network 
to  add  remote  nodes  via  an 
ordinary  RS-232  serial  link. 


Dr.  Dobb's  Journal,  September  1987 

654 


3 


FORUM 


EDITORIAL 


Stone  Age  Computing 


i/iiuuiw  jjvuiumui 

Software  Tools 

FOR  THE  PROFESSIONAL  PROGRAMMER 


This  month’s  ddj  has,  without 
argument,  one  of  the  most 
thought-provoking  covers  we've  pro¬ 
duced.  The  mix  of  Og,  the  caveman, 
and  a  pseudocoded  cave  wall  is  in¬ 
congruous  enough  to  provoke  a 
hoard  of  questions  in  the  reader’s 
mind.  What  does  that  code  on  the 
wall  really  say?  Why  is  Og  staring  so 
intently  at  the  bottom  line  of  the 
algorithm?  And  perhaps  most  im¬ 
portant  of  all:  has  the  staff  at  ddj 
finally  gone  round  the  bend? 

Once  you  get  past  the  fun  cap¬ 
tions  and  lines  about  “recoding  the 
wheel,"  however,  there’s  an  aspect 
of  Og’s  situation  that  is  jarringly  simi¬ 
lar  to  the  present  day.  Simply  put: 
today’s  personal  computers  often 
seem  (despite  their  sexy  new  proces¬ 
sors  and  sophisticated  designs)  to 
be  the  product  of  Neanderthal  think¬ 
ing.  Again  and  again  we’re  presented 
with  products  that  are  inexplicably 
primitive  or  crippled. 

Apple’s  Macintosh,  for  example, 
seems  to  be  a  curious  mix  of  high- 
tech  and  high-Bronze  Age.  Much  of 
the  design  work  in  the  Macintosh  is 
wasted  in  a  machine  aimed  at 
people  only  slightly  more  sophisti¬ 
cated  than  our  friend  Og.  We  have 
icons  for  all  those  computer  users 
who  are  too  stupid  or  lazy  to  read 
English  and  a  single-button  mouse 
for  those  unable  to  handle  a  key¬ 
board.  To  its  credit,  Apple  has  re¬ 
cently  added  an  innovation  for  the 
Mac’s  most  advanced  users — the  key¬ 
boards  now  have  cursor  keys. 

The  problems  of  primitive  think¬ 
ing  and  design  are  hardly  an  Apple 
monopoly,  of  course.  Microsoft’s  ms- 
dos,  based  on  ideas  from  cp/m  and 
Unix,  is  so  primitive  that  it  defies 
tasteful  jokes.  On  the  other  hand, 
OS/2 — with  over  a  million  lines  of 
code  yet  to  be  debugged  by  adven¬ 
turous  users — has  tremendous  po¬ 
tential  for  being  as  good  a  running 
gag  as  Microsoft's  Windows.  Assum¬ 


ing,  of  course,  it  actually  runs. 

And  then  there’s  the  Intel  archi¬ 
tecture.  . . .  Has  anyone  out  there 
found  a  solution  to  switching  back 
and  forth  between  protected  and 
real  mode  in  the  80286?  Being  able 
to  use  both  modes  was  going  to  be 
a  major  feature  of  the  chip,  but  here 
we  are — years  after  the  introduc¬ 
tion — and  the  accepted  method  of 
switching  modes  is  to  reset  the  pro¬ 
cessor.  The  method  used  by  Micro¬ 
soft  in  OS/2  (a  reset  request  via  the 
keyboard  controller)  seems  to  be  the 
best  kludge  so  far.  That  is,  it’s  pain¬ 
fully  slow,  but  it  works.  You’d  think 
there’d  be  something  less,  well, 
"Stone  Age”  than  putting  the  chip 
to  sleep  for  a  couple  of  milliseconds. 

In  flaming,  it’s  generally  a  good 
idea  to  keep  your  historical  perspec¬ 
tive.  This  editorial  was  partially  in¬ 
spired  by  a  letter  from  one  of  our 
readers,  John  Dvorak.  Always  eager 
to  stir  things  up,  John  questioned 
our  ability  to  be  nasty  and  rude. 
This  editorial  could  be  seen  as  a 
reply.  Flaming  is  certainly  a  fun  way 
to  fill  space  in  a  magazine,  but  at  ddj 
we  try  to  remember  that  the  pur¬ 
pose  is  to  shed  light,  not  just  to 
blow  smoke.  Or,  as  Mrs.  Og  said 
when  she  saw  the  wheel  for  the  first 
time,  “That's  very  nice,  dear.  But 
what’s  it  for?” 


Tyler  Sperry 
editor 


Editorial 

Editor-in-Chief  Michael  Swaine 
Editor  Tyler  Sperry 
Managing  Editor  Vince  Leone 
Associate  Editor  Ron  Copeland 
Assistant  Editor  Sara  Noah  Ruddy 
Technical  Editors  Allen  Holub 

Richard  Relph 

Contributing  Editors  Namir  Shammas 
Ernest  R.  Tello 

Copy  Editor  Rhoda  Simmons 
Production 

Production  Manager  Bob  Wynne 

Art  Director  Michael  Hollister 
Assoc.  Art  Director  Joe  Sikoryak 
Technical  Illustrator  Frank  Pollifrone 
Typesetter  Mary  Lopez 
Cover  Photographer  Michael  Carr 
Circulation 

Circulation  Director  Maureen  Kaminski 
Book  Marketing  Mgr.  Jane  Sharninghouse 
Subscription  Supervisor  Kathleen  Shay 
Newsstand  Sales  Larry  Hupman 
Administration 
Finance  Director  Kate  Wheat 
Business  Manager  Betty  Trickett 
Accounts  Payable  Supv.  Mayda  Lopez-Qpintana 
Accts.  Receivable  Supv.  Laura  DiLazzaro 
Advertising  Director 
Ferris  Ferdon  (415)  366-3800 
Account  Managers  see  page  137 
Promotions/Srvcs.  Mgr.  Anna  Kittleson 
Advertising  Coordinator  Donna  Rogers 
Associate  Publisher 
Michael  Swaine 
Assistant  Sara  Noah  Ruddy 

Dr.  Dobb’s  Journal  of  Software  Tools  (USPS  307690) 
is  published  monthly  by  M&T  Publishing  Inc.,  501 
Galveston  Dr.,  Redwood  City,  CA  94063;  (415)  366-3600. 
Second-class  postage  paid  at  Redwood  City  and  at 
additional  entry  points.  DDJ  is  published  under  license 
from  People's  Computer  Company,  2682  Bishop  Dr., 
Suite  107,  San  Ramon,  CA  94583,  a  nonprofit  corpora¬ 
tion. 

Article  Submissions:  Send  manuscripts  and  disk  (with 
article  and  listings)  to  the  Editor. 

DDJ  on  CompuServe:  Type  GO  DDJ 
Address  Correction  Request:  Postmaster;  Send  Form 
3579  to  Dr.  Dobb's  Journal,  P.O.  Box  27809,  San  Diego, 
CA  92128.  ISSN  088-3076 

Customer  Service:  For  subscription  problems  call: 
outside  CA  (800)  321-3333;  in  CA  (619)  485-9623  or 
566-6947.  For  book/software  order  problems  call  (415) 
366-3600. 

Subscriptions:  $29.97  per  1  year;  $56.97  for  2  years. 
Canada  and  Mexico  add  $27  per  year  airmail  or  $10  per 
vear  surface.  All  other  countries  add  $27  per  year 
airmail.  Foreign  subscriptions  must  be  prepaid  in  U.S. 
funds  drawn  on  a  U.S.  bank.  For  foreign  subscriptions, 
TELEX:  752-351. 

Foreign  Newsstand  Distributor:  Worldwide  Media 
Service  Inc.,  386  Park  Ave.  South,  New  York,  NY  10016; 
(212)  686-1520  TELEX  620430  (WUI). 

Entire  contents  copyright  c  1987  by  M&T 
Publishing,  Inc.,  unless  otherwise  noted 
on  specific  articles.  All  rights  reserved. 


M&T  Publishing  Inc. 

Chairman  of  the  Board  Otmar  Weber 
Director  C.F.  von  Quadt 
President  and  Publisher  Laird  Foshay 


6 


Dr.  Dobb’s  Journal,  September  1987 

655 


FORUM 


RUNNING  LIGHT 


I  am  often  struck  by 
the  similarity  be¬ 
tween  editing  a  maga¬ 
zine  and  the  plight  of 
Billy  Pilgrim  in  Slaugh¬ 
terhouse  Five.  Von- 
negut’s  hero,  you  may 
recall,  had  become  ‘‘un¬ 
stuck  in  time”  and 
began  to  experience 
the  moments  of  his 
life — both  past  and 
future — out  of  the  normal  sequence. 
A  day  spent  in  old  age  might  be 
followed  by  a  week  of  his  youth,  the 
shift  coming  without  any  apparent 
plan  or  warning. 

Editing  a  magazine  is  a  lot  like 
that. 

From  the  writer’s  perspective, 
things  are  pretty  straightforward. 
You  write  and  debug  your  program, 
write  an  article  about  it,  and  send  it 
in  to  ddj.  After  an  awfully  long  time, 
you  get  a  letter  telling  you  if  we 
liked  the  manuscript  enough  to 
print  it. 

The  view  from  the  editor’s  desk  is 
a  lot  different.  As  I  write  this 
column,  it’s  the  middle  of  July,  and 
this  piece  is  literally  the  last  thing 
to  go  into  the  magazine  before  we 
ship  it  out  the  door  in  preparation 
for  printing.  As  soon  as  I  finish  this 
column,  there  are  neglected  Forth 
articles  to  edit  for  the  October  issue. 
All  this  happens  when  I  should  be 
soliciting  authors  and  articles  for 
the  December  operating  systems 
issue  or  at  least  checking  on  the 
progress  of  the  authors  and  editors 
working  on  stuff  for  the  November 
graphics  issue.  All  this  high-speed 
flipping  through  the  calendar  is 
enough  to  drive  a  normal  person 
quite  mad.  Having  spent  so  much 
time  with  computers,  of  course,  I 
am  already  certifiable. 

I  mention  all  this  chaos,  not  in  an 
appeal  for  sympathy — although  it 
would  be  nice  if  they’d  quit  sticking 
pins  in  the  voodoo  doll — but  so  that 
you’ll  understand  why  it  sometimes 


takes  months  to  evalu¬ 
ate  a  manuscript.  It's 
the  old  business  of 
trying  to,,  drain  the 
swamp  when  the  alliga¬ 
tors  keep  demanding 
your  attention.  Some¬ 
times  a  week  will  pass 
and  I’ll  suddenly  real¬ 
ize  I  haven't  looked  at 
the  latest  batch  of  man¬ 
uscripts,  or  checked 
into  the  ddj  forum  on  CompuServe. 
Worse,  even  after  we’ve  considered 
a  piece,  there’s  a  good  chance  it  will 
have  to  be  sent  off  to  a  referee  or 
Technical  Editor. 

Things  are  getting  better,  and 
we’re  speeding  up  the  evaluation 
cycle.  Ron  Copeland  has  joined  our 
staff  as  Associate  Editor,  and  has 
already  been  an  immense  help  both 
in  tracking  down  manuscripts  that 
were  lost  on  my  desk  and  in  routing 
them  to  the  proper  referees.  Richard 
Relph,  who  coordinated  our  infa¬ 
mous  giant  C  Compiler  roundup 
two  years  running  and  has  written 
a  number  of  articles  for  us,  has  just 
joined  the  staff  as  a  Technical 
Editor.  By  the  time  you  read  this, 
our  response  time  on  manuscripts 
should  have  dropped  to  less  than  a 
month. 

So,  to  those  of  you  who  got  caught 
in  the  old  delay  loops,  my  apologies. 
For  those  of  you  who  haven’t  sent 
in  an  article  or  query,  there’s  no 
longer  any  excuse.  If  you’ve  done 
something  particularly  neat  or 
solved  a  particularly  vexing  prob¬ 
lem,  let  our  readers  know  about  it 
by  submitting  an  article.  If  you  want 
to  discuss  an  article  idea  with  me, 
give  me  a  call  at  (415)  366-3600.  You 
can  also  send  me  e-mail  via  Com¬ 
puServe  (76703,4266),  or  BIX  (with 
the  clever  nom-de-baud  of  “tyler”). 


editor 


ARCHIVES 


Intel  Needs  Help 

"From  the  January-February  issue  of 
Solutions,  a  public-relations  magazine  put 
out  by  Intel  Corp.,  we  learn  that  Intel  has 
prepared  a  Support  Library  for  their  8087 
numeric  co-processor.  It  ‘consists’  of  the 
function  library,  a  decimal  conversion 
module, . .  .  and  a  full  8087  software 
emulator  for  debugging  without  the  8087 
component.  The  library  . . .  supports  the 
proposed  IEEE  Floating  Point  Standard. 

"Well,  that's  a  lot  of  software  (even  if 
the  8087  does  do  most  of  the  work),  so 
we  suppose  that  Intel’s  asking  price  of 
$1,250  (gulp!)  is  justified.  But  is  the  8087 
so  hard  to  work  with  that  you  need 
twelve  hundred  dollars  worth  of  interface 
code?  Surely  not!  Obviously,  Intel  doesn't 
know  how  to  sell  chips.  Let’s  show  them 
how  it’s  done:  let's  put  some  8087  knowl¬ 
edge  into  the  public  domain  so  that  ex¬ 
perimenters  can  use  the  device.  Many 
readers  working  with  it?  What  kind  of 
programming  does  it  take?” — D.  E. 
Cortesi,  Dr.  Dobb’s  Clinic,  ddj,  June  1982. 

Ten  Years  Ago  in  ddj 

"OLDIE  BUT  GOODIE  AVAILABLE 

“Our  glorious!?)  Editor  spent  much  of 
a  decade  (no  pun  intended)  in  mutually 
supportive  relationships  with  the  8- 
Family  of  Maynard,  and  has  long  had  a 
warm  spot  in  his  heart  for  these  quaint 
machines.  (For  the  younger  generation 
and  the  uninitiated,  this  eccentric  dis¬ 
cussion  concerns  one  of  the  first — and, 
for  many  years,  the  most  popular — mini¬ 
computers:  the  pdp-8  manufactured  by 
Digital  Equipment  Corporation  (dec)  in 
Maryland,  MA.) 

"Back  in  the  Summer  of  ’76,  the  blos¬ 
soming  of  ddj,  the  Computer  Faire,  the 
Silicon  Gulch  Gazette,  the  ACM,  teaching 

at  Stanford,  ncc  in  Dallas,  PC’77 _ and 

the  list  goes  on  and  on.  And  our  Editor 
has  had  little  time  to  do  more  than  turn 
his  old  friend  on  from  time  to  time,  and 
watch  it  glow  and  blink.  Lest  this  old  8/1 
feel  spurned  and  lonely  — for  it  is  an 
honorable  machine  with  a  strong  Puritan 
work  ethic,  not  accustomed  to  having  its 
diodes  idle — Jim  is  seeking  a  good  home 
for  it.  Price  for  the  whole  system  fob 
Menlo  Park,  California  is  $4,000.” — Jim 
Warren,  DDJ  September  1977. 


Dr.  DoBB'S  JoURNALof 

COMPUTER 

Calisthenics  (}  vjrthodontia 

Running  Light  Without  O verhytr 


8 

656 


Dr.  Dobb’s  Journal,  September  1987 


FORUM _ 

LETTERS 


Dvorak  Responds 

Dear  DDJ, 

I  was  greatly  amused  (Swaine's 
Flames,  June  1987)  by  the  rambling 
cockroach-like  insect  that  supposed¬ 
ly  jumped  out  of  Swaine's  computer 
and  onto  the  keyboard  late  one  night 
while  Mike  was  puking  his  guts  out 
after  having  been  to  one  of  those  du¬ 
bious  and  fashionable  sushi  bars 
found  far  too  often  south  of  Palo  Alto 
where  fresh  fish  is  a  rarity  and 
where  the  wallpaper,  when  exam¬ 
ined  closely,  turns  out  to  be  posted 
health  department  notices  of 
violation. 

No  doubt  was  left  in  the  reader’s 
mind  that  the  whole  incident  was  a 
hallucination  of  grand  pro¬ 
portion  or  (and  we  all  know 
that  this  is  the  case)  the  col¬ 
umn  was  a  gimmicky  idea 
possibly  dreamed  up  while 
suffering  the  ill  effects  of  poi¬ 
soning  and  designed  to  solve 
one  of  two  problems.  They 
are:  1)  how  to  make  a  point 
and  blame  someone  (or  some¬ 
thing)  else  for  making  the 
point;  2)  how  to  avoid  the  rav¬ 
ages  of  mean  copy  editors. 

Since  the  second  point  only 
applies  to  the  profession  of 
writing,  one  must  assume 
that  the  first  point  applies. 

Now  obviously  some  asser¬ 
tion  I  made  sometime  when 
and  who  knows  where  struck 
a  nerve  inside  the  Dr.  Dobb's 
editors  cage.  Thus  the  essay 
by  the  roach  about  me.  Unfor¬ 
tunately,  for  Mike  and  the 
crew,  my  point  concerning 
outspokenness  was  only  reaf¬ 
firmed  by  the  technique  of 
using  an  imagined  insect  to 


express  an  opinion.  It  surely  wasn’t 
done  to  avoid  being  nasty  but  only 
done  to  avoid  confrontation.  That's 
the  problem  (if  there  is  a  problem),  in 
my  opinion,  with  Dr.  Dobb's — a  cer¬ 
tain  inexplicable  timidity  mistakenly 
interpreted  by  some  as  politeness. 

Now  I  advise  you  to  look  at  my  cur¬ 
rent  benefactor  and  publisher,  PC 
Magazine.  The  commentary  in  there 
is  sometimes  downright  rude.  Edi¬ 
tors  even  have  the  gall  to  make  Edi¬ 
tors  Choices,  a  concept  that  flies  in 
the  face  of  advertising  theory.  Yet 
the  magazine  is  as  fat  as  imaginable 
and  growing  in  circulation  like  noth¬ 
ing  I've  ever  seen.  Why?  Because  to¬ 
day's  reader  is  bored  and  prefers  to 
read  copy  written  with  a  sharp  edge. 
There  is  too  much  dreary  stuff  out 
there  taking  up  too  much  space  in 
our  lives. 

The  editors  of  Dobb's,  I  know  for  a 
fact,  have  this  same  ability  to  pro¬ 
duce  enjoyable  vitriol  but  prefer  to 
be  well  liked  instead.  The  out-of- 
place  and  gratuitous  swipe  at  me  by 
the  verbose  roach  claiming  I  was  an 
incarnation  of  Max  Headroom  is  a 
good  example  of  the  kind  of  nasti¬ 
ness  that  lurks  within  the  Dobb's 


staff. 

While  the  entire  piece  was  a  gem 
of  creativity  insofar  as  finding  some 
unusual  way  to  couch  an  essay,  I  still 
maintain  that  using  a  bug  to  make  a 
complaint  is  symbolic  not  of  dignity 
but  of  a  certain  kind  of  shyness  that 
weakens  the  impact  of  the  printed 
word.  Then  again,  the  whole  story 
may  be  true.  If  so,  I  advise  Swaine  to 
vacuum  more  often. 

John  C.  Dvorak 
PC  Magazine 
Ziff-Davis  Publishing  Co. 

One  Park  Ave. 

New  York,  NY  10016 

Michael  Swaine  replies: 

Thanks,  John.  It’s  an  honor  to  get  a 
lesson  from  the  leading  practitioner 
of  no-nonsense,  outspoken,  sharp- 
edged  computer  journalism.  By  the 
way,  I  enjoyed  your  recent  PC  Maga¬ 
zine  column  tracing  the  origin  of  the 
word  nerd  back  to  Dr.  Seuss. 

XOR  Chains 

Dear  DDJ, 

David  Cortesi’s  article  on  XOR  chains 
(June  1987)  is  an  excellent  example  of 
isolating  information  in  modules. 

I  disagree  with  one  of  his 
conclusions, though:  the  Insert 
procedure  can  do  the  same 
damage  to  a  list  as  the  Delete 
procedure  can.  When  two 
different  Scans  each  insert  an 
item  into  the  same  position  in 
the  list,  the  list  structure  is 
destroyed. 

For  example:  let  Scans  P 
and  Q  each  have  item  A  as 
next  and  item  B  as  prev.  Then 
Insert  (C,P)  will  insert  an  item 
C  between  A  and  B  in  the  list. 
Scan  P  will  correctly  show 
item  A  as  ne}tt  and  item  C  as 
prev,  but  Scan  Q  will  still  treat 
items  A  and  B  as  adjacent 
items.  Calling  Insert  (D, QJ 
now  will  cause  Insert  to  try, 
incorrectly,  to  XOR  B  to  the 
link  in  A  as  if  they  were  still 
adjacent  items.  This  XOR  oper¬ 
ation  and  the  XOR  of  A  to  the 
link  in  B  will  leave  A  and  B 
with  invalid  links. 

This  flaw  in  Insert  can  be 
corrected  by  having  Insert 


12 


Dr.  Dobb's  Journal,  September  1987 

657 


LETTERS 

(continued  from  page  12) 


check  if  the  elements  in  the  Scan  are 
still  adjacent  before  inserting  the 
new  item. 

Andrew  Ginter 
208-21  Ave.  NW 
Calgary,  AB 
Canada  T2M  1J3 

Dear  DDJ, 

In  his  article  Mr.  Cortesi  mentions 
clearing  a  register  in  assembly  lan¬ 
guage  by  XORing  it  with  itself.  It  also 
might  be  worth  mentioning  that  you 
can  swap  two  fields  quickly  without 
using  a  work  area  via  use  of  the  XOR 
command  thrice: 

XC  FIELD1,FIELD2 
XC  FIELD2,FIELD1 
XC  FIELD1,FIELD2 

This  will  put  the  contents  of  FIELD1 
into  F1ELD2  and  vice  versa.  Note  that 
the  fields  must  be  of  equal  length. 
I'm  sure  that  this  also  works  just  as 
quickly  in  other  languages  because 
typically  Boolean  algebra  commands 
are  built-in  commands  in  almost  any 
processor. 

Jack  Gillette 
30  Harvest  Rust  Ct. 

St.  Charles,  MO  63303 

Dear  DDJ, 

David  E.  Cortesi 's  fine  article  present¬ 
ed  your  readers  with  some  valuable 
information.  However,  the  algorithms 
he  describes  are  not  the  simplest  possi¬ 
ble  because  he  uses  an  unnecessarily 
complicated  logical  tautology.  The  re¬ 
lation  he  used  in  his  article  was: 

(A  XOR  B  XOR  C)  XOR  B  XOR  C  =  A 

This  relation  is  actually  one  case  of  a 
more  general  relationship,  namely: 

(X_l  XOR  X _ 2  XOR  .  .  .  XOR  X_N)  XOR 

X _ 2  XOR  .  .  .  XOR  X_N  =  X_1 

I’m  not  sure  if  this  general  relation  is 
of  much  value,  but  its  simplest  form 
is  more  suitable  for  the  task  of  storing 
both  front  and  back  pointers  in  a  single 
field.  The  simplest  form  merely  says: 

(A  XOR  B)  XOR  B  =  A 

Nowhere  in  the  algorithms  that 


Mr.  Cortesi  describes  does  he  require 
the  fact  that  his  link  field  stores  the 
address  of  the  current  item  in  the  list. 
With  this  in  mind,  his  implementa¬ 
tion  can  be  improved  with  the  fol¬ 
lowing  changes: 

1.  For  an  item  in  the  list,  if  A  is  the 
address  of  its  predecessor  and  B  is  the 
address  of  its  successor,  then  set  the 
single  link  word  W  of  the  item  to: 

W  =  AXORB 

You  can  now  recover  B  as  (W  XOR  A) 
or  A  as  (W  XOR  B). 

2.  The  stepping  routines  GoFwd  and 
GoBak  should  be  modified  to  remove 
the  use  of  the  current  item  address. 
For  example: 

Void  GoFwd(s)  struct  Scan  *s; 
struct  Item  *i; 
i  =  s— >priorXor 
s  —  >  next  —  >  link; 
s—  >  prior  =  s— >next; 
s—  >next  =  i; 

} 

3.  The  insertion  and  de¬ 
letion  routines  Insert 
and  Delete  should  be 
modified  in  a  similar 
fashion.  For  the  sake  of 
accuracy  (and  to  correct 
some  minor  errors  in 
the  original  article),  I 
have  given  each  of  these 
in  full  in  Example  1, 
right. 

The  changes  de¬ 
scribed  in  Example  1 
amount  to  a  savings  of 
one  XOR  operation  per 
call  to  any  of  these  four 
routines.  Although  this 
may  not  seem  like 
much,  it  can  make  a  sig¬ 
nificant  difference  in 
the  time  required  to 
process  a  large  list.  Of 
course,  the  effect  on 
code  size  is  minimal. 

Mr.  Cortesi  correctly 
asserts  that  traversing  a 
data  structure  linked  in 
this  fashion  requires  the 
addresses  of  two  logical¬ 
ly  adjacent  items.  How¬ 


ever,  this  does  not  lock  you  into  a  list- 
type  structure.  I  am  submitting  a 
follow-up  article  entitled  "The  XOR 
Chain  Revisited”  that  describes  how 
such  links  can  be  used  effectively  in 
manipulating  tree-type  data 
structures. 

Bennette  R.  Harris 
231 S.  Janesville  St. 

Whitewater,  WI  53190 

Mr.  Harris'  article  begins  on  page  36  in 
this  issue.  — eds. 

68000  Dynamic  Halt 

Dear  DDJ, 

The  FOR  TOP  to  BOT  loop  in  Brian  An¬ 
derson's  Viewpoint  (June  1987)  will 
only  reach  BOT  in  the  unlikely  event 
that  TOP  =  BOV.  Else  we  have  yet  an¬ 
other  fine  Dynamic  Halt  (aka  endless 
loop).  CLR.8  0(A0,D0)  does  not  alter 
value  y  DO.  So  did  your  typesetter 
omit  ADDQ,W  #1,  DO  before  the  loop 
branch?  Brian  would  never  do  such  a 
thing  (and  yet  his  line  numbers  seem 
(continued  on  page  138) 


void  Insert  (  i  ,  s  ) 
struct  Item  *i ; 
struct  Scan  *s ; 

{ 

if  ( AtHead ( s ) ) 
s  — >base  — >head  =  i  ; 
else 

s  — >prior  — >link  Xor=  (  i  Xor  s  — >next )  ; 
if  (AtTail(s)  ) 

s  — >base  — >tail  =  i; 
else 

s  — >next  — >linkXor  =  (i  Xor  s  — >prior)  ; 
i  — >link  —  s  — >prior  Xor  s  — >next; 
s  —  >pr  ior  =  i  ; 

} 

void  Del ete  (  s  )  struct  Scan  *s  ; 

struct  Item  *i ; 

if  ( AtTail ( s ) )  return ; 

i  =  s  — >next  — >link  Xor  s  — >prior; 

if  ( AtHeadf  s  )  ) 

s  — >base  — >head  —  i; 
else 

s  — > prior  — >linkXor=  (s  — >next  Xor  i  )  ; 
if  (  i  ==  Nil ) 

s  — >base  — >tail  =  s  — >prior; 
else 

i  — >linkXor  =  (  s  —  >next  Xor  s  — > prior  )  ; 
DropItem(s  — >next) ; 
s  —  >next  =  i  ; 

} 


Example  1:  Modified  insertion  and  deletion 
routines  for  "The  XOB  Chain" 


14 

658 


Dr.  Dobb's  Journal,  September  1987 


ARTICLES 


How  Many  Ways 
Can  You 
Draw  a  Circle? 


I  like  to  collect  things. 

When  I  was  young  I  col¬ 
lected  stamps;  now  I 
collect  empty  margarine 
tubs — and  algorithms  for 
drawing  circles.  Because  this 
is  an  algorithms  issue,  I  will 
restrain  myself  from  talking 
about  stamps  or  margarine 
tubs;  instead,  I’ll  show  you  my  album  of  circle-drawing 
algorithms. 

It's  traditional  at  this  point  in  any  discussion  of  geome¬ 
try  to  drag  in  the  ancient  Greeks  and  mention  how  they 
considered  the  circle  to  be  the  most  perfect  shape.  Even 
though  a  circle  is  such  an  apparently  simple  shape,  it  is 
interesting  to  see  how  many  essentially  different  algo¬ 
rithms  you  can  find  for  drawing  the  Greek’s  favorite 
curve. 

I  will  be  brief  about  some  pretty  obvious  techniques  to 
leave  space  to  play  with  the  more  interesting  and  subtle 
techniques.  Note  that  many  of  these  algorithms  might  be 
ridiculously  inefficient  but  are  included  to  pad  out  the 
article.  (OK,  OK — they're  actually  included  for 

James  F.  Blinn,  195  South  Wilson,  Apt.  8,  Pasadena,  CA 
91106.  Dr.  Blinn  has  been  actively  involved  in  computer 
graphics  for  the  past  19  years.  Since  1978  he  has  been  at  the 
Jet  Propulsion  Laboratory  of  the  California  Institute  of 
Technology,  producing  animations  depicting  various  space 
missions.  He  also  produced  computer  graphics  effects  for 
the  PBS  series  "COSMOS. "  He  currently  teaches  courses  in 
computer  graphics  at  Cal  Tech  and  at  the  Art  Center  College 
of  Design  in  Pasadena. 

This  is  a  revised  version  of  an  article  published  in  the 
August  1987  issue  of  IEEE  Computer  Graphics  and  Applica¬ 
tions.  ©  1987  IEEE. 


completeness.) 

I'm  not  sure  where  I  first 
heard  of  some  of  these.  I  will 
cite  inventors  where 
known,  but  let  me  just 
thank  the  world  at  large 
in  case  I've  missed  any¬ 
body. 

A  word  about  the  pro¬ 
gramming  language  I  use — I  am  not  using  any  formal 
algorithm  display  language  here.  These  algorithms  are 
meant  to  be  read  by  human  beings,  not  computers,  so  the 
language  I  present  is  a  mishmash  of  several  programming 
constructs  that  I’m  sure  will  be  perfectly  clear  to 
you. 

The  collection  can  be  categorized  by  the  two  types  of 
output — line  endpoints  and  pixel  coordinates.  This  comes 
from  the  general  dichotomy  of  curve  representation — 
parametric  vs.  algebraic. 

Line  Drawings 

First,  I’ll  look  at  line  output.  All  the  algorithms  in  this 
section  operate  in  floating  point  and  generate  a  series  of 
y,y  points  on  a  unit-radius  circle  centered  at  the  origin. 
You  then  play  connect  the  dots. 

1.  Trigonometry 

Evaluate  sin  and  cos  at  equally  spaced  angles: 

MOVE(1,0) 

FOR  DEGREES  =  1  to  360 

RADIANS= DEGREES*2*3. 14159/360. 
DRAW(COS(RADIANS),SIN(RADIANSI) 

This  has  to  evaluate  the  two  trig  functions  at  each  loop, 
ick. 


by  James  F.  Blinn 


A  comprehensive  roundup 
of  circle-drawing  algorithms 


18 


Dr.  Dobb's  Journal,  September  1987 

659 


2.  Polynomial  Approximation 

You  can  get  a  fair  approximation  of  a  circle  by  evaluating 
simple  polynomial  approximations  to  sin  and  cos.  The 
first  ones  that  come  to  mind  are  the  Taylor's  series: 

cos  a  «  i_(J-)a2+(^-)a4 
sin  a  »  a-(-^-)a:l  +(  )a5 

These  require  fairly  high-order  terms  to  get  very  close. 

A  better  approach  is  to  fit  lower-order  polynomials  to 
the  desired  endpoints  and  end  slopes.  This  is  effectively 
what  happens  with  various  commonly  used  Bezier 
curves — for  example,  the  four  control  points  (1,0), 
(1,0.552),  (0.552,1),  (0,1)  describe  a  good  approximation  to 
the  upper-right  quarter  of  a  circle.  You  can  get  the  other 
three  quadrants  by  rotating  the  control  points. 

When  transformed  to  polynomial  form,  the  first  quad¬ 
rant  is: 

*</)  =  1-1.34412  +  ,344t3 
yit)  =  1.656f-.312t2  -.344 t3 

with  the  parameter  t  going  from  0  to  1. 

MOVE(1,0) 

FOR  T  =  0  TO  1  BY  .01 

X  =  1  +  T*T*(-1.344  +  T*.344) 

Y  =  T*(1.656-T*(.312  +  T*.344) 

DRAW(X,Y) 

This  makes  a  pretty  good  circle — the  maximum  radius 
error  is  about  0.0004  at  t=0.2  and  f=0.8. 

3.  Forward  Differences 

Polynomials  can  be  evaluated  quickly  by  the  technique 


known  as  forward  differences.  Briefly,  for  the  polynomi¬ 
al: 

fit)  =  f0  +  f)t  +  ft2  +  f3t3 

if  you  start  at  f =0  and  increment  by  equal  steps  of  size  5, 
the  forward  differences  are: 

A /-/,«  +  &*+&> 

A  A/=  2fJ>2  +  6fs5° 

A  A  Af  =  6f383 

Then,  for  polynomials  stepping  in  units  of  0.01: 

X  =  l;  DX  = -.000134056;  DDX  =  -. 000266736;  DDDX=  .000002064 
Y  =  0;  DY  =  .016528456;  DDY  = -.000064464;  DDDY  =  -.000002064 
MOVE(X,Y) 

FOR  1  =  1  TO  100 

X  =  X+DX;  DX  =  DX  +  DDX;  DDX = DDX + DDDX 
Y  =  Y  +  DY;  DY  =  DY  +  DDY;  DDY  =  DDY  +  DDDY 
DRAW(X,Y) 

Trust  me,  I'm  a  doctor.  If  you  don't  believe  it,  look  up 
forward  differences  on  page  328  of  Newman  and 
Sproull — I'm  not  going  to  do  all  the  work  here. 

Notice  the  number  of  significant  digits  in  the  constants. 
It  might  seem  as  though  that  many  digits  would  require 
double  precision,  but  in  practice  the  accumulated  round¬ 
off  error  using  single  precision  is  less  than  the  error  due 
to  the  polynomial  approximation. 

4.  Incremental  Rotation 

Let's  back  off  from  the  approximation  route  and  try  an¬ 
other  approach.  Start  with  the  vector  (1,0)  and  multiply  it 
by  a  one-degree  rotation  matrix  each  time  through  the 
loop: 


Dr.  Dobb  s  Journal,  September  1987 

660 


19 


DRAWING  CIRCLES 


fix  this? — introduce  a  bug  into  the  algorithm. 

6.  Unskewing  the  Approximation 

Because  vector  multiplication  and  assignment  don't  oc¬ 
cur  in  one  statement,  you  had  to  calculate  y  carefully, 
using  the  old  value  for  y.  Suppose  you  were  dumb  and  did 
it  the  naive  way: 

A  =  .015;  N=2*3.14159/A 
X  =  l;  Y  =  0 
MOVE(X,Y) 

FOR  1  =  1  TON 
X  =  X  -  Y*A 
Y  =  X*A  +  Y 
DRAW(X,Y> 

Now,  what  is  the  effect  of  this?  Really  what  you  get  is: 

Knew  No/d-yo/da 

y new  Knew  &  ""E  Yold  Xo/d  d  "E  Yold  a  ) 

In  other  words: 


(continued  from  page  19) 


DELTA  =  2*3.14159/360. 

SINA=SIN(DELTA) 

COSA=COS(DELTA) 

X=l;  Y  =  0 
MOVE(X,Y) 

FOR  1  =  1  TO  360 

XNEW  =  X*COSA  -  Y*SINA 

Y  =  X'SINA  +  Y'COSA 
X  =  XNEW 

5.  Extreme  Approximation 

If  the  incremental  angle  is  small  enough,  you  can  make 
the  approximations  cos  a=l  and  sin  a=a.  The  number  of 
times  through  the  loop  is  n  =  2ir/a,  or  conversely,  the  an¬ 
gle  is  a=27 r/n,  depending  on  which  you  want  to  use  as 
input: 

A  =  .015;  N  =  2*3. 14159/ A 
X  =  l;  Y  =  0 
MOVE(X,Y) 

FOR  1  =  1  TON 

XNEW  =  X  -  Y*A 

Y  =  X*A  +  Y 
X  =  XNEW 
DRAW(X,Y) 


IXnew,  Y  new )  ^Koldi  Yold I 

This  matrix  has  a  determinant  of  1,  and  there  is  no  net 
spiraling  effect.  What  you  get  is  actually  an  ellipse  that  is 
stretched  slightly  in  the  northeast-southwest  direction 
and  squeezed  slightly  in  the  northwest-southeast  direc¬ 
tion.  The  maximum  radius  error  in  these  directions  is 
approximately  a/4. 

Now  comes  the  really  interesting  part.  Because  you  can 
start  out  with  any  vector,  let's  try  (1000,0).  Now,  cleverly 
select  a  to  be  an  inverse  power  of  2  and  the  multiplication 
becomes  just  a  shift — for  example,  a  value  of  a =1/64  is 
just  (shift  right  6)  and  generates  the  circle  in  about  402 
steps.  So,  you  can  do  all  this  just  with  integer  arithmetic 
and  no  multiplication.  This  is  how  you  used  to  draw  cir¬ 
cles  quickly — and  in  fact  do  rotation  incrementally — be¬ 
fore  the  age  of  hardware  floating  point  and  even  hard¬ 
ware  multiplication.  (This  was  probably  invented  by  Ivan 
Sutherland.) 

7.  Rational  Polynomials 

Another  polynomial  tack  can  be  taken  by  looking  in  your 
hat  and  pulling  out  the  following  rabbit: 

if: 


But  there’s  a  problem.  Each  time  through  the  loop,  you 
are  forming  the  product: 

iXnew/  Y new)  i Koldj  Y old^ 

The  matrix  is  almost  a  rotation  matrix,  but  its  determi¬ 
nant  equals  1+a2.  This  is  bad.  It  means  that  the  running 
(y,y)  is  magnified  by  this  amount  on  each  iteration,  so 
what  you  get  is  a  spiral  that  gets  bigger  and  bigger.  How  to 


y  =  (l-t2)/(l  +  f2) 
y  =  2f/(l  +  r2> 

then: 

x2  +  y2  =  i 

no  matter  what  t  is  (or  identically,  as  mathematicians 
would  say).  Running  t  from  0  to  1  gives  the  upper-right 
quadrant  of  the  circle.  You  can  again  evaluate  these  poly- 


20 


Dr.  Dobb's  Journal,  September  1987 

661 


DRAWING  CIRCLES 

(continued  from  page  20) 


nomials  by  forward  differences,  stepping  t  in  increments 
of  0.01,  and  get: 

X=l;  DX: — .0001;  DDX  =  -.0002 
Y  =  0;  DY  =  .02 

W=l;  DW=  .0001;  DDW  =  .0002 
MOVEIX.Y) 

FOR  1  =  1  TO  100 

X  =  X  +  DX;  DX  =  DX  +  DDX 
Y=Y+DY 

W=W+DW;  DW=DW+DDW 
DRAW(X/W,Y/W) 

Note  that  this  is  not  an  approximation  like  the  last  few 
tries  were.  It  is  exact — except  for  round-off  error.  Even 
round-off  error  can  be  removed,  either  by  calculating  the 
polynomials  directly  or  by  scaling  all  numbers  by  10,000 
and  doing  it  with  integers.  (The  division  y/w  must  still  be 
done  in  floating  point.) 

This  one  has  always  amazed  me — effectively,  you  get 
to  evaluate  two  transcendental  functions  exactly  with 
only  a  few  additions.  What’s  the  catch?  It’s  an  application 
of  the  No  Free  Lunch  theorem — you  don’t  get  to  pick  the 
angles.  If  you  watch  the  points,  you  see  that  they  are  not 
equally  spaced  around  the  circle.  In  fact,  as  f  goes  to  infin¬ 
ity,  the  point  keeps  going  counterclockwise  but  slows 
down,  finally  running  out  of  juice  at  (-1,0).  If  you  go  back¬ 
ward  to  minus  infinity,  the  point  goes  clockwise,  finally 
stopping  again  at  (-1,0).  (Yet  more  evidence  that  -oo  = 
+  oo.)  To  draw  a  complete  circle,  you  are  best  advised  to 
run  t  from  -1  to  + 1,  which  draws  the  whole  right  half, 
and  then  mirror  it  to  get  the  left  half. 

8.  Differential  Equations 

An  entirely  different  technique  is  to  describe  the  motion 
of  (y,y)  dynamically.  Imagine  the  point  rotating  about  the 
center  as  a  function  of  time  t.  The  position,  velocity,  and 
acceleration  of  the  point  will  be: 

(y,y)  =  (cos  t,  sin  t) 

<y',  y')  =  (-sin  t,  cos  t )  =  (-y,y) 

Ox",  y")  =  (-cos  t,  -sin  f)  =  (-y,  -y) 

You  can  cast  these  into  differential  equations  and  use  sev¬ 
eral  numerical  integration  techniques  to  solve  them. 

The  dumbest  one,  Euler  integration,  is  just: 

Knew  Kold  “E  «X  old  At  =  Kold~Yold  At 
Y new  Y old  T  y  old  At  =  y old  “E  Kold  At 

This  looks  a  lot  like  algorithm  5  and  has  the  same  spiral- 
ing-out  problem.  You  can  generate  better  circles  by  using 
better  integration  techniques.  My  two  favorites  are  the 
"leapfrog”  technique  and  the  Runge-Kutta  technique. 

Leapfrog  calculates  the  position  and  acceleration  at 
times: 

t,  t  +  At,  t+2Af, .  .  . 


but  calculates  the  velocity  at  times  halfway  between 
them: 

t  +-£  At,  t  +  ^ At,  .  .  . 

Advancing  time  one  step  then  looks  similar  to  Euler  with 
just  the  evaluation  times  offset: 

K,+ai  =  Kt  +  N'(+iA,At 
*'<+!  A,  =  N'i+iA( +*''(+ A(At 

(with  similar  equations  for  y).  The  position  and  velocity 
leapfrog  over  each  other  on  even/odd  half-time  steps,  so 
you  have  to  keep  separate  variables  for  the  velocities 
y'and  y'  The  code  has  a  lot  in  common  with  algorithm  6 
and  probably  for  good  reason: 

X  =1 ; Y  =0 

VX=-SIN(DT/2);  VY=COS(DT/2) 

MOVE(X,Y) 

FOR  I  =  1  TO  N 

X  =  X  +  VX*DT  "update  posn" 

Y  =  Y  +  VY*DT 

VX  =  VX  -  X  *DT  "update  veloc,  AX  =  -X" 

VY  =  VY  -  Y  *DT  "  A  Y  =  -Y" 

DRAW(X,Y) 

Runge-Kutta  is  a  slightly  involved  process  that  takes  a 
fractional  Euler  step,  reevaluates  the  derivatives  there, 
applies  the  derivative  at  the  original  point,  steps  off  in  this 
new  direction,  generally  screws  around,  and  finally  takes 
some  average  of  all  these  to  get  the  new  time  step.  Plug¬ 
ging  the  differential  equation  into  the  formulas  and  sim¬ 
plifying  requires  about  a  page  of  algebra.  You  can  look  up 
the  actual  equations;  they’re  not  incredibly  complicated, 
but  their  derivation  is  "beyond  the  scope”  of  almost  all 
numerical  analysis  textbooks  I  have  seen. 

One  advantage  of  Runge-Kutta  is  that  it  finds  the  posi¬ 
tion  and  velocity  at  the  same  time  step,  so  for  circles  you 
can  generate  y  and  y  with  the  same  computation.  Anoth¬ 
er  advantage  is  that  it  comes  in  second-order,  third-order, 
fourth-order,  and  so  on  versions  for  higher  orders  of  pre¬ 
cision  than  does  leapfrog. 

Plugging  in  for  second-order  RK,  the  ultimate  result  is: 

Knew  =  Kold (1  -  2Af2)  +  y„ld  <-Af) 
y„ew  =  x„w(At)  +  yo/d(l  -  ^Af2) 

Does  this  look  familiar?  The  third-order  RK  and  another 
page  of  algebra  leads  to: 

Nnew  Koh)  1  -^Af-)  +  y„/d(-Af  -P  gAt2) 

y new  N»/</(Af-gAf3)  T  y„/d(l-2At2) 

Guess  what  fourth-order  RK  gives  .  .  .  you’re  right.  I  won't 
even  bore  you  with  the  code. 


22 

662 


Dr.  Dobb's  Journal,  September  1987 


DRAWING  CIRCLES 

pixel  is  from  the  center  and  color  it  in  if  it's  inside  the 

(continued  from  page  ZZ) 

circle: 

9.  Half  Interval 

FOR  IY=-100  TO  100 

The  half-interval  method  (suggested  by  Jim  Kajiya)  as- 

FOR  IX=-100  TO  100 

sumes  you  have  two  endpoints  of  an  arc  and  wish  to  fill  in 

IF  (IXTX  +  IY*IY  <  10000)  SETPXL(IX,IY) 

the  middle  with  points  on  the  circle.  At  each  step  you 
insert  a  new  point  between  two  others.  Assuming  a  circle 

This,  of  course,  fills  in  the  entire  disk  instead  of  just  draw- 

centered  at  the  origin,  the  new  point  will  be  approxi- 

ing  lines,  but  who’s  being  picky?  You  would  be  correct  in 

mately  halfway  between  the  surrounding  ones: 

assuming  that  this  might  be  a  bit  slow.  Some  quick  speed- 

y-n>  =  ( ^  + 

ups:  calculate  the  value  of  y2  by  forward  differences  and 
calculate  the  allowable  range  of  y2  outside  the  y  loop  (for¬ 
ward  differences  probably  aren't  worth  the  trouble  for 

V  2  2  / 

this). 

It  just  needs  to  be  moved  outward  to  lie  on  the  circle, 

FOR  IY  =  -100  TO  100 

which  involves  scaling  the  previous  expression  to  length 

IX2MAX  =  10000  -  IY*IY 

1.  If  the  original  points  are  at  unit  distance  from  the  ori- 

1X2  =  10000;  IDX2  =  -199;  IDDX2  =  2 

gin,  this  means  dividing  by  V  1  +  XiXz  +  Viy-i/  V2. 

FOR  IX  =  -100  TO  100 

By  doing  this  recursively,  you  can  keep  splitting  until 

IF  (1X2  <  IX2MAX)  SETPXL(IX,IY) 

some  error  tolerance  is  met.  The  code  is  something  like: 

1X2  =  1X2 + IDX2;  IDX2=IDX2+IDDX2 

Xl=l;  Yi =0 

1 1 .  Solve  for  X  Range  Covered 

X2  =  0;  Y2  =  l 

The  preceding  algorithm  still  examines  every  pixel  on 

MOVE(Xl,Yl) 

the  screen.  You  can  skip  some  of  this  by  explicitly  solving 

SPLIT(X1,Y1,  X2,Y2) 

for  the  range  in  y: 

where  SPLITfXl,  Yl,  X2,  Y2>  is  defined  to  be: 

FOR  IY=  100  TO  -100  BY  -1 

D  =  SQRT(2*(1  +  X1*X2  +  Y1*Y2I) 

IXRNG  =  SQRT110000  -IY*IY) 

FOR  IX  =  -IXRNG  TO  IXRNG 

XM  =  (Xl  +  X2)/D 

SETPXL(IX.IY) 

YM  =  (Y1  + Y2)/D 

IF  error_tolerance_ok 

Or  just  plot  the  endpoints  instead  of  filling  in  the  whole 

DRAW(XM.YM) 

disk: 

DRAW(X1,Y2I 

ELSE 

FOR  IY=  100  TO  -100  BY  -1 

SPLIT(X1,Y1,  XM,YM) 

IX  =  SQRT(10000-IY*IY) 

SPLIT(XM,YM,  X2,Y2) 

SETPXL(-IX,IY) 

The  error  tolerance  could  be  just  a  recursion  depth  count¬ 
er,  stopping  at  a  fixed  recursion  depth.  This  is  nice  be- 

SETPXLI IX, IY) 

This  leaves  unsightly  gaps  near  the  top  and  bottom. 

cause,  for  a  given  pair  of  initial  points,  the  value  of  D  is  just 
a  function  of  recursion  depth  and  can  be  precomputed 

12.  Various  Approximations  to  SQRT 

and  placed  in  a  table. 

Make  a  polynomial,  or  rational  polynomial,  approxima- 

Pixel-Based  Techniques 

tion  to  \/10000-y2  that  is  good  for  the  range 
-100  .  .  .  + 100.  Evaluate  it  with  forward  differences. 

The  other  major  category  of  algorithms  involves  output 
more  directly  suited  to  raster  displays.  Here  the  question 

13.  Driving  X  Away 

is  not  where  to  move  the  "  pen  next  but  which  of  the  grid 

Let's  just  do  the  upper-right  quarter  of  the  circle  and  fol- 

of  pixels  to  light  up.  The  preceding  algorithms  can,  of 

low  the  point  (0,100).  For  each  downward  step  in  y,  you 

course,  be  applied  to  pixels  by  generating  coordinates 

move  to  the  right  some  distance  in  y.  Start  at  the  y  that’s 

and  feeding  them  to  a  line-to-pixel  drawing  routine,  but  I 

left  over  from  last  time  and  step  it  to  the  right  until  it  hits 

won’t  pursue  these  methods.  I’ll  just  look  at  ways  to  gen- 

the  circle,  leaving  a  trail  of  pixels  behind: 

erate  the  desired  pixels  directly.  For  simplicity  I  will  as¬ 
sume  you  are  drawing  a  100-pixel-radius  circle  with  pix- 

IX  =  0 

els  addressed  so  that  (0,0)  is  in  the  center  and  that  negative 

FOR  IY  =  100  TOO  BY -1 

coordinates  are  OK.  The  algorithms  operate  in  integer 

IX2MAX  =  10000-IYTY 

pixel  space,  assuming  square  pixels.  Note  that  the  follow- 

DO  UNTIL  (IX'IX)>IX2MAX 

ing  variables  start  with  /,  indicating  that  they  are  integers. 

SETPXL(IX.IY) 

10.  Fill  Disk 

Perhaps  the  dumbest  algorithm  is  just  to  see  how  far  each 

IX  =  IX  +  1 

Calculation  of  IX2MAX  and  1X2  can  be  done  by  forward 

24 


Dr.  Dobb's  Journal,  September  1987 

663 


differences: 

IX  =0 

1X2  =  0;  IDX2=1;  IDDX2=2 

IX2M AX  =  0;  IDX2MAX=199;  IDDX2MAX — 2 
FOR  IY  =  100  TO  0  BY  -1 

DO  UNTIL  1X2  >  IX2MAX 
SETPXL(IX,IY) 

IX  =  IX  +  1 

1X2 = 1X2  +  IDX2;  IDX2=IDX2+IDDX2 
I  X2MAX  =  I  X2MAX  +  ID  X2MAX 
1DX2M  AX  =  IDX2M  AX + IDDX2MAX 

This  still  has  a  few  problems,  but  I  won't  pursue  them 
because  the  next  two  algorithms  are  so  much  better. 

14.  Brest- nham 

The  previous  algorithm  begins  to  look  like  Bresenham's 
algorithm,  which  is  the  top  of  the  line  in  pixel-oriented 
circle  algorithms.  It  endeavors  to  generate  the  best  possi¬ 
ble  placement  of  pixels  describing  the  circle  with  the 
smallest  amount  of  (integer)  code  in  the  inner  loop.  It  op¬ 
erates  with  two  basic  concepts. 

First,  the  curve  is  defined  by  an  "error”  function.  For 
my  circle,  this  is  E= 10000-y2-y2.  For  points  exactly  on  the 
circle,  E=0;  inside  the  circle,  £> 0;  and  outside  the  circle, 
E<0. 

Second,  the  current  point  is  nudged  by  one  pixel  in  a 
direction  that  moves  "forward”  and  in  a  direction  that 
minimizes  E.  Consider  just  the  octant  of  the  circle  from 
(0,100),  moving  to  the  right  by  45  degrees.  At  each  itera¬ 
tion,  you  can  choose  to  move  either  to  the  right  (R) — 
X=X+1 — or  diagonally  (D) — y=y+l  andy=y-l. 

The  nice  thing  about  this  is  that  the  value  of  £  can  be 
tracked  incrementally.  If  the  error  at  the  current  (y,y)  is: 

E,,Jr  =  10000-y2-y2 

then  an  R  step  will  make: 

Emw  =  10000-U  +  l)2-y2 
=  E,„r-(  2y  +  1) 

and  a  D  step  will  make: 

Enew  =  10000-(y  +  l)My-l)2 
=  E„„  r-(2n  +  1)-K2y-1) 


Now,  for  the  octant  in  question,  y<y,  x>0,  and  y>0.  So, 
an  R  step  subtracts  something  from  E  and  a  D  step  adds 
something  to  £.  The  naive  version  of  the  algorithm  deter¬ 
mines  which  way  to  go  by  looking  at  the  current  sign  of  £, 
always  striving  to  drive  it  toward  its  opposite  sign: 

IX=0;  IY=-100 
IE=0 

WHILE  IX<=IY 
IF  (IE<0) 

IE=IE+IY  +  IY-1 
IY  =  IY-1 
IE=IE-IX-IX-1 
IX  =  IX  +  1 
SETPXL(IX,IY) 

15.  Improved  Bresenham 

You  can  do  better.  What  you  want  to  do  at  each  step  is 
actually  to  pick  the  direction  that  generates  the  smallest- 
size  error,  IEI.  You  want  to  look  ahead  at  the  two  possible 
new  error  values: 

E/[  =  £-(2x  +  l) 

Ed  =  E-(2y+l)  +  ('2y-l) 

and  test  the  sign  of  I  £»  I  - 1  E„  I.  The  trick  is  to  avoid  cal¬ 
culating  absolute  values.  Table  1,  below,  shows  the 
possibilities. 

Now  comes  the  tricky  part.  You  can  define  a  "biased” 
error  from  the  (  +  -)  case: 

G  —  E/ j  +  E„ 

and  use  this  as  the  test  for  all  three  cases.  This  works  for 
the  following  reason:  In  the  (+  +)  case,  I  E„  I  - 1  £„  I  =  2y-l 
is  positive,  but  so  is  G.  In  the  (--)  case, 
l£/jl  - 1  Efi  I  =-(2y-l)  is  negative,  but  so  is  G. 

G  can  be  calculated  incrementally,  just  like  £  was.  The 
new  values  due  to  R  and  D  steps  are: 

G,,  =  G-4y-6 
G/j  =  G-4y+4y-10 

Further,  the  increments  to  G  can  be  calculated  incremen¬ 
tally.  You  get  the  idea  by  now  .  .  . 

IR  =  100 
IX =0;  IY  =  IR 
IG  =  2TR-3 

IDGR  =  -6;  IDGD=4*IR-10 
WHILE  IX<  =  IY 
IF  IG<0 

IG  =  IG  +  IDGD  "go  diagonally" 

IDGD  =  IDGD-8 
IY  =  IY-1 
ELSE 

IG  =  IG  +  IDGR  "go  right" 

IDGD=IDGD-4 
IDGR  =  IDGR-4 
IX  =  IX  +  1 
SETPXL(IX,IY) 


ED 

er 

ed-er 

+ 

+ 

Ed  -  Er  =2y-1 ,  always  positive 

+ 

- 

Ed  +  Er 

- 

+ 

can  never  happen 

~ 

-Ed  +  Er  =-(2y-1),  always  negative 

Table  Is  Testing  the  sign  of  I ED I  -  I ER  I. 


Dr.  Dobb's  Journal,  September  1987 

664 


25 


DRAWING  CIRCLES 

(continued  from  page  25) 


Whew! 

Conclusion 

So,  why  is  all  this  interesting — aside  from  the  pack  rat  joy 
of  collecting  things? 

Well,  you  can  certainly  use  these  algorithms  to  opti¬ 
mize  your  circle-drawing  programs  if  you're  into  circles. 
Each  algorithm  has  its  own  little  niche  in  the  speed-accu- 
racy-complexity  trade-off  space.  Sometimes  economy  is 
misleading — the  SETPXL  routine  often  gobbles  up  any 
time  you  saved  being  clever  with  Bresenham's  algorithm. 
Let's  face  it:  unless  there's  something  very  time-critical,  I 
usually  use  algorithm  1  because  it's  easiest  to  remember. 

The  really  interesting  thing  about  all  these  algorithms  is 
the  directions  in  which  they  lead  you  when  you  try  to 
generalize  them.  Algorithms  2  and  7  lead  to  general  poly¬ 
nomial  curves.  Algorithm  4  leads  to  iterated  function  the¬ 
ory.  Algorithm  5  leads  to  the  CORDIC  method  of  function 
evaluation.  Algorithm  11  has  to  do  with  rendering 
spheres.  (I  wonder  what  happens  to  algorithm  15  if  you 
use  some  other  simple  functions  of  y  and  y  for  G,  G,h  and 
G0.)  In  fact,  although  many  of  these  algorithms  look  quite 
similar  when  applied  to  circles,  their  generalizations  lead 


to  very  different  things.  It  sort  of  shows  the  underlying 
unity  of  the  universe.  Maybe  the  Greeks  had  something 
there. 

Bibliography 

Newman,  William  M.;  and  Sproull,  Robert  F.  Principles  of 
Interactive  Computer  Graphics.  New  York:  McGraw-Hill, 
1979. 

Turkowski,  Kenneth. “Anti-Aliasing  Through  the  Use  of 
Coordinate  Transformations.”  ACM  Transactions  on 
Graphics,  vol.  1,  no.  3  (July  1982):  215-234.  Refers  to 
CORDIC. 

For  references  to  Runge-Kutta,  see  any  numerical  analy¬ 
sis  text —  for  example,  Schaum 's  Outlines. 

DDJ 

Vote  for  your  favorite  feature/article. 

Circle  Reader  Service  No.  1 . 


26 


Dr.  Dobb’s  Journal,  September  1987 

665 


ARTICLES 


File  Comparison 
Algorithms 


by  Tom  Steppe 


Several  popular  algorithms  ex¬ 
ist  for  comparing  two  files. 
All  of  these  actually  look  first 
for  matches  rather  than  differences. 
After  the  matching  process  has  been 
completed,  the  remainders  of  the 
files  that  are  not  included  in  the 
matches  are  then  reported  as  differ¬ 
ences.  (See  Figure  1,  page  29.) 

The  algorithms  differ  greatly  in 
their  conceptualization  of  the  prob¬ 
lem,  however.  In  this  article,  1  exam¬ 
ine  several  algorithms  for  comparing 
text  files — specifically,  source  code 
files — using  a  line  as  the  basic  unit  of 
comparison.  The  ideas  and  algo¬ 
rithms  I  present  here,  however,  can 
be  extended  to  other  types  of  files 
and  other  units  of  comparison  as 
well.  I  also  present  a  new  algorithm 
with  some  interesting  properties. 

Evaluating  The  Algorithms 

Any  file  comparison  algorithm 
should  be  evaluated  according  to  sev¬ 
eral  criteria: 

•  Is  it  efficient?  Time  efficiency 
(speed)  and  space  efficiency  (memo¬ 
ry  usage)  are  both  practical  consider¬ 
ations.  Usually  they  are  related  to  the 
lengths  of  the  files  being  compared. 

•  Is  it  robust?  No  algorithm  is  flawless. 
For  any  given  file  comparison  algo¬ 
rithm,  it  is  always  possible  to  concoct 
devious  situations  in  which  its  per¬ 
formance  appears  less  than  perfect. 
The  algorithm  should,  however,  be 
able  to  produce  reasonable  differ¬ 
ence  reports  for  a  variety  of  test 
cases. 

•  Can  it  let  differences  go  undetected? 
No  algorithm  should  allow  a  file  dif¬ 
ference  to  go  undetected. 


Tom  Steppe,  P.O.  Boy  2887,  Ann  Ar¬ 
bor,  MI  48106.  Tom  designs  and  devel¬ 
ops  software  written  exclusively  in  C. 


Determining  which 
files  are  more  equal 
than  others 


•  Can  it  let  matches  go  undetected?  If 
an  algorithm  can  overlook  matching 
lines,  it  will  report  these  lines  as  dif¬ 
ferences  when  they  are  not.  If  the 
file  comparison  is  being  performed 
to  produce  a  delta  file,  this  usually  is 
not  a  major  problem,  even  though 
each  undetected  match  does  increase 
the  size  of  the  delta  file  unnecessari¬ 
ly.  If  the  differences  are  to  be  in¬ 
spected  visually,  however,  a  report 
of  false  differences  can  be  a  serious 
drawback. 

Say,  for  example,  that  you  do  not 
have  a  file  comparison  utility  and  so 
you  have  to  compare  two  files  by 
eye.  This  process  is  certainly  tedious 
and  prone  to  error,  especially  if  some 
of  the  differences  are  subtle.  If  you 
now  use  a  file  comparison  utility  that 
is  known  to  report  false  differences, 
you  have  to  inspect  the  output  by  eye 
and  decide  which  reported  differ¬ 
ences  are  true  differences.  The  utili¬ 
ty  has  not  really  done  the  job  for  you, 
it  has  only  made  your  "by  eye”  in¬ 
spection  a  smaller  job  that  is  still 
prone  to  error. 

•  Can  it  detect  blocks  of  text  that  have 
been  moved?  Typically,  if  a  block  of 
text  has  been  moved,  it  simply  shows 
up  in  the  report  of  differences  as  a 
large  deletion  of  text  at  one  location 
and  a  large  insertion  of  text  at  anoth¬ 
er.  Unfortunately,  no  differences 
within  the  moved  block  are 
highlighted. 

When  a  file  comparison  is  used  to 
create  a  delta  file,  the  ability  to  detect 


moved  blocks  of  text  is  probably  de¬ 
sirable  because  it  can  lead  to  smaller 
delta  files.  But,  when  a  file  compari¬ 
son  is  performed  so  that  the  differ¬ 
ences  can  be  inspected  visually,  the 
ability  to  detect  moved  blocks  is  not 
always  as  handy  as  it  might  seem  to 
be.  Trying  to  report  the  moved 
blocks  is  often  difficult  and  can  lead 
to  complicated  reports  of  the  differ¬ 
ences,  especially  when  a  large  block 
of  text  is  moved,  a  piece  of  that  block 
is  moved  to  another  location,  a  piece 
of  that  piece  is  moved  to  still  another 
location,  and  so  on.  Also,  the  differ¬ 
ence  report  can  sometimes  be  over¬ 
burdened  by  uninteresting  reports  of 
small  blocks  (one-line  and  two-line 
blocks  of  text)  being  moved  all  over 
the  place. 

Only  one  algorithm  discussed  here 
can  inherently  detect  moved  blocks 
of  text.  The  other  algorithms,  howev¬ 
er,  can  be  extended  to  do  so,  as  fol¬ 
lows.  After  applying  the  algorithm, 
replace  each  matching  line  in  each 
file  with  a  line  that  is  guaranteed 
never  to  match.  This  leaves  only  the 
differences,  which  could  contain 
moved  blocks  of  text.  Next,  reapply 
the  algorithm  to  the  transformed 
files.  Any  match  that  is  found  in  this 
pass  will  represent  a  moved  block  of 
text  (see  Figure  2,  page  29).  Continue 
this  process  iteratively  until  no  new 
matches  can  be  found.  Of  course,  the 
cost  of  this  iterative  behavior  is  long¬ 
er  execution  time. 

These  criteria  help  to  provide  a 
useful  basis  for  surveying  popular 
file  comparison  algorithms. 

Popular  Algorithms 
for  Finding  Matches 

Scan  Until  Next  Match 

The  "scan  until  next  matching  se- 


28 

666 


Dr.  Dobb's  Journal,  September  1987 


quence”  algorithm  is  probably  the 
oldest  method  of  file  comparison. 
This  algorithm  starts  at  the  tops  of 
both  files  and  matches  as  many  lines 
as  possible.  When  a  difference  is  de¬ 
tected,  the  next  M  lines  are  scanned 
until  at  least  N  consecutive  matching 
lines  are  found.  If  a  sequence  of  N  or 
more  consecutive  matching  lines  is 
found,  the  process  begins  again  after 
the  matching  sequence.  If  such  a  se¬ 
quence  is  not  found,  the  process  be¬ 
gins  again  M  lines  further  down  in 
the  files.  This  process  is  repeated  un¬ 
til  the  ends  of  the  files  are  reached. 

The  values  of  M  and  N  can  be  ad¬ 
justed  to  affect  the  algorithm’s  per¬ 
formance.  The  value  of  M  is  used  to 
control  efficiency  by  restricting  the 
number  of  lines  that  will  be  exam¬ 
ined  while  searching  for  a  sequence 
of  matching  lines.  When  an  improp¬ 
er  sequence  of  matching  lines  is  dis¬ 
covered,  the  algorithm  can  be  reap¬ 
plied  using  a  new  value  for  N  that  is 
larger  than  the  length  of  the  improp¬ 
er  sequence.  In  this  way,  the  algo¬ 
rithm  will  overlook  the  undesirable 
sequence  because  it  contains  fewer 
than  N  matching  lines,  but  as  is  al¬ 
ways  the  case,  the  algorithm  will  also 
overlook  any  legitimate  matching  se¬ 
quences  that  contain  fewer  than  N 
lines  (see  Figure  3,  page  30).  Unfortu¬ 
nately,  these  matching  lines  are  then 
reported  as  differences.  All  too  often, 
this  algorithm  produces  bad  reports 
in  common  situations. 

Although  this  algorithm  is  often 
highly  time  efficient,  requires  mini¬ 
mal  memory,  and  frequently  pro¬ 
duces  good  difference  reports,  it  does 
not  take  long  to  become  frustrated 
with  its  shortcomings  and  inherent 
problems  and  begin  looking  for  a  bet¬ 
ter  solution. 

Longest  Common  Subsequence 

Think  of  a  file  as  representing  a  se¬ 
quence  of  lines.  A  subsequence  of 
those  lines  is  defined  simply  as  any 
sequence  of  lines  that  results  from 
removing  zero  or  more  lines  from 
the  original  sequence —  for  example, 
the  longest  subsequence  of  any  se¬ 
quence  of  lines  is  the  sequence  itself, 
with  zero  lines  removed.  Also,  a  se¬ 
quence  of  zero  lines  would  be  a  sub¬ 
sequence  of  any  sequence  because  it 
could  be  created  by  removing  all  the 
lines  from  any  sequence. 

The  "longest  common  subse¬ 


quence”  approach  to  file  comparison 
takes  the  two  files  to  be  compared 
and  finds  the  longest  sequence  of 
lines  that  is  a  subsequence  of  each  of 
the  files'  lines — the  longest  common 
subsequence  (see  Figure  4,  page  30). 
The  details  of  the  algorithm  are  not 
discussed  here,  but  sources  of  such 
discussions  are  included  in  the  bibli¬ 
ography.  The  Unix  diff  command  is 
based  on  this  algorithm. 


This  algorithm  provides  a  simple, 
compact  formalization  of  the  file 
comparison  problem  and  produces 
reasonable  difference  reports  in  a  va¬ 
riety  of  test  cases.  The  reports  are 
quite  acceptable  whether  the  com¬ 
parison  is  being  used  for  visual  in¬ 
spection  of  the  differences  or  for  cre¬ 
ating  a  delta  file.  In  fact,  among  all 
the  algorithms  discussed  here,  it  is 
probably  safe  to  say  that  this  one  con- 


filel  file2 

A  A 

B  BB 

C  C 

D  E 

E  F 

Difference  report: 

######00001  ######################file1 

#  2  B 

#  =  =  =  =  =  =  =  =  =  =  =  =  =  =  =  =  =  =  =  =  =  i=  =  changed  to 

#  2  BB 

######00001  ######################file2 

######00001  ######################file1 

#  4  D 

#  deleted 

######00001  ######################file2 

######00001  ######################file1 

#  inserted 

#  5  F 

######00001  #####################  #file2 

Figure  1:  File  comparison  algorithms  actually  look  first  for  line  matches  and 
then  report  lines  that  are  not  included  in  the  matches  as  differences.  The  differ¬ 
ences  are  usually  expressed  as  the  changes,  insertions,  and  deletions  that  can  be 
applied  to  one  file  to  make  it  identical  to  the  other. 


Dr.  Dobb’s  Journal,  September  1987 


29 

667 


FILE  COMPARISONS 

(continued  from  page  29) 


sistently  produces  the  best  reports 
when  comparing  files  that  do  not  in¬ 
volve  blocks  of  text  that  have  been 
moved. 

Sometimes  the  quality  of  the  re¬ 
ports  can  be  overshadowed  by  issues 
of  time  and  space  efficiency.  This  is 
not  always  true,  but  situations  that 
include  a  poor  combination  of  large 
files  and  limited  computer  resources 
can  lead  to  less  than  desirable  per¬ 
formance  by  this  algorithm.  A  basic 
implementation  of  the  algorithm  re¬ 
quires  linear  space  and  quadratic 
time.  In  some  cases,  the  quadratic 
time  can  prove  to  be  unacceptable.  In 
summary,  the  "longest  common  sub¬ 
sequence”  algorithm  produces  excel¬ 
lent  reports,  but  it  can  be  slow. 

Extended  Unique  Line  Matching 

The  "extended  unique  line  match¬ 
ing”  algorithm  is  based  on  the  idea 
that  a  line  that  occurs  once  and  only 
once  in  each  file  must  be  the  same 
line.  These  pairs  of  "unique"  lines 
determine  the  initial  set  of  matched 


Figure  3:  The  "scan  until  next 
matching  sequence"  algorithm  often 
produces  bad  reports  in  common  situ¬ 
ations.  When  N—3,  the  algorithm  set¬ 
tles  for  matches  of  three  lines,  never 
realizing  that  a  match  of  eight  lines  is 
possible.  When  N=4,  it  discovers  the 
match  of  eight  lines  but  does  not  de¬ 
tect  the  remaining  match  of  three  lines 
(A,  B,  C). 


lines.  (Imaginary  lines  at  the  tops  and 
the  bottoms  of  the  files  are  also  add¬ 
ed  to  the  set  of  matched  lines.)  Then, 
in  each  file,  the  lines  adjacent  to  each 
match  are  examined  and,  if  identical, 
are  added  to  the  set  of  matched  lines. 
This  process  is  repeated  until  no  new 
matches  can  be  found. 

This  algorithm  has  strong  intuitive 
appeal.  It  is  efficient,  being  linear  in 
both  time  and  space.  Also,  it  is  the 
only  popular  algorithm  that  inher¬ 
ently  detects  blocks  of  text  that  have 
been  moved  (even  if  some  differ¬ 
ences  exist  within  the  blocks).  Moved 
blocks  can  be  detected  because  the 
search  for  pairs  of  unique  lines  is  in 
no  way  sequential  and,  therefore, 
can  result  in  matches  that  indicate 
that  a  block  of  text  has  been  moved. 
Note  that  the  algorithm  can  find  a 
moved  block  of  text  only  if  it  contains 
a  unique  line  match  within  it. 

A  significant  problem  with  this  al¬ 
gorithm  is  that  it  is  prone  to  allowing 
some  matches  to  go  undetected.  This 
occurs  when  matching  lines  are  not 
neatly  flanked  by  either  unique  line 
matches  or  the  adjacent  matches  that 
have  grown  outward  from  unique 
line  matches  (see  Figure  5,  below). 

This  algorithm  is  fast  and  can  fre¬ 
quently  detect  moved  blocks  of  text, 
but  a  sacrifice  is  often  made  in  the 
quality  of  the  difference  report. 
Probably  its  best  application  is  in  the 
generation  of  delta  files  when  speed 
is  the  primary  concern. 

A  New  Algorithm 

The  "recursive  longest  matching  se¬ 
quence”  algorithm  uses  a  simple  yet 
effective  approach  to  the  problem. 


Figure  4:  The  " longest  common  se¬ 
quence"  algorithm  finds  the  longest 
(not  necessarily  consecutive)  se¬ 
quence  of  lines  that  is  contained  in 
both  files. 


This  method  first  scans  both  files 
from  beginning  to  end,  looking  for 
the  longest  sequence  of  consecutive 
matching  lines.  That  sequence  is 
then  thought  of  as  dividing  each  of 
the  two  files  into  an  upper  section 
and  a  lower  section.  Then,  the  algo¬ 
rithm  proceeds  by  scanning  both  up¬ 
per  sections  looking  for  the  longest 
sequence  of  consecutive  matching 
lines  and,  similarly,  both  lower  sec¬ 
tions  for  the  same.  These  matching 
sequences  then  divide  their  respec¬ 
tive  sections,  and  the  process  contin¬ 
ues  recursively  until  no  more  match¬ 
es  can  be  found. 

This  method  of  file  comparison  is 
easy  to  understand  and  produces  ac¬ 
ceptable  difference  reports  across  a 
spectrum  of  test  cases.  It  uses  linear 
space  but  quadratic  time.  Because 
time  efficiency  can  be  a  problem  in 
some  situations,  a  simple  modifica¬ 
tion  of  the  algorithm  is  needed.  An 
explanation  of  the .  modification  re¬ 
quires  an  understanding  of  the  meth¬ 
od  used  to  locate  the  longest  se¬ 
quence  of  matching  lines  between 
sections  of  two  files. 

First  of  all,  once  the  longest  se¬ 
quence  is  known,  it  can  be  identified 
by  a  pair  of  starting  lines — one  line 
from  each  file  that  specifies  where 
the  sequence  begins  in  that  file.  So, 
when  searching  for  the  longest  se¬ 
quence,  candidate  pairs  of  starting 
lines  are  examined  successively  (in 
some  intelligent  order  that  starts  at 
the  beginnings  of  both  file  sections), 
and  information  is  continually  main¬ 
tained  about  the  length  and  location 
of  the  longest  sequence  of  matching 
lines  that  has  been  discovered  so  far. 


filel 

file2 

A 

A 

A 

A 

A 

A 

B 

B 

B 

B 

B 

B 

Figure  S:  The  " extended  unique  line 
matching "  algorithm  is  prone  to  de¬ 
tecting  false  differences.  In  this  case, 
no  matches  are  found  (because  there 
are  no  unique  line  matches)  and  all 
lines  are  reported  as  differences. 


30 

668 


Dr.  Dobb's  Journal,  September  1987 


FILE  COMPARISONS 

(continued  from  page  30) 


When  the  ends  of  the  file  sections  are 
reached,  the  longest  sequence  is 
known  and  information  about  the  se¬ 
quence  is  reported. 


Delta 

A  file  comparison  utility  is  a  versatile 
tool  for  a  range  of  situations.  It  is  use¬ 
ful  to  partition  these  situations  into 
two  distinct  cases. 

In  the  most  common  case,  a  file 
comparison  is  performed  so  that  the 
differences  between  two  versions  of 
a  text  file  can  be  inspected  visually. 
The  differences  are  usually  ex¬ 
pressed  as  the  changes,  insertions, 
and  deletions  that  can  be  applied  to 
one  file  to  make  it  identical  to  the 
other  file.  In  this  case,  the  primary 
job  of  the  comparison  is  to  produce  a 
concise  and  readable  report  of  the 
differences. 

In  the  course  of  editing,  a  file  com¬ 
parison  can  be  used  in  this  way  to 
highlight  the  differences  between  a 
previous  version  of  a  file  and  the  cur¬ 
rent  version.  Valid  modifications  can 
be  verified,  and  spurious  edits  can  be 
detected.  As  another  example,  if  a 
new  version  of  a  program  is  pro¬ 
duced,  a  partial  test  of  its  integrity 
could  include  a  file  comparison  of  its 
output  with  the  output  from  a  previ¬ 
ous  version  of  the  program  that  is 
known  to  be  correct.  If  the  two  out- 


The  modification  to  this  algorithm 
allows  the  searching  to  stop  if  a  se¬ 
quence  of  N  matching  lines  is  found, 
realizing  that  it  might  not  be  the  lon¬ 
gest  sequence  that  would  be  discov¬ 
ered  if  the  searching  were  allowed  to 
continue  to  the  ends  of  the  sections. 


puts  compare  favorably,  the  new 
program  passes  this  integrity  test.  If 
they  do  not  compare  favorably,  an¬ 
other  file  comparison  can  be  used  in 
the  debugging  process  to  highlight 
the  changes  between  a  version  of  a 
source  code  file  that  is  known  to 
work  and  the  version  that  does  not 
work. 

In  the  second  case,  a  file  compari¬ 
son  is  performed  to  generate  a  delta 
file — a  file  that  contains  a  report  of 
the  differences  between  the  two 
files.  If  the  file  comparison  is  thought 
of  as  comparing  an  old  file  with  a 
new  file,  a  backward  delta  file  is  de¬ 
signed  so  that  it  contains  all  the  infor¬ 
mation  necessary  to  recreate  the  old 
file,  given  the  new  file.  A  forward 
delta  file  is  designed  to  be  able  to  re¬ 
create  the  new  file,  given  the  old  file. 
In  either  case,  one  of  the  original  files 
can  be  eliminated  without  loss  of  in¬ 
formation.  If  the  delta  file  is  smaller 
than  the  file  it  allows  to  be  eliminat¬ 
ed,  this  will  result  in  a  savings  of  disk 
space.  The  primary  job  of  a  file  com¬ 
parison  in  this  case  is  to  produce  a 
compact  delta  file. 


This  allows  the  searching  to  end  pre¬ 
maturely  (before  the  longest  se¬ 
quence  has  been  assured)  and  can 
save  considerable  time.  N  is  called 
the  "long-enough”  value.  The  effects 
of  the  long-enough  value  can  be  ex¬ 
amined  by  choosing  some  test  pairs 
of  files  and  comparing  the  behavior 
of  the  algorithm  when  a  long- 
enough  value  is  used  and  when  one 
is  not  used.  Quite  often,  the  use  of  a 
reasonable  long-enough  value  will 
find  exactly  the  same  sequences  of 
matching  lines  (although  the  discov¬ 
eries  may  occur  in  a  different  order), 
thus  producing  an  identical  report  of 
the  differences  but  with  a  significant 
improvement  in  speed  (see  Figure  6, 
page  32).  In  fact,  the  use  of  a  reason¬ 
able  long-enough  value  allows  this 
algorithm  to  perform  in  essentially 
linear  time  for  typical  cases,  over¬ 
coming  the  previous  worry  of  time 
efficiency. 

The  long-enough  value  is  a  param¬ 
eter  that  you  can  specify.  To  deter¬ 
mine  a  good  value  for  your  purposes, 
first  guess  at  the  length  of  the  longest 


This  use  of  a  file  comparison  utility 
is  particularly  common  in  version 
control  systems  that  maintain  multi¬ 
ple  historical  versions  of  source  code 
files.  Only  the  current  version  of  a 
source  code  file  is  saved,  whereas  a 
backward  delta  file  is  saved  for  each 
historical  version.  Any  historical  ver¬ 
sion  can  be  recreated  by  applying 
the  appropriate  delta  files  to  the  cur¬ 
rent  version  of  the  file.  The  savings 
in  disk  space  can  be  tremendous.  (Al¬ 
ternatively,  some  version  control  sys¬ 
tems  save  the  first  version  of  the  file 
and  the  subsequent  forward  delta 
files.) 

This  usage  is  also  common  in  tele¬ 
communications  applications  where 
a  file  at  one  or  more  remote  sites  has 
to  be  updated  from  a  host.  A  forward 
delta  file  is  created  on  the  host  by 
comparing  the  new  file  with  the  old 
file  (a  copy  of  the  file  that  exists  at  the 
remote  site).  If  the  delta  file  is  small,  it 
is  often  more  efficient  to  transmit  the 
forward  delta  file  and  apply  it  to  the 
old  file  than  it  is  to  transmit  the  new 
file  in  its  entirety. 


No  long-enough  value: 

Long-enough  value 

=2: 

filel 

file2 

filel 

file2. 

A 

A 

3rd  sequence 

A 

A 

2nd  sequence 

- 

- 

- 

= 

A 

A 

2nd  sequence 

A 

A 

1  st  sequence 

B 

B 

B 

B 

- 

= 

- 

= 

A 

A 

A 

A 

B 

B 

1 st  sequence 

B 

B 

3rd  sequence 

C 

C 

C 

C 

- 

= 

- 

= 

A 

A 

4th  sequence 

A 

A 

4th  sequence 

Figure  G:  With  the  " recursive  longest  matching  sequence"  algorithm,  the  use 
of  a  long-enough  value  often  finds  exactly  the  same  sequences  of  matching  lines 
although  the  discoveries  may  occur  in  a  different  order. 


Files  and  User  Reports 


32 


Dr.  Dobb's  Journal,  September  1987 

669 


FILE  COMPARISONS 

(continued  from  page  32) 


sequence  of  lines  you  can  imagine 
appearing  more  than  once  in  a  typi¬ 
cal  file.  The  long-enough  value 
should  be  at  least  one  larger  than 
your  guess.  This  will  help  the  algo¬ 
rithm  to  avoid  matching  the  wrong 
instance  when  a  sequence  of  lines 
appears  multiple  times  in  a  file.  If  a 
particular  choice  of  long-enough  val¬ 
ue  produces  unsatisfactory  differ¬ 
ence  reports,  the  algorithm  can  al¬ 
ways  be  applied  again  with  a  larger 
value.  When  comparing  C  source 
code,  I  typically  choose  a  generous 
value  of  25,  and  I  rarely  have  to  re¬ 
run  the  comparison. 

The  "recursive  longest  matching 
sequence”  algorithm  is  particularly 
well  suited  to  take  advantage  of  some 
common  hash  code  technology  as  a 
means  of  improving  time  perform¬ 
ance  even  more.  In  applications  that 
involve  repetitive  string  compari¬ 
sons,  it  is  often  useful  to  calculate 
hash  codes  initially  for  all  the  strings. 
Then,  the  hash  codes  are  compared 
instead  of  the  strings  themselves.  The 
comparison  of  two  hash  code  values 
is  much  quicker  than  is  the  compari¬ 
son  of  two  strings.  If  the  hash  codes 
are  not  equal,  the  strings  cannot  pos¬ 
sibly  be  the  same  and  need  not  be 
compared.  If  the  hash  codes  are 
equal,  only  then  must  the  strings  be 
compared  to  prove  or  disprove  their 
equality. 

The  performance  benefits  are 
even  more  dramatic  when  hash 
codes  are  used  with  the  "recursive 
longest  matching  sequence”  algo¬ 
rithm.  When  searching  for  the  lon¬ 
gest  sequence  of  matching  lines, 
strings  do  not  have  to  be  compared 
every  time  a  pair  of  matching  hash 
codes  is  found.  Instead,  strings  only 
have  to  be  compared  once  a  se¬ 
quence  of  matching  hash  codes  is 
found  that  is  longer  than  the  longest 
sequence  yet  found. 

The  time  efficiency  can  be  im¬ 
proved  even  further  if  a  hash  code 
table  is  maintained  for  each  file.  The 
table  should  consist  of  an  array  that 
contains  as  many  elements  as  there 
are  possible  hash  code  values.  Each 
element  of  the  array  should  consist 
of  a  linked  list  of  line  numbers  for 
lines  whose  hash  code  values  are 
equal  to  the  array  index.  This  table 


can  easily  be  created  by  processing 
each  line  in  the  file,  calculating  its 
hash  code  value,  and  adding  its  line 
number  to  the  proper  linked  list. 
Now,  while  searching  for  the  longest 
sequence  of  matching  lines  by  exam¬ 
ining  pairs  of  starting  line  numbers, 
the  number  of  candidate  pairs  can  be 
greatly  reduced.  For  any  given  line 
in  One  file,  only  those  lines  in  the 
other  file  that  have  the  same  hash 
code  value  (as  can  be  easily  deter¬ 
mined  from  the  file’s  hash  code  ta¬ 
ble)  need  to  be  considered. 

A  basic  C  implementation  of  the 
“recursive  longest  matching  se¬ 
quence”  algorithm  is  shown  in  List¬ 
ing  One,  page  54.  Its  simplicity,  com¬ 
bined  with  a  long-enough  value 
modification  and  some  clever  use  of 
hash  codes,  makes  it  a  viable  solution 
to  the  file  comparison  problem.  It  is 
suitable  for  both  delta  creation  and 
visual  inspection  purposes. 

Availability 

All  the  source  code  for  articles  in  this 
issue  is  available  on  a  single  disk.  To 
order,  send  $14.95  to  Dr.  Dobb's  Jour¬ 
nal,  501  Galveston  Dr.,  Redwood  City, 
CA  94063,  or  call  (415)  366-3600,  ext. 


216.  Please  specify  issue  number  and 
format  (MS-DOS,  Macintosh,  Kaypro). 

You  can  also  purchase  a  full-fea¬ 
tured  executable  version  of  this  algo¬ 
rithm  from  Stepping  Stone  Software, 
P.O.  Box  2887,  Ann  Arbor,  MI  48106 
for  $30.  The  available  format  is  MS- 
DOS  5V4-inch  DSDD. 

Bibliography 

Heckel,  Paul.  "A  Technique  for  Isolat¬ 
ing  Differences  Between  Files.”  Com¬ 
munications  of  the  ACM,  vol.  21,  no.  4 
(April  1978):  264-268. 

Hirschberg,  D.  S.  "A  Linear  Space  Al¬ 
gorithm  for  Computing  Maximal 
Common  Subsequences.”  Communi¬ 
cations  of  the  ACM,  vol.  18,  no.  6  (June 
1975):  341-343. 

Wagner,  Robert  A.;  and  Fischer,  Mi¬ 
chael  J.  "The  String-to-String  Correc¬ 
tion  Problem.  ".Journal  of  the  Associa¬ 
tion  for  Computing  Machinery,  vol. 
21,  no.  1  (January  1974):  168-173. 

DDJ 

(Listing  begins  on  page  54.) 

Vote  for  your  favorite  feature/article. 

Circle  Reader  Service  No.  2. 


Dr.  Dobb's  Journal,  September  1987 

670 


33 


ARTICLES 


The  XOR  Chain 
Revisited 

by  Bennette  R.  Harris 


In  "The  XOR  Chain”  ( DDJ ,  June 
1987),  David  Cortesi  showed  how 
to  use  the  exclusive-OR  (XOR)  op¬ 
eration  to  compress  two  pointer 
items  into  one  link  field  in  a  doubly 
linked  list.  This  trick  is  just  one  of 
many  that  have  been  used  for  this 
purpose  over  the  years.  In  this  article 
I’ll  describe  some  similar  tricks  for 
doing  the  same  thing,  discuss  their 
relative  advantages  and  disadvan¬ 
tages,  and  then  present  a  package  of 
C  functions  that  use  the  XOR  tech¬ 
nique  to  manipulate  binary  trees  of 
arbitrary  depth  without  using  stacks. 

The  Tricks  of  the  Trade 

Cortesi 's  article  discussed  a  few  im¬ 
portant  properties  of  the  XOR 
operation: 


A  XOR  0  =  A  (1) 

A  XOR  A  =  0  (2) 

(A  XOR  C)  XOR  A  =  C  (3) 

(A  XOR  C)  XOR  C  =  A  (4) 


Properties  3  and  4  make  it  possible  to 
store  two  pointers,  A  and  C,  in  a  sin¬ 
gle  field  as  A  XOR  C  and  then  recover 
either  one  of  the  two  provided  the 
other  is  known.  Thus,  if  the  address¬ 
es  A,  B,  and  C  point  to  three  nodes  in 
sequence,  and  if  the  link  field  W  of 
node  B  is  set  so  that: 


Bennette  H.  Harris,  231  S.  Janesville 
St.,  Whitewater,  Wl  53190.  Dr.  Harris 
is  an  assistant  professor  of  mathemat¬ 
ics  and  computer  science  at  the  Uni¬ 
versity  of  Wisconsin-Whitewater.  He 
teaches  in  the  Management  Computer 
Systems  program  and  he  also  serves 
as  a  consultant  both  privately  and 
through  the  university's  Small  Busi¬ 
ness  Development  Center. 


Managing  trees 
without  stacks 


W  =  A  XOR  C 

then  W  XOR  A  yields  C,  permitting 
travel  in  one  direction  through  the 
list,  whereas  W  XOR  C  yields  A,  per¬ 
mitting  travel  in  the  other  direction. 

The  basic  purpose  of  a  storage 
technique  such  as  this,  as  the  preced¬ 
ing  example  illustrates,  is  to  permit 
travel  in  two  directions  through  a 
data  structure  while  storing  only  a 
single  link  field.  It  assumes  that  you 
entered  the  data  structure  at  a  prede¬ 
fined  entry  point  (such  as  a  head  or 
root  node)  and  that  an  adjacent  pair 
of  nodes  are  referenced  at  any  given 
point  in  time. 

It  is  not  essential  that  you  use  the 
XOR  operation  for  combining  the  ad¬ 
dresses — you  can  use  almost  any  re¬ 
versible  operation.  For  example,  pro¬ 
vided  the  addresses  are  in  a  range 
that  won’t  generate  overflow  errors, 
you  could  use  the  operations: 

W  =  A  +  C 
C  =  W  -  A 
A  =  W-C 

to  move  about  in  the  data  structure. 
This  observation  might  be  handy  if 
you  wanted  to  use  this  technique  in 
an  environment  in  which  XOR  was 
not  available.  The  chief  advantages 
of  using  XOR  are  its  speed  and  never 


getting  results  that  are  out  of  range. 

Any  such  technique  exchanges  a 
fairly  minimal  amount  of  computa¬ 
tion  for  a  reduction  in  the  amount  of 
storage  required  to  hold  pointers. 
All,  however,  suffer  from  the  prob¬ 
lem  that  the  data  stored  in  a  link  field 
is  "unnatural” — it  does  not  actually 
represent  a  pointer  to  any  one  node 
of  the  data  structure.  Instead,  its  con¬ 
tents  are  encoded,  and  the  informa¬ 
tion  can  only  be  recovered  using  an 
adjacent  node’s  pointer  as  a  key.  If  a 
portion  of  the  data  structure  be¬ 
comes  corrupted,  the  data  structure 
as  a  whole  becomes  practically 
worthless  because  it's  virtually  im¬ 
possible  to  make  any  sense  out  of 
links  in  nodes  located  beyond  the 
point  of  corruption.  Also,  such  link 
fields  are  hard  to  trace  from  a  dump 
of  the  data  structure. 

You  can  achieve  the  same  space 
savings  while  maintaining  "natural” 
links  by  reversing  the  link  field  of 
the  current  node  to  point  backward 
to  the  previous  node  as  you  go  down 
the  data  structure  and  then  reversing 
the  link  again  as  you  go  back  up.  This 
way,  the  link  field  remains  a  true 
pointer  to  nodes  of  the  data  struc¬ 
ture.  Supposing  that  addresses  A,  B, 
and  C  pointed  to  three  nodes  in  se¬ 
quence,  the  link  field  W  of  node  B 
would  initially  be  set  so  that  W  =  C. 
Then,  as  node  B  was  accessed  travel¬ 
ing  forward,  a  series  of  statements 
similar  to  the  following  would  be 
executed: 

NextNode  =  W 
W  =  PrevNode 

On  the  return  back  through  the 
structure,  of  course,  the  link  field  of 


36 


Dr.  Dobb's  Journal,  September  1987 

671 


node  B  would  be  reset  using  the  same 
statements.  For  an  example  of  this 
technique  in  action,  see  Allen  Holub’s 
discussion  of  nonrecursive  tree  tra¬ 
versal  in  C  Chest,  DDJ,  July  1986. 

This  traversal  technique  ex¬ 
changes  the  XOR  computations  for 
the  overhead  required  to  reverse  the 
link  fields.  Although  links  are  always 
true  pointers  to  interesting  places,  a 
disadvantage  in  this  case  is  that  the 
data  structure  must  be  exited  from 
completely  in  the  reverse  manner  to 
that  in  which  it  was  entered  in  order 
to  restore  pointers  to  a  useful  state. 
This  is  undesirable  if  the  data  struc¬ 
ture  is  a  large  linked  list,  although  it’s 
perhaps  acceptable  in  a  tree.  A  sec¬ 
ond  undesirable  feature  of  this  meth¬ 
od  is  that,  if  a  program  or  computer 
were  to  go  down  while  in  the  middle 
of  such  an  operation,  the  data  would 
again  be  corrupted,  although  not  ir¬ 
reparably  so.  Because  the  likelihood 
of  such  problems  seems  to  be  rela¬ 
tively  high  in  a  microcomputer  envi¬ 
ronment,  this  is  usually  not  accept¬ 
able.  Finally,  observe  that  the  data 
must  be  accessed  twice- — once  to 
read  it  and  once  to  write  it  again  with 
the  new  link.  In  some  applications 
this  could  represent  a  significant 
amount  of  overhead. 

Given  the  alternatives  described 
here,  the  XOR  chain  stands  out  as  one 
of  the  safest  link  compression  tech¬ 
niques  because  the  link  fields  are  not 
changed  in  traversing  the  data 
structure. 

Climbing  the  Tree 

To  keep  matters  simple,  I  will  focus 
on  binary  trees  in  this  article.  Those 
of  you  who  have  experience  with 
more  general  tree  structures  will  be 
able  to  make  the  necessary 
adjustments. 

In  a  binary  tree,  each  node  con¬ 
tains  two  link  fields — a  left  link  and  a 
right  link — which  point  to  the  two 
children  of  the  current  (parent)  node. 
Routines  for  manipulating  trees  of¬ 
ten  make  use  of  stacks  or  recursive 
programming  techniques  to  main¬ 
tain  the  back-pointers  required  by 
these  routines.  In  many  cases  the 
necessary  stack  space  must  be  set 
aside  in  advance,  limiting  the  poten¬ 
tial  depth  of  the  tree.  With  the  XOR 
chain  technique,  the  link  fields  can 
be  coded  so  they  can  be  used  to  travel 
either  from  parent  to  child  or  from 


child  to  parent,  much  as  in  the  case 
of  linked  lists.  This  makes  it  possible 
to  implement  a  binary  tree  without 
setting  aside  a  stack  to  maintain  back- 
pointers  for  traversing  the  tree. 

The  only  reason  why  implement¬ 
ing  a  tree  is  not  trivial  has  to  do  with 
the  fact  that  a  node  has  two  children. 
Suppose  you  wish  to  visit  the  nodes 
in  a  tree  in  the  fashion  known  as  "in- 
order,”  visiting  the  left  child,  then 
the  parent,  and  finally  the  right 
child.  When  you  return  from  a  child 
to  its  parent,  you  need  to  know 
whether  the  child  was  a  left  child  or 
a  right  child.  If  the  child  was  a  left 
child,  then  the  parent  must  be  visited 


The  chief  advantages 
of  using  XOR  are  its 
speed  and  never 
getting  results  that 
are  out  of  range. 


next.  If  the  child  was  a  right  child, 
then  the  parent  has  already  been  vis¬ 
ited,  and  it  is  time  to  return  to  the 
parent's  parent.  This  is  not  hard  to 
determine  if  the  links  of  the  parent 
can  be  matched  with  the  node  just 
visited,  but  when  these  links  have 
been  coded  as  described  earlier,  it  is 
not  so  simple. 

There  are  at  least  two  ways  out  of 
the  difficulty.  One  is  to  include  an  ex¬ 
tra  bit  in  each  node,  indicating 
whether  it  is  a  left  or  right  child.  This 
extra  bit  is  a  small  sacrifice  for  the 
convenience  of  the  doubly  linked 
pointers  but  still  might  be  wasteful  in 
a  large  data  structure.  The  other  way 
is  available  only  if  the  binary  tree  is 
sorted.  If  the  tree  is  sorted  so  that  (for 
example)  the  key  field  of  the  left 
child  is  less  than  the  key  of  the  par¬ 
ent,  which  is  in  turn  less  than  or 
equal  to  the  key  of  the  right  child, 
then  whether  a  node  is  a  left  child  or 
a  right  child  can  be  determined  by 
comparing  the  child’s  key  field  with 
that  of  its  parent.  Because  binary 
trees  are  frequently  used  to  index 
other  data  structures,  this  criterion  is 
often  met. 


Defining  an  Item 

The  structure  of  an  item  (or  node)  in 
the  tree  is  defined  as  follows: 

struct  Item 

{ 

int  key; 

unsigned  llink, 
rlink; 

/*  other  stuff  */ 

}; 

The  key  field  contains  the  key  based 
on  which  the  tree  is  sorted.  Here,  I 
have  chosen  key  to  be  of  type  int,  but 
you  could  use  other  types,  depend¬ 
ing  on  your  application.  The  link 
fields  contain  the  XORed  linking  ad¬ 
dresses.  The  "other  stuff”  is  applica¬ 
tion  specific. 

I  assume  that  Items  are  created  and 
destroyed  as  in  Cortesi's  article: 

extern  struct 
Item  *MakeItem( ); 
extern  void 

Dropltem(i)  struct  Item  *i; 

Defining  a  Tree 

A  binary  tree  consists  of  a  collection 
of  zero  or  more  Items  that  satisfy  a 
hierarchical  relationship.  If  a  tree  is 
not  empty,  then  it  has  a  unique  root 
item  that  may  have  zero,  one,  or  two 
children.  Each  child  item  is  the  root 
of  a  subtree  that  is  itself  a  tree.  The 
root  item  of  a  tree  is  an  Item  just  like 
any  other  in  the  tree,  except  that  it  is 
not  the  child  of  any  other  item.  It  is 
convenient  not  to  store  any  data  in 
the  root  item  so  that  an  empty  tree 
consists  of  a  root  item  with  no  chil¬ 
dren.  This  makes  it  possible  to  define 
a  function  to  test  for  an  empty  tree: 

int  Empty(r) 
struct  Item  *r; 

{ 

return!  (NULL  =  =  r->llink)  && 
(NULL  =  =  r->  rlink) ); 

} 

I'm  assuming  that  the  tree  is  sorted 
in-order  fashion — that  is,  the  key  of 
the  left  child  must  be  less  than  the 
key  of  the  parent  and  the  key  of  the 
right  child  must  be  greater  than  or 
equal  to  the  key  of  the  parent.  I'm 
giving  the  root  item  an  artificially 
high  key  so  that  the  data-containing 
items  are  all  found  in  the  root’s  left 
subtree.  The  root  should  be  initial- 


Dr.  Dobb  s  Journal,  September  1987 

672 


37 


XOR  CHAIN 

(continued  from  page  37) 


ized  with: 

root- >  key  =  MaxKey 

where  MacKey  is  an  appropriately 
defined  value  greater  than  any  other 
key  in  the  tree. 

Traversing  the  Tree 

To  travel  within  a  tree,  you  must  en¬ 
ter  at  its  root  and  then  move  up  or 
down  the  tree  from  child  to  parent 
or  parent  to  child.  At  any  point  with¬ 
in  the  tree,  your  position  is  described 
by  a  child-parent  pair  of  pointers 
that  contains  the  addresses  of  the 
item  (child)  being  visited  and  its  par¬ 
ent.  As  in  Cortesi’s  article,  these  ad¬ 
dresses  are  stored,  together  with  the 
address  of  the  tree's  root,  in  a  record 
called  a  Scan: 

struct  Scan 

{ 

struct  Item  "parent, 

"child, 

"root; 

}; 

Before  it  can  be  used,  a  Scan  must 
first  be  associated  with  a  particular 
tree  by  having  its  root  field  set: 

void  Associated, r) 
struct  Scan  *s; 
struct  Item  "r; 


s->root  =  r; 


Once  a  Scan  is  associated  with  a 
particular  tree,  it  can  be  positioned  at 
the  root  of  the  tree  to  begin  the  tree 
traversal: 

void  ToRoot(s) 
struct  Scan  *s; 

{ 

s->  parent  =  NULL; 
s->child  =  s— >root; 


Because  the  root  item  of  a  tree  has 
no  parent,  you  can  very  easily  test  to 
see  if  a  Scan  is  at  the  root  of  the  tree: 

int  AtRoot(s) 
struct  Scan  *s; 


return(s->  parent  =  =  NULL); 


} 


This  function  can  be  used  to  detect 
the  end  of  a  sequential  walk  through 
the  tree  because  the  root's  key  value 
is  greater  than  all  other  keys  in  the 
tree  and  so  it  will  be  visited  last. 

Moving  the  Scan 

Three  fundamental  actions  must  be 
described  for  moving  around  within 
the  tree:  a  move  from  the  parent  to 
the  left  child,  a  move  from  the  par¬ 
ent  to  the  right  child,  and  a  move 
from  a  child  to  its  parent.  The  first 
two  are  easy  and  are  described  by 
the  code  in  Example  1,  below. 


The  third  move  is  more  interesting 
(see  Example  2,  below).  You  must  be 
able  to  recover  the  parent's  parent  to 
implement  a  move  from  the  child  to 
its  parent.  To  do  so,  you  must  deter¬ 
mine  whether  the  child  is  a  left  or 
right  child  so  you  can  XOR  with  the 
proper  link  field.  Naturally,  the  Is- 
Left  function  needs  to  be  modified  if 
the  necessary  position  information  is 
coded  differently  in  the  tree,  as 
would  be  the  case  for  height-bal¬ 
anced  trees  that  allowed  nonunique 
keys. 

Notice  that  the  procedures  for 
these  three  actions  are  not  recursive 
and  do  not  require  the  use  of  a  stack 
for  back-pointers. 

Once  the  three  fundamental  ac¬ 
tions  have  been  described,  you  can 
write  procedures  that  will  process 
the  tree  sequentially  in  either  direc¬ 
tion  as  if  it  were  a  linked  list.  I'll  do  an 
in-order  traversal  here — that  is,  visit 
the  left  subtree  recursively,  then  the 
root,  then  the  right  subtree.  First,  you 
need  to  be  able  to  move  forward  and 
backward  within  the  tree's  list,  as 
shown  in  Example  3,  page  39.  Both 
these  procedures  are  designed  to 
stop  at  the  root  item  to  make  it  easy 
to  test  for  the  ends  of  the  tree’s  list. 

To  start  sequential  processing,  you 
first  have  to  locate  the  head  or  tail  of 
the  list: 

void  ToHead(s) 
struct  Scan  *s; 

{ 

ToRoot(s); 


void  GoLeft  (  s  ) 

int  IsLef  t  (  s ) 

struct  Scan  *s ; 

struct  Scan  *s ; 

/ 

{ 

struct  Item  *i ; 

\ 

return(  s  — >child  — >key  <  s->parent->key )  ; 

\ 

i  =  s  —  >par ent  t  s  — >child  — >llink; 

I 

s  — >parent  =  s  — >child; 

void  GoPar ent  (  s  ) 

s  — >child  =  i; 

> 

struct  Scan  *s  ; 

/ 

\ 

struct  Item  *i  ; 

void  GoRight ( s ) 

if  ( IsLef t ( s  )  ) 

struct  Scan  *s ; 

i  =  s  — >parent  — >llink  t  s  — >child; 

{ 

else 

struct  Item  *i  ; 

i  —  s  — >parent  — >rlink  t  s  — >child; 

i  =  s  —  >parent  t  s  — >child— >rlink; 

s  —  >child  =  s  — >parent; 

s  — >parent  =  s  — >child; 

s  — >parent  =  i  ; 

s  — >child  =  i  ; 

> 

} 

Example  1:  Moving  from  the  parent  to  its  children 

Example  2:  Moving  from  a  child  to  its  parent 

38 


Dr.  Dobb’s  Journal,  September  1987 

673 


while  (s->child->llink  !  = 

s->  parent) 


} 


GoLeft(s); 


void  ToTail(s) 
struct  Scan  *s; 
{ 

ToRoot(s); 

GoBak(s); 


Inserting  an  Item 

Unlike  the  case  for  a  linked  list,  new 
items  cannot  be  inserted  at  arbitrary 
places  within  the  tree;  instead,  new 
items  must  be  added  to  the  tree  in 
such  a  way  that  the  tree  remains 
sorted. 

In  the  insertion  procedure  shown 
in  Example  4,  below,  the  scan  is 
passed  to  identify  the  tree  in  which 
the  item  is  to  be  inserted.  The  tree  is 
traversed  (not  sequentially!)  until  the 
proper  insertion  point  for  the  item  is 


void  GoFwd ( s ) 
struct  Scan  *s ; 

{ 

if  (  s  —  >child  —  >rlink  !=  s  — >parent) 


GoRight( s ) ; 

while  ( s  — >child  — >llink  !=  s  — >parent) 
GoLeftf  s ) ; 

} 

else 


found.  The  item  is  then  inserted  as  a 
leaf  of  the  tree,  with  no  children  of 
its  own.  First,  the  appropriate  link 
field  of  the  item's  parent  is  set  to 
point  to  the  new  item  (using  XOR), 
and  then  the  item's  links  are  set  to 
point  back  to  its  parent.  Finally,  the 
scan  is  returned  with  the  new  item 
■>s  the  current  active  child. 

Deleting  an  Item 

Deleting  an  item  from  a  binary  tree  is 
much  more  complicated  because  the 
resulting  items  must  still  maintain 
the  binary  structure  and  sorted  ar¬ 
rangement  of  the  tree  before  the 
item  was  deleted. 

In  my  implementation,  if  the  item 
has  only  one  child,  then  I  delete  it  by 
pointing  its  parent  to  the  nonempty 
subtree,  and  vice  versa.  If  the  item 
has  two  children,  I  delete  the  item  by 
removing  it  and  replacing  it  with  the 
greatest  item  in  the  left  subtree.  This 
item  will  have  at  most  one  child  (a 
left  child),  and  so 
the  item  can  be  re¬ 
positioned  with  a 
minimum  of  ma¬ 
nipulation  of  the 
link  fields.  Thus, 
the  procedure 
used  here  must 
handle  four  cases: 


while  (  !AtRoot(s)  SS  !IsLeft(s)) 
GoParent ( s ) ; 
if  (  ! AtRoot ( s  ) ) 

GoParent ( s ) ; 


void  GoBak(  s ) 
struct  Scan  *s  ; 

{.. 

if  (  s  — >child  — >llink  !=  s  — >parent) 

'{ 


>rlink  !=  s  — >parent) 


GoLef t ( s  )  ; 
while  (  s  >  chi  Id  - 
GoRight( s ) ; 

} 

else 


while  (  ! AtRoot ( s )  EE IsLeft(s) ) 
GoParent ( s ) ; 
if  (  ! AtRoot ( s ) ) 

GoParent ( s ) ; 


1.  The  item  to  be 
deleted  has  no 
children  (an  item 
has  no  children  if 


its  link  fields  are  equal  to  each  other 
because  both  would  point  back  to  the 
item's  parent). 

2.  The  item  has  a  right  child  but  no 
left  child  (an  item  has  no  left  child  if 
the  left  link  equals  the  pointer  to  the 
item's  parent). 

3.  The  item  has  a  left  child  but  no 
right  child. 

4.  The  item  has  both  a  left  and  a  right 
child.  There  are  two  possibilities: 

a.  If  the  left  child  has  no  right  child  of 
its  own,  then  the  left  child  replaces 
the  item  and  picks  up  the  item’s  right 
subtree. 

b.  If  the  left  child  has  a  right  child, 
you  travel  down  through  right  chil¬ 
dren  as  far  as  you  can  go  to  locate  the 
greatest  item  less  than  the  item  to  be 
deleted. 

These  cases  are  more  difficult  than 
usual,  even  for  a  tree  delete  routine, 
because  not  only  must  parent  point¬ 
ers  be  adjusted  to  point  to  their  new 
children  but  also  child  pointers  must 
be  adjusted  to  point  back  to  their 
new  parents.  This  means  testing  to 
make  sure  the  children  exist  before 
accessing  their  link  fields. 

When  the  deletion  procedure  (see 
Listing  One,  page  66)  is  called,  the 
node  to  be  deleted  is  pointed  to  by 
s->child.  The  scan  is  set  back  to  the 
parent  (by  a  call  to  GoParent)  before 
the  routine  returns  so  that  no  scan  is 
ever  returned  with  a  NULL  child  (be¬ 
cause  the  child  item  of  the  scan  is  the 
item  being  visited).  In  fact,  GoParent 


} 


Example  3:  Moving  backward  and  forward  within  the 
tree's  list 


void  Insert ( i , s ) 
struct  Item  *i ; 
struct  Scan  *s ; 

{ 

ToRoot(S)  ; 

while  (  s  — >child  !=  NULL) 

if  (  i  —  >key  <  s  —  >  chi  Id  —  >key ) 
GoLef t ( s ) ; 
else 

GoRight ( s  ) ; 

if  (  i  — >key  <  s  — >parent  — >key) 
s  —  >parent  —  >11  ink  t  =  i  ; 
else 

s  —  > parent  —  >rlink  f  =  i; 


i  — >llink  =  s- 
i  — >rlink  =  s- 
s  — >child  =  i  ; 


>parent  ; 
>parent ; 


Example  4:  The  insertion  procedure 


Dr.  Dobb's  Journal,  September  1987 

674 


39 


XOR  CHAIN 

(continued  from  page  39) 


is  called  early  in  the  procedure,  be¬ 
fore  any  pointers  are  rearranged. 

Conclusion 

As  with  any  arbitrary  binary  tree 
system,  it  is  possible  for  the  tree  to 
become  quite  unbalanced  after  sev¬ 
eral  insertions  or  deletions,  degrad¬ 
ing  the  performance  of  the  system.  A 
good  question  is  whether  the  same 
idea  could  be  woven  into  an  AVL 
height-balanced  tree  system  such  as 
that  discussed  by  Allen  Holub  (DDJ, 
August  1986,  page  20).  The  answer  is 


no,  not  in  the  form  presented  here. 
My  routines  rely  heavily  on  the  fact 
that  the  key  of  any  item  is  strictly 
greater  than  the  key  of  its  left  child. 
This  cannot  be  guaranteed  in  an  AVL 
tree  if  nonunique  keys  are  allowed. 
In  that  case,  it  would  be  necessary  to 
use  the  alternative  approach  I  men¬ 
tioned  earlier:  code  a  bit  in  each  item, 
marking  it  as  either  a  left  child  or  a 
right  child.  With  this  modification, 
height  balancing  is  possible. 

Another  interesting  implementa¬ 
tion  would  involve  B-trees  and  other 
generalized  tree  structures.  Such  im¬ 
plementations  are  possible  with 
unique  keys,  or  with  a  few  code  bits 


added  to  each  tree  item,  and  can  be 
coded  in  much  the  same  way  as  the 
examples  shown  here. 

Finally,  I  should  say  a  few  words 
about  another  nonrecursive  travers¬ 
al  method — namely,  threaded  trees. 
The  main  problem  with  this  tech¬ 
nique  is  maintaining  the  threads  as 
items  are  added  to  and  deleted  from 
the  tree.  By  their  very  nature, 
threaded  links  are  links  to  items  that 
usually  are  not  adjacent  to  the  item 
being  added  or  deleted,  and  so  there 
are  very  few  easy  cases  in  the  add 
and  delete  algorithms.  I  will  leave  it 
to  you  to  determine  which  has  the 
more  substantial  overhead. 

A  wise  instructor  once  said,  "Nine¬ 
ty-nine  percent  of  all  programs  using 
recursion  could  be  written  more  ef¬ 
fectively  without  it."  The  XOR  chain 
technique  makes  it  possible  to  apply 
this  maxim  to  tree  structures  as  well. 
With  a  little  thought  and  creativity, 
you  should  be  able  to  customize 
these  ideas  to  fit  your  needs. 

Availability 

All  the  source  code  for  articles  in  this 
issue  is  available  on  a  single  disk.  To 
order,  send  $14.95  to  Dr.  Dobb's  Jour¬ 
nal,  501  Galveston  Dr.,  Redwood  City, 
CA  94063  or  call  (415)  366-3600,  ext. 
215.  Please  specify  the  issue  number 
and  format  (MS-DOS,  Macintosh, 
Kaypro). 

Bibliography 

Holub,  Allen.  "C  Chest:  Nonrecursive 
Tree  Traversal."  DDJ  117  (July  1986): 
18-20.  There’s  a  bug  fix  in  DDJ  123 
(January  1987):  103. 

Holub,  Allan.  "C  Chest.”  DDJ  118  (  Au¬ 
gust  1986):  20-29. 

Knuth,  Donald  E.  The  Art  of  Comput¬ 
er  Programming,  Volume  3:  Sorting 
and  Searching.  Reading,  Mass.:  Addi- 
son-Wesley,  1973. 

Singh,  Bhagat;  and  Naps,  Thomas  L. 
Introduction  to  Data  Structures.  St. 
Paul,  Minn.:  West  Publishing  Co., 
1985. 

DDJ 


(Listing  begins  on  page  62.) 

Vote  for  your  favorite  feature/article. 
Circle  Reader  Service  No.  3. 


40 


Dr.  Dobb's  Journal,  September  1987 

675 


ARTICLES 


Writing  MS-DOS  Device 

Drivers  in  C 


The  MS-DOS/PC-DOS  installable 
device  driver  facility  has 
been  available  since  Version 
2.0  and  allows  you  to  add  extra  de¬ 
vices  to  your  system  without  making 
any  modifications  to  DOS  itself.  You 
specify  devices  to  be  added  in  an 
ASCII  file  called  CONFIG.SYS  using  the 
command  de vice  =  xxxxxxxx.yyy . 
Installable  device  drivers  allow 
hardware  independence  as  hard- 
ware-specific  code  is  isolated  in  the 
driver. 

This  article  shows  how  to  write 
the  functions  that  implement  a  de¬ 
vice  in  C.  To  do  this,  I  discuss  the  for¬ 
mat  of  a  device  driver  and  how  to 
create  a  driver  in  this  format  using 
(mostly)  object  code  produced  by  a  C 
compiler.  Some  of  the  material  is 
compiler  specific  and  pertains  to  Az¬ 
tec  C,  but  it  should  be  portable  to 
other  C  compilers  (even  to  other 
high-level  languages).  The  example 
device  driver  I  present — prndrv.asm 
in  Listing  One,  page  68— is  a  simple 
one  that  implements  a  parallel  print¬ 
er  and  replaces  the  standard  PRN  de¬ 
vice.  I've  chosen  a  simple  device  de¬ 
liberately  so  that  I  can  concentrate 
on  how  to  construct  drivers  in  C  in¬ 
stead  of  on  the  details  of  a  specific 
device  and  at  the  same  time  provide 
a  real,  working  device  driver  that 
you  can  modify  and  experiment 
with  on  your  PC. 

Request  Header 

DOS  communicates  with  its  device 
drivers  through  a  packet  called  a  re¬ 
quest  header,  which  is  a  formatted 


Andy  Klein,  3801  N.  16th  St.  #222, 
Phoeniy,  AZ  85106.  Andy's  company, 
Micro  Quantitative  Sciences,  develops 
software  interfaces  for  the  IBM  PC  and 
peripheral  devices. 


by  Andy  Klein 


An  alternative 
to 

assembly  language 


block  of  system  memory  that  con¬ 
tains  the  command  code  that  the 
driver  is  to  perform,  a  status  word 
that  the  driver  uses  to  report  the  suc¬ 
cess  or  failure  of  the  operation  back 
to  DOS,  and  the  length  of  the  packet. 
The  header  is  followed  by  data  need¬ 
ed  for  the  operation,  which  varies 
depending  on  the  operation  to  be 
performed.  For  my  driver,  this  vari¬ 
able  area  can  be  considered  fixed, 
containing  the  segment,  offset,  and 
number  of  bytes  to  transfer.  Req_hdr 
is  a  C  structure  that  is  typedefed  in 
rh.h  (Listing  Two,  page  71)  and  is 
used  by  the  driver  implementation 
functions  to  access  the  request  head¬ 
er  data.  An  important  task  you  need 
to  accomplish  is  to  make  the  request 
header  addressable  by  C  compiled 
functions  (discussed  later). 

When  DOS  calls  a  driver  to  perform 
a  task,  it  does  so  in  two  separate 
stages.  The  first  stage  is  referred  to  as 
the  device  strategy  and  occurs  when 
the  device  is  passed  a  long  (seg- 
ent:offset)  pointer  to  the  request 
header  in  the  es:by  register  pair.  The 
device  does  not  perform  the  request 
at  this  stage — it  just  saves  the  pointer 
to  the  request  header.  The  next  stage 
is  called  the  device  interrupt.  It  re¬ 
ceives  no  parameters;  instead,  the  in¬ 
terrupt  function  retrieves  the  previ¬ 
ously  saved  pointer  to  the  request 
header  and  then  performs  the  opera¬ 
tion  indicated  by  the  command  code 


in  the  header.  It  is  the  interrupt  func¬ 
tion  that  actually  does  the  work  of 
the  device,  and  it  also  has  the  task  of 
setting  things  up  so  that  you  can  call 
C  compiled  functions.  DOS  always 
calls  the  strategy  function,  then  im¬ 
mediately  calls  the  interrupt  func¬ 
tion.  Following  this  pattern,  device 
drivers  have  two  entry  points,  called 
the  strategy  and  interrupt  routines. 
These  entry  points  are  implemented 
in  prndrv.asm  as  labels  dev^strategy 
and  dev— interrupt. 

Format  of  a  Driver 

DOS  device  drivers  must  be  in  a  spe¬ 
cific  format  in  order  to  be  incorpo¬ 
rated  into  MS-DOS.  This  format  differs 
from  that  of  a  .COM  or  .EXE  file.  Driv¬ 
ers  must  start  with  a  device  header 
that  starts  at  offset  Oh  from  the  start 
of  the  file,  and  all  the  logical  seg¬ 
ments  of  the  driver  must  be  in  one 
physical  segment,  such  as  a  .COM  file, 
because  the  driver  is  simply  loaded 
into  memory  by  DOS — it  will  not  get 
any  of  the  segment  fix-ups  that  an 
.EXE  file  receives  when  loaded.  The 
largest  hurdle  you’ll  face  when  writ¬ 
ing  your  first  device  driver  is  to  get  it 
to  load  at  boot  time  without  hanging 
up  the  computer.  Once  you  get  past 
this  point,  the  rest  is  cake. 

Device  Header 

The  device  header’s  purpose  is  to 
provide  DOS  with  the  attributes  of  the 
device,  the  offsets  of  the  strategy  and 
interrupt  entry  points  into  the  driv¬ 
er,  and  the  name  of  the  device  or  the 
number  of  units  it  controls.  The 
fields  of  the  header  are: 

•  pointer  to  next  header  (4  bytes) 

•  attribute  (2  bytes) 

•  pointer  to  strategy  routine  (2  bytes) 

•  pointer  to  interrupt  routine  (2  bytes) 


44 

676 


Dr.  Dobb's  Journal,  September  1987 


•  name  or  number  of  units  (8  bytes) 

The  pointer  to  next  header  is  a  4- 
byte  field  that  DOS  uses  to  store  a 
pointer  to  the  next  driver  in  the 
chain  of  all  drivers.  These  four  bytes 
should  be  filled  with  Oxff  in 
prndrv.asm  to  indicate  to  DOS  that 
there  is  only  one  driver  in  this  file. 

The  attribute  field  is  bit-mapped  as 
shown  in  Table  1,  page  below.  My 
example  driver  is  a  character  device 
and  has  bit  15  set;  the  others  are  all 
Os. 

The  name  or  number  of  units  de¬ 
pends  on  the  driver  type.  If  the  driv¬ 
er  is  a  character  device,  then  this 
field  is  an  eight-character  name,  left- 
justified  and  padded  with  spaces. 
The  name  assigned  here  will  be  the 
name  DOS  uses  for  the  device.  When 
character  devices  are  loaded  at  boot 
time,  installable  devices  are  linked 
into  the  chain  of  all  drivers  before 
the  default  or  built-in  character  driv¬ 
ers.  When  DOS  searches  this  linked 
list  for  a  named  driver,  it  stops  on  the 
first  match,  so  if  you  use  the  name  of 
a  default  driver  as  the  name  of  your 
driver,  you  replace  the  default  driv¬ 
er  with  your  own.  This  is  exactly 
what  the  example  driver  does,  re¬ 
placing  the  default  PRN  device  with 
my  PRN  device. 

Note  that  there  is  one  exception  to 
this  rule — the  NUL  device  cannot  be 
replaced  because  it  is  the  starting 
point  of  the  linked  list  of  devices. 
Block  devices  do  not  have  names;  in¬ 
stead,  they  are  assigned  drive  letters 
by  DOS  in  the  order  in  which  they  are 
loaded.  Unlike  character  device  driv¬ 
ers,  the  default  block  drivers  are 
loaded  first  and  cannot  be  replaced. 
A  block  driver  can  control  multiple 
subunits,  and  the  number  of  units 
controlled  is  entered  in  this  field  for 
block  devices  with  the  last  7  bytes  of 
the  field  filled  with  spaces.  The  num¬ 
ber  of  subunits  can  be  overridden  by 
the  initialization  function  for  block 
devices.  An  example  of  the  use  of 
subunits  is  a  fixed  disk  controller  that 
can  have  two  fixed  disk  drives  at¬ 
tached. 

Device  Driver  Start-Up  Code 

In  order  to  write  the  functions  that 
implement  a  device  in  C,  a  few  spe¬ 
cial  things  have  to  be  done.  First,  be¬ 
fore  any  C  compiled  functions  are 
called,  the  environment  (segment 


registers  and  stack  pointer)  must  be 
set  up  to  correspond  with  what  the 
compiler-generated  code  assumes. 
Normally  this  setup  is  done  by  the 
start-up  code  that  is  provided  with 
the  compiler,  which  for  Aztec  C  us¬ 
ing  the  small  model  is  in  the  Aztec 
function  sbegin.asm.  All  Aztec  C 
compiled  functions  make  an  exter¬ 
nal  reference  to  a  function  called 
$begin,  causing  the  linker  to  drag  it  in 
from  the  standard  library  c.lib.  $be- 
gin  in  turn  makes  an  external  refer¬ 
ence  to  Croot,  which  in  turn  refer¬ 
ences  main.  In  the  normal  case, 
$begin  sets  up  the  environment  and 


Installable 
character  devices 
are  linked  into  the 
chain  before  the 
default  drivers. 


calls  Croot,  which  then  calls  main. 
Other  compilers  follow  the  same  pat¬ 
tern. 

Much  of  what  is  done  in  the  com¬ 
piler  start-up  functions  does  not  have 
to  be  replicated  in  the  driver  start-up 
code,  including  parsing  argvt  ),  allo¬ 
cating  a  heap,  getting  access  to  the 
DOS  environment  variables,  and 
other  initializations  that  are  required 
for  some  library  functions.  There  is  a 
penalty  exacted  for  this  approach, 
and  it  is  that  your  C  functions  cannot 


call  library  functions  that  require 
some  setup  that  you  have  not  done — 
notably,  mallocf  ),  free(  ),  and  all 
those  functions  pertaining  to  I/O  are 
off  limits.  Also,  variables  declared 
outside  functions  must  be  initialized 
to  some  value.  An  essential  part  of 
the  compiler’s  start-up  that  must  be 
incorporated  into  the  device  driver 
start-up  code  is  establishing  a  stack 
frame. 

In  the  example  driver,  the  start-up 
tasks  are  accomplished  by  replacing 
the  normal  compiler  start-up  code 
with  the  device  interrupt  function. 
This  function  takes  care  of  setting  up 
the  segment  register  to  what  Aztec  C 
expects.  There  is  no  mainf  )  because 
the  entry  point  is  the  label  dev^inter- 
rupt  in  prndrv.asm.  It  is  necessary  to 
provide  a  function  called  $begin  to 
prevent  the  linker  from  reporting  an 
error,  so  prndrv.asm  has  a  function 
of  this  name  but  it  is  never  called. 
Code  generated  by  earlier  versions  of 
Aztec  C  also  has  a  reference  to  a  func¬ 
tion  called  $cswt,  which  is  also  in 
prndrv.asm.  Users  of  other  compil¬ 
ers  will  have  to  replace  these  func¬ 
tions  with  whatever  their  compiler 
demands.  If  the  source  for  the  start¬ 
up  function  for  your  compiler  is 
available,  it  is  fairly  straightforward. 
(Compiler  publishers  that  don't  make 
this  source  available  significantly 
limit  the  applicability  of  their  prod¬ 
ucts  for  serious  system  develop¬ 
ment.)  You  can  discover  much  of 
what  your  compiler  does  by  compil¬ 
ing  a  few  functions  and  getting  the 
compiler  to  output  the  assembly-lan¬ 
guage  result  of  the  compilation. 

Device  Strategy  Function 

The  strategy  function  is  called  with  a 
long  pointer  to  a  request  header  in 
the  es:bg  register  pair.  The  strategy 
function  does  not  perform  the  re- 


bit 

15 

1  “character  device, 

0“  block  device 

bit 

14 

1  “supports  ioctl, 

0- no  ioctl 

bit 

13 

1  “non-iBM  format, 

0=ibm  format  (block  device) 

1  “supports  output  until  busy , 

0= doesn’t  (char  dev) 

bit 

12 

0  (reserved  by  Microsoft) 

bit 

11 

1  “supports  removable  media 

(block  device  only) 

bits 

10 

-5  reserved,  should  be  0 

bit 

3 

1  “dock  device, 

0= not  clock  device 

bit 

2 

1=  nul  device, 

0=not  nul  device 

bit 

1 

1  =^5tdout  device, 

Q=not  stdout  device 

bit 

0 

1  “sW/n  device, 

0“  not  stdin  device 

Tnble  1:  Bit  mapping  of  the  attribute  field 


Dr.  Dobb's  Journal,  September  1987 


45 

677 


DEVICE  DRIVERS 

(continued  from  page  45) 


quested  function;  it  just  saves  the 
pointer  to  the  request  header  and 
then  returns.  The  request  header 
pointer  segment  is  stored  in  the  code 
segment  variable  req—hdr—seg,  and 
the  offset  is  stored  in  the  code  seg¬ 
ment  variable  req—hdr—off  (see  the 
listing). 

Device  Interrupt  Function 

The  device  interrupt  function  is 
where  the  command  stored  in  the  re¬ 
quest  header  is  carried  out.  Although 
this  function  is  called  the  interrupt 
function,  DOS  invokes  it  as  a  far  call  as 
opposed  to  a  processor  interrupt. 
This  function  has  seven  crucial  tasks 
to  perform,  some  of  which  replace 
the  C  compiler's  start-up  code.  The 
listing  of  prndrv.asm  has  the  sections 
marked  as  ;STEP  0,  ;STEP  I,  and  so  on, 
where: 

•  STEP  0  saves  the  machine  state.  On 
entry,  the  interrupt  function  saves 
the  machine  state  by  pushing  all  the 
registers  onto  DOS’  stack,  which  has 
enough  space  to  have  the  registers 
pushed  onto  it.  Then  the  DOS  stack 
frame — the  ss  and  sp  registers — is 
saved  in  the  code  segment  variables 
caller— ss  and  caller— sp. 

•STEP  1  gets  the  driver's  segment. 
The  first  step  in  establishing  ad¬ 
dressability  is  to  get  the  driver's  seg¬ 
ment  address  and  hold  it  in  the  ay 
register.  Remember  that  the  driver’s 
segments  are  all  in  one  physical  seg¬ 
ment,  so  when  the  driver  is  called, 
the  value  of  the  cs  register  is  the  seg¬ 
ment  value  for  all  segments. 

•  STEP  2  makes  the  request  header  ad¬ 
dressable.  The  long  pointer  to  the  re¬ 


quest  header  was  previously  saved 
in  the  strategy  function.  Now  the  in¬ 
terrupt  function  uses  this  informa¬ 
tion  to  copy  it  into  the  private  re¬ 
quest  header. 

•  STEP  3  finishes  establishing  ad¬ 
dressability.  The  interrupt  function 
moves  the  segment  address  of  the 
driver  from  ay  into  the  ds,  es,  and  ss 
registers;  gets  the  offset  of  the  private 
stack  and  puts  it  into  sp  to  establish 
the  stack  frame;  and  finally,  sets  the 
bp  register  equal  to  the  sp  register  as 
this  condition  is  needed  to  call  C 
functions  that  rely  heavily  on  bp. 

•  STEP  4  calls  the  first  C  function.  Now 
the  interrupt  function  is  ready  to  call 
the  first  C  function,  which  will  in 
turn  call  others  to  perform  the  re¬ 
quested  task.  It  calls  a  function 
named  driver— functions! cmd—code ) 
and  passes  the  command  code  from 
the  request  header  as  a  parameter. 
Calling  a  C  function  from  assembler 
requires  pushing  the  parameters 
onto  the  stack,  calling  the  function, 
and  removing  the  parameters  from 
the  stack. 

•  STEP  5  updates  DOS’  request  header. 
The  task  DOS  wanted  the  driver  to 
perform  has  finished.  The  status  of 
the  operation  is  stored  in  the  address¬ 
able  copy  of  the  request  header 
along  with  any  function-specific  re¬ 
turn  info.  In  order  for  DOS  to  get  this 
info,  the  request  header  must  be  cop¬ 
ied  back  to  the  request  header  that 
DOS  gave  a  pointer  to  way  back  in  the 
strategy  function. 

•  STEP  6  cleans  up.  To  clean  up,  the 
original  DOS  stack  frame  that  was 
saved  in  caller— ss  and  caller— sp  must 
be  restored,  then  the  registers  push¬ 
ed  onto  DOS'  stack  must  be  popped. 
Finally,  the  interrupt  function  re¬ 
turns,  and  its  tasks  are  finished. 


Implementing  a  Device 

Once  you’ve  completed  the  DOS  de¬ 
vice  start-up  code,  you  need  to  write 
the  functions  that  actually  imple¬ 
ment  the  device.  Each  device  will 
have  its  own  unique  requirements, 
and  the  implementor  must  be  on  inti¬ 
mate  terms  with  the  device  to  be  con¬ 
trolled.  The  device  presented  here  is 
extremely  simple  as  devices  go — be¬ 
ing  a  printer  it  is  a  character  device 
that  has  output  functions  only.  The 
start-up  code  does  not  change  (except 
for  the  attribute  word  and  device 
name,  both  in  the  device  header) 
with  the  complexity  of  the  specific 
device,  however. 

After  the  start-up  code  is  set  up  for 
calling  C  compiled  functions,  it  calls 
driver— functionsfcmd— code ),  pass¬ 
ing  the  command  code  from  the  re¬ 
quest  header.  Driver— functions(  )  is 
in  drvfunc.c  (Listing  Three,  page  72). 
The  function  driver— functions!  )  is 
implemented  as  a  finite  state  ma¬ 
chine  that  calls  the  device  implemen¬ 
tation  functions  indirectly.  Function- 
— table  is  declared  in  drvfunc.c  as  an 
array  of  pointers  to  functions  return¬ 
ing  void,  and  this  array  is  initialized 
to  contain  pointers  to  those  functions 
that  this  device  implements.  Slots  for 
unused  functions  are  initialized  to  a 
pointer  to  bad—cmd( ),  a  function 
that  sets  the  request  header  status 
word  to  indicate  an  invalid  com¬ 
mand  (see  error .c,  Listing  Four,  page 
74).  The  command  code  is  the  state 
that  selects  which  function  to  call. 

For  those  not  familiar  with  an  ar¬ 
ray  of  pointers  to  functions,  here's  an 
explanation:  Each  element  in  the  ar¬ 
ray  is  a  pointer  to,  or  the  address  of,  a 
function  that  (for  an  8086  small  mod¬ 
el)  is  an  offset  into  the  code  segment. 
The  index  into  the  array  determines 
which  address  to  call,  just  as  the  in¬ 
dex  into  an  array  of  integers  deter¬ 
mines  which  integer  to  access.  If  you 
have  the  address  of  a  function,  you 
can  call  it  indirectly  as  (void )(*  func¬ 
tion— pointer)!  and  by  extending 
this  to  an  array,  you  get  ( void )(*  func¬ 
tion— pointer— array /index] )(  );■ 

There’s  nothing  really  esoteric 
about  it — assembly-language  pro¬ 
grammers  call  it  a  jump  table.  This 
construct  produces  efficient  code, 
which  is  what  you  want  in  a  driver. 


48 

678 


Dr.  Dobb's  Journal,  September  1987 


The  alternative  is  a  switch  statement 
or  a  series  of  if. .  .  else  pairs  that  com¬ 
pile  into  a  lot  of  compare/jump  in¬ 
structions.  Note  that  all  the  functions 
to  which  pointers  are  placed  in  the 
array  must  take  the  same  number  of 
parameters  or  chaos  will  result.  In  C, 
a  function  name  (without  the  paren¬ 
theses)  evaluates  to  a  pointer  to  that 
function. 

All  DOS  device  drivers  must  imple¬ 
ment  an  initialization  function — 
called  only  once  when  the  driver  is 
first  loaded.  Mine  is  called  initf  )  and 
is  found  in  init.c  (Listing  Five,  page 
74).  The  initialization  function  for  a 
character  device  must  return  to  DOS  a 
long  (segment:offset)  pointer  to  the 
end  of  the  driver  and  set  the  done  bit 
in  the  status  word.  The  ending  ad¬ 
dress  of  the  driver  is  the  segment  ad¬ 
dress  obtained  by  a  call  to  showiest  ) 
(in  show_cs.asm,  Listing  Six,  page  75) 
and  a  short  pointer  (offset  only)  to  an 
unsigned  integer  called  _t/encf. 
_ Mend  is  inserted  by  the  Aztec  link¬ 


er,  In,  at  the  end  of  the  uninitialized 
data  segment.  Those  using  other 
compilers/linkers  need  to  replace 
this  by  creating  another  module 
called  end.c  or  whatever  with  a  glob¬ 


al//  device  drivers 
must  include  an 
initialization 
function. 


al  unsigned  called  _t/encf  and  make 
this  the  last  module  in  the  list  when 
you  link.  My  driver  also  displays  a 
message  on  the  screen,  resets  the 
parallel  port,  and  sets  the  printer 
font  to  elite  (12  characters/inch).  For 
a  different  device,  the  initialization 


will  be  quite  different.  If  this  were  a 
block  device  driver,  in  addition  to  all 
this,  the  initialization  function  would 
have  to  set  the  number  of  units  and 
return  a  long  pointer  to  an  array  of 
BPBs  (BIOS  parameter  blocks). 

Being  a  printer,  the  primary  func¬ 
tion  this  device  performs  is  outputf  ), 
found  in  output. c  (Listing  Seven, 
page  75).  Here  you  get  from  the  re¬ 
quest  header  the  number  of  bytes  to 
transfer  and  the  segment:offset  at 
which  you  find  the  first  byte.  For 
each  character,  the  output  function 
gets  the  character,  sends  it  to  the  par¬ 
allel  port  by  calling  char-2-prn(  )  (in 
prlport.asm,  Listing  Eight,  page  77), 
checking  for  an  error  return  status, 
and  finally  incrementing  the  trans¬ 
fer  offset  and  character  transferred 
count.  If  an  error  condition  is  re¬ 
turned  by  char  Z  ,prn(  ),  the  error 
handler  function  error(  )  is  called; 
otherwise,  when  you  have  finished, 
you  set  the  done  bit  in  the  request 
header  status  word.  Again,  this  is  a 
simple  device,  but  you  apply  the 
same  technique  to  develop  drivers 
for  more  complex  devices. 


Return  Info 

Drivers  return  status  information  to 
DOS  via  the  status  word  in  the  request 
header.  This  is  represented  by  an  un¬ 
signed  integer  called  status  in  the  re¬ 
quest  header  structure  req—hdr  (see 
rh.h).  Bit  15  is  the  error  bit,  and  it  is 
set  to  1  if  the  driver  needs  to  report 
an  error  condition  to  DOS,  with  the  8 
least  significant  bits  holding  the  error 
code.  Code  to  implement  an  error 
condition  is  in  error.c,  and  for  my 
printer  driver,  the  only  possible  er¬ 
ror  codes  are  those  that  are  returned 
by  the  parallel  port  BIOS.  This  func¬ 
tion  is  called  only  when  errors  occur; 
otherwise,  the  driver  functions  set 
bit  8,  the  done  bit,  to  1  indicating  to 
DOS  that  the  device  has  completed  its 
task  successfully. 

Linking 

Linking  the  driver  is  different  from 
linking  a  normal  C  program.  You 
must  make  all  the  logical  segments 
fall  into  the  same  physical  segment, 
and  the  device  header  must  be  at  off¬ 
set  0  in  the  file.  One  of  the  reasons 
why  I  use  Aztec  C  is  because  its  link¬ 


er,  In,  can  accomplish  all  the  preced¬ 
ing  tasks  in  response  to  a  few  com¬ 
mand-line  arguments,  making  it  a 
useful  and  powerful  tool  for  dealing 
with  the  8086  architecture. 

To  link  the  driver  using  In,  you 
type: 

In  -t  -b  0  -c  0  -o  prndrv.com 
prndrv.o  [list  of  files  and  libraries] 

where  -t  saves  the  symbol  table  in 
prndev.sym,  -b  0  sets  the  base  ad¬ 
dress  to  Oh  (same  as  ORG  0),  -c  0 
makes  the  code  segment  start  at  off¬ 
set  Oh  in  the  physical  segment,  and 
-o  prndrv.com  means  that  the  output 
file  is  prndrv.com  (specifying  an  ex¬ 
tension  of  .COM  causes  all  logical  seg¬ 
ments  to  be  in  one  physical  segment). 

Again,  users  of  other  compilers/ 
linkers  will  have  to  adapt  this  to  their 
systems.  Hint:  Try  using  a  GROUP 
statement  in  prndrv.asm  to  cause  the 
code  and  data  segments  to  be  joined 
into  one  physical  segment,  and  re¬ 
member  to  take  the  offset  into  the 
group  instead  of  into  the  segment 
when  using  the  offset  operator. 


Known  Shortcomings 

All  variables  declared  outside  a  func¬ 
tion  must  be  initialized  to  some  val¬ 
ue — they  cannot  be  assumed  to  be 
initialized  to  0,  and  if  they  are  not  ini¬ 
tialized,  the  scheme  gets  buggy.  Driv¬ 
ers  written  in  C  tend  to  be  large — 
probably  all  those  references  to  bp. 

Availability 

All  the  source  code  for  articles  in  this 
issue  is  available  on  a  single  disk.  To 
order,  send  $14.95  to  Dr.  Dobb's  Jour¬ 
nal,  501  Galveston  Dr.,  Bedwood  City, 
CA  94063,  or  call  (415)  366-3600,  ext. 
216.  Please  specify  the  issue  number 
and  format  (MS-DOS,  Macintosh, 
Kaypro). 

DDJ 

(Listings  begin  on  page  68.) 

Vote  for  your  favorite  feature/article. 

Circle  Reader  Service  No.  4. 


679 


_ FILE  COMPARISONS 

Listing  One  (Text  begins  on  page  28. ) 

/* 

**  Copyright  (c)  1987,  Tom  Steppe.  All  rights  reserved. 

★  * 

**  This  module  compares  two  arrays  of  lines  (representing 
**  files)  and  reports  the  sequences  of  consecutive  matching 
**  lines  between  them  using  the  "recursive  longest  matching 
**  sequence"  algorithm.  This  is  useful  for  implementing  a 
**  file  comparison  utility. 

**  Compiler:  Microsoft  (R)  C  Compiler  Version  4.00 
*/ 


♦include  <stdio.h> 

#include  <ctype.h> 

#include  <string.h> 

#include  <malloc.h> 

/*  Boolean  type  and  values.  */ 
typedef  int  BOOLEAN; 

♦define  TRUE  1 

♦define  FALSE  0 

/*  Minimum  macro.  */ 

♦define  min(x,  y)  ( ( (x)  <=  (y) )  ?  (x)  :  (y) ) 

/*  Value  to  indicate  identical  strings  with  strcmp.  */ 
♦define  ALIKE  0 


/*  Result  of  hashing  function  for  a  line  of  text.  */ 
typedef  unsigned  int  HASH; 

/*  Mask  for  number  of  bits  in  hash  code.  (12  bits) .  */ 
♦define  MASK  (unsigned  int)  OxOFFF 

/*  Number  of  possible  hash  codes.  */ 

♦define  HASHSIZ  (MASK  +  1) 

/*  Information  about  an  entry  in  a  hash  table.  */ 
typedef  struct  tblentry 
{ 

int  frst;  /*  First  line  #  with  this  hash  code.  */ 
int  last;  /*  Last  line  ♦  with  this  hash  code.  */ 

)  TBLENTRY; 

/*  Information  about  a  line  of  text.  */ 
typedef  struct  lineinf 
{ 

HASH  hash;  /*  Hash  code  value.  */ 

int  nxtln;  /*  Next  line  with  same  hash  (or  0) .  */ 

)  LINEINF; 


/*  Information  about  a  file.  */ 
typedef  struct  fileinf 
( 

char  **txt;  /*  Array  of  lines  of  text.  */ 

LINEINF  *line;  /*  Array  of  line  info  structs.  */ 

TBLENTRY  *hashtbl;  /*  Hash  table.  */ 

)  FILEINF; 


/*  Function  declarations.  */ 

BOOLEAN  filcmp  (char  **,  int,  char  **,  int,  int); 

BOOLEAN  get_inf  (char  **,  int,  FILEINF  *) ; 

HASH  calc_hash  (char  *) ; 

void  fnd_seq  (FILEINF  *,  int,  int, 

FILEINF  *,  int,  int,  int) ; 

BOOLEAN  chk_hashes  (LINEINF  *,  LINEINF  *,  int); 
int  cnt_matches  (char  **,  char  **,  int) ; 
void  rpt_seq  (int,  int,  int); 


/*********************************************************** 
**  compare  compares  two  arrays  of  lines  and  reports  the 
**  sequences  of  consecutive  matching  lines.  The  zeroth 


54 

680 


Dr.  Dobb's  Journal,  September  1987 


**  element  of  each  array  is  unused  so  that  the  index  into 
**  the  array  is  identical  to  the  associated  line  number. 

** 

**  RETURNS:  TRUE  if  comparison  succeeded. 

**  FALSE  if  not  enough  memory. 

*********************************** *********** *************/ 

BOOLEAN  compare  (al,  nl,  a2,  n2,  lngval) 

char  **al;  /*  (I)  Array  of  lines  of  text  in  #1.  */ 

int  nl;  /*  (I)  Number  of  lines  in  al. 

(Does  not  count  Oth  element.)  */ 
char  **a2;  /*  (I)  Array  of  lines  of  text  in  #2.  */ 

int  n2;  /*  (I)  Number  of  lines  in  a2. 

(Does  not  count  Oth  element.)  */ 
int  lngval;  /*  (I)  "Long  enough"  value.  */ 

{ 

FILEINF  fl;  /*  File  information  for  #1.  */ 

FILEINF  f2;  /*  File  information  for  #2.  */ 

BOOLEAN  rtn;  /*  Return  value.  */ 

/*  Gather  information  for  each  file,  then  compare.  */ 
if  (rtn  = 

(get_inf  (al,  nl,  &fl)  SS  get_inf  (a2,  n2,  & f2 ) ) ) 

{ 

fnd_seq  (sfl,  1,  nl,  Sf2,  1,  n2,  lngval)  ; 

) 

return  (rtn) ; 

) 

/*********************************************************** 

**  get_inf  calculates  hash  codes  and  builds  a  hash  table. 

** 

**  RETURNS:  TRUE  if  get_inf  succeeded. 

**  FALSE  if  not  enough  memory. 

***********************************************************/ 

static  BOOLEAN  get_inf  (a,  n,  f) 

char  **a;  /*  (I)  Array  of  lines  of  text.  */ 

int  n;  /*  (I)  Number  of  lines  in  a.  */ 

FILEINF  *f;  /*  (O)  File  information.  */ 

( 

unsigned  int  size;  /*  Size  of  hash  table.  */ 

register  int  i;  /*  Counter.  */ 

TBLENTRY  *entry;  /*  Entry  in  hash  table.  */ 

/*  Assign  the  array  of  text.  */ 
f->txt  =  a; 

/*  Allocate  and  initialize  a  hash  table.  */ 
size  =  HASHSIZ  *  sizeof  (TBLENTRY); 
if  (f->hashtbl  =  (TBLENTRY  *)  malloc  (size)) 

{ 

memset  ((char  *)  f->hashtbl,  '\0',  size) ; 

) 

else 

{ 

return  (FALSE) ; 

} 

/*  If  there  are  any  lines:  */ 
if  (n  >  0) 

{ 

/*  Allocate  an  array  of  line  structures.  */ 
if  (f->line  =  (LINEINF  *) 

malloc  ((n  +  1)  *  sizeof  (LINEINF  *) ) ) 

( 

/*  Loop  through  the  lines.  */ 
for  (i  =  1;  i  <=  n;  i++) 

( 

(continued  on  nejct  page) 


Dr.  Dobb's  Journal,  September  1987 


55 

681 


_ FILE  COMPARISONS 

Listing  One  (Listing  continued,  text  begins  on  page  28.) 

/*  Calculate  the  hash  code  value.  */ 
f->line [i] .hash  =  calc_hash  (f->txt[i]); 

/*  Locate  the  entry  in  the  hash  table.  */ 
entry  =  f->hashtbl  +  f->line[i] .hash; 

/*  Update  the  linked  list  of  lines  with  */ 

/*  the  same  hash  code.  */ 

f->line [entry->last] .nxtln  =  i; 
f->line [i] .nxtln  =  0; 

/*  Update  the  first  and  last  line  */ 

/*  information  in  the  hash  table.  */ 
if  (entry->frst  —  0) 

{ 

entry->frst  =  i; 

} 

entry->last  =  i; 

) 

> 

else 

< 

return  (FALSE) ; 

) 

) 

else 

( 

f->line  =  NULL; 

} 

return  (TRUE) ; 

) 

/**★** ************ ************************** ****** ********** 

**  calc_hash  calculates  a  hash  code  for  a  line  of  text. 

** 

**  RETURNS:  a  hash  code  value. 

*********************************************************** i 

static  HASH  calc_hash  (buf) 

char  *buf;  /*  (I)  Line  of  text.  */ 

{ 

register  unsigned  int  chksum;  /*  Checksum.  */ 

char  *s;  /*  Pointer.  */ 

HASH  hash;  /*  Hash  code  value.  */ 

/*  Build  up  a  checksum  of  the  characters  in  the  text.  */ 
for  (chksum  =  0,  s  =  buf;  *s;  chksum  A=  *s++) 

{ 

/ 

} 

/*  Combine  the  7-bit  checksum  and  as  much  of  the  */ 

/*  length  as  is  possible.  */ 

hash  =  ( (chksum  &  0x7F)  |  ( (s  -  buf)  «  7) )  &  MASK; 

return  (hash) ; 

) 

/*********************************************************** 

**  Giv.en  starting  and  ending  line  numbers,  fnd_seq  finds  a 
**  "good  sequence"  of  lines  within  those  ranges.  fnd_seq 
**  then  recursively  finds  "good  sequences"  in  the  sections 
**  of  lines  above  the  "good  sequence"  and  below  it. 
***********************************************************/ 

static  void  fnd_seq  (fl,  begl,  endl,  f2,  beg2,  end2,  lngval) 

FILEINF  *fl;  /*  (I)  File  information  for  #1.  */ 

int  begl;  /*  (I)  First  line  #  to  compare  in  #1.  */ 

int  endl;  /*  (I)  Last  line  #  to  compare  in  #1.  */ 


56 

682 


Dr.  Dobb's  Journal,  September  1987 


FILEINF  *f2;  /*  (X) 

int  beg2;  /*  (I) 

int  end2;  /*  (I) 

int  lngval;  /*  (I) 

{ 

LINEINF  *linel; 

LINEINF  *line2; 

register  int  limit; 
int  lnl; 

int  ln2; 

register  int  In; 
BOOLEAN  go; 

int  most ; 

int  mostl; 

int  most2; 

int  cnt; 

int  oldcnt; 

int  n; 

int  ml; 

int  m2; 

/*  Initialize.  */ 
go  =  TRUE; 
linel  =  fl->line; 
line2  =  f2->line; 


File  information  for  #2.  */ 

First  line  #  to  compare  in  #2.  */ 
Last  line  #  to  compare  in  #2 .  */ 
"Long  enough"  value.  */ 

/*  Line  information  ptr  in  #1.  */ 
/*  Line  information  ptr  in  #2.  */ 
/*  Looping  limit.  */ 

/*  Line  number  in  #1.  */ 

/*  Line  number  in  #2.  */ 

/*  Working  line  number.  */ 

/*  Continue  to  loop?  */ 

/*  Longest  possible  seq.  */ 

/*  Longest  possible  due  to  #1.  */ 
/*  Longest  possible  due  to  #2.  */ 
/*  Length  of  longest  seq.  */ 

/*  Length  of  prev  longest  seq.  */ 
/*  Length  of  cur  longest  seq.  */ 
/*  Line  of  longest  seq.  in  #1.  */ 
/*  Line  of  longest  seq.  in  #2.  */ 


/*  Initialize  longest  sequence  information.  */ 
cnt  =0;  /*  Length  of  longest  seq.  */ 

ml  =  begl  -  1;  /*  Line  #  of  longest  seq.  in  #1.  */ 

m2  =  beg2  -  1;  /*  Line  #  of  longest  seq.  in  #2.  */ 

oldcnt  =0;  /*  Length  of  prev  longest  seq.  */ 

/*  Calculate  maximum  possible  number  of  consecutive  */ 

/*  lines  that  can  match  (based  on  line  #  ranges) .  */ 

mostl  =  endl  -  begl  +  1; 
mo st 2  =  end2  -  beg2  +  1; 

/*  Scan  lines  looking  for  a  "good  sequence". 

**  Compare  lines  in  the  following  order  of  line  numbers: 
★  ★ 

**  (1,  1) 

**  (1,  2),  (2,  1),  (2,  2) 

**  (1,  3),  (2,  3),  (3,  1),  (3,  2),  (3,  3) 

**  etc. 

*/ 

for  (lnl  =  begl,  ln2  =  beg2;  TRUE;  lnl++,  ln2++) 

{ 

if  (ln2  <=  end2  -  cnt) 

/*  There  are  enough  lines  left  in  #2  such  that  it  */ 
/*  is  possible  to  find  a  longer  sequence.  */ 

{ 

/*  Determine  the  limit  in  #1  that  both  */ 

/*  enforces  the  order  scheme  and  still  makes  */ 
/*  it  possible  to  find  a  longer  sequence.  */ 
limit  =  min  (lnl  -  1,  endl  -  cnt); 

/*  Calculate  first  potential  match  in  #1.  */ 
for  (In  =  fl->hashtbl [line2 [ln2] .hash] -frst; 

In  S&  In  <  begl;  In  =  linel [In] .nxtln) 

{ 


/*  Loop  through  the  lines  in  #1.  */ 

for  (;  In  &&  In  <=  limit;  In  =  linel [In] .nxtln) 

( 

if  (linel [In] -hash  ==  line2 [ln2] .hash  SS 
linel [In  +  cnt]. hash  == 

Iine2[ln2  +  cnt] .hash  && 

! (In  -  ml  =  ln2  -  m2  && 

In  <  ml  +  cnt  &&  ml  !=  begl  -  1) ) 

/*  A  candidate  for  a  longer  sequence  has  */ 

(continued  on  ne}it  page) 


Dr.  Dobb's  Journal,  September  1987 


FILE  COMPARISONS 


Listing  One  (Listing  continued,  text  begins  on  page  28.) 

/*  been  located.  The  current  lines  */ 

/*  match,  the  current  lines  +  cnt  match,  */ 

/*  and  this  sequence  is  not  a  subset  of  */ 

/*  the  longest  sequence  so  far.  */ 

{ 

/*  Calculate  most  possible  matches.  */ 
most  =  min  (endl  -  In  +  1,  most2) ; 

/*  First  compare  hash  codes.  If  the  */ 

/*  number  of  matches  exceeds  the  */ 

/*  longest  sequence  so  far,  then  */ 

/*  compare  the  actual  text.  */ 

if  (chk_hashes  (linel  +  In, 

line2  +  ln2,  cnt)  SS 
(n  =  cnt_matches  (fl->txt  +  In, 
f2->txt  +  ln2,  most) )  >  cnt) 

/*  This  is  the  longest  seq.  so  far.  */ 

{ 

/*  Update  longest  sequence  info.  */ 
oldcnt  =  cnt; 

cnt  =  n; 

ml  =  In; 

m2  =  ln2 ; 

/*  If  it's  long  enough,  end  the  */ 

/ *  search .  */ 

if  (cnt  >=  lngval) 

( 

break; 


/*  Update  limit,  using  new  count.  */ 
limit  =  min  (lnl  -  1,  endl  -  cnt) ; 


/*  If  it's  long  enough,  end  the  search.  */ 
if  (cnt  >=  lngval) 

{ 

break; 

) 

most2 — ; 


go  =  FALSE;  /*  This  file  is  exhausted.  */ 


/*  Repeat  the  process  for  the  other  file.  */ 
if  (lnl  <=  endl  -  cnt) 

{ 

limit  =  min  (ln2,  end2  -  cnt)  ; 

for  (In  =  f2->hashtbl [linel [lnl] .hash] .frst; 

In  &&  In  <  beg2;  In  =  line2 [In] .nxtln) 

( 


for  (;  In  &&  In  <=  limit;  In  =  line2 [In] .nxtln) 

< 

if  (linel [lnl] .hash  ==  line2 [In] .hash  && 
linel [lnl  +  cnt] .hash  — 
line2 [ln+  cnt] .hash  && 

! (lnl  -  ml  ==  In  -  m2  SS 

lnl  <  ml  +  cnt  SS  m2  !=  beg2  -  1) ) 

{ 

most  =  min  (end2  -  In  +  1,  mostl) ; 


if  (chk_hashes  (linel  +  lnl, 

line2  +  In,  cnt)  SS 


58 

684 


Dr.  Dobb's  Journal,  September  1987 


(n  =  cnt_matches  (fl->t*t  +  lnl, 
f2->txt  +  In,  most))  >  cnt) 

oldcnt  =  cnt; 

cnt  =  n; 

ml  =  lnl; 

m2  =  In; 

if  (cnt  >=  lngval) 

{ 

break; 


limit  =  min  (ln2,  end2  -  cnt); 


if  (cnt  >=  lngval) 

{ 

break; 

) 

mostl — ; 

) 

else  if  ( !go) 

( 

break;  /*  This  file  is  exhausted,  also.  */ 


/*  If  the  longest  sequence  is  shorter  than  the  "long  */ 
/*  enough"  value,  the  "long  enough"  value  can  be  */ 
/*  adjusted  for  the  rest  of  the  comparison  process.  */ 
if  (cnt  <  lngval) 

{ 

lngval  =  cnt; 


if  (cnt  >=  1) 

/*  Longest  sequence  exceeds  minimum  necessary  size.  */ 

( 

if  (ml  !=  begl  SS  m2  !=  beg2  &S  oldcnt  >  0) 

/*  There  is  still  something  worth  comparing  */ 

/*  previous  to  the  sequence.  */ 

{ 

/*  Use  knowledge  of  the  previous  longest  seq.  * 
fnd_seq  (fl,  begl,  ml  -  1, 

f2,  beg2,  m2  -  1,  oldcnt); 


/*  Report  the  sequence.  */ 
rpt_seq  (ml,  m2,  cnt); 

if  (ml  +  cnt  -  1  !=  endl  &&  m2  +  cnt  -  1  !=  end2) 
/*  There  is  still  something  worth  comparing  */ 

/*  subsequent  to  the  sequence.  */ 

{ 

fnd_seq  (fl,  ml  +  cnt,  endl, 

f2,  m2  +  cnt,  end2,  lngval); 


/*********************************************************** 

**  chk_hashes  determines  whether  this  sequence  of  matching 
**  hash  codes  is  longer  than  cnt .  It  knows  that  the  first 
**  pair  of  hash  codes  is  guaranteed  to  match. 

irk 

**  RETURNS:  TRUE  if  this  sequence  is  longer  than  cnt. 

**  FALSE  if  this  sequence  is  not  longer  than  cnt. 

***********************************************************/ 

(continued  on  next  page) 


Dr.  Dobb's  Journal,  September  1987 


FILE  COMPARISONS 


Listing  One  (Listing  continued,  te?ct  begins  on  page  28.) 

static  BOOLEAN  chk_hashes  (linel,  line2,  cnt) 

LINEINF  *linel;  /*  (I)  Line  information  for  #1.  */ 

LINEINF  *line2;  /*  (I)  Line  information  for  #2.  */ 

register  int  cnt;  /*  (I)  Count  to  try  to  exceed.  */ 

{ 

register  int  n;  /*  Count  of  consecutive  matches.  */ 

for  (n  =  1;  n  <=  cnt  && 

(  (++linel)  ->hash  ==  (++line2) ->hash) ;  n++) 

{ 


return  (n  >  cnt); 


/*********************************************************** 
**  cnt_matches  counts  the  number  of  consecutive  matching 
**  lines  of  text. 

** 

**  RETURNS:  number  of  consecutive  matching  lines. 

*********************************************************** i 

static  int  cnt_matches  (si,  s2,  most) 

char  **sl;  /*  (I)  Starting  line  in  file  #1.  */ 

char  **s2;  /*  (I)  Starting  line  in  file  #2.  */ 

register  int  most;  /*  (I)  Most  matching  lines  possible.  */ 

< 

register  int  n;  /*  Count  of  consecutive  matches.  */ 
/*  Count  the  consecutive  matches.  */ 

for  (n  =  0;  n  <  most  &&  strcirp  (*sl++,  *s2++)  ==  ALIKE; 
n++) 


return  (n) ; 


/*********************************************************** 
**  rpt_seq  reports  a  matching  sequence  of  lines. 

*********************************************************** i 

static  void  rpt_seq  (ml,  m2,  cnt) 

int  ml;  /*  (I)  Location  of  matching  sequence  in  #1.  */ 

int  m2;  /*  (I)  Location  of  matching  sequence  in  #2.  */ 

int  cnt;  /*  (I)  Number  of  lines  in  matching  sequence.  */ 

{ 

fprintf  (stdout, 

"Matched  %5d  lines:  (%5d  -  %5d)  and  (%5d  -  %5d)\n", 
cnt,  ml,  ml  +  cnt  -  1,  m2,  m2  +  cnt  -  1)  ; 

} 


End  Listing 


60 

686 


Dr.  Dobb's  Journal,  September  1987 


XOR  CHAIN 


Listing  One  ( Text  begins  on  page  36. ) 

/*  Binary  tree  delete  procedure 
* 

*  Input  parameter  "s"  points  to  a  scan  for  an  XOR 

*  chained  binary  tree. 

*  The  procedure  deletes  s->child  from  the  tree, 

*  and  returns  with  s  set  by  GoParent (s) 

*/ 

void  Delete  (s) 
struct  Scan  *s; 

{ 

struct  Scan  temp,  *t; 
struct  Item  *i,  *j,  *k; 

i  =  s->child;  /*i  is  the  item  to  be  deleted*/ 
GoParent  (s)  ; 

if  (i->llink  ==  i->rlink)  /*  Case  1  */ 

( 

/* 

*  adjust  the  pointers  for  s->child, 

*  i ' s  parent 
*/ 

if  (i->key  <  s->child->key) 

s->child->llink  =  s->parent; 

else 

s->child->rlink  =  s->parent; 

) 

else  if  (i->llink  ==  s->child)  /*  Case  2  */ 

{ 

/* 

*  adjust  the  pointers  for  s->child, 

*  i '  s  parent 
*/ 

if  (i->key  <  s->child->key) 

s->child->llink  =  i->rlink  A 

s->parent  A  s->child; 

else 

s->child->rlink  =  i->rlink  A 

s->parent  A  s->child; 

/* 

*  adjust  the  pointers  for  i's  child 
*/ 

j  =  i->rlink  A  s->child; 
j->rlink  A=  i  A  s->child; 
j->llink  A=  i  A  s->child; 

} 

else  if  (i->rlink  ==  s->child)  /*  Case  3  */ 

{ 

/* 

*  adjust  the  pointers  for  s->child, 

*  i's  parent 
*/ 

if (  i->key  <  s->child->key  ) 

s->child->llink  =  i->llink  A 
s->parent  A 
s->child; 

else 

s->child->rlink  =  i->llink  A 
s->parent  A 
s->child; 

/* 

*  adjust  the  pointers  for  i's  child 
*/ 

j  =  i->llink  A  s->child; 
j->rlink  A=  i  A  s->child; 

(continued  on  page  65) 


62 


Dr.  Dobb's  Journal,  September  1987 

687 


XOR  CHAIN 


Listing  One  (Listing  continued,  tejct  begins  on  page  36.) 

j->llink  A=  i  A  s->child; 

} 

else  /*  Case  4  */ 

{ 

j  =  i->llink  A  s->child; 

if  ( j->rlink  =  i)  /*  Case  4a  */ 

{ 

/* 

*  adjust  the  pointers  for  s->child, 

*  i 1 s  parent 
*/ 

if  (i->key  <  s->child->key) 

s->child->llink  =  j  A  s->parent; 

else 

s->child->rlink  =  j  A  s->parent; 

/* 

*  adjust  the  pointers  for  i ' s  children 
*/ 

j->llink  A=  i  A  s->child; 
j->rlink  =  i->rlink; 
k  =  i->rlink  A  s->child; 
k->llink  A=  i  A  j; 
k->rlink  A=  i  A  j; 

} 

else  /*  Case  4b  */ 

*  (continued  on  nejct  page ) 


688 


XOR  CHAIN 


Listing  One 

(Listing  continued,  tejct  begins  on  page  36.) 


/* 

*  locate  the  replacement  item 
*/ 

t  -  Stemp; 

Associate (t, s->root) ; 
t->parent  =  s->child; 
t->child  =  i; 

GoLeft  (t)  ; 

while(  t->child->rlink  !=  t->parent  ) 
GoRight (t) ; 


/* 

*  adjust  the  pointers  to  free  t->child 
*/ 

t->parent->rlink  A=  t->child->llink  A 
t->parent  A 
t->child; 

if  (t->child^>llink  !=  t->parent) 

{ 

k  =  t->child->llink  A  t->parent; 
k->llink  A=  t->parent  A  t->child; 
k->rlink  A=  t->parent  A  t->child; 


/* 

*  adjust  the  pointers  for  s->child, 

*  i 1 s  parent 
*/ 

if  (i->key  <  s->child->key) 

s->child->llink  =  t->child  A  s->parent; 

else 

s->child->rlink  =  t->child  A  s->parent; 

t->child->llink  =  i->llink; 
t->child->rlink  =  i->rlink; 


r 


adjust  the  pointers  for  i's  children 


*/ 


j->llink  A=  i  A  t->child; 
j->rlink  A=  i  A  t->child; 
k  =  i->rlink  A  s->child; 
k->llink  A=  i  A  j; 
k->rlink  A=  i  A  j; 


) 


} 

Dropltem(i)  ; 


End  Listing 


66 


Dr.  Dobb's  Journal,  September  1987 

689 


DEVICE  DRIVERS 


listing  One  (Text  begins  on  page  44.) 

;  prndrv . asm 

;  printer  driver  startup  code 

;  ms/pc-dos  2.x,  3.x  installable  device  driver 

; copyright  (c)  Andy  Klein  1987 

PRIVATE_STACK_SIZ  equ  256  /private  stack  size,  probably  much  larger 

;  than  necessary 

IS_CHAR_DEV  equ  32768  ;bit  to  set  for  this  driver 


codeseg  segment  para  public  'CODE' 

codeseg  ends 

dataseg  segment  para  public  'DATA' 

dataseg  ends 


;  define  first 


assume  csrcodeseg,  ds:dataseg,  es/dataseg,  ssidataseg 


codeseg 

org 


segment  para  public  'CODE 
0 


/drivers  are  always  org'ed  at  0 
/  with  the  device  header  first 


public  prn_driver,  dev_strategy_,  dev_interrupt_ 

prn_driver  proc  far  /drivers  invoked  as  far  calls  by  DOS 
/device  header  starts  here 


next  dev 

db  4 

dup(255) 

/no  next  driver  in  this  file. 

/(long)-l  which  is  4  bytes  of 

attribute 

dw 

I  S_C  HAR_DE  V 

strategy 

dw 

dev_strategy 

/device  strategy  entry  point 

interrupt 

dw 

dev  interrupt 

/device  interrupt  entry  point 

dev_name 

db 

'PRN  ■ 

;  i 

end  of 

the  device  header 

/code  segment  variables,  these  will  be  addressable  before  we  setup 
z  our  data  and  stack  segment 


req_hdr  seg 

dw 

(0) 

/  pointer  to  request 

header. 

segment  part 

req_hdr  off 

dw 

(0) 

/  pointer  to  request 

header. 

offset  part 

caller  ss 

dw 

(0) 

/  caller's  ss 

caller  sp 

dw 

(0) 

/  caller's  sp 

/strategy  -  this  strategy  function  saves  a  long  pointer  to  a 
/  request  header  in  code  segment  variables  req_hdr_seg  and 
/  req_hdr_off. 

/  The  address  of  the  request  header  is  passed  in  es:bx 
dev_strategy_: 

mov  word  ptr  cs:req_hdr_off,bx  /save  request  header  pointer 

mov  word  ptr  cs:req_hdr_seg,es  /  in  code  segment  variables 

ret  /  ...and  back  to  dos 


public  $cswt,  $begin  /keep  Aztec  linker  happy,  all  Aztec  C  compiled 

/  functions  have  a  reference  to  these  2  functions 
/  which  normally  drag  in  the  startup  code 

$cswt  proc  near 
$cswt  endp 

$begin  proc  near 
$begin  endp 


68 

690 


Dr.  Dobb's  Journal,  September  1987 


extrn  driver_functions_:near 

; interrupt  -  not  a  true  interrupt  handler  (it  ends  with  a  ret  as 

opposed 

;  to  an  iret) ,  this  function  recieves  no  parameters.  Instead  it 
;  uses  the  information  stored  in  the  request  header  we  recieved 

and 

;  saved  a  pointer  to  in  the  strategy  function  to  determine  what 

to  do. 


dev_interrupt_: 

; STEP  0 

/  Preserve  machine  state 

cli  ;  no  interrupts  when  dealing  with  machine  state 

push  ds  ;  save  machine  state  on  caller's  stack 

push  es 

push  ax 

push  bx 

push  cx 

push  dx 

push  si 

push  di 

push  bp 

pushf 

mov  cs :caller_ss, ss  ;  save  caller  stack  frame  (ss  and  sp) 
mov  cs:caller_sp, sp 

sti  /interrupts  OK 

/Now  the  real  work  starts  ... 

/STEP  1 

/  Get  the  driver's  segment  into  the  ax  register 
/  This  begins  our  task  of  establishing  addressability 


mov  ax,cs 


/STEP  2 

/  Copy  the  request  header  stored  at  req_hdr_seg : req_hdr 

_off 

/  into  our  private  and  C  addressable  request  header 


mov  si, cs : req_hdr_off 

mov  bx, cs : req_hdr_seg 
mov  ds,bx 
mov  es,ax 

mov  di, offset  cs:driver_rh 
cld 

xor  cx,cx 
mov  cl, [si] 


rep  movsb 


/load  up  the  source  string  segment: 

offset 


/  into  ds:si  registers 


/load  up  the  destination  string 

segment : offset 


/  into  es:di  registers 

/set  direction  flag  to  increment 

si  and  di 


/clear  out  cx 

/first  byte  of  request  header  is  its 


/  ...  and  copy  it  . . . 


length 


/STEP  3 

/  This  step  establishes  addressability  of  data  segment 
/  and  sets  up  driver's  stack  frame. 

/  It  also  sets  register  bp  equal  to  sp,  a  state  required 
/  for  calling  the  first  C  function 

/  Remember  that  register  ax  contains  the  driver 

segment 

/  We  will  set  ds,  es,  ss  all  equal  to  cs  ...  8080 

model?? 


mov  bx, offset 
cli 

mov  ds,ax 
mov  es, ax 
mov  ss,ax 
mov  sp,bx 
mov  bp, bx 


c_stack_top  /this  will  go  into  sp 

/no  interrupts  while  we  mess  with  these  registers 


/now  establish  our  stack  frame 
/bp  =  sp  this  is  critical  for  C 


(continued  on  ne^ct  page) 


Dr.  Dobb's  Journal,  September  1987 


69 

691 


DEVICE  DRIVERS 


Listing  One  (Listing  continued,  text  begins  on  page  44.) 

stl  /ok  for  an  interrupt  to  occur 

;now  we  have  set  up  the  environment  for  C 
;STEP  4 

;  Call  our  first  C  function, 

;  passing  the  command  code  from  the  request  header  as 
;  a  parameter.  (void)driver_functions  (cnvd_code)  / 


xor  ax,  ax 

mov  bx, offset  driver  rh 
mov  al, byte  ptr  [bx  +  2] 
push  ax 

call  driver_functions_ 
pop  ax 


; clear  out  ax 

;bx  points  to  our  request  header 

;3rd  byte  of  request  header  is  comnand  code 

;to  pass  parameter (s)  to  a  C  function, 

;  push  it  (them)  on  stack  and  call  function 
/■caller  must  remove  parameter (s)  from  stack 


,-STEP  5 

;  At  this  point  we  are  done  with  the  function  our  driver 
;  was  to  perform.  Now  we  must  copy  our  private  request  header 
;  back  into  the  original  request  header  DOS  gave  us  a  pointer 
;  to  back  in  the  strategy  function. 


mov  e  s ,  cs : req_hdr_seg 
mov  di , cs : req_hdr_o  f f 
mov  si, offset  driver  rh 
cld 

xor  cx,cx 
mov  cl, [si] 
rep  movsb 


;load  address  for  orig  RH  into  es:di 

;ds:si  our  (updated)  copy 
;set  direction  to  increment 
; clear  out  cx 
,-length  is  at  first  byte 
;  ...  and  copy  it 


;STEP  6 

;  Restore  machine  state  saved  in  STEP  0 
;  and  return  to  DOS 


cli 

mov  ax,cs:caller_ss 
mov  ss,ax 

mov  ax, cs:caller_sp 

mov  sp,ax 

popf 

pop  bp 

pop  di 

pop  si 

pop  dx 

pop  cx 

pop  bx 

pop  ax 

pop  es 

pop  ds 

sti 

ret 

prn_driver  endp 


/•machine  state  restore  should  not  be  interrupted 
/switch  to  caller's  stack  frame 


;  ...  pop  all  of  it  . . . 


/enable  inter rrupts 
/  ...  back  to  dos  and  bye  bye 

/end  of  driver  code 


codeseg  ends 


/Data  segment,  contains  our  local  stack  space 
/  and  C  addressable  copy  of  request  header 

dataseg  segment  para  public  'DATA' 

public  c_stack_top 

db  PRIVATE_STACK_SIZ  dup  (0) 
c_stack_top  label  word 

public  driver_rh_ 

driver_rh_  db  32  dup  (0)  /C  addressable  copy  of  request  header 
dataseg  ends 

END  End  Listing  One 


70 

692 


Dr.  Dobb's  Journal,  September  1987 


Listing  Two 

/**  rh.h 

Device  Driver  Request  Header  structure  definition 
and  status  word  Mefine's 
copyright  (c)  1987  Andy  Klein 
**/ 


typedef  struct 

{ 

unsigned  char  length,  /*  length  of  header  */ 

unit,  /*  unit  code  which  unit  to  use  */ 

cmd;  /*  command  to  execute  */ 

unsigned  int  status;  /*  status  of  operation  */ 

unsigned  char  reserved[8], 

media_type;  /*  media  descriptor  byte,  block  dev  only  */ 
unsigned  int  xfer_buf_offset, 
xfer_buf_segment, 
xfer_count; 

char  dummy [32  -  20]; 

/*  data  for  operation  */ 

)  request_hdr; 


♦define  ERROR_MASK  32768 

♦define  BUSY_MASK  1024 

♦define  DONE_MASK  512 

♦define  WRITE_PROTECTED  0x00 

♦define  UNKNOWN_UNIT  0x01 

♦define  DEV_NOT_READY  0x02 

♦define  UNKNOWN_CMD  0x03 

♦define  CRC_ERROR  0x04 

♦define  BAD_DRIVE_REQ_LEN  0x05 

♦define  seek  error  0x06  (continued  on  ne?ct  page) 


693 


DEVICE  DRIVERS 


Listing  Two  (Listing  continued,  text  begins  on  page  44.) 


#define 

UNKOWN  MEDIA 

0x07 

#define 

SECTOR  NOT  FOUND 

0x08 

♦define 

PRN  NO  PAPER 

0x09 

♦define 

WRITE  FAULT 

OxOA 

♦define 

READ  FAULT 

OxOB 

♦define 

GENERAL  FAILURE 

OxOC 

♦define 

INVALID  DISK  CHG 

OxOF 

Listing  Three 

/**  drvfunc.c 

Device  Driver  for  DOS  2.x,  3.x 
This  is  the  function  table  that  is  called  once 
the  interrupt  function  has  established  addressability 
copyright  (c)  1987  Andy  Klein 

**  j 

#define  LAST_FUNCTION  15 

#define  Q_FUNCTIONS  LAST  FUNCTION  +  1 


extern  void  init(),  output_status  () ,  output  () ,  output  flush  (), 
output_verf  ()  ,  bad_cmd(); 


/** 


function_table  is  an  array  of  pointers  to  functions 
returning  nothing  (void) .  It  is  analogous  to  a  jump  table 
in  assembler. 


**  / 

void  (*  function  table[Q  FUNCTIONS] )  ()  =  ( 


/* 

0 

*/ 

init. 

/* 

1 

*/ 

bad 

cmd. 

/* 

2 

*/ 

bad 

cmd. 

/* 

3 

V 

bad 

_cmd. 

/* 

4 

*/ 

bad 

_cmd. 

/* 

5 

*/ 

bad 

cmd. 

/* 

6 

*/ 

bad 

cmd. 

/* 

7 

*/ 

bad 

cmd. 

/* 

8 

*/ 

output. 

/* 

9 

*/ 

output_verf. 

/* 

10 

*/ 

output_status 

/* 

11 

*/ 

output  flush, 

/* 

12 

*/ 

bad 

cmd. 

/* 

13 

*/ 

bad 

cmd. 

/* 

14 

*/ 

bad 

cmd, 

/* 

15 

*/ 

bad 

cmd 

/*  media  check  */ 

/*  build  bpb  */ 

/*  ioctl  input  */ 

/*  input  */ 

/*  input  no  wait  */ 
/*  input  status  */ 
/*  input_flush  */ 


/*  ioctl  output  */ 

/*  open  device  */ 

/*  close  device  */ 

/*  removable_media  */ 


End  Listing  Two 


void 

dr iver_f unct ions (cmd) 
int  cmd; 

{ 

if  (  cmd  >  LAST_FUNCTION  ) 

(void)  bad_cmd(); 

else 

(void)  (*  function_table  [cmd] )  (); 

>/*  driver_functions()  */  End  Listing  Three 

Listing  Four 

/**  error. c 

Error  handler  for  printer  driver 
copyright  (c)  1987  Andy  Klein 
**/ 

((include  "rh.h" 

extern  request_hdr  driver_rh; 

void 

bad_cnvd() 

( 

driver_rh. status  =  ERROR_MASK  |  UNKNOWN_CMD; 

(continued  on  page  74) 


72 

694 


Dr.  Dobb's  Journal,  September  1987 


DEVICE  DRIVERS 


Listing  Four  (Listing  continued,  tejct  begins  on  page  44.) 


((define  TIME_OUT  1 
#define  IO_ERR  8 
#define  NO_PAPER  32 
((define  BUSY  128 


void 

driver_error (stat) 
int  stat; 

( 

int  err  code; 


if  (  stat  S  ERROR_MASK  ) 


stat 

A=  ERROR_ 

MASK; 

switch  ( 

{ 

case 

stat  ) 

TIME_OUT 

:  err_code  =  DEV_NOT_READY; 

break; 

case 

IO_ERR: 

err  code  —  GENERAL  FAILURE, 

break; 

case 

NO_PAPER 

:  err  code  =  PRN_NO_PAPER; 

break; 

case 

BUSY: 

err  code  =  DEV_NOT_READY; 
break; 

default : 

err  code  =  GENERAL_FAILURE 
break; 

) 

driver  rh. status  =  ERROR_MASK  |  err_code; 

}/*  driver~error ()  */  End  Listing  Four 

Listing  Five 

/**  init.c 

Printer  driver  initialization 
copyright  (c)  1987  Andy  Klein 
**/ 

((include  "rh.h" 

extern  request_hdr  driver_rh; 

char  *title  =  "\nPrinter  Device  Driver  for  Okidata  92"; 
char  *copyw  =  "copyright  1987  (c)  Andy  Klein\n"; 


void 
init  () 

( 

extern  unsigned  _Uend; 

/*  _Uend  is  inserted  by  the  Aztec  linker  */ 
unsigned  show_cs(); 

void  reset_printer  () ,  initialize_oki92  ()  ; 

puts  (title)  ; 
puts (copyw) ; 

(void)  reset_printer  ()  ; 

(void)  initialize_oki92  ()  ; 

/*  set  ending  address  of  driver,  set  status  word  */ 
driver_rh.xfer_buf_segment  =  show_cs(); 
driver_rh.xfer_buf_offset  =  (unsigned  int)S_Uend; 
dr iver_rh. status  =  DONE_MASK; 

)  /*  initO  */ 


#def ine  ELITE_FONT  28 
void 

initialize_oki92  () 

( 

char_2_prn (ELITE_FONT) ; 

)/*  initialize_oki92  ()  */  End  Listing  Five 


74 


Dr.  Dobb's  Journal,  September  1987 

695 


Listing  Six 

;  show_cs . asm 

;  c  call  is  (unsigned)  show_cs(); 

;  Returns  contents  of  cs  register 
/copyright  (c)  1987  Andy  Klein 

include  lmacros.h 

procdef  show_cs 

mov  ax,  cs 
pret 

pend  show_cs 

finish  End  Listing  Six 


Listing  Seven 

/**  output.c 

Printer  driver  output  functions 
copyright  (c)  1987  Andy  Klein 
**/ 

((include  "rh.h" 

extern  request_hdr  driver_rh; 

void 
output ( ) 

( 

int  stat; 

unsigned  xfer_off,  bytes_xferd; 
register  char  outch; 
char  fetch_char  ()  ; 

xfer_off  =  driver_rh.xfer_buf_offset; 

for  (  bytes_xferd  =  0;  bytes_xferd  <  driver_rh .xfer_count; 
++bytes_xferd,  ++xfer_off  ) 

{ 

outch  =  fetch_char (driver_rh.xfer_buf_segment,  xfer_off) ; 
if  (  (stat  =  char_2_prn (outch) )  ) 

( 

(void)  driver_error (stat) ; 
break; 

) 

) 

driver_rh.xfer_count  =  bytes_xferd; 
if  (  !  stat  ) 

driver_rh. status  =  DONE_MASK; 

else 

dr iver_rh. status  =  stat; 

)/*  output))  */ 


void 

output_verf  () 

( 

(void)  output  (); 
)/*  out_verf()  */ 


void 

output_status () 

( 

/** 

if  device  is  currently  doing  an  operation, 
dr iver_rh. status  =  BUSY_MASK  S  D0NE_MASK; 
else  if  a  write  can  begin  immediately, 
driver_rh. status  =  0  &  D0NE_MASK; 

**  / 

if  (  printer_busy ()  ) 

driver_rh. status  =  BUSY_MASK  S  D0NE_MASK; 

else 

(continued  on  next  page ) 


Dr.  Dobb's  Journal,  September  1987 

696 


75 


DEVICE  DRIVERS 


Listing  Seven  (Listing  continued,  text  begins  on  page  44. ) 

driver  rh. status  =  DONE_MASK/ 

}/*  output_status ()  */ 


void 

output_flush () 

{ 

/★* 

terminate  all  pending  requests, 
meaningless  for  this  device 

**/ 

dr iver_rh. status  =  D0NE_MASK/ 

}/*  output_f  lush  ( )  */  End  Listing  Seven 

Listing  Eight 

;  prlport.asm 

/  low  level  parallel  port  output  functions 
/copyright  (c)  Andy  Klein  1987 


include  lmacros.h  ;Aztec  C  macros  for  assembly  language  functions 

/  for  other  environments  replace  with  calling 
;  conventions  for  your  compiler 


ERROR  MASK 

equ 

32768 

PRINTER  INT 

equ 

017H 

PRINTER  ID 

equ 

0 

;lptl  =  0,  lpt2  =  1,  lpt3  =  2 


/printer  commands,  put  into  ah 


PRINT_CHAR  equ  0 
INIT_PRN  equ  1 
READ_STAT  equ  2 


/printer  status  byte  error  codes,  returned  in  ah 
1 
8 

32 
128 

equ  TXME_OUT+IO_ERR+NO_PAPER 


(void)  reset_printer  () / 
/no  value  returned 
procdef  reset_printer 

mov  ah, INIT_PRN 
mov  dx,PRINTER_ID 
sti 


TIME  OUT 

equ 

10  ERR 

equ 

NO  PAPER 

equ 

BUSY 

equ 

PRINTER_FAILURE 

/C  call 

is 

int  PRINTER_INT 
xor  ax, ax 
pret 

pend  resetjorinter 


/C  call  is  printer_busy  ()  / 

/returns  0  (FALSE)  if  not  busy,  nonzero  (TRUE)  if  busy 
procdef  printer_busy 

mov  ah,READ_STAT 
mov  dx,PRINTER_ID 
sti 

int  PRINTER_INT 
and  ah, BUSY 

jz  printer_is_busy  ( continued  on  next  page ) 


697 


DEVICE  DRIVERS 


Listing  Eight  (Listing  continued,  tept  begins  on  page  44.) 

xor  ax, ax 
pret 

printer_is_busy: 
mov  ax, 1 
pret 

pend  printer_busy 


;  C  call  is  (char)  fetch_char (segm,  offs); 
;  gets  the  character  at  segmroffs  and 

;  returns  it 

procdef  fetch_char,  «segm,  word>,  <offs,  word» 
push  ds 
mov  bx,offs 
mov  ds , segm 

mov  al,byte  ptr  ds:[bx] 
and  ax,  OffH 
pop  ds 
pret 

pend  fetch_char 


;C  call  is  char_2_prn (outch) ; 

;  prints  outch  on  PRINTER_ID 
;  returns  0  if  ok,  error  code 

procdef  char_2_prn,  «outch, byte» 
mov  al, outch 
mov  ah , PRINT_CHAR 
mov  dx, PRINTER_ID 
sti 


int  PRINTER_INT 

test  ah, PRINTER_FAILURE 

jnz  printer_has_f ailed 

xor  ax, ax 

pret 

printer_has_f ailed: 

mov  al,ah  .-result  code  in  ah,  mov  it  to  al 

and  ax, PRINTER_FAILURE 

pret 

pend  char_2_prn 


finish 


End  Listings 


698 


C  CHEST 


Listing  One  (Text  begins  on  page  106.) 


11 

♦define  PUBLIC 

21 

31 

♦ifdef 

DEBUG 

41 

♦ 

define 

PRIVATE 

51 

♦ 

define 

D  (x)  x 

61 

♦else 

7| 

♦ 

define 

PRIVATE  static 

81 

♦ 

define 

D  (x) 

91 

♦endif 

101 

HI 

♦  ifdef 

MSDOS 

121 

♦ 

define 

MS (x)  x 

13| 

♦ 

define 

UX(x) 

14| 

♦else 

151 

♦ 

define 

MS  (x) 

16| 

♦ 

define 

UX (x)  x 

17| 

♦endif 

Listing  Two 


End  Listing  One 


II 
21 
31 
41 
51 
61 
71 
81 
91 
10| 
HI 
121 
131 
14| 
151 
16| 
17| 
18| 
19| 
20| 
211 
22  | 
231 
241 
25| 
261 
27| 
281 
291 
30| 
311 
321 
331 
34| 
351 
36 
37| 
38  | 
391 


/*  Defines  for  IBM-PC  Hardware  */ 


/*  Timer  defines: 
★ 

*  TIMR_CLK 

* 

*  TIMR  TICKS (d) 


TIMR_CTRL 
TIMR_2_DATA 
TIMR  2  LOAD 


TIMR_0_LOAD 
TIMR  0  DATA 


The  number  of  system  clock  cycles  in  a  second 
(ie.  the  input  frequency  of  the  counter/timer) . 
The  number  of  ticks  needed  to  get  a  delay  of  d 
seconds,  'd'  can  be  a  fraction:  TIMR_TICKS ( .5) . 

Address  of  control  port 
Address  of  counter  2  data  port 
Control  word  to  load  new  count  into 
timer  2  (the  speaker — 2  bytes,  lsb  first)  . 
Timer  initilized  in  mode  3  (square  wave) 

Same  but  for  timer  0  (system  clock) . 


#define  TIMR_CLK 
♦define  TIMRJTICKS (d) 

♦define  TIMR_CTRL 

♦define  TIMR_0_DATA 
♦define  TIMR_0_LOAD 

♦define  T IMR_2_DAT A 
♦define  TIMR  2  LOAD 


1193180L 

(int)  ( (double)  (d) 

0x43 

0x40 

0x36 

0x42 

0xb6 


*  (TIMR  CLK/65536.0) ) 


/*  Programmable  peripheral  interface: 


PPI 

PPI  SPKR 


I  V 

♦define  PPI 
♦define  PPI  SPKR 


Base  address  of  interface 
Bit  mask  to  enable  speaker  (bit  0  is 
gate  on  timer  chip,  bit  1  actually 
enables  the  speaker)  . 


0x61 

0x03 


401 

/*■ 

411 

421 

* 

Make  the  speaker  beep  at  a  particular 

431 

•k 

44  | 

k 

SETFRQ ( f req) ; 

Sets  the  frequency,  J 

451 

* 

461 

* 

SPKR  ON  ()  ; 

Turns  the  speaker  on. 

47| 

k 

SPKR  OFF  ( )  ; 

Turns  it  off  again. 

48  | 

★ 

/ 

491 

501 

♦define  SETFRQ ( 

freq  ) 

511 

if  (  1  ) 

52| 

( 

531 

unsigned 

int  count ; 

point 


82 


Dr.  Dobb's  Journal,  September  1987 

699 


54  1 

551 

count 

=  TIMR_CLK  / 

freq  ; 

561 

57| 

outp  ( 

TIMR_CTRL  , 

T  IMR_2_LO  AD 

581 

outp  ( 

TIMR  2  DATA, 

count 

&  Oxff 

591 

outp  ( 

TIMR  2  DATA, 

(count  »  8) 

&  Oxff 

60  | 

611 

621 

) 

else 

63|  Sdefine  SPKR_ON()  outp(  PPI, 
64  |  # define  SPKR_OFF()  outp  (  PPI, 


inp  (PPI)  |  PPI_SPKR  ) 
inp  (PPI)  S  ~PPI_SPKR  ) 


Listing  Three 


End  Listing  Two 


|  /*  These  ♦defines  are  the  frequencies  12  notes  of  the  octave 

*  starting  with  middle  C.  Multiply  by  two  to  go  up  an  octave, 

*  divide  by  two  to  go  down.  This  is  an  equal-tempered  scale 

*  so  each  note  is  derrived  by  multiplying  the  previous  note 

*  by  the  twelfth  root  of  two.  Note  that  there's  a  little 

*  round-off  error  here  but  this  error  isn't  audible. 

*/ 


91 

#define 

TWELFTH  ROOT  OF  TWO 

1.059463095 

101 

111 

#def ine 

C4 

(261.6256) 

/* 

C 

4 

*/ 

121 

#def ine 

C4  SHARP 

(277.1826) 

/* 

c# 

4 

*/ 

131 

♦define 

D4 

(293.6648) 

/* 

D 

4 

*/ 

14  1 

♦define 

D4  SHARP 

(311.1270) 

/* 

D# 

4 

*/ 

15| 

♦define 

E4 

(329.6276) 

/* 

E 

4 

*/ 

161 

♦define 

F4 

(349.2282) 

/* 

F 

4 

*/ 

17| 

♦define 

F4  SHARP 

(369.9944) 

/* 

F# 

4 

*/ 

181 

♦define 

G4 

(391.9954) 

/* 

G 

4 

*/ 

191 

♦define 

G4  SHARP 

(415.3047) 

/* 

G# 

4 

*/ 

201 

♦define 

A4 

(440.0000) 

/* 

A 

4 

*/ 

211 

♦define 

A4  SHARP 

(466.1638) 

/* 

A# 

4 

*/ 

221 

♦define 

B4 

(493.8833) 

/* 

B 

4 

*/ 

Listing  Four 

11  ♦ include  <stdio.h> 

End  Listing 

2 

3 

4 

5 

6 

7 

8 
9 

10 

11 

12 

13 

14 

15 

16 

17 

18 

19 

20 
21 
22 

23 

24 

25 

26 

27 

28 

29 

30 

31 

32 

33 

34 

35 


♦include  <tools/hardware.h> 

beep(  freq,  duration  ) 
double  freq; 
double  duration; 

{ 

/*  Beep  the  bell  on  the  IBM-PC  for  the  indicated  time 

*  (which  may  be  fractional)  at  the  indicated  frequency. 

*  Frequencies  for  various  notes  in  an  equal-tempered 

*  scale  are  in  <tools/notes.h>. 

*/ 

SETFRQ  (  freq  ); 

SPKR_ON()  ; 

delay!  (double) duration  ); 

SPKR_OFF()  ; 

) 

/* - * 

#ifdef  MAIN 

♦include  <tools/notes.h> 

main(  argc  ,  argv  ) 
char  **argv; 

( 

double  atof(); 

♦  if  0 

beep (  C  *  4  , 

#  endif 


atof (argv(l) )  )  ; 


(continued  on  nejct  page) 


Dr.  Dobb's  Journal,  September  1987 

700 


83 


C  CHEST 

Listing 

Four  ( Listing  continued,  tejct  begins  on  page  106. ) 

361 

beep  (  C  ,  0.5) 

371 

beep (  D  ,  0.5) 

381 

beep  (  E  ,  0.5) 

391 

beep  (  F  ,  0.5  ) 

401 

beep  (  G  ,  0.5  ) 

411 

beep (  A  ,  0.5  ) 

42| 

beep  (  B  ,  0.5  ) 

431 

beep  (  C  *  2  ,  0.5  ) 

441 

} 

45  1 

#endif 

End  Listing  Four 

Listing  Five 

il 

♦include  <tools/hardware.h> 

21 

♦include  <dos.h> 

31 

41 

delay ( 

duration  ) 

51 

61 

71 

double 

duration; 

l 

/*  Delay  for  the  indicated  number  of  seconds  (may  be 

81 

*  fractional. 

91 

*/ 

10| 

HI 

unsigned  long  start. 

elapsed  ; 

12| 

13| 

unsigned  long  i,  t; 

141 

151 

elapsed  =  TIMR  TICKS ( 

duration  ) ; 

161 

17| 

fort  start  =  ticks  (); 

ticks!)  -  start  <  elapsed  ;) 

181 

/ 

191 

> 

201 

211 

ticks  (] 

22  | 

i 

23  | 

/* 

Return  the  number  of  BIOS  clock  ticks  since  midnight. 

241 

★ 

The  routine  rolls  over 

succesfully  at  midnight  (1573040 

25| 

★ 

is  the  number  of  clock 

ticks  in  a  day  and  AL  is  zero 

261 

★ 

if  the  timer  has  not  passed  midnight  since  the  last 

27| 

* 

call) . 

281 

*/ 

291 

301 

union  REGS  regs; 

311 

321 

regs. h. ah  =0; 

331 

341 

int86(  Oxla,  Sregs,  Sregs 

) ;  /*  Time-of-day  interrupt  */ 

351 

361 

return  (  regs.h.al  ?  1573040L  :  0  ) 

37| 

+  (  regs.x.cx  « 

16  ) 

381 

+  regs.x.dx 

391 

; 

401 

} 

411 

421 

♦ifdef 

MAIN 

431 

main  () 

44  1 

( 

451 

printf ("Should  be  five  seconds  between  beeps\n\007" ) ; 

461 

delay (  5.0  ) ; 

471 

printf ("\007") ; 

481 

) 

49  | 

#endif 

End  Listing  Five 

Listing  Six 

il 

PAGE  56,132 

21 

TITLE  SPEEDUP. ASM:  System-clock-modification  routines 

31 

/ 

41 

51 

DEBUG 

egu  1  ;  Set  to  1  to 

make  internal  symbols  public 

/ 

61 

84 


Dr.  Dobb’s  Journal,  September  1987 

701 


71 

JTEXT 

SEGMENT 

BYTE 

PUBLIC  'CODE' 

81 

_TEXT 

ENDS 

91 

_DATA 

SEGMENT 

WORD 

PUBLIC  'DATA' 

101 

_DATA 

ENDS 

111 

CONST 

SEGMENT 

WORD 

PUBLIC  'CONST 

12| 

CONST 

ENDS 

131 

_BSS 

SEGMENT 

WORD 

PUBLIC  'BSS ' 

14  1 

_BSS 

ENDS 

15| 

DGROUP 

GROUP 

CONST, 

,  BSS,  DA' 

16| 

ASSUME 

CS:  TEXT,  DS:  DGROI 

17  | 

18| 

EXTRN 

chkstkzNEAR 

19 

20 
21 
22 

23 

24 

25 

26 

27 

28 

29 

30 

31 

32 

33 

34 

35 


64 

65 

66 
67 


79 


T IMR_CTRL  =  43H 
TIMR_0_DATA  =  40H 
TIMR  0  LOAD  =  36H 


address  of  timer  control  port 
address  of  counter  0  data  port 
control  word  for  timer 


TEXT 


SEGMENT 


Misc.  variables.  Note  that  I'm  putting  all  these  in  the 
code  (_TEXT)  segment  so  that  I  can  find  them  when  an 
interrupt  comes  along.  The  PUBLIC  statements  are  just  for 
debugging . 


361 

old  int 

equ 

$ 

37  | 

old  off 

dw 

7 

Offset  of  old  timer  interrupt 

38| 

» 

service  routine. 

39  | 

old  seg 

dw 

7 

/ 

Segment  address  of  same. 

401 

411 

service 

dw 

7 

r 

User-supplied  interrupt  service 

421 

; 

routine  (offset) . 

431 

44  | 

old  ds 

dw 

7 

/ 

segments  for  running  program. 

451 

tick  reset 

dw 

7 

46| 

numticks 

dw 

7 

Initialized  to  tick  reset,  decre¬ 

47  | 

; 

mented  on  each  timer  interrupt. 

481 

reset  to  the  speedup  factor  (and  the 

491 

; 

old  service  routine  is  called)  when 

501 

; 

it  reaches  zero. 

511 

52| 

stack 

dw 

64 

dup  (0) 

;  Local  stack  for  service  routine 

531 

;  18  bytes  are  used  by  real  service 

541 

;  routine,  the  rest  is  available 

55  | 

;  for  the  user  service  routine. 

56| 

stack  end 

dw 

7 

57| 

old  sp 

dw 

7 

58  | 

old  ss 

dw 

7 

59  | 

601 

IF  DEBUG 

611 

PUBLIC 

old 

int 

,  old 

off,  old  seg,  old  ds 

621 

PUBLIC 

service 

,  tick 

reset,  numticks,  serv 

63| 

END  IF 

cliO;  sti();  Disable  and  enable  interrupts. 


68  | 

PUBLIC 

_cli. 

_st : 

69| 

701 

_cli 

PROC 

NEAR 

711 

cli 

72  1 

ret 

731 

cli 

ENDP 

741 

751 

_sti 

PROC 

NEAR 

76| 

sti 

77| 

ret 

781 

sti 

ENDP 

(continued  on  ne?ct  page) 


Dr.  Dobh's  Journal,  September  1987 

702 


C  CHEST 

Listing  Six  ( Listing  continued,  tejct  begins  ctn  page  106. ) 

81|  ;  speedup)  factor,  routine  ) 

82 1  ;  int  factor,  (* 

routine)  ()  ; 

831; 

84 |  ;  Speed  up  the 

system  clock  by  the  indicated  factor.  i 

85|  ;  Call 

the  indicated  subroutine  on  every  timer  interrupt. 

8  6 |  ;  and 

call  the 

default  clock  routine  as  well  every  "factor" 

87 |  ;  ticks. 

881  ; 

89 |  ;  Offsets  to  arguments: 

90|  ; 

factor 

=  [bp+4] 

911  ; 

routine 

=  [bp+6] 

92|  ; 

93|  PUBLIC 

speedup 

941 

95 |  speedup 

PROC  NEAR 

961 

push 

bp 

97| 

mov 

bp,  sp 

981 

xor 

ax,  ax 

991 

call 

chkstk 

100| 

1011 

mov 

ax, [bp+6]  ; 

service  =  offset  of  new 

1021 

mov 

TEXT : service,  ax  ; 

routine . 

1031 

mov 

TEXT: old  ds,ds  ; 

remember  current  DS  too. 

104  | 

mov 

ax, [bp+4]  ; 

tick  reset  =  numticks 

1051 

mov 

TEXT: tick  reset, ax  ; 

=  ax  =  factor 

1061 

mov 

TEXT:numticks,ax  ; 

107| 

1081 

mov 

al, TIMR  0  LOAD 

Set  up  timer  for  load 

1091 

out 

TIMR  CTRL,  al 

1101 

mov 

ax,  [bp+4]  ; 

if (  factor  ==  1  ) 

nil 

cmp 

ax, 01H 

( 

1121 

jne 

do  div  ; 

use  0  for  the  ouput  count 

1131 

mov 

ax,  0  ; 

} 

1141 

jmp 

load  ; 

else 

115|  do  div 

/ 

{ 

1161 

mov 

ax, OOOOOH 

Number  of  ticks  = 

117| 

mov 

dx, 00001H 

65536/factor 

118| 

mov 

bx, [bp+4]  ; 

BX  =  factor. 

1191 

div 

bx  ; 

AX  =  number  of  ticks 

120|  load: 

• 

} 

1211 

out 

TIMR  0  DATA,  al 

Send  new  count  to  timer 

1221 

mov 

al,  ah  ; 

1231 

out 

TIMR  0  DATA,  al 

124  | 

1251 

/ 

Get  the  old  vector 

126| 

mov 

ah, 35H  ; 

127| 

mov 

al,  08H 

128| 

int 

21H 

129| 

mov 

TEXT: old  off,bx  ; 

1301 

mov 

TEXT: old  seg, es  ; 

1311 

1321 

set  up  the  new  vector 

1331 

mov 

ah, 25H  ; 

134  | 

mov 

al,  08H 

1351 

mov 

dx, OFFSET  TEXT:  serv 

1361 

push 

ds 

1371 

push 

cs 

138| 

pop 

ds 

1391 

int 

21H 

1401 

pop 

ds 

1411 

1421 

mov 

sp,bp 

143| 

pop 

bp 

144  | 

ret 

145| 

146|  speedup 

ENDP 

1471 

148| 

149|  ;  Actual  interrupt  service  routine.  This  routine  saves  the 

150|  ;  environment. 

calles  the  user-supplied  C  service  routine, 

1511  ;  and 

then  calls  the  default  service  routine  if  necessary. 

86 


Dr.  Dobb's  Journal,  September  1987 

703 


152| 

;  The  service 

routine  runs  under 

its  own 

stack  so  stack 

153| 

;  checking  should  be  disabled  with  either  the  /Gs  command- 

154  | 

;  1  ine 

switch 

or  the  "#pragma  check  stack [+|-]"  directive. 

1551 

; 

156| 

serv 

PROC 

NEAR 

1571 

158| 

push 

ax 

;  Save  AX  on  old  stack 

159| 

160| 

mov 

TEXT: old  sp,sp 

;  Set  up  local 

1611 

mov 

TEXT: old  ss,ss 

;  stack 

1621 

push 

cs 

/ 

163| 

pop 

ss 

/ 

164  | 

mov 

sp, offset  TEXT: 

stack  end  ; 

1651 

166| 

push 

bx 

Set  up 

C  environment: 

167| 

push 

cx 

save  everything  (the 

168| 

push 

dx 

flags  are  saved  as 

1691 

push 

bp 

part  of  the  interrupt 

1701 

push 

si 

processing) . 

1711 

push 

di 

172| 

push 

ds 

1731 

push 

es 

1741 

1751 

mov 

ds,  TEXT: old  ds 

fix  the  data  segment 

176| 

mov 

es,  TEXT : old  ds 

and  extra  segment 

1771 

178  | 

cli 

1791 

call 

word  ptr  TEXT: service 

Call  C  subroutine 

180  | 

sti 

1811 

1821 

pop 

es 

restore  everything 

1831 

pop 

ds 

but  AX 

184| 

pop 

di 

1851 

pop 

si 

1861 

pop 

bp 

187| 

pop 

dx 

188| 

pop 

cx 

1891 

pop 

bx 

190| 

1911 

mov 

ss,  TEXT: old  ss 

Restore  original 

192| 

mov 

sp,  TEXT: old  sp 

stack. 

193| 

1941 

dec 

TEXT:numticks 

if ( — numticks  >  0) 

195| 

jle 

do  old  int 

{ 

196| 

mov 

al, 20h 

send  EOI 

197| 

out 

20h, al 

198| 

pop 

ax 

restore  ax 

1991 

iret 

> 

200| 

else 

2011 

do  old 

int : 

{ 

202| 

mov 

ax,  TEXT: tick  reset 

numticks 

2031 

mov 

TEXT:numticks,  ax 

=  tick  reset; 

204| 

pop 

ax 

restore  ax 

205| 

jmp 

dword  ptr  TEXT:old  int 

jmp  to  old  vector 

206| 

) 

207| 

208| 

2091 

serv 

ENDP 

/ 

210| 

2111 

PUBLIC 

slowdown 

2121 

2131 

slowdown 

PROC  NEAR 

2141 

215| 

push 

bp 

216| 

mov 

bp,  sp 

217| 

xor 

ax,  ax 

218| 

call 

chkstk 

2191 

220| 

mov 

ax,  TEXT : old  off 

;  See  if  the  interrupts  have 

2211 

or 

ax,  ax 

; 

changed . 

2221 

jz 

no  int 

;  No,  don't  fix  them  then 

223| 

2241 

;  restore  old  timer  interrupt 

225| 

push 

ds 

; 

226| 

mov 

ah, 25H 

•  (continued  on  next  page ) 

Dr.  Dobb's  Journal,  September  1987 

704 


87 


C  CHEST 


Listing  Six 

(Listing  continued,  tegt  begins  on  page  106.) 

227| 

mov 

al, 08H 

2281 

mov 

ds,  TEXT: old  seg  ; 

2291 

mov 

dx,  TEXTiold  off  ; 

230| 

int 

21H 

2311 

pop 

ds  ; 

2321 

233| 

no_int : 

2341 

mov 

al.TIMR  0  LOAD  ;  Restore  default  system 

2351 

out 

TIMR  CTRL, al  ;  clock  tick  rate 

2361 

mov 

al,  0 

237| 

out 

T IMR_0_D ATA,  al 

238| 

out 

TIMR_0_DATA, al 

2391 

240| 

mov 

sp,bp 

2411 

pop 

bp 

2421 

ret 

243| 

244| 

slowdown 

ENDP 

245| 

2461 

_TEXT 

ENDS 

247| 

END 

End  Elating  Six 

Listing  Seven 


ii 
21 
31 
41 
51 
61 
71 
81 
91 
101 
111 
121 
131 
14  1 
15| 
161 
17| 
18| 
191 
20| 
211 
221 
231 
241 
251 
26| 
27| 
281 
291 
301 
311 
321 

33 

34 

35 

36 

37 

38 

39 

40 

41 

42 

43 

44 

45 

46 

47 


♦include  <stdio.h> 

♦include  <signal.h> 

♦include  <stdarg.h> 

♦include  <ctype.h> 

♦include  <tools/notes .h>  /*  Frequencies  of  notes  */ 

♦include  <tools/hardware .h>  /*  TIMR_CLK  */ 

♦include  <tools/debug.h>  /*  D()  macro  */ 

/*  CLICK. C  A  polyrhythmic  metronome.  Usage  is  described 

*  in  the  usage_msg[],  below. 

* 

*  (c)  1987,  Allen  I.  Holub.  All  rights  reserved. 


*/ 

extern  double  ceil  (  double  ) ; 
extern  double  floor (  double  ) ; 

/* - 

*  The  compiler  truncates  floating  point  numbers  when  they're 

*  converted  to  int.  This  macro  rounds  as  it  converts. 

*/ 


♦define  ROUND  (x)  (  (int)  (  ( (x)  >  0.5)  ?  ceil  ( (double)  (x) )  \ 

:  floor ( (double) (x) )  )  ) 


/* - - - 

*  TICKS (x)  converts  a  metronome  count  to  clock  ticks. 

★ 

*  With  a  speedup  factor  4,  a  tick  happens  72.84  times/second 

*  (every  .01373  seconds,  more  or  less) .  A  speedup  factor  of 

*  2  yields  half  this  number:  36.42  times/sec,  or  an  interrupt 

*  every  .02746  seconds) . 

* 

*  A  metronome  60  is  1  Hz,  120  is  2  Hz,  etc. 

★ 

*  seconds  ==  60  /  metronome_count 

*  ticks  in  second  ==  (  60  /  metronome_count  )  *  0NE_TICK 

*  ==  (  (60  *  ONE_TICK)  /  metronome_count  ) 

*/ 


♦define  FACTOR 
♦define  DEFAULT_T ICK 
♦define  ONE  TICK 


16  /*  Speedup  factor  */ 
(TIMR_CLK  /  65536.0)  /*  ticks  /  second  */ 
(DEFAULTJTICK  *  FACTOR) 


♦define  TICKS (x)  ROUND (  (60.0  *  ONE_TICK)  /  (x)  ) 


88 


Dr.  Dobb's  Journal,  September  1987 

705 


481 

/* - 

- */ 

491 

501 

tdefine  min(a,b) 

((a)  <  (b)  ?  (a)  :  (b)  ) 

511 

#define  max(a,b) 

((a)  >  (b)  ?  (a)  :  (b) ) 

52| 

531 

#define  MAX  MEASURES 

1280  /*  Max  #  of  measures/track  */ 

54| 

#define  NUM  TRACKS 

4  /*  Number  of  tracks  */ 

55| 

561 

#define  WARNING 

(NUM  TRACKS  +  1) 

57| 

581 

/* - 

- */ 

591 

601 

typedef  Unsigned  char 

uchar; 

61 1 

typedef  unsigned  int 

uint; 

621 

63  | 

typedef  struct 

64  | 

( 

651 

uchar  num  beats; 

/*  #  of  beats  remaining  in  measure  */ 

661 

uchar  remainder; 

/*  Number  of  ticks  to  get  in  synch  */ 

67| 

uint  cur  tick; 

/*  Current  tick  for  this  beat  */ 

68  | 

uint  ticks  per  beat  ;  /*  #  of  clock  ticks  between  beats  */ 

691 

uint  metronome  : 

14  ;  /*  metronome  count  in  this  measure  */ 

701 

uint  silent  : 

1  ;  /*  this  measure  is  silent  */ 

711 

uint  warning  : 

1  ;  /*  warning  tone  at  downbeat  */ 

72| 

} 

73| 

MEASURE; 

741 

75| 

typedef  MEASURE  TRACK 

[  MAX  MEASURES  ]; 

76  | 

77  | 

TRACK  Tape  [  NUM  TRACKS  ]; 

781 

MEASURE  *Measure  [  NUM  TRACKS  ];  /*  Current  measure  on  */ 

791 

/*  each  track  of  Tape.*/ 

801 

int  Lineno  =  0; 

/*  Input  line  number  */ 

811 

82  | 

int  Ring  bell  =  0; 

/*  0  if  the  bell  shouldn't  ring. 

83  | 

*  Set  to  1  for  track  1,  2  for  track 

84  | 

*  2,  etc. 

85| 

*/ 

861 

87  | 

int  Collision  =  0 

;  /*  If  two  track  collide,  the  track 

88  | 

*  number  of  the  second  one  is 

891 

*  put  here . 

90  | 

*/ 

911 

92[ 

int  Downbeat  =1; 

/*  Incremented  by  the  interrupt 

93  | 

*  service  routine  on  every 

94  | 

*  downbeat  from  track  0; 

951 

*/ 

96  | 

97  | 

int  Done  =  0; 

/*  Set  to  1  by  interrupt  service 

98  | 

*  routine  when  it  gets  to  the 

99 1 

*  end  of  the  tape. 

100| 

*/ 

1011 

102| 

int  Numticks  =  0; 

/*  For  debugging,  incremented  on 

1031 

*  every  timer  interrupt. 

1041 

*/ 

105| 

106| 

/* - 

- */ 

107| 

1081 

char  *Usage  msg[]  = 

109| 

{ 

110| 

"Usage:  click  [-d] 

inputfile". 

1111 

II  II 

/ 

1121 

"  -d  (for  dull)  don't  use  different  notes  for  different". 

113| 

"  tracks". 

114  | 

II  II 

/ 

1151 

"A  polyrhythmic  metronome.  Four  independent  \"tracks\"”. 

1161 

"are  supported,  with  rhythms  specified  as  follows:". 

117| 

ll  ll  ^ 

1181 

"track  0:  #4  @120 

5/8,  #6  @120  6/8,  @120  4/4", 

1191 

"track  1:  #4 

6  ,  #6  @100  6/8", 

120| 

II  II 

1 

(continued  on  next  page) 

Dr.  Dobb's  Journal,  September  1987 

706 


89 


C  CHEST 


Listing  Seven  (Listing  continued,  text  begins  on  page  106.) 

121|  "No  line  can  be  longer  than  132  characters,  but  several", 

122|  "track  specifiers  can  be  given.  The  basic  notation  is:", 

123|  "\"[#N]  [@Z]  X[/Y],\"  interpreted  as  N  measures  of  X/Y  at", 

1241  "metronome  Z.  If  the  \"#N\"  is  missing,  1  is  used.  If", 

125|  "the  \"@Z\"  is  missing,  the  measure  is  stretched  to", 

126|  "synch  with  track  one  on  the  down  beat.  The  \"/Y\"  is", 

127 |  "optional.  So,  in  the  above  example,  the  six  beats  in", 

1281  "the  first  four  measures  of  track  2  will  synch  up  with", 

129|  "track  one,  coming  into  synch  on  the  downbeat  of  each", 

130|  "measure.  Each  track  may  be  up  to  1000  measures  long.", 

131|  "Lines  that  don't  begin  with  V'trackV1  are  ignored", 

132 1  NULL' 

1331  )  ; 

1341 

135|  /* - */ 

136| 

137|  char  *skipwhite (p) 

138 |  char  *p; 

1391  < 

140|  /*  Skip  all  characters  that  aren't  part  of  a  command  */ 

1411 

1421  whilef  *p  ss  (isspace(*p)  I  I  *p  —  '\n')  ) 

1431  p++  ; 

1441 

145|  return  p; 

146|  > 

147| 

1481  /* - */ 

14  91 

1501  err  (  fmt  ) 

1511  char  *fmt; 

152|  { 

153|  /*  Works  like  printfO  but  writes  to  standard  error  and 

1541  *  prints  an  input  line  number  along  with  the  message. 

155|  */ 

156| 

157|  va_list  args; 

158 |  va_start (  args,  fmt  )  ; 

159| 

1601  fprintf  (  stderr,  "line  %d:  ",  Lineno  ); 

1611  vfprintf (  stderr,  fmt,  args  ); 

162|  } 

163| 

1641  /* - */ 

1651 

166|  init() 

1671  { 

168|  /*  Initialize  The  Measure  array  to  point 

1691  *  at  the  first  measure  of  each  track  on  the  Tape. 

1701  */ 

1711 

1721  register  int  i; 

1731 

1741  for (  i  =  NUM_TRACKS;  — i  >=  0;  ) 

1751  Measure  [i]  =  Tape[i]  ; 

1761  ) 

1771 

178  1  /* - */ 

1791 

180 |  print_tape (  this_many  ) 

1811  { 

1821  /*  Print  out  the  tape.  If  this_many  is  0,  only  the 

1831  *  initialized  measures  are  printed;  otherwise,  the 

184|  *  indicated  number  of  measures  are  printed 

1851  */ 

186| 

187|  MEASURE  *p  ; 

188 1  register  int  i  ,  measure_num  ; 

1891  int  amt; 

190| 

191|  for (  i  =  0;  i  <  NUM_TRACKS  ;  i++  ) 

1921  { 

1931  printf(  "Track  %d:\n",  i  ); 


90 


Dr.  Dobb’s  Journal,  September  1987 

707 


1941 

measure  num  =0; 

195| 

amt  =  this  many; 

1961 

197| 

for(  p=Tape[i];  p->num  beats>0  ||  — amt  >-  0;  p++) 

1981 

{ 

199| 

printf  ("  measure  %2d:  ",  ++measure  num  ); 

200| 

printf (" [%2d  ticks/beat  "  ,  p->ticks_per  beat  ) 

; 

2011 

printf("+  %d] ,  "  ,  p->remainder  ); 

2021 

printf ("cur  tick=%d,  "  ,  p->cur  tick  ) ; 

2031 

printf  (  "%d  beats  "  ,  p->num  beats  ) ; 

2041 

2051 

if (  p->metronome  ) 

206| 

printf (  "at  %-5d  ",  p->metronome  ); 

207| 

else 

208| 

printf (  "in  synch  "  ) ; 

2091 

2101 

printf ("%s",  p->silent  ?  "(mute)"  :  ); 

2111 

printf  ("%s",  p->warning  ?  "(warn)"  :  ""  ) ; 

2121 

printf  ("\n”)  ; 

2131 

) 

2141 

} 

2151 

) 

216| 

217| 

/*- 

— 

— 

•*/ 

2181 

219| 

build  tracks  (  file  name  ) 

220| 

char 

*file  name; 

221| 

{ 

2221 

FILE  *fp; 

223| 

MEASURE  *mp; 

224| 

char  buf[133],  *line; 

225| 

int 

i; 

2261 

int 

track;  /*  Track  number 

*/ 

227| 

int 

num  measures;  /*  #  of  measures  to  repeat 

*/ 

2281 

int 

metronome;  /*  metronome  count 

*/ 

2291 

int 

beats;  /*  beats  per  measure 

*/ 

230| 

int 

ticks;  /*  ticks  per  measure 

*/ 

231| 

int 

measure;  /*  Current  measure  number 

*/ 

232| 

int 

silent;  /*  measure  is  silent 

*/ 

2331 

int 

warning;  /*  warning  on  last  repeat 

*/ 

234  | 

235| 

if  ( 

!  (fp  =  fopen(file  name,"r"))  ) 

2361 

return  0; 

237| 

238  | 

init  () ; 

2391 

2401 

while  (  line  =  fgets  (buf,  133,  fp)  ) 

2411 

{ 

242| 

++Lineno; 

2431 

2441 

line  =  skipwhite(  line  ); 

245| 

246| 

if  (  !  (  line[0]=='t’  SS  line(l]=='r’ 

247| 

ss  line[2]==‘a'  SS  line[3]=='c'  SS  line[4]==‘ 

k‘ 

248| 

) 

249| 

)  continue; 

250| 

251! 

line  +=  5;  /*  Get  track  number 

*/ 

252| 

line  =  skipwhite  (  line  ) ; 

2531 

track  =  stoi  (  sline  ) ; 

2541 

line  =  skipwhite  (  line  ) ; 

2551 

2561 

if  (  *line  =■*','  II  *line  ==  '  ) 

257| 

line++; 

2581 

2591 

mp  =  Measure  [  track  ];  /*  starting  measure  # 

*/ 

260| 

measure  =  mp  -  Tape  [  track  ] ; 

2611 

2621 

while(  *line  ) 

263| 

{ 

2641 

num  measures  =  1; 

2651 

metronome  =0; 

2661 

silent  =  0; 

2671 

warning  =0; 

268  | 

(continued  on  next  page ) 

Dr.  Dobb's  Journal,  September  1987 

708 


C  CHEST 

Listing  Seven 

(Listing  continued,  text  begins  on  page  106.) 

2691 

for(  line=skipwhite  (line) ;  *line;  line=skipwhite (line)  ) 

210  | 

( 

2711 

if  (  *line  ==';')  /*  comment  */ 

2721 

( 

273| 

*line  =  0; 

274| 

break; 

2751 

) 

2761 

else  if (  *line  '  II  *line  ==  ) 

2771 

{ 

2781 

++line;  /*  end  of  measure  */ 

2791 

break; 

280| 

) 

2811 

if (  *line  =='#')  /*  #  of  measures  */ 

2821 

( 

283| 

line  =  skipwhite  (  ++line  ); 

284  | 

num  measures  =  stoi  (  sline  ); 

2851 

) 

2861 

else  if (  *line  —  • @ ■  )  /*  metronome  count  */ 

287| 

( 

2881 

line  =  skipwhite  (  ++line  ) ; 

289| 

metronome  =  stoi  (  Sline  ); 

290| 

) 

2911 

else  if (  *line  ==  'w'  I  I  *line  ==  'W'  ) 

2921 

( 

293| 

while(  isalpha (*line  )  ) 

294| 

++line; 

295| 

2961 

warning  =  1; 

2971 

) 

2981 

else  if (  isdigit (  *line  )  )  /*  Time  signature  */ 

2991 

( 

3001 

if (  (beats  =  stoi  (Sline))  ==  0  ) 

3011 

( 

302| 

err (  "Illegal  time  signature\n"  ); 

3031 

exit (1)  ; 

3041 

) 

3051 

306| 

if(  *line  =='/')  /*  Throw  away  bottom  */ 

307| 

++line;  /*  of  time  signature.  */ 

308| 

309| 

while (  isdigit (  *line  )  ) 

3101 

line++; 

3111 

) 

312| 

else  if (  *line  ==  1 ( '  II  *line  ==■)’) 

3131 

( 

3141 

++  line  ; 

315| 

silent  =  1; 

3161 

} 

317| 

else 

3181 

err("<%c>  is  illegal  in  measure  spec.\n",  *line  ); 

3191 

) 

3201 

3211 

for(;  — num  measures  >=  0;  mp++,  measure++  ) 

3221 

( 

3231 

if(  metronome  ) 

324  | 

( 

3251 

mp->metronome  =  metronome; 

3261 

ticks  =  TICKS (metronome)  *  beats  ; 

327| 

) 

3281 

else 

3291 

{ 

3301 

mp->metronome  =  0; 

3311 

ticks  =  Tape [0] [measure] ,ticks_per_beat 

3321 

*  Tape[0] [measure] ,num_beats 

3331 

/ 

3341 

> 

3351 

3361 

if (  num  measures  ==  0  )  /*  last  in  series  */ 

337| 

mp->warning  =  warning; 

3381 

3391 

mp->silent  =  silent  ; 

3401 

mp->num  beats  =  beats  ; 

3411 

mp->cur  tick  =  ticks  /  beats; 

92 


Dr.  Dobb's  Journal,  September  1987 

709 


mp->ticks_per_beat  -  ticks  /  beats; 
mp->remainder  =  ticks  %  beats; 

D(  printf ("loading  track  %d,  ",  track  );) 
D(  printf ("measure  %d:  "  ,  measure  );) 
D(  printf ("%d  beats/measure  "  ,  beats  );) 
D(  printf ("at  metronome  %d\n"  ,  mp->metronome) ;) 


Measure  [  track  ]  =  rttp  ; 


init  ()  ; 

D(  print_tape(  0  );  ) 
return  1; 


Spragma  check_stack-  /*  Turn  off  stack  probes.  This  pragma  */ 
/*  is  Microsoft-compiler  dependent.  */ 

timer_intr  () 

{ 

/*  Interrupt  service  routine  for  timer  interrupt  */ 

MEASURE  **mp,  *p  ; 

int  i,  did_nothing  ; 

Done  =  1; 

++  Numticks; 

for (  i  =  0,  mp  =  Measure;  ++i  <=  NUM  TRACKS  ;  mp++  ) 

( 

if (  (P  =  *mp) ->num_beats  >  0  ) 

( 

if (  p->cur_tick==p->ticks_per_beat  ) 

( 

/*  Ring  bell  on  first  tick  of  measure 

*  unless  this  is  a  silent  measure. 

*  Warnings  take  precedence  over 

*  everything. 


if (  p->warning  ) 

( 

Ring_bell  =  WARNING  ; 
p->warning  =  0; 


else  if (  !p->silent  ) 

( 

if (  !Ring_bell  ) 
Ring_bell  =  1; 

else 

Collision  =  i; 


if  (  —  p->cur_tick  <=  0  ) 

( 

if (  —  p->num_beats  <=  0  ) 

< 

if  (  i  ==  1  ) 
++Downbeat; 


++(  *mp  );  /*  go  to  next  measure  */ 


p->cur_tick  =■  p->ticks_per_beat; 

if (  p->remainder  >  0  ) 

*  (continued  on  next  page ) 


Dr.  Dobb  s  Journal,  September  1987 


C  CHEST 


Listing  Seven 


(Listing  continued,  text  begins  on  page  106.) 

—  p->remainder  ; 

++  p->cur_tick  ; 


Done  =  0; 


#pragma  check_stack+ 


on  break ( ) 


/*  Routine  for  signal)),  tries  to  put  the  clock  rate 

*  back  to  normal  on  a  Ctrl-Break.  Note  that  this 

*  routine  can  fail  if  Ctrl-Break  is  hit  several  times 

*  in  quick  succession. 


signal (  SIGINT,  SIG_IGN  ) ; 
s lowdown ( ) ; 
exit  (0) ; 


print_stats  () 

{ 

) 


usage  () 

{ 


char  **p; 

for)  p  =  Usage_msg;  *p;  fprintf (stderr,  "%s\n",  *p++)  ) 
/ 

exit (1) ; 


main (  argc,  argv 
char  **argv; 


static  int  measure  =  0;  /*  current  measure  in  track  0  */ 


static  int  dull 
static  int  stats 
static  int  i; 


=0;  /*  Dull  output 
=  0;  /*  Statistics  only 


if  (  argc  ==  3  ) 


if)  **(++argv) 
usage  () ; 


switch)  argv[0][l]  ) 

{ 

case  'd' :  dull  =  1; 
default : 


break; 
usage  ()  ; 


else  if)  argc  !=  2  I  I  *argv[l]  ==  ) 

usage  ()  ; 

if)  !build_tracks (  argv[l]  )  ) 
exit  (  2  ) ; 


i  -  (  stats  ) 


94 


Dr.  Dobb 's  Journal,  September  1987 

711 


C  CHEST 

490  | 

{ 

4911 

print  stats  (); 

4921 

exit  (0)  ; 

493| 

) 

4941 

495| 

signal (  SIGINT,  on  break  ); 

4961 

497| 

for (  speedup (FACTOR,  timer  intr);  !Done;  ) 

498| 

( 

4S9| 

cli<);  /*  Interrupts  off  */ 

500| 

5011 

if  (  Ring  bell  =-=  WARNING  ) 

5021 

( 

5031 

Collision  =  0; 

504  1 

Ring  bell  =  0; 

505| 

sti(); 

5061 

beep  (  C4*8,  0.125  ); 

507| 

delay (  0.1  ) ; 

5081 

beep  (  C4*8,  0.125  ); 

509| 

) 

5101 

else  if(  Collision  ) 

5111 

( 

512| 

/*  the  higher  track  wins  */ 

513| 

5141 

i  =  max(  Collision,  Ring  bell  ); 

515| 

Collision  =  0; 

5161 

Ring  bell  =  0; 

517| 

sti  () ; 

5181 

beep (  dull  ?  C4*2  :  C4  *  i,  0.125  ); 

519| 

) 

5201 

else  iff  Ring  bell  ) 

521| 

( 

522| 

/*  You  must  set  Ring  bell  to 

523| 

*  0  before  calling  beep  so  that 

524| 

*  a  note  won't  collide  with  itself. 

5251 

*/ 

712 


STRUCTURED  PROGRAMMING 


Listing  One  (Text  begins  on  page  122.) 

PROGRAM  CBTree; 


Program  to  test  the  compare  the  time  needed  in  creating  and 
searching  a  binary  tree  and  a  CB-tree. 

Author:  Namir  Clement  Shammas 
===== - ======================================= - ==} 

CONST  SIZE  =  1000; 

RANGE  =  10000; 

MainLoopCount  =  100; 

TYPE  Bin_Ptr  =  ABin_Node; 

{  normal  binary  tree  strcutures  } 

Bin_Node  =  RECORD 

Value  :  INTEGER; 

Left,  Right  :  Bin_Ptr; 

END; 

CB_Ptr  =  ACB_Node; 

{  Record  structure  for  clustured  binary  tree  } 

CB_Node  =  RECORD 

Value  :  INTEGER; 

HLeft,  HRight, 

LLeft,  LRight  :  CB_Ptr; 

END; 

REGTYPE  =  RECORD 

AX,  BX,  CX,  DX,  BP , 

DI, SI, DS, ED, FLAGS  :  INTEGER 
END; 

Time_Rec  =  RECORD 

HOUR,  MIN,  SEC,  HSEC  :  BYTE 
END; 


VAR  Numbers  :  ARRAY  [1..SIZE]  OF  INTEGER; 

Iter,  I,  Choice,  CDV,  Diff  :  INTEGER; 

BinTreeRoot  :  Bin_Ptr; 

CBTreeRoot  :  CB_Ptr; 

Timer_Start,  Timer_Stop,  Time_Elapsed  :  Time_Rec; 


PROCEDURE  Get_Time (VAR  TIME  :  Time_Rec  {  output  } ) ; 

VAR  REGISTER  :  REGTYPE; 

AH  :  BYTE; 

BEGIN 

AH  :=  $2C; 

WITH  REGISTER,  TIME  DO  BEGIN 
AX:=  AH  SHL  8; 

MSDOS (REGISTER) ; 

HOUR  :=  Hi (CX) ; 

MIN  :=  Lo(CX); 

SEC  :=  Hi  (DX) ; 

HSEC  :=  Lo(DX); 

END; 

END; 

- - - 

PROCEDURE  Time_Diff (T_Start, 

T  Finish  :  Time_Rec;  (  input  } 

VAR  Delta  Time  :  Time_Rec  {  output  )) ; 


98 


Dr.  Dobb  s  Journal,  September  1987 

713 


BEGIN 

WITH  DeltaJTime  DO  BEGIN 

HOUR  :=  T_Finish .HOUR  -  TJStart .HOUR; 

IF  T_Start .MIN  >  T_Finish.MIN  THEN  BEGIN 
HOUR  :=  HOUR  -  1; 

T_Finish.MIN  :=  T_Finish.MIN  +  60; 

END; 

MIN  :=  T_Finish.MIN  -  T_Start.MIN; 

IF  T_Start .SEC  >  T_Finish.SEC  THEN  BEGIN 
MIN  :=  MIN  -  1; 

T_Finish.SEC  :=  T_Finish.SEC  +  60; 

END; 

SEC  :=  T_Finish.SEC  -  T_Start.SEC; 

IF  T_Start.HSEC  >  T_Finish .HSEC  THEN  BEGIN 
SEC  :=  SEC  -  1; 

T_F in i sh . HSEC  :=  T_Finish.HSEC  +  100; 

END; 

HSEC  :=  T_Finish .HSEC  -  T_Start .HSEC; 

END;  (*  WITH  *) 

END;  (*  Time_Diff  *) 

1 - } 

PROCEDURE  Show_Time (T  :  Time_Rec  {  input  }); 

BEGIN 

WITH  T  DO  BEGIN 

WRITE  ('Time  elapsed  =  \HOUR,'  :  ',MIN,'  :  SEC,  HSEC)  ; 

END; 

END;  (*  Show_Time  *) 

{ - , 

PROCEDURE  Create (Choice  :  INTEGER  {  input  }); 

VAR  J  :  INTEGER; 

Angle,  FloatSize,  Increment  :  REAL; 

BEGIN 

CASE  Choice  OF 

1  :  BEGIN 

Angle  :=  0.0; 

Increment  :=  Pi  /  360.0; 

FOR  J  :=  1  TO  SIZE  DO  BEGIN 

Numbers [J]  :=  Trunc (SIN (Angle)  *  RANGE) ; 

Angle  :=  Angle  +  Increment; 

END; 

END; 

2  :  BEGIN 

Angle  : =  0.0; 

Increment  :=  Pi  /  360.0; 

FOR  J  :=  1  TO  SIZE  DO  BEGIN 

Numbers [J]  :=  Trunc (COS (Angle)  *  RANGE) ; 

Angle  :=  Angle  +  Increment; 

END; 

END; 

ELSE  BEGIN 

Numbers [1]  :=  RANGE  div  2; 

FOR  J  :=  2  TO  SIZE  DO 

Numbers [J]  :=  Trunc (Random  *  RANGE); 

END; 

END;  (  CASE  } 

END; 

(continued  on  next  page) 


Dr.  Dobb's  Journal,  September  1987 

714 


STRUCTURED  PROGRAMMING 


Listing  One  (Listing  continued,  text  begins  on  page  122. ) 

{ - ) 


PROCEDURE  Bin_Insert (VAR  Root  :  Bin_Ptr;  {  input  } 

Item  :  INTEGER  {  input  } ) ; 

(*  Insert  element  in  binary-tree  *) 

BEGIN 

IF  Root  =  NIL  THEN  BEGIN 
NEW (Root) ; 

Root*. Value  :=  Item; 

RootA.Left  :=  NIL; 

Root*. Right  :=  NIL 

END 

ELSE 

WITH  Root*  DO 

IF  Item  <  Value  THEN  Bin_Insert (Left,  Item) 

ELSE  Bin_Insert (Right,  Item) ; 

END; 

{ - } 

PROCEDURE  Bin_Search (VAR  Root  :  Bin_Ptr;  {  input  ) 

Target  :  INTEGER  {  input  ) ) ; 

(*  Recursive  procedure  to  search  for  Target  value  *) 

BEGIN 

IF  Root  <>  NIL  THEN 

IF  Target  <>  Root*. Value  THEN 

IF  Target  <  Root*. Value  THEN  BEGIN 
Root  :=  Root*. Left; 

Bin_Search (Root , Target) 

END 

ELSE  BEGIN 

Root  :=  Root*. Right; 

Bin_Search (Root, Target) 

END; 

END; 


< - > 

PROCEDURE  CB_Insert (VAR  Root  :  CB_Ptr;  (  input  } 

VAR  Item  :  INTEGER  {  input  ) ) ; 

(*  Insert  element  in  a  CB-tree  *) 


BEGIN 

IF  Root  =  NIL  THEN  BEGIN 
NEW  (Root)  ; 

Root*. Value  :=  Item; 

Root*.LLeft  :=  NIL; 

Root*.LRight  :=  NIL; 

Root*.HLeft  :=  NIL; 

Root*.HRight  :=  NIL; 

END 

ELSE 

WITH  Root*  DO  BEGIN 

Diff  :=  Value  -  Item; 

IF  Diff  >  0  THEN 

IF  ABS (Diff)  <=  CDV  THEN  CB_Insert (LLeft,  Item) 
ELSE  CB_Insert  (HLeft,  Item) 

ELSE 

IF  ABS (Diff)  <=  CDV  THEN  CB_Insert (LRight,  Item) 
ELSE  CB_Insert (HRight, Item) ; 

END;  (*  WITH  *) 

END; 

{ - > 

PROCEDURE  CB_Search (VAR  Root  :  CB_Ptr;  {  input  ) 

VAR  Target  :  INTEGER  {  input  ) ) ; 

(*  Recursive  procedure  to  search  for  Target  value  *) 


BEGIN 


100 


Dr.  Dobb 's  Journal,  September  1987 

715 


IF  Root  <>  NIL  THEN 


END; 


IF  Target  <>  Root*. Value  THEN  BEGIN 
Diff  ;=  RootA. Value  -  Target; 

IF  Target  <  Root A. Value  THEN  BEGIN 
IF  ABS (Diff)  <=  CDV  THEN  Root  := 
ELSE  Root  : = 
CB_Sear ch (Root , Target ) 

END 

ELSE  BEGIN 

IF  ABS (Diff)  <=  CDV  THEN  Root  := 
ELSE  Root  := 
CB_Search (Root , Target ) 

END; 


RootA .LLeft 
RootA .HLeft; 


RootA .LRight 
RootA.HRight; 


END; 


( - ) 

BEGIN  (*  MAIN  *) 

CDV  :=  Trunc (0.05  *  RANGE); 

ClrScr; 

WRITE ('  * : 10 ) ; 

WRITELN  ( 1 - Menu  for  Method  of  Generating  Numbers 

WRITELN; 

WRITELN ( '  0)  Random  numbers  '); 

WRITELN ( '  1)  Sine  function  '); 

WRITELN ( '  2)  Cosine  function  '); 

WRITELN; 

WRITE ('Enter  code  for  array  creation  :  '); 

READLN (Choice) ;  WRITELN; 

Create (Choice) ; 


WRITELN;  WRITELN ( 'Created  array');  WRITELN; 

(*  Building  the  binary  tree  *) 

BinTreeRoot  :=  NIL; 

Get_Time (Timer_Start) ; 

FOR  I  :=  1  TO  SIZE  DO 

Bin_Insert (BinTreeRoot, Numbers [I] ) ; 

Get_Time (Timer_Stop) ; 

Time_Diff (Timer_Start,  Timer_Stop,  TimeJElapsed) ; 

Show_Time (Time_Elapsed) ; 

WRITELN ('  for  creating  binary  Tree');  WRITELN; 

(*  Building  the  CB-tree  *) 

CBTreeRoot  :=  NIL; 

Get_Time  (Timer_Start)  ; 

FOR  I  :«  1  TO  SIZE  DO 

CB_Insert (CBTreeRoot /Numbers [I] ) ; 

Get_Time (Timer_Stop) ; 

Time_Diff (Timer_Start,  Timer_Stop,  Time_Elapsed) ; 

Show_Time (Time_Elapsed) ; 

WRITELN ('  for  creating  CB-Tree');  WRITELN; 

Get_Time (Timer_Start)  ; 

FOR  Iter  :«=  1  TO  MainLoopCount  DO 
FOR  I  :=  SIZE  DOWNTO  1  DO 

Bin_Search (BinTreeRoot, Numbers [SIZE  +  1  -  I]); 
Get_Time (Timer_Stop) ; 

Time_Diff (Timer_Start,  Timer_Stop,  Time_Elapsed) ; 

Show_Time (Time_Elapsed) ;  WRITELN)'  for  binary  tree  search'); 
WRITELN; 


'); 


Get_Time (Timer_Start) ; 

FOR  Iter  :=  1  TO  MainLoopCount  DO 
FOR  I  :=  SIZE  DOWNTO  1  DO 
CB_Search (CBTreeRoot, Numbers [SIZE  +  1  -  I]); 

Get_Time (Timer_Stop) ; 

Time_Diff (Timer_Start,  Timer_Stop,  Time_Elapsed) ; 

Show_Time (Time_Elapsed) ;  WRITELN ( '  for  CB-tree  search'); 

WRITELN; 

WRITELN  ( '  DONE ' )  ;  WRITELN; 

IP  End  Listing  One 

(Listing  Two  begins  on  next  page.) 


Dr.  Dobb's  Journal,  September  1987 

716 


101 


STRUCTURED  PROGRAMMING 


Listing  Two  ( Text  begins  on  page  1Z2.) 

PROGRAM  Test_Clustered_Lists; 

{$R+,V-} 


Turbo  Pascal  program  that  implements  routine  for  inserting, 
searching  and  viewing  clustered  list  structures. 

Copyright  (c)  1987  Namir  Clement  Shammas 

========== - ================================ - ======} 

CONST  LENSTRING  =  30; 

MAX_LIST  =  100; 

MDNM0 

CDV  =  0;  {  Critical  Difference  value  } 

TYPE  LSTRING  =  STRING [LENSTRING] ; 

KeyArray  =  ARRAY  [1. .MAX_LIST]  OF  LSTRING; 

ListPtr  =  AListNode; 

{  CLustered  List  structure  } 

ListNode  =  RECORD 

Key  :  LSTRING; 

{  other  fields  may  be  placed  here  } 

NextPtr,  NextHi  :  ListPtr; 

END; 

VAR  Head  :  ListPtr; 

LesKeys  :  KeyArray; 

I,  Count  :  INTEGER; 

{ - } 

PROCEDURE  Search_Node (Head  :  ListPtr;  {  input  } 

SearchData  ;  LSTRING;  {  input  } 

VAR  Found  :  BOOLEAN;  {  output  } 

VAR  LastPtr, 

ThisPtr  :  ListPtr  {  output  }); 

{  search  for  'SearchData1  in  list  } 

VAR  HiTrack  :  BOOLEAN; 

Ordl,  Diff  :  INTEGER; 

BEGIN 

Found  : =  FALSE ; 

HiTrack  :=  TRUE; 

LastPtr  :=  NIL; 

ThisPtr  :=  Head; 

Ordl  :=  ORD (SearchData [1] ) ; 

WHILE  (ThisPtr  <>  NIL)  AND  (ThisPtrA .Key  <  SearchData)  DO  BEGIN 

LastPtr  :=  ThisPtr; 

IF  HiTrack  THEN  BEGIN 

Diff  :=  ORD (ThisPtrA.Key [1] )  -  Ordl; 

IF  ABS  (Diff)  >  CDV 
THEN 

ThisPtr  :=  ThisPtrA .NextHi 
ELSE  BEGIN 

ThisPtr  :=  ThisPtr A. NextPtr; 

HiTrack  :=  FALSE  (  switch  to  low  track  ) 


102 


Dr.  Dobb's  Journal,  September  1987 

111 


END;  {  IF  ABS(Diff)  } 

END 

ELSE 

ThisPtr  :=  ThisPtrA. NextPtr; 

{  END  IF  HiTrack  } 

END;  {  WHILE  } 

IF  ThisPtr  <>  NIL  THEN  Found  :=  (ThisPtrA.Key  =  SearchData)  ; 
END;  {  Search_Node  } 

( - } 

PROCEDURE  Insert_List (VAR  Head  :  ListPtr;  {  in/out  } 

NewData  :  LSTRING  {  input  }); 

{  Insert  new  data  string  into  the  list  } 

VAR  Found  :  BOOLEAN; 

Ordl,  Diff  :  INTEGER; 

Tempo  :  LSTRING; 

Node,  LastPtr,  ThisPtr  :  ListPtr; 

BEGIN 

Ordl  :=  ORD (NewData [1] ) ;  (  get  ascii  code  of  firt  char  } 

IF  Head  =  NIL  THEN  BEGIN  {  start  a  new  list  } 
new (Head) ; 

WITH  HeadA  DO  BEGIN 
NextPtr  :=  NIL; 

NextHi  :=  NIL; 

Key  :=  NewData 
END;  {  WITH  ) 

END 

ELSE  BEGIN  {  expand  list  } 
new (Node ) ; 

WITH  NodeA  DO  BEGIN 
Key  :=  NewData; 

NextPtr  :=  NIL; 

NextHi  :=  NIL 
END;  {  WITH  ) 

Search_Node (Head,  NodeA.Key,  Found,  LastPtr,  ThisPtr); 

IF  LastPtr  =  NIL  THEN  BEGIN  {  insert  as  new  list  head  ) 

Diff  :=  ORD (HeadA . Key [ 1 ] )  -  Ordl; 

IF  ABS  (DIFF)  >  CDV  THEN 
NodeA. NextHi  :=  Head 

ELSE 

NodeA. NextHi  :=  HeadA .NextHi; 

NodeA. NextPtr  :=  Head; 

(  END  IF  } 

Head  : =  Node ; 

END 

ELSE  BEGIN  {  insert  new  data  in  the  middle  or  at  the  tail  } 

Diff  :=  Ordl  -  ORD(LastPtrA.Key[l] ) ; 

IF  Diff  <=  CDV  THEN  BEGIN 

{  insert  inside  a  clustered  sublist  ) 

(  LasPtr  may  be  a  high  of  low  track  node  ) 

NodeA .NextPtr  :=  LastPtr A .NextPtr; 

LastPtrA .NextPtr  :=  Node 

END 

ELSE  BEGIN 


(continued  on  next  page) 


Dr.  Dobb's  Journal,  September  1987 

718 


103 


STRUCTURED  PROGRAMMING 


Listing  Two  (Listing  continued,  text  begins  on  page  122.) 

IF  ThisPtr  <>  NIL 
THEN  BEGIN 

Diff  :=  Ordl  -  ORD  (ThisPtr A .Key [ 1] ) ; 

IF  ABS  (Diff)  >  CDV 

THEN  BEGIN  {insert  between  two  high  track  nodes  } 
NodeA. NextHi  :=  LastPtrA.NextHi; 

LastPtrA. NextHi  :=  Node 

END 

ELSE  BEGIN  {swap  names  in  the  next  high  track  node  } 
Tempo  :=  NodeA.Key; 

Node A .Key  :=  ThisPtrA .Key; 

ThisPtrA.Key  :=  Tempo; 

{  insert  a  new  swapped  first  element  } 

{  in  clustered  sublist  } 

NodeA.NextPtr  :=  ThisPtrA .NextPtr; 

ThisPtrA .NextPtr  :=  Node 
END;  {  IF  } 

END 

ELSE  BEGIN  {  insert  as  last  high  track  node  } 
NodeA.NextHi  :=  LastPtrA .NextHi; 

LastPtrA -NextHi  :=  Node 
END;  {  IF  } 

END;  {  IF  ) 

END;  {  IF  LastPtr  =  NIL  > 

END;  {  IF  Head  =  NIL  } 

END;  {  Insert_List  } 

{ - > 

PROCEDURE  List_to_Array (Head  :  ListPtr;  {  input  } 

VAR  Keys  :  KeyArray;  {  output  } 

VAR  Count  :  INTEGER  {  output  } ) ; 

{  Converts  the  list  to  an  array  containing  sorted  names  } 

{ - } 

PROCEDURE  Visit_Low_Node (VAR  Node  :  ListPtr) ; 

{  Local  recursive  routine  to  visit  low  tracks  of  a  list  ) 

BEGIN 

IF  (Node  <>  NIL)  AND  {Count  <  MAX_LIST)  THEN  BEGIN 
Count  :=  Count  +  1; 

Keys [Count]  :=  NodeA.Key; 

WRITE ('  ' , Keys [Count ]: 10) ; 

Visit_Low_Node (NodeA .NextPtr) ; 

END 

ELSE  WRITE LN ( '  -] ' ) ; 

END;  {  Visit_Low_Node  ) 

{ - > 

PROCEDURE  Visit_Hi_Node (VAR  Node  :  ListPtr); 

(  Local  recursive  routine  to  visit  high  tracks  of  a  list  } 

BEGIN 

IF  (Node  <>  NIL)  AND  (Count  <  MAX_LIST)  THEN  BEGIN 
Count  :=  Count  +  1; 

Keys [Count]  :=  NodeA.Key; 

WRITE (Keys [Count] : 10) ; 

Visit_Low_Node (NodeA .NextPtr) ; 

Visit_Hi_Node (NodeA .NextHi) ; 

END; 


104 


Dr.  Dobb's  Journal,  September  1987 

719 


END;  {  Visit_Hi_Node  } 

{ - } 

BEGIN 


IF  Head  <>  NIL  THEN  BEGIN 
Count  :=  0; 
Visit_Hi_Node (Head) ; 

END 

ELSE 

Count  :=  0; 

{  END  IF  ) 


END;  {  List_to_Array  ) 

{ - ) 

BEGIN 

ClrScr; 

IF  CDV  <  0  THEN  BEGIN 

WRITELN ( 'Adjust  Critical  Difference  Value  Please'); 

HALT 

END; 

Head  :=  NIL; 

WRITELN (' List  of  sorted  capitals  ');  WRITELN; 

Insert_List (Head, 'Athens ' ) ;  Insert_List (Head, 'London ' ) ; 
Insert_List (Head, 'Bonn ' ) ;  Insert_List (Head, 'Ankara ' ) ; 

Insert_List (Head, 'Sau  Paulo' ) ;  Insert_List (Head, 'Moscow' ) ; 
Insert_List (Head, 'Otawa' ) ;  Insert_List (Head, 'Tokyo ' ) ; 

Insert_List  (Head, 'Bern ' ) ;  Insert_List (Head, 'Warsaw' ) ; 

Insert_List  (Head,  'Cairo') ;  Insert_List  (Head,  'Rome')  ; 

Insert_List (Head, 'Madrid') ;  Insert_List (Head, 'Lisbon' ) ; 

Insert_List (Head, 'Paris' ) ;  Insert_List (Head, 'Baghdad' ) 

List_to_Array (Head,  LesKeys,  Count); 

END. 


720 


COLUMNS 


C  CHEST 


The  Ultimate  Metronome: 
Writing  Interrupt  Service  Routines  in  C 


The  ostensible  topic  of  this 
month's  column  is  a  fancy  met¬ 
ronome  program  capable  of  doing 
polyrhythms  and  even  more  com¬ 
plex  rhythmic  patterns.  As  is  often 
the  case,  however,  the  pieces  are  as 
interesting  as  the  whole.  For  exam¬ 
ple,  in  order  to  get  adequate  timing 
resolution,  I  had  to  speed  up  the  IBM 
PC's  system  clock  and  then  supply 
my  own  interrupt  service  routine, 
written  largely  in  C — though  there's 
a  little  assembly  language.  The  rou¬ 
tines  used  for  this  purpose  are  useful 
in  other  applications  as  well  (such  as 
a  preemptive  multitasking  scheduler 
that  I'm  also  writing). 

The  metronome  is  pretty  spiffy  in 
its  own  right,  though.  I  use  it  for  my 
own  compositional  work  to  create 
"click  tracks"  for  electronic  pieces.  (A 
click  track  is  one  track  of  a  multitrack 
tape  that  holds  nothing  but  timing  in¬ 
formation.  You  use  it  as  a  reference 
as  you  add  additional  tracks  to  the 
piece  and  then  remove  it  from  the 
final  mix.)  The  program  lets  you  plot 
all  the  complex  rhythmic  patterns 
required  for  a  complete  piece  of  mu¬ 
sic.  (The  maximum  time  varies;  the 
program  can  do  up  to  1,280  mea¬ 
sures,  and  if  these  are  measures  of  6/ 


by  Allen  Holub 


8  at  metronome  120,  that’s  64  min¬ 
utes.)  Though  I’m  just  ringing  the  bell 
on  the  PC  to  mark  beats,  it  wouldn't 


be  too  hard  to  modify  this  program  to 
use  a  MIDI  drum  machine  or  the  like 
to  make  the  sounds. 

Metronomes  and 
Time  Signatures 

When  metronomes  were  introduced 
in  the  early  19th  century  (Beethoven 
was  the  first  major  composer  to  use 
one),  it  finally  became  possible  for 
composers  to  present  timing  infor¬ 
mation  accurately  to  performers.  To 
understand  why  this  wasn't  possible 
before,  you  have  to  look  at  how  a 
standard  time  signature  works.  Mu¬ 
sic,  by  nature,  is  cyclical,  and  a  mea¬ 
sure  is  one  complete  cycle  of  the  un¬ 
derlying  rhythmic  pattern.  A  march, 
for  example,  has  two  beats  in  a  mea¬ 
sure  (oom  pah,  oom  pah),  a  waltz  has 
three  (oom  pah  pah,  oom  paa  pah), 
and  rock  and  roll  usually  has  four- 
beat  measures.  The  main  purpose  of 
the  measure  structure  is  to  tell  the 
performer  where  to  put  emphasis. 
That  is,  the  first  note  of  the  measure 
(called  the  downbeat)  is  usually 
played  a  little  more  forcefully  than 
the  other  notes  in  the  measure  (OOM! 
pah  pah). 


Coupled  with  the  measure  are  the 
individual  notes.  A  whole  note  re¬ 
quires  an  entire  measure  to  play.  If 
you  think  in  terms  of  an  organ,  you 
hold  the  key  down  for  the  entire 
measure.  A  half  note  requires  half  a 
measure.  There  are  two  half  notes  in 
a  whole  note,  two  quarter  notes  in  a 
half  note,  two  eighth  notes  in  a  quar¬ 
ter  note,  and  so  forth.  Each  of  these 
types  of  notes  are  represented  by  a 
slightly  different  symbol  in  a  musical 
score.  You'll  note  (so  to  speak)  that 
this  system  is  self-relative.  There's  no 
absolute  durational  information  in  a 
particular  note  symbol — every¬ 
thing's  relative  to  the  other  notes. 

A  standard  time  signature  gives 
you  two  pieces  of  information — the 
number  of  beats  in  a  measure  and 
which  of  the  various  types  of  notes  in 
the  score  represent  one  beat.  For  ex¬ 
ample,  the  time  signature  3/4  (pro¬ 
nounced  "three  four”)  has  three 
beats  in  the  measure  and  a  quarter 
note  is  one  beat  long.  So  a  complete 
three-beat  measure  can  be  composed 
of  three  quarter  notes,  six  eighth 
notes,  two  quarter  notes  and  two 
eighth  notes,  and  so  forth.  A  time  sig¬ 
nature  is  not  a  fraction;  6/8  is  not  the 
same  thing  as  3/4.  The  emphasis  in 
the  former  is  on  every  sixth  beat, 
whereas  the  emphasis  in  the  latter  is 
on  every  third  beat.  There's  no  infor¬ 
mation  in  a  time  signature  about  the 
actual  duration  of  a  beat  in  seconds. 
This  duration  is  usually  detailed  with 
written  instructions — most  often  in 
Italian — that  are  quite  inexact  (fast, 
slow,  medium-fast,  and  so  on). 

The  metronome  changed  all  this. 
The  mechanical  metronome  used  by 
Beethoven  is  still  in  use  today.  It  con¬ 
sists  of  a  spring-driven  pendulum 
with  a  weight  on  it.  The  position  of 
the  weight  determines  the  rate  at 


l<  - 
1 

. Measure  1  . 

...  >l<  .. 

. Measure  2  . 

—  >1 

1  1 
3/4:  X 

1  1  1  1  1  !  1  1  1 

X  X 

1  1  1 

X 

1  1  1  1  l  I  1  I  l 

X  X 

1  1 

X 

4/4:  X 

XXX 

X 

XXX 

X 

Figure  1:  A  3  against  4  polyrhythm 


106 


Dr.  Dobb's  Journal ,  September  1987 

721 


C  CHEST 

(continued  from  page  106) 


which  the  pendulum  swings,  and  the 
box  ticks  at  the  end  of  each  swing. 
Metronomes  are  calibrated  in  beats 
per  minute,  so  a  metronome  60  is  one 
beat  per  second  and  a  metronome 
120  is  two  beats  per  second.  The  com¬ 
poser  can  now  put  an  instruction  at 
the  top  of  the  score  saying  something 
such  as  "an  eighth  note  is  played  at 
metronome  100.” 

Traditional  metronomes  are  fine  if 
you’re  writing  waltzes,  but  they 
aren’t  really  adequate  for  most  mod¬ 
ern  music  for  three  reasons.  First, 
some  pieces  have  different  time  sig¬ 
natures  on  literally  every  measure, 
and  you  obviously  can't  stop  after  ev¬ 
ery  measure  to  adjust  your  metro¬ 
nome.  A  second,  related,  problem  is  a 
piece  with  measures  of  alternating 
time  signatures — say,  alternating 
measures  of  5/8  and  6/8.  Dave  Bru- 
beck  and  Steve  Reich  both  use  this 
technique  extensively.  Finally, 
there's  the  problem  of  polyrhythms. 
A  polyrhythm  is  a  rhythmic  pattern 
in  which  two  dissimilar  time  signa¬ 
tures  are  superimposed.  For  exam¬ 
ple,  in  three  against  four,  a  three-beat 
pattern  is  played  in  one  hand  and  a 
four-beat  pattern  in  the  other.  These 
come  together  in  at  least  one  place 
(usually  on  the  downbeat)  of  every 
measure.  This  timing  is  illustrated  in 
Figure  1,  page  106.  The  three-beat 
pattern  beats  on  every  fourth  tick, 
the  four-beat  pattern  on  every  third 
tick.  Polyrhythmic  music  is  usually 
great  dance  music — you  hear  it  in 
Latin,  Brazilian,  and  West  African 
pop  music  and  more  and  more  in 
contemporary  Western  music. 

Click:  A  Four-Channel 
Programmable  Metronome 

The  program  I  present  this  month, 
called  click,  is  a  programmable  met¬ 
ronome  that  solves  all  the  problems 
I’ve  just  discussed.  Its  invocation  syn¬ 
tax  is: 

click  [— d]  file 

where  file  holds  an  input  program. 
Click  outputs  two  things — beats  in 
the  form  of  sounds  and  a  running 
count  of  the  number  of  measures 
played  so  far. 

Click  is  programmed  using  a  multi¬ 


track  tape  recorder  model.  Four 
tracks  (numbered  0  to  3)  are  support¬ 
ed,  each  of  which  can  contain  up  to 
1,280  measures.  Track  0  is  always  re¬ 
quired;  the  others  are  optional.  The 
four  tracks  can  be  programmed  inde¬ 
pendently,  or  they  can  synchronize 
to  each  other  at  the  beginning  of  each 
measure.  Different  notes  are  used  for 
the  different  tracks  so  that  you  can 
distinguish  them.  Track  0  uses  mid¬ 
dle  C,  track  1  uses  the  C  one  octave 
up,  track  2  uses  the  G  above  that,  and 
track  3  uses  the  C  above  that.  When 
there’s  a  conflict  (a  beat  occurring  at 
the  same  time  on  more  than  one 
track),  the  note  associated  with  the 
higher  track  is  used.  The  optional  -d 
command-line  switch  forces  all  four 
tracks  to  use  the  same  note  (C5)  for 
their  beats. 

A  short  click  program  is  shown  in 
Example  1,  page  112.  The  input  lan¬ 
guage  works  as  follows.  White  space 
(spaces,  tabs,  blank  lines,  and  so  on)  is 
ignored  except  as  is  needed  to  sepa¬ 
rate  commands.  By  the  same  token, 
all  lines  that  don’t  start  with  the 
word  track  and  all  text  on  a  line  that 
follows  a  semicolon  are  ignored.  You 
can  use  this  mechanism  to  comment 
your  programs.  The  virtual  tape  is  as¬ 
sembled  using  track  commands, 
which  take  the  following  form: 

track  n:  <measure>  [,  <measure>] 

where  n  is  a  number  in  the  range  0-3 
that  specifies  a  track  on  the  virtual 
tape.  A  <measure>  is  all  text  starting 
with  a  colon  or  comma,  up  to  the 
next  comma  or  end  of  line.  Several 
comma-delimited  measures  can  be 
placed  on  a  single  line.  Tracks  are  ac¬ 
cumulated  by  successively  adding 
the  information  in  each  track  com¬ 
mand  to  the  indicated  track. 

A  measure  comprises  one  or  more 
of  the  following  commands,  where  n 
and  m  are  numbers,  and  the  com¬ 
mands  can  occur  in  any  order: 

n/m — A  command  that  starts  with  a 
digit  represents  either  the  time  signa¬ 
ture  or  the  number  of  beats  in  the 
measure.  Here,  n  is  the  number  of 
beats  in  the  current  measure  and  /m 
is  ignored.  In  fact,  you  can  omit  the 
/m  entirely  if  you  like.  A  time  signa¬ 
ture  is  required  in  every  measure — 
I'll  show  you  some  examples  in  a  mo- 


Dr.  Dobb's  Journal,  September  1987 

722 


109 


C  CHEST 

(continued  from  page  109) 


ment.  The  maximum  number  of 
beats  in  a  measure  (n)  is  255. 

@n — A  number  following  an  at  sign 
(@)  is  used  to  set  the  metronome 
count.  For  example,  you’d  use  the  fol¬ 
lowing  to  play  100  beats  at  metro¬ 
nome  120: 

track  0:  100  @120 


One  measure  of  3/4,  with  the  quarter 
note  at  metronome  120,  is  represent¬ 
ed  as  follows: 

track  0:  3/4  @120 
Alternating  measures  of  5/8  and  6/8 
at  different  metronome  counts  can 
be  done  with  the  following: 
track  0:  6/8  @120,  5/8  @100 
track  0:  6/8  @120,  5/8  @100 
track  0:  6/8  @120,  5/8  @100 
Click  supports  a  resolution  as  low  as 
one  metronome  count — a  metro¬ 


nome  1  is  one  beat/minute.  There  is 
no  effective  maximum  count  because 
the  maximum  is  in  the  audio-fre¬ 
quency  range. 

The  @  command  is  required  for 
every  measure  on  track  0.  It’s  option¬ 
al,  however,  on  the  other  tracks.  If 
the  @  command  is  omitted,  the  time 
between  beats  is  stretched  out  so  that 
the  measure  takes  up  the  same  time 
as  the  corresponding  measure  on 
track  0.  For  example,  you  can  take 
advantage  of  the  higher  note  being 
used  in  a  conflict  to  accent  the  down- 
beat  of  a  measure  as  follows: 
track  0:  3/4  @120,  3/4  @120 
track  1:  1/4  ,  1/4 

Here,  each  measure  of  track  1  takes 
the  same  amount  of  time  to  play  as 
the  corresponding  measure  of  track 
0.  You  could  do  the  same  thing  with: 
track  0:  3/4  @120,  3/4  @120 
track  1:  1/4  @  30,  1/4  @  30 
but  then  you'd  have  to  do  the  math 
yourself.  This  synchronization  fea¬ 
ture  can  also  be  used  to  do  poly- 
rhythms.  A  three  against  four  poly- 
rhythm  can  be  specified  with: 
track  0:  3/4  @120,  3/4  @120 
track  1:  4/4  ,  4/4 

Four  beats  on  track  1  are  played  in 
the  same  amount  of  time  that  it  takes 
to  play  three  beats  on  track  0  (IV2 
seconds). 

#n — A  #  sign  followed  by  a  number  is 
a  repeat  count.  For  example,  you 
could  specify  ten  measures  of  6/8  at 
metronome  100  with: 

track  0:  # 10  6/8  @100 
Note  that  10  of  the  1,280  measures  are 
being  used  here.  This  isn’t  usually  a 
problem,  but  if  it  is,  the  following 
command  outputs  the  same  number 
of  beats  but  uses  only  one  measure: 
track  0:  60  @100 

The  earlier,  two-measure-long,  poly- 
rhythm  can  be  restated  as: 
track  0:  * 2  3/4  @120 
track  1:  #2  4/4 

() — A  measure  surrounded  by  pa¬ 
rentheses  is  counted  silently — no 
sounds  are  output  during  that  mea¬ 
sure.  This  command  is  useful  if  you 
need  measure-long  rests  in  your  mu¬ 
sic.  For  example,  three  measures  of 
3/4,  followed  by  three  measures  of  a 
three  against  four  polyrhythm,  fol¬ 
lowed  by  three  more  measures  of  4/ 
4,  can  be  specified  as  follows: 

track  0:  #3  3/4  @100,  #3  3/4  @100, 


;  Play  four  measures  of  3/4  ,  followed  by  four  measures  of  a  3/4 
;  against  4/4  polyrhythm,  followed  by  four  measures  of  3  against 
;  4  against  5  ,  followed  by  four  measures  of  3  against  4  against  5 
;  against  7  .  Sound  a  warning  tone  one  measure  before  each  change  . 

trackO:  #4  3/4a120w,  #4  3/4a120w,  #4  3/4  3l20w,  #43/4a120w 

track  1:  (#43/4),  # 44/4,  #44/4,  #44/4 

track  2 :  (#43/4),  (#45/4),  #45/4,  #4  5/4 

track  3 :  (#43/4),  (#47/4),  (#47/4),  #47/4 

;  Now  go  into  Steve  Reich  mode  .  Play  20  measures  of  4/4  at 
;  metronome  150  and  at  the  same  time  play  20  measure  of  4/4 
;  at  metronome  151.  The  clicks  start  out  on  the  same 
;  beat  and  they  very  gradually  go  out  of  phase  and  then 
;  come  back  into  phase  again. 

track  0:  #20  4/4  a  150 
track  1  :  #20  4/4  3  151 

Example  1:  A  click  program 

112 


Dr.  Dobb's  Journal,  September  1987 

723 


(#3  3/4  @100) 

track  1:  (#3  4/4  ),  * 3  4/4,  #3  4/4 

You  could  do  the  same  thing  with: 
track  0:  #3  3/4  @100,  #3  3/4  @100 
track  1:  (# 3  4/4),  "6  4/4 

warning — The  warning  command 
causes  a  warning  tone  (consisting  of 

two  high-pitched  beeps)  to  be  played 
rather  than  the  normal  downbeat  of 

a  measure. 

If  the  measure  is  created  with  a  "n 
command,  only  the  last  measure  of 
the  series  is  affected.  For  example,  if 
you  want  to  do  four  measures  of  3/4, 
followed  by  a  change  to  6/8  but  with 

a  warning  just  before  the  change, 
you’d  say: 

track  0:  * 4  3/4  @120  warning, 

#10  6/8  @100 
The  program  plays  three  measures 
of  3/4  as  usual;  then  it  plays  a  final 
measure  of  3/4,  but  in  this  measure 
the  warning  tone  is  used  for  the 

Multiple-Statement  Macros 

Flotsam  and  Jetsam 

The  semicolon  is  a  legitimate  state- 

illegal  y  =  ij\l) ... .  A  way  around 

This  month’s  Flotsam  and  Jetsam  dis- 

ment  in  C,  so  the  else  tries  to  bind  to 

this  problem  is  to  use  the  comma  (or 

cusses  how  to  write  macros  that  have 

the  preceding  semicolon  rather  than 

sequence)  operator.  The  comma  op- 

more  than  one  statement  in  them. 

the# 

erator  evaluates  left  to  right  and  eval- 

I've  touched  on  some  of  this  stuff  in 

One  workable  solution  to  the  prob- 

uates  to  the  rightmost  thing  in  the  list. 

previous  columns,  but  it's  worth- 

lem  was  used  in  the  program  I  dis- 

So: 

while  to  get  everything  in  one  place. 

cussed  this  month,  in  which  I've  used 

Two-statement  macros  such  as: 

an  if  statement  to  create  a  block.  The 

"define  X! )  \ 

general  form  is: 

<\ 

"define  X< )  stmtl( );  stmt2( ) 

stmtl! ),  \ 

"define  X! )  \ 

stmt2( )  \ 

are  not  usually  a  good  idea.  This  mac- 

if(l)\ 

) 

ro,  when  used  in: 

(\ 

stmtl! );  \ 

executes  stmtlf )  first,  then  stmt2(  ). 

if(  condition ) 

stmt2( );  \ 

The  entire  expression  evaluates  to 

X< ); 

}else 

the  return  value  of  stmt2(  ).  Note  that 

else 

I’m  using  parentheses,  not  braces,  for 

something! ); 

You  can  even  declare  variables  after 

grouping. 

expands  as: 

the  open  curly  brace  if  you  like  (see 

A  disadvantage  of  the  comma  oper- 

Listing  Two,  lines  50-61,  page  82,  for 

ator  is  that  you  can't  declare  vari- 

an  example).  These  variables  shadow 

ables  local  to  the  macro,  as  you  can 

iff  condition ) 

(take  precedence  over)  any  variables 

with  braces. 

stmtll ); 

that  might  have  the  same  name  in  an 

Don't  confuse  the  comma  operator 

stmt2( ); 

outer  block  (they’ll  be  different  vari- 

with  the  comma  that’s  used  in  a  sub- 

else 

ables).  The  trailing,  but  empty,  else 

routine  call.  They’re  different  things. 

something! ); 

clause  is  required.  Without  it,  the  fol- 

In  order  to  get  a  comma  operator  into 

lowing  wouldn’t  work: 

a  subroutine  call,  you  have  to  sur- 

Adding  curly  braces  does  not  solve 

round  the  expression  with  parenthe- 

the  problem: 

if!  condition ) 

ses,  as  in  foo(  (a,b),  c  ).  Here,  the  left 

XO; 

comma  is  a  comma  operator,  and  the 

"define  X! )  {stmtl! );  stmt2( );} 

else 

one  on  the  right  is  an  argument-list 

when  used  in: 

something! ); 

separator.  This  same  caveat  applies 

to  the  D( )  macro  that  I  discussed  in 

It  would  expand  as: 

the  November  1986  Flotsam  and  Jet- 

if!  condition ) 

sam.  The  preprocessor  accepts  D( 

XO; 

if!  condition ) 

printfl”%d",}L);  )  but  only  because  the 

else 

if(l) 

comma  is  surrounded  by  parenthe- 

something! ); 

{ 

ses.  The  following  won’t  work: 

expands  to: 

}  ’ 

D(  printf("%d", ) 

else 

D(  x); ) 

if!  condition ) 

something! ); 

{ 

because  the  preprocessor  intercepts 

stmtl! ); 

The  extra  else  statement  forces  the 

the  trailing  comma  and  looks  for  a 

stmt2( ); 

else  something (  )  to  bind  properly. 

second  macro  argument.  Note  that 

} 

A  disadvantage  of  this  method  is 

the  comma  used  in  a  for  statement, 

else 

that  an  if  statement  is  not  an  expres- 

such  as  fort  i=0,  j=9;  i<j  ;),  is  indeed 

sion.  Consequently,  you  can’t  say  y  = 

a  comma  operator. 

something! ); 

X(  );  because  it  would  evaluate  to  the 

Dr.  Dobb's  Journal,  September  1987 

724 


113 


Dr.  Dobb's  Journal,  September  1987 


C  CHEST 

(continued  from  page  113) 


downbeat;  finally,  it  plays  ten  mea¬ 
sures  of  6/8.  That  is,  you  get  a  warn¬ 
ing  at  the  beginning  of  the  measure 
that  immediately  precedes  the  new 
time  signature. 

This  command  can  be  abbreviated 
to  w  or  W  if  you  like  (actually,  any 
word  that  starts  with  a  w  will  do). 
The  frequency  of  the  warning  tone  is 
not  affected  by  the  -d  command-line 
switch,  and  it  cannot  be  suppressed 
with  parentheses. 

Implementation 

Click  is  implemented  in  Listings  One- 
Seven,  pages  82-96.  Listing  One,  de- 
bug.h,  is  a  general-purpose  file  that 
contains  macros  I  use  regularly.  The 
D( )  macro  was  discussed  in  the  No¬ 
vember  1986  C  Chest  (it’s  also  in 
Bound  Volume  11,  page  740).  Its  argu¬ 
ment  goes  away  when  DEBUG  isn't 
defined.  The  PUBLIC  and  PRIVATE 
macros  just  declare  a  subroutine  as 
static  or  not.  They’re  useful  because 
you  can  easily  change  the  storage 
classes  of  all  private  subroutines  to 
nonstatic  when  you’re  debugging. 
The  UX( )  and  MS( )  macros  work 
just  like  the  D( )  macro  does.  Here  the 
argument  to  MS( )  is  used  only  if  MS- 
DOS  is  defined,  and  the  argument  to 
UX( )  is  used  only  if  MS-DOS  isn’t  de¬ 
fined  (UX  stands  for  Unix). 

Listing  Two  presents  hardware.h, 
a  catchall  file  for  IBM  PC  hardware 
defines.  These  macros  define  various 
numbers  used  for  talking  to  the 
counter-timer  chip  in  the  PC  and  for 
turning  the  speaker  on  and  off.  The 
process  is  described  in  detail  in  the 
books  listed  in  the  bibliography.  Tim¬ 
er  channel  0  is  used  for  the  system 
clock  interrupt,  channel  1  triggers  a 
memory  refresh  and  shouldn't  be 
touched  by  your  program,  and  chan¬ 
nel  3  is  used  to  control  the  frequency 
of  the  PC’s  speaker. 

The  SETFRQ.  macro  is  interesting 
because  it  does  more  than  one  thing 
in  a  single  macro.  It’s  described  in 
depth  in  this  month's  Flotsam  and 
Jetsam.  You  pass  it  a  desired  frequen¬ 
cy  (in  Hz),  and  it  causes  the  bell  to  ring 
at  that  frequency  the  next  time  the 
speaker  is  enabled  (with  a 
SPKR—ON( )  invocation).  The  fre¬ 
quencies  of  the  12  notes  in  the  middle 
octave  of  an  equal-tempered  chro¬ 


matic  scale  (as  on  a  piano)  are  defined 
in  notes.h  (Listing  Three)  (C4  is  middle 
C  on  a  piano).  Go  up  an  octave  by  mul¬ 
tiplying  by  2,  down  by  dividing.  Go 
up  a  half  step  (from  C  to  *C,  for  exam¬ 
ple)  by  adding  the  TWELFTH-ROOT 
-OF -TWO  to  a  note. 

The  heepffreq,  duration)  subrou¬ 
tine  (Listing  Four)  uses  all  this  stuff  to 
ring  the  bell  on  the  PC  at  a  specified 
frequency  for  an  indicated  number 
of  seconds.  Both  of  these  numbers 
can  be  fractional.  For  example: 

^define  <notes.h> 
beep(  G4,  0.5 ); 

plays  the  G  above  middle  C  for  a  half 
second. 


The  ROM  BIOS  keeps 
a  running  count  of 
clock  ticks. 


The  note  duration  is  determined 
by  the  argument  sent  to  the  delay ( ) 
subroutine,  called  on  line  18  and  de¬ 
clared  in  Listing  Five.  The  ROM  BIOS 
keeps  a  running  count  of  clock  ticks, 
accessible  via  function  0  of  interrupt 
Oyla.  0  is  midnight  on  the  day  that  the 
machine  was  started,  and  the  inter¬ 
rupt  returns  with  AH  set  to  nonzero 
when  you  roll  past  midnight.  Delay ( ) 
just  waits  in  a  tight  loop  for  the  cor¬ 
rect  number  of  ticks  to  pass. 

Hitherto  I’ve  looked  at  pretty 
straightforward  code,  described  in 
most  books  on  DOS  programming. 
With  Listing  Six,  things  get  more  in¬ 
teresting.  The  IBM  PC  system  clock 
ticks  once  every  18.2  seconds  (more 
or  less).  This  resolution  isn’t  adequate 
for  musical  applications — for  exam¬ 
ple,  click  couldn't  distinguish  be¬ 
tween  metronome  150  and  152  at  the 
default  resolution.  Moreover,  sitting 
in  a  tight  loop  waiting  for  the  clock  to 
tick  isn’t  a  good  way  to  do  things — it's 
too  easy  to  miss  a  tick. 

These  problems  are  solved  in  two 
ways.  First,  you  have  to  speed  up  the 
system  clock  in  such  a  way  that  you 
don’t  mess  up  the  real  DOS  clock.  Sec¬ 
ond,  you  have  to  provide  the  inter¬ 
rupt  service  routine  to  take  care  of 
the  faster  clock  tick.  If  you  speed  up 
the  clock  by  a  factor  of  4,  for  exam¬ 
ple,  your  interrupt  service  routine  is 


115 

725 


C  CHEST 

(continued  from  page  115) 

activated  on  every  tick  and  your  rou¬ 
tine  jumps  to  the  default  service  rou¬ 
tine  on  every  fourth  tick.  As  a  final 
convenience,  you  want  to  be  able  to 
write  your  own  interrupt  service 
routine  in  C,  without  having  to  go  to 
assembly  language. 

All  of  this  is  done  by  the  routines  in 
Listing  Six.  The  speedup)  )  subroutine 
is  called  with: 

speedup!  factor,  funct  ) 
int  factor; 

int  (*funct)( ); 

Factor  is  the  speedup  factor.  Set  it  to  4 
for  a  fourfold  increase  in  the  clock 
speed.  The  funct  argument  is  a  point¬ 
er  to  an  interrupt  service  routine  that 
is  called  on  every  clock  tick.  I'll  look 
at  one  of  these  shortly. 

Speedup ( )  begins  on  line  93  of  List¬ 
ing  Six.  The  first  thing  it  does  is  re¬ 
member  the  funct  and  factor  argu¬ 
ments  in  two  local  variables:  service 
and  tick— reset  (declared  on  lines  41 
and  45).  It  also  remembers  the  value 


of  the  current  data  segment  (on  line 
103).  You  need  to  do  this  because  it’s 
convenient  for  the  C  interrupt  ser¬ 
vice  routine  to  access  static  data  and 
global  variables.  Because  the  timer 
interrupt  can  come  at  any  time,  you 
have  no  idea  what  number  is  in  the 
DS  register  when  the  interrupt  is  ser- 


When  the  C  routine 
returns,  the  service 
routine  decides 
whether  or  not  to  call 
the  default  timer 
interrupt. 


viced;  odds  are  it  is  incorrect  (it  will 
be  the  DS  associated  with  the  inter¬ 
rupted  process).  By  remembering  the 
DS  register  now,  you  can  restore  it 
later.  Note  that  all  the  variables  used 
by  the  interrupt  service  routine  are 
being  put  into  the  —TEXT  segment, 
which  is  normally  used  only  for 


code.  I’m  doing  this  because  the  con¬ 
tents  of  the  CS  register  must  be  valid 
when  the  service  routine  is  entered 
(or  you  couldn't  be  there).  So,  you  can 
always  get  at  a  variable  in  the  _ TEXT 
segment  by  using  a  -TEXT:  or  CS:  seg¬ 
ment  override.  Then  you  can  re¬ 
trieve  the  previously  stored  DS  con¬ 
tents  from  a  variable  in  the  code 
segment. 

Lines  108  to  124  of  Listing  Six  speed 
up  the  clock  hardware.  Note  that  you 
have  to  test  explicitly  for  a  factor  of  1 
on  lines  110  and  111  to  avoid  an  arith¬ 
metic  overflow  exception.  In  this 
case,  you  want  to  load  the  counter 
with  the  default  count  of  0,  which 
yields  64K  ticks  between  interrupts 
(that's  the  default  18.2  seconds).  A 
speedup)  1, . . .  )  call  is  useful  if  you 
don’t  want  to  change  the  tick  rate  but 
do  want  to  install  your  own  service 
routine.  Finally,  the  routine  gets  the 
old  timer  interrupt  (8)  vector,  using 
DOS  function  0x35  on  line  133;  saves  it 
in  oldseg:old—off,  and  then  installs  a 
new  interrupt  using  DOS  function 
0x25  (on  lines  133-140).  You  have  to 
push  the  DS  register  (on  line  136)  be¬ 
cause  function  0x25  is  passed  the  new 
vector  in  DS:DX.  It's  restored  on  line 
140. 

You'll  note  that  I  didn't  actually  in¬ 
stall  the  user-supplied  service  rou¬ 
tine  in  this  last  step;  rather,  I  installed 
a  pointer  to  the  serv  subroutine  (lines 
156-207).  This  routine  sets  up  the  ma¬ 
chine  environment  so  that  the  C  rou¬ 
tine  can  be  called  safely,  and  only 
then  does  it  call  the  C  function.  Serv 
starts  out  by  setting  up  a  new  stack 
(on  lines  160-164).  A  small,  128-byte 
stack  (declared  on  line  52)  is  used  for 
this  purpose.  Then  it  pushes  all  the 
registers  onto  the  new  stack  (lines 
166-173).  The  old  data  segment  is  re¬ 
stored  next  so  that  the  C  function  can 
get  at  global  and  local-static  variables. 
Finally,  the  C  routine  is  called  (on  line 
179)  indirectly  through  the  pointer  I 
set  up  earlier  (on  line  102). 

When  the  C  routine  returns,  the 
service  routine  decides  whether  or 
not  to  call  the  default  timer  interrupt. 
A  running  count  ( numticks )  is  decre¬ 
mented  on  each  call.  When  it  reaches 
0,  you  reset  it  to  tick— reset  (the  first 
argument  to  speedup)  )  )  and  then 
jump  to  the  old  service  routine  (on 
line  205),  again  indirectly  through  a 
pointer.  If  numticks  hasn't  gone  to  0, 
you  send  a  nonspecific  EOI  (end  of  in- 


118 

726 


Dr.  Dobb 's  Journal,  September  1987 


terrupt)  to  the  hardware  (on  line  197) 
and  then  do  an  iret  to  terminate  the 
interrupt. 

The  remainder  of  Listing  Six  is 
straightforward.  Slowdown!  ),  de¬ 
fined  on  lines  213-247,  just  slows  the 
clock  down  to  its  original  rate  and  un¬ 
installs  the  custom  service  routine. 
Slowdown ( )  must  be  called  by  your 
program  before  it  terminates  or  the 
machine  will  go  into  outer  space  the 
next  time  a  timer  interrupt  occurs. 
The  cli( )  and  sti(  )  routines  on  lines 
70-78  just  disable  and  enable  inter¬ 
rupts  from  C.  They're  useful  when 
trying  to  avoid  various  communica¬ 
tion  problems  between  the  interrupt 
service  routine  and  the  rest  of  the 
program.  You'll  see  how  in  a  mo¬ 
ment. 

Now  that  I’ve  laid  the  groundwork, 

I  can  actually  discuss  the  metronome 
program.  Click.c  is  presented  in  List¬ 
ing  Seven.  A  few  of  the  * defines  at  the 
top  of  the  listing  are  interesting. 
ROUND! y.)  takes  as  input  a  floating¬ 
point  number  (either  float  or  double) 
and  converts  it  to  an  int.  It  rounds  to 
the  nearest  integer  value,  however, 
rather  than  just  truncating — the  de¬ 
fault  behavior  of  the  Microsoft  com¬ 
piler.  The  various  definitions  needed 
to  figure  the  clock  rates  are  on  lines 
42-46.  FACTOR  is  the  speedup!  )  fac¬ 
tor.  If  the  clock  runs  at  less  than  16 
times  the  default  tick  rate,  the  pro¬ 
gram  won’t  be  able  to  resolve  the  tim¬ 
ing  by  a  single  metronome  count — it 
won't  be  able  to  differentiate  be¬ 
tween  metronome  150  and  151,  for 
example.  DEFAULT-TICK  is  the  de¬ 
fault  18.2  times/second  used  by  the 
system  clock.  ONE— TICK  is  the  num¬ 
ber  of  times  that  the  clock  ticks  in  a 
second,  given  the  speedup  factor  de¬ 
fined  earlier.  Finally,  TlCKS(y)  con¬ 
verts  a  metronome  y  into  a  number 
of  clock  ticks. 

The  virtual  tape  is  defined  with  a 
system  of  typedefs  on  lines  60-78. 
Each  measure  is  represented  by  the 
MEASURE  structure  defined  on  lines 
63-72.  Num— beats  is  the  number  of 
beats  in  the  measure,  and  ticks— per 
_ beat  is  the  number  of  timer  ticks  be¬ 
tween  every  beat.  Num— beats  is  dec¬ 
remented  on  every  beat,  and  when  it 
reaches  0,  the  program  goes  to  the 
next  measure.  When  you’re  synchro¬ 
nizing  with  track  0  in  a  polyrhyth¬ 
mic  mode,  however,  num— beats  * 
ticks— per— beat  is  not  necessarily  the 


exact  number  of  clock  ticks  in  the 
corresponding  measure.  The  remain¬ 
der  field  makes  up  the  difference. 
The  extra  ticks  represented  by  re¬ 
mainder  are  spread  over  the  first  few 
beats  of  the  measure  to  make  up  any 
discrepancy.  This  way  the  down- 
beats  of  synchronized  measures  will 
always  coincide.  Cur— tick  is  the  cur¬ 
rent  clock  tick.  It  is  initially  set  to  ticks 
per  heat  and  is  decremented  on  ev¬ 
ery  timer  interrupt.  When  it  reaches 
0,  a  tone  is  output  and  the  variable  is 
reinitialized  to  ticks— per  —beat.  Num- 
— beats  is  also  decremented  at  this 
point.  The  rest  of  the  structure  is  just 
housekeeping:  silent  is  set  when  this 
measure  is  silent;  warning  is  set  when 
a  warning  pulse  is  to  be  used  on  the 
downbeat  of  each  measure.  All  these 
fields  are  used  (and  modified)  by  the 
interrupt  service  routine  on  each 
clock  tick. 

A  TRACK  (on  line  75)  is  an  array  of 
MEASURE  structures,  and  the  Tape 
(line  77)  is  an  array  of  four  TRACKS. 
The  Measure  array  is  an  array  of 
pointers,  one  for  each  track.  These 
pointers  point  to  the  current  mea¬ 
sure  on  each  track.  They  are  incre¬ 
mented  by  the  interrupt  service  rou¬ 


Dr.  Dobb's  Journal,  September  1987 


tine  every  time  it  rolls  over  to  a  new 
measure. 

The  next  set  of  global  variables 
(lines  82-102)  are  used  by  the  inter¬ 
rupt  service  routine  to  pass  informa¬ 
tion  to  the  main  body  of  the  program. 
Ring— bell  is  set  when  the  bell  should 
be  sounded.  It  is  set  to  one  more  than 
the  track  number  or  to  the  special 
value  WARNING,  defined  on  line  56,  if 
the  warning  tone  should  be  sounded. 
Collision  is  set  true  when  there's  a  col¬ 
lision  between  two  tracks  (a  beat  oc¬ 
curs  on  both  tracks  at  the  same  time). 
Downbeat  is  incremented  on  the 
downbeat  of  every  measure  on  track 
0;  it's  used  to  print  the  measure  num¬ 
ber  to  the  screen.  Done  is  set  to  true 
when  all  tracks  are  exhausted,  and 
Numticks  is  incremented  on  each 
timer  interrupt  (it's  useful  primarily 
for  debugging). 

By  far  the  largest  subroutine  in  the 
program  is  build— tracks! )  on  lines 
219-358,  which  parses  the  input  file 
and  initializes  the  Tape/ 1  array.  In 
spite  of  its  length,  it’s  pretty  straight¬ 
forward  and  requires  little  additional 
comment  here. 

More  important  is  the  interrupt 
service  routine  itself,  timr—intr! )  on 


119 

727 


C  CHEST 

(continued  from  page  119) 


lines  362-427.  Though  the  routine  is 
just  a  C  subroutine,  there  are  several 
issues  that  must  be  kept  in  mind 
while  writing  it.  First,  MS-DOS  is  not 
reentrant.  In  practice,  this  failing 
means  that  you  can’t  call  most  DOS 
functions  from  an  interrupt  service 
routine  (because  the  program  might 
be  in  DOS  when  it  was  interrupted). 
The  second  issue  is  return  values.  Be¬ 
cause  the  service  routine  isn’t  called 
in  the  normal  way,  it  can't  be  passed 
parameters  and  it  can't  return  a  val¬ 
ue  in  the  normal  way.  Global  vari¬ 
ables  must  be  used  for  this  purpose. 
The  remaining  problems  are  stack- 
related.  The  interrupt  service  routine 
uses  its  own  stack,  so  you  have  to  dis¬ 
able  the  default  stack-overflow 
checking  that's  inserted  by  the  com¬ 
piler  (which  will  almost  always  fail 
because  the  stack  isn’t  where  its  sup¬ 
posed  to  be).  This  disabling  is  often 
done  with  a  compiler  command-line 
switch,  but  the  Microsoft  compiler 
lets  you  do  it  with  the  # pragma  check 
stack  directives  (on  lines  362  and 
427).  The  # pragma  lets  you  disable  the 
checking  for  one  subroutine  only,  in¬ 
stead  of  affecting  the  entire  module. 
A  trailing  minus  sign  disables  check¬ 
ing  and  a  plus  sign  enables  it.  The  fi¬ 
nal  stack  issue  is  its  size.  The  service 
routine  uses  a  128-byte  stack,  of 
which  18  bytes  are  used  to  save  regis¬ 
ters  and  do  the  subroutine  call.  You 
have  to  be  careful  not  only  about  the 
amount  of  stack  used  by  your  own 
local  variables  but  also  the  amount  of 
stack  used  by  subroutines  that  the 
service  routine  might  call.  Be  careful. 
If  you  need  more  stack,  change  the 
declaration  for  stack  on  line  52  of  List¬ 
ing  Six. 

The  service  routine  itself  modifies 
the  global  variables  described  earlier. 
The  for  loop  is  executed  four  times — 
one  iteration  for  each  track.  The  test 
on  line  377  is  for  the  end  of  track.  If 
the  beat  count  is  0,  there  are  no  more 
measures  on  this  track.  The  next  test 
(on  line  379)  checks  to  see  if  the  bell 
should  sound  (this  is  the  case  if  the 
current  count  is  the  maximum).  The 
current  tick  is  then  decremented  on 
line  402,  and  if  it  goes  to  0,  the  num¬ 
ber  of  beats  is  also  decremented  (on 
line  404).  The  routine  advances  to  the 
next  measure,  if  necessary,  on  line 


409.  The  else  clauses  on  lines  411-421 
takes  care  of  the  Remainder — the  ex¬ 
tra  clock  ticks  that  have  to  be  inserted 
to  keep  synchronized  with  track  0. 1 
add  1  to  the  current  tick  count  on  line 
418  to  do  this.  That  is,  I  stretch  the 
current  beat  out  by  one  clock  tick. 
This  way  the  extra  ticks  are  spread 
over  the  first  few  beats  in  the  mea¬ 
sure  instead  of  being  piled  in  one 
place. 

The  main( )  subroutine  has  to  do 
several  things  to  get  the  ball  rolling. 
First,  it  must  call  signaK  )  to  guaran¬ 
tee  that  slowdowni  )  is  called  if  you 
abort  the  program  with  Ctrl-Break. 
SignaK  )  is  called  on  line  495,  and  it 
installs  the  on—break( )  subroutine 
(lines  431-442)  to  call  slowdownf  ). 
Speedupf )  is  called  in  the  initializa¬ 
tion  part  of  the  for  loop  on  line  497. 
The  loop  terminates  when  the  inter¬ 
rupt  service  routine  says  that  it’s  fin¬ 
ished  by  setting  the  Done  flag.  The 
code  in  the  body  of  the  loop  just  tests 
to  see  if  it  should  ring  the  bell  and 
rings  it  if  necessary.  I’m  doing  this 
here  rather  than  in  the  service  rou¬ 
tine  because  it’s  easier.  The  problem 
is  the  delay  between  turning  the  bell 
on  and  then  off  again.  You  can't  wait 
in  the  service  routine  itself  because 
the  timing  would  be  thrown  off.  By 
waiting  here,  the  wait  time  is  essen¬ 
tially  independent  from  the  start-of- 
beat  timing.  Of  course,  you  could 
turn  the  bell  on  in  the  service  routine 
and  then  set  a  flag,  decrementing  the 
flag  on  each  timer  tick  and  turning 
the  bell  off  when  the  flag  gets  to  0. 
This  method  seems  a  lot  of  bother, 
however,  so  I  took  the  easy  way  out. 
The  final  point  worth  mentioning  is 
that  interrupts  have  to  be  disabled 
while  you  do  the  tests  because  there 
are  two  variables  involved  and  you 
don’t  want  one  of  these  to  magically 
change  its  value  halfway  through  the 
test.  Disabling  interrupts  prevents 
this  change. 

Availability 

All  the  source  code  for  articles  in  this 
issue  is  available  on  a  single  disk.  To 
order,  send  $14.95  to  Dr.  Dobbs  Jour¬ 
nal,  501  Galveston  Dr.,  Redwood  City, 
CA  94063  or  call  (415)  336-3600,  ext. 
216.  Please  specify  issue  number  and 
format  (MS-DOS,  Macintosh,  Kaypro). 

Because  the  code  this  month  is 
pretty  compiler  dependent,  I’m  dis¬ 
tributing  executable  versions  (along 


with  the  full  source  code)  through 
Software  Engineering  Consultants, 
P.O.  Box  5679,  Berkeley,  CA  94705. 
This  version  has  been  enhanced  to  al¬ 
low  several  different  kinds  of  warn¬ 
ing  tones  and  to  allow  duration  to  be 
specified  in  minutes  and  seconds  as 
well  as  beats.  The  cost  is  $20. 

Bibliography 

Brickner,  Ralph  G.  "An  Execution 
Profiler  for  the  PC."  PC  Tech  Journal 
4:11  (November  1986):  120-142.  This 
article  describes  another  program 
that  steals  the  system  timer  interrupt; 
the  code  of  interest  is  in  Listing  2, 
pages  140-142. 

IBM  Technical  Reference.  The  BIOS 
listing  contains  the  default  timer  in¬ 
terrupt  service  routine.  It’s  on  page  5- 
162  of  the  AT  technical  reference  and 
on  page  A-79  of  the  XT  reference.  The 
routine  is  called  TIMER— INT  in  both 
listings. 

Norton,  Peter.  The  Peter  Norton  Pro¬ 
grammer's  Guide  to  the  IBM  PC.  Belle¬ 
vue,  Wash.:  Microsoft  Press,  1985. 
This  book  contains  information  on 
how  the  bell  on  the  IBM  PC  works  as 
well  as  information  on  the  other  DOS 
interrupts  used  by  click.  The  infor¬ 
mation  is  duplicated  in  innumerable 
other  books  about  programming  the 
IBM  PC. 

DDJ 

(Listings  begin  on  page  82.) 

Vote  for  your  favorite  feature/article. 

Circle  Reader  Service  No.  5. 


Dr.  Dobb’s  Journal,  September  1987 

728 


121 


COLUMNS 


STRUCTURED  PROGRAMMING 


V.I.P.,  Clustered  Binary  Trees,  and  Clustered  List  Data  Structures 


This  month  I  discuss  the  Visual 
Interactive  Programming  (V.I.P.) 
language,  a  new  icon-based  inter¬ 
preter  for  the  Apple  Macintosh  com¬ 
puter.  My  second  topic  conforms 
with  this  issue’s  theme — algo¬ 
rithms — by  presenting  modified 
structures  for  binary  trees  and 
linked  lists. 

The  Macintosh  has  been  blessed 
with  numerous  language  interpret¬ 
ers  and  compilers:  BASIC,  FORTRAN, 
Pascal,  Modula-2,  C,  LISP,  PROLOG, 
Forth,  and  so  on.  Now,  Mainstay  of 
Agoura  Hills,  California,  has  devel¬ 
oped  a  new  language  that  truly  takes 
advantage  of  icons  and  the  Macin¬ 
tosh  user  interface.  V.I.P.  is  an  inter¬ 
preter  that  breathes  life  into  a  flow¬ 
chart — instead  of  typing  text  for  the 
source  code,  you  can  assemble  a  pro¬ 
gram  using  special  flowchartlike 
symbols. 

V.I.P.  also  incorporates  a  program¬ 
ming  environment.  Its  appearance 
resembles  MacPaint:  a  menu  bar 
across  the  top;  a  flowchart  viewing 
port;  and  to  the  left  of  this  port,  three 
groups  of  icons — object  (data-type) 
icons,  icons  for  loops  and  decision¬ 
making  constructs,  and  icons  for  sev¬ 
eral  classes  of  predefined  routines. 
To  build  a  program  you  point  to  an 
icon  with  the  mouse  and  click.  The 
environment's  response  takes  one  of 
two  forms:  a  window  or  a  flowchart 
icon.  The  window  form  is  fairly  typi¬ 
cal  of  the  Macintosh  and  involves  a 
more  sophisticated  level  of  interac¬ 
tion  with  the  user.  The  icon  form  lets 
you  fully  define  the  flowchart  sym- 

by  Namir  Clement 
Shammas 

bol  and  represents  a  simpler  level  of 
interaction.  Each  flowchart  symbol 
has  a  comment  line  at  the  bottom. 

The  product  uses  the  same  geomet¬ 
ric  shape — namely,  a  horizontal  rect¬ 
angular  strip — for  all  the  icons.  At 


the  two  edges  of  the  rectangular  icon 
are  two  squares:  one  to  open  the 
icon,  the  other  to  close  it.  Figure  1, 
page  123,  shows  a  sketch  of  an 
opened  FOR-NEXT  icon.  The  rectangle 
containing  the  for  keyword  is  the 
permanently  visible  part  of  the  icon. 
The  two  columns  below  it  are  for  in¬ 
put  or  inspection.  The  left  column  of 
rectangles  labels  the  information  re¬ 
quired  and  encloses  the  object  type 
in  parentheses.  V.I.P.  uses  lowercase 
and  uppercase  letters  for  the  object- 
type  codes  to  indicate  input  and  out¬ 
put,  respectively.  The  last  row  is  re¬ 
served  for  comments. 

V.I.P.  supports  a  fixed  set  of  object 
(data  and  constant)  types.  They  are: 

•  BYTE,  which  uses  1  byte  of  storage. 
Short  integers  (-128  to  + 127)  or  single 
characters  can  use  this  type. 

•  INTEGER,  which  uses  4  bytes  to  ac¬ 
commodate  long  integers  (between 
minus  and  plus  2  billion). 

•  REAL,  which  occupies  10  bytes  with 
a  64-bit  mantissa  and  15-bit  exponent. 
Thus,  floating  points  with  19  signifi¬ 
cant  figures  are  supported  with  an 
exponent  varying  from  -4,932  to 
+4,932. 

•  POINT,  which  requires  4  bytes  of 
storage  to  represent  a  vertical  and 
horizontal  set  of  coordinates.  (The 
range  of  values  for  each  axis  is 
-32,678  to  +32,767.)  This  is  equiva¬ 
lent  to  a  predefined  record  structure 
in  a  structured  language  such  as  Pas¬ 
cal  or  C. 

•  RECTANGLE,  which  uses  8  bytes  to 
represent  four  integers  that  define 
the  upper-left  and  lower-right  cor¬ 


ners  of  a  rectangle. 

•  CONSTANT,  which  is  used  to  imple¬ 
ment  symbolic  constants.  V.I.P.  sup¬ 
ports  four  types  of  constants:  charac¬ 
ter,  integer,  real,  and  string.  The  true, 
false,  e,  and  pi  constants  are 
predefined. 

Arrays  of  up  to  three  dimensions 
are  supported,  with  the  lower  array 
bound  fixed  as  1.  When  you  select  a 
data-type  icon,  a  window  directory 
opens  to  display  a  list  of  all  the  vari¬ 
ables  of  the  selected  type.  The  win¬ 
dow  lets  you  choose  between  global 
and  local  variables.  You  can  also  in¬ 
sert  new  variables,  delete  or  change 
existing  definitions,  rename  vari¬ 
ables  (V.I.P.  is  case  sensitive),  convert 
scalar  ones  into  arrays  and  vice 
versa,  and  alter  array  sizes  and  the 
number  of  dimensions. 

V.I.P.  offers  two  decision-making 
constructs:  IF  and  CASE  statements. 
The  IF  statement  comes  in  the  form 
of  the  IF-THEN-ELSE  icon,  with  two 
flowchart  branches.  You  can  simu¬ 
late  an  ELSE1F  by  using  nested  IF 
statements.  The  IF  icon  is  defined  by 
entering  a  logical  expression.  The 
CASE  statement  supports  up  to  30 
cases.  When  you  choose  the  CASE 
icon,  you  are  prompted  for  the  num¬ 
ber  of  cases  to  create  an  icon  with  the 
correct  number  of  alternatives.  You 
enter  the  selector  (switch)  variable 
and  the  values  associated  with  each 
CASE  clause.  V.I.P.  has  no  CASE  ELSE 
clause. 

The  language  supports  two  loop 
constructs:  FOR  . . .  NEXT  and  WHILE 
. . .  DO.  When  you  use  a  FOR  loop,  you 
must  specify  the  control  variable  as 
well  as  its  initial,  final,  and  step  val¬ 
ues.  To  insert  a  WHILE  loop  icon,  you 
need  to  specify  the  logical  expression 
used  to  iterate  the  loop. 

V.I.P.  lets  you  divide  your  task  into 
smaller  routines — a  feature  useful 
for  maintaining  a  clear  set  of  flow¬ 
charts  instead  of  using  one  big  flow- 


123 


Dr.  Dobb's  Journal,  September  1987 

729 


chart.  You  can  choose  to  invoke  a 
routine's  option  from  the  top  menu 
bar.  This  opens  a  new  window  in 
which  you  declare  all  your  routines. 
You  can  easily  switch  between  the 
main  routine  and  any  other  routine 
to  edit  or  inspect  any  program  com¬ 
ponents.  Each  routine  has  its  own 
workspace.  When  you  select  a  rou¬ 
tine  for  the  first  time,  you  start  with  a 
clear  flowchart. 

User-defined  routines  in  V.I.P.  can 
take  parameters.  A  special  window 
for  parameters  opens  when  you  se¬ 
lect  the  "Set  argument  ..."  option. 
The  parameter  window  resembles 
the  one  for  declaring  variables,  with 
a  few  differences.  First,  you  must  in¬ 
dicate  the  type  and  the  input/output 
status.  V.I.P.  does  not  allow  the  same 
parameter  to  pass  data  back  and 
forth  between  the  routine  and  its 
caller.  You  can  define  the  parame¬ 
ters  as  scalar  or  arrays  (with  up  to 
three  dimensions).  Parameters  can 
be  inserted,  deleted,  or  altered.  V.I.P. 
maintains  control  over  parameter 
passing  and  verifies  that  the  argu¬ 
ments  in  the  calls  correspond  to  the 
parameters'  type,  number,  and 
sequence. 

V.I.P.  has  a  versatile  set  of  intrinsic 
functions.  Mathematical  functions 
include  square  root,  logarithmic, 
trigonometric  (and  their  inverses), 
and  hyperbolic  (and  their  inverses). 
Other  mathematical  functions  in¬ 
clude  the  absolute  value,  sign,  modu¬ 
lus,  minimum,  maximum,  and  ran¬ 
dom-number  generator.  Another 
collection  of  functions  returns  the 
vertical  and  horizontal  coordinates 
of  a  point  and  the  four  coordinates 
used  to  define  a  rectangle. 

The  software  comes  with  power¬ 
ful  group  of  predefined  routines. 
Each  class  of  routines  is  represented 
by  an  icon  located  on  the  left-hand 
side  of  the  flowchart  viewing  port. 
There  are  15  classes  of  routines,  listed 
along  with  their  functions  in  Table  1, 
right. 

The  V.I.P.  editor  enables  to  you  per¬ 
form  search/replace  operations  on 
the  contents  of  the  flowchart  icons 
and  also  lets  you  cut,  copy,  paste,  and 
delete  icons.  You  can  view  the  flow¬ 
chart  using  a  smaller  scale  that  can 
be  magnified  by  pressing  the  shift 
key  and  the  mouse  button  simulta¬ 
neously.  In  this  mode,  however,  the 
screen  displays  uncommented  icons, 


which  I  find  annoying. 

Finally,  V.I.P.  includes  a  versatile 
debugger  that  lets  you  single-step 
through  your  flowchart  icons,  set 
and  remove  breakpoints,  and  set/ex¬ 
amine  objects. 

I  ran  some  of  the  V.I.P.  demonstra¬ 
tion  programs,  which  illustrate  how 
easy  it  is  to  write  programs  that  use 
the  Macintosh  interface.  I  also  wrote 


two  versions  of  the  Sieve  benchmark 
program.  The  first  (see  the  text  ver¬ 
sion  in  Example  1,  page  124)  is  writ¬ 
ten  in  a  typical  form — that  is,  no  local 
routines  are  used.  One  iteration  of 
this  program  took  3  minutes  and  35 
seconds.  Example  2,  page  125,  shows 
the  text  of  the  second  version,  which 
uses  two  subroutines — init  and  body. 

To  the  best  of  my  knowledge,  V.I.P. 


1 .  Assignment:  includes  simple  assignment  and  filling/copying  bytes. 

2.  Mathematics:  includes  integer,  fraction,  power,  simple  financial  calculations,  and 
sorting  numbers. 

3.  String  manipulation:  to  carry  out  typical  related  operations  (string  concatenation, 
comparison,  append,  copying,  and  conversions  with  numbers).  Sorting  an  array 

of  strings  is  also  included. 

4.  Graphics:  a  large  set  of  routines  that  lets  you  obtain  the  best  of  the  Macintosh 
graphics. 

5.  Event  trapping:  to  inspect  the  status  of  the  mouse. 

6.  Menu  management:  to  create,  enable,  disable,  remove,  and  load  menus,  to  name  a 
few. 

7.  Window  management:  to  create,  set  up,  load,  remove,  and  activate  windows. 

8.  Text  editing:  to  perform  text  editing  functions,  such  as  copying,  pasting,  cutting, 
clearing,  and  inserting  text.  Text  file  I/O  operations  are  also  included. 

9.  Dialog  management:  to  permit  your  programs  to  display  Macintosh  dialog  windows. 

10.  Sound  effects:  to  play  tune,  set  voice,  set  notes,  and  turn  sound  on/off. 

1 1 .  Record  management:  to  support  dynamic  record  allocation,  copying  of  records, 
and  field  I/O. 

12.  I/O  operations:  a  set  of  I/O  routines  for  file  manipulation  (open,  close,  get  file 
position,  and  so  on)  and  I/O  of  objects,  records,  text,  and  pictures. 

13.  Printing:  to  set  up  the  printed  page  and  print  text  and  pictures. 

14.  Branching:  to  exit  from  loops  and  execute  another  program. 

1 5.  Date  and  time  management:  to  wait,  read  the  clock,  and  obtain  the  time  and  the 
date. 


Table  1:  V.I.P.  's  classes  of  routines  and  their  functions 


for  I  V  I 


1  control  var  (N) 

l<-l 

1 

l->l 

1  initial  (n) 

l<-l 

1 

l->l 

1  boundary  (n) 

1  <-l 

7001 

l->l 

1  increment  (n) 

l<-l 

1 

l->l 

|  ************************************ ********* 

l<-l 

Main  loop 

l->l 

Figure  1:  Sketch  of  an  opened  FOR-NEXT  icon 


Dr.  Dobb's  Journal,  September  1987 

730 


133 


STRUCTURED  PROGRAMMING 

(continued  from  page  123) 


is  the  first  commercially  distributed 
program  of  its  kind  for  microcom¬ 
puters.  It  offers  several  interesting 
programming  features — in  particu¬ 
lar,  visual  programming,  which  is 
excellent  for  teaching  because  stu¬ 
dents  don't  look  at  dummy  flow¬ 
charts  but  at  active  ones.  Visual  pro¬ 
gramming  may  also  prove  valuable 
for  algorithm  design.  In  addition, 
V. I. P.'s  library  of  routines  make  it  a 
very  capable  language.  I  applaud 
Mainstay  for  its  efforts  and  creativity 
in  developing  V.I.P.  and  look  forward 
to  more  powerful  versions.  Mainstay 
has  also  announced  that  it  will  be 
realeasing  V.I.P.  translators  for  Pascal 
and  C. 


I  have  praised  both  V.I.P.  and  the 
concept  of  visual  programming;  now 
let  me  mention  a  few  shortcomings. 
First,  programmers  who  type  quick¬ 
ly  and  use  keyboard  macros  may 
find  software  development  in  visual 
programming  languages  slow.  Sec¬ 
ond,  V.I.P.  does  not  provide  an  escape 
mechanism  once  a  flowchart  icon  is 
opened.  Third,  viewing  large  flow¬ 
charts  is  cumbersome.  Fourth,  pro¬ 
grammers  have  to  inspect  V.I.P. 's 
flowchart  icons  in  order  to  view 
their  contents.  And  finally,  V.I.P. 
makes  no  provision  for  user-defined 
objects. 

Clustered  Binary  Trees 

Binary  trees  are  useful  data  struc¬ 
tures  for  internal  sorting  and  search¬ 
ing.  Yet,  despite  all  the  praise  binary 


trees  receive,  they  are  vulnerable  to 
becoming  unbalanced.  The  ultimate 
nightmare  is  to  input  a  perfectly  sort¬ 
ed  array  into  a  binary  tree  and  end 
up  with  one  long  linked  list!  Two  de¬ 
cades  ago,  two  Russian  mathemati¬ 
cians  devised  a  few  algorithms  to 
maintain  a  binary  tree  in  a  near-per¬ 
fect  balanced  state.  This  type  of  bina¬ 
ry  tree  is  known  as  an  AVL  tree.  Yet 
both  the  basic  binary  tree  and  the 
AVL  tree  typically  use  less-than  com¬ 
parisons  to  insert  a  new  node  or  to 
search  for  one — they  take  no  advan¬ 
tage  of  the  difference  between  the 
values  of  a  resident  node  and  an  in¬ 
coming  data  item.  With  binary  and 
AVL  trees,  you  compare  the  value  of  a 
data  item  with  that  of  a  node.  If  it  is 
less  or  equal,  you  use  the  left  node 
pointer;  otherwise,  you  use  the  right 
node  pointer. 

I  suggest  the  following  modifica¬ 
tion  to  the  binary  tree.  In  the  new 
tree,  each  node  has  two  left  pointers 
and  two  right  pointers,  and  so  I  call  it 
the  clustered  binary  tree  (CB  tree  for 
short)  and  name  the  pointers  low 
left,  high  left,  low  right,  and  high 
right.  Why  duplicate  the  pointers  for 
each  side?  The  answer  lies  in  being 
able  to  select  the  proper  pointer  to 
follow,  based  on  the  difference  be¬ 
tween  the  values  of  the  node  key  and 
the  incoming  data  item.  A  critical  dif¬ 
ference  value  (CDV)  is  preassigned 
based  on  the  nature  of  the  data. 
When  handling  numeric  keys,  the 
values  of  the  CDV  and  the  whole  algo¬ 
rithm  are  easy  to  implement.  If  the 
calculated  difference  is  greater  than 
the  value  of  the  CDV,  then  the  high- 
left  or  high-right  pointers  are  used, 
depending  on  the  sign  of  the  differ¬ 
ence;  otherwise,  the  low  pointers  are 
used. 

The  use  of  these  four  pointers  in 
the  CB  tree  helps  to  separate  nodes 
with  keys  that  vary  widely  and  so 
speeds  up  tree  insertion  and  search¬ 
ing.  I  used  the  Turbo  Pascal  program 
shown  in  Listing  One,  page  98,  to 
compare  the  speed  of  insertion  and 
searching  in  binary  and  CB  trees. 
The  program  creates  an  array  of 
numbers  by  using  one  of  three  meth¬ 
ods:  random-number  generation,  the 
sine  function,  or  the  cosine  function. 
Although  the  speed  of  searching  was 
about  the  same,  the  CB  tree  was  four 
to  five  times  faster  in  inserting  new 
data.  I  used  the  sine  and  cosine  func- 


byte 

Ch 


integer 

Count 

Diff 

Flag[7001] 

I 

Iter 

K 

Prime 

T1 

T2 


constants 
SIZE  =  7001 


main 

read  clock  (Tl) 

for  (Iter, 1,1,1)  **  Main  loop  ** 

assign  (0, Count) 

for  (1,1, SIZE, 1)  **  Init.  flags  ** 

assign  (l,Flag[I]) 
for  (1, 1,  SIZE,  1) 

if  (Flag [I]  =  1) 

assign  (I+I+3, Prime) 
assign  (I  +  Prime, K) 
while  (K  SIZE) 

assign  (0,Flag[K]) 
assign  (K+Prime,K) 
assign  (Count+1, Count) 

else 

read  clock  (T2) 
wait  (5000) 

assign  { (T2-T1) /60, Diff) 
write  output  (1,  "@i",Diff) 
get  key  (100, Ch) 


Example  1:  V.I.P.  text  output  for  the  first  version  of  the  Sieve  program 


124 


Dr.  Dobb’s  Journal,  September  1987 

731 


tions  to  generate  arrays  with  a  cer¬ 
tain  degree  of  order,  which  is  the 
weak  point  of  the  binary  tree. 

If  string-type  keys  are  used  with 
CB  trees,  then  the  ASCII  code  num¬ 
bers  of  the  first  few  leading  charac¬ 
ters  are  utilized.  You  use  the  Pascal 
ORD( )  function  to  do  this.  The  nu¬ 
meric  values  used  for  critical  differ¬ 
ence  calculations  are  ORD(Key[l])  for 
using  the  first  character  in  the  key, 
and: 

ORDlKeytl])  *  100  +  ORD(Key(2)) 
for  using  the  first  two  characters. 

Clustered  List  Structures 

You  can  also  use  the  concept  of  the 
clustered  binary  tree  structure  with 
lists.  Basically,  a  clustered  single-link 
list  (C  list  for  short)  is  a  list  with  two 
sets  of  pointers:  one  set  forms  the 
"high-track,"  or  fast  search  lane;  the 
other  forms  the  "low  track,”  or  a 
clustered  sublist.  Thus,  a  C  list  struc¬ 
ture  has  one  high-track  link  and 
many  clustered  sublists,  each  linked 
to  a  high-track  node,  and  so  a  high- 
track  node  becomes  an  index  that  in¬ 
dicates  the  range  of  data  stored  in  its 
linked  sublist.  The  effect  of  the  two 
sets  of  pointers  is  best  visualized  by 
the  two  lanes  of  a  highway,  in  which 
one  is  for  faster  traffic.  You  normally 
take  the  fast  lane,  as  long  as  you  are 
relatively  far  from  the  exit  you  seek. 
As  you  get  closer  to  your  exit,  you 
switch  to  the  slower  lane.  The  same 
idea  applies  to  C  lists.  By  using  nu¬ 
merical  differences  in  comparing  a 
node  with  an  incoming  datum,  you 
can  decide  whether  or  not  to  use  the 
high-track  pointers  and  so  bypass 
many  unnecessary  comparisons. 
This  approach  makes  C  lists  more  ef¬ 
ficient  in  searches  than  are  normal 
lists.  The  price  you  pay  is  the  extra 
set  of  pointers. 

Listing  Two,  page  102,  shows  a 
Turbo  Pascal  program  that  demon¬ 
strates  C  list  insertions  and  displays. 
The  program  contains  procedures 
for  searching,  inserting,  and  visiting 
C  lists.  Notice  the  following  aspects  of 
the  program: 

1.  A  string-type  key  is  used.  The  en¬ 
tire  key  of  any  node  and  incoming 
data  is  used  in  a  logical  comparison. 
The  numeric  difference  is  calculated 
for  the  first  character  only,  however. 


2.  A  CDV  of  0  is  used,  which  causes  the 
high-track  node  to  index  on  a  range 
of  one  character.  The  range  of  char¬ 
acters  is  (CDV  +  1)  when  the  ASCII 


number  of  one  key  character  is  used. 
3.  The  Search— Node  procedure  uses 
a  Boolean  flag,  HiTrack,  to  switch 
from  using  high-track  pointers  to 


byte 

Ch 


integer 

Count 

Diff 

Flag[7001] 

Iter 

T1 

T2 


main 

read  clock  (Tl) 
for  (Iter,  1,1,1) 

init  **  Initialize  flags  ** 
body  (Count)  **  Body  of  sieve  ** 
read  clock  (T2) 
assign  (T2  -  Tl,Diff) 
write  output  (1,  "@i",Diff) 
get  key  (5000, Ch) 


body  (Count) 


< —  integer  Count 


integer 

Flags [7001] 

I 

K 

Prime 


assign  (0, Count) 
for  (1,1,7001,1) 

if  (Flag[I]  =  1) 

assign  (I+I+3, Prime) 
assign  (I+Prime,K) 
while  (K  7001) 

assign  (0,Flag[K]) 
assign  (K+Prime,K) 
assign  (Count+1, Count) 

else 


init 

integer 

I 


for  (1,1,7001,1) 

assign  (l,Flag[I]) 


Example  2:  V.I.P.  text  output  for  the  modified  Sieve  program 


Dr.  Dobb's  Journal,  September  1987 

732 


135 


STRUCTURED  PROGRAMMING 

(continued  from  page  125) 


low-track  pointers  during  a  search. 

4.  A  new  datum  can  be  inserted  in 
one  of  the  following  locations: 

•  as  the  new  head  of  the  entire  list 

•  in  a  clustered  sublist 

•  as  a  new  high-track  node 

•  as  the  new  member  of  a  high-track 
node,  pushing  the  previous  one  in¬ 
side  the  clustered  sublist 

The  unused  high  pointers  in  a  sub¬ 
list  node  can  be  used  to  form  a  dou¬ 
bly  linked  sublist.  I  feel  the  impact  of 
C  lists  on  list  structures  is  greater 
than  that  of  CB  trees  on  binary  trees. 
The  improved  performance  brought 
by  C  lists  is  less  affected  by  the  type 
and  variation  of  data  than  is  the  case 
with  CB  trees. 

Availability 

All  the  source  code  for  articles  in  this 
issue  is  available  on  a  single  disk.  To 
order,  send  $14.95  to  Dr.  Dobb's  Jour¬ 
nal,  501  Galveston  Dr.,  Redwood  City, 
CA  94063  or  call  (415)  366-3600,  ext. 
216.  Please  specify  the  issue  number 
and  format  (MS-DOS,  Macintosh, 
Kaypro). 

DDJ 

(Listings  begin  on  page  98.) 

Vote  for  your  favorite  feature/article. 

Circle  Reader  Service  No.  6. 


COLUMNS 


ARTIFICIAL  INTELLIGENCE 


Smalltalk/ V 


Digitalk's  Smalltalk/V  program¬ 
ming  tool  is  a  bit-mapped  im¬ 
plementation  of  a  substantial  subset 
of  Smalltalk.  It  is  aimed  primarily  at 
the  AI  development  market  and  is 
code  compatible  with  the  earlier 
Methods  product,  also  by  Digitalk. 
The  feature  that  makes  it  suitable  for 
AI,  aside  from  the  object  orientation, 
is  primarily  the  inclusion  of  a  sur¬ 
prisingly  complete  and  robust  PRO¬ 
LOG  compiler.  This  PROLOG  includes 
many  predicates  that  are  lacking  in 
Turbo  Prolog,  for  example,  such  as 
functor  and  univ. 

Smalltalk/V  also  has  excellent 
graphics  capabilities,  such  as  turtle 
graphics,  and  offers  good  perform¬ 
ance  in  graphics  animation.  Also  in¬ 
cluded  with  the  product  is  a  large  on- 
disk  tutorial  that  provides  some 
substantial  program  examples.  Digi- 
talk’s  Methods  was  the  first  Smalltalk 
implementation  for  PCs,  and  it  was 
an  important  landmark  as  far  as 
many  programmers  were  con¬ 
cerned.  But  here  we  have  a  major 
subset  of  it  running  in  about  half  a 
megabyte. 

The  first  thing  to  realize  about 
Smalltalk/V  is  that  it  is  not  just  a  pro¬ 
gramming  language,  but  as  any 
Smalltalk  should  be,  it  is  a  full, 
multiwindow  environment  with 
drop-down  menus,  mouse  support, 
and  more  conveniences  than  you 
have  probably  ever  seen  in  a  pro¬ 
gramming  environment  on  a  PC.  A 


by  Ernest  R.  Tello 


mouse  is  not  necessarily  required  be¬ 
cause  Digitalk  has  implemented  an 
ingenious  use  of  the  keyboard  using 
two  main  clusters  of  keys  on  the  far 
left  and  far  right,  which  it  calls  the 
left-hand  mouse  and  right-hand 
mouse,  respectively.  If  you  choose 
not  to  use  a  mouse,  you  will  find  that 
after  a  while  this  works  quite  well. 


You  can  keep  both  hands  in  their 
place  and  reach  for  one  of  the  keys  in 
either  cluster  just  as  if  you  were 
pressing  the  buttons  on  two  mice. 
The  product  was  intended  to  be  used 
with  a  mouse,  however,  and  works 
much  better  with  one,  if  for  no  other 
reason  than  that  it  makes  the  cursor 
fly  across  the  screen  instead  of 
crawl.  At  this  time,  Smalltalk/V  sup¬ 
ports  the  Microsoft  mouse  and  the 
Mouse  Systems  mouse.  I  have  also 
had  no  trouble  using  it  with  the  Logi¬ 
tech  mouse. 

The  current  version  of  Smalltalk/V 
comes  on  three  disks — the  Image, 
Source,  and  Tutorial  disks — and  re¬ 
quires  512K.  It  is  a  considerable  ac¬ 
complishment  to  have  implemented 
so  much  of  Smalltalk-80  in  half  a 
megabyte. 

The  dialect  of  Smalltalk/V  is  so 
close  to  Smalltalk-80  that  most  of  the 
classes  and  examples  in  the  Small¬ 
talk-80  book  series  can  be  entered  “as 
is.”  The  main  exceptions  are  those 
that  make  use  of  multitasking,  such 
as  the  simulation  examples — the  sys¬ 
tem  accepts  even  these,  although 
they  won't  work  as  written.  This  is 
important  for  programmers  new  to 
Smalltalk  because  there  is  really  very 
little  published  material  available 
other  than  what  is  available  for 
Smalltalk-80  to  give  them  a  full  over¬ 
view  of  the  Smalltalk  system  and 
help  them  get  going  with  it. 

In  the  Class  Hierarchy  Browser, 
the  lower  classes  in  the  hierarchy  be¬ 
neath  those  that  are  the  immediate 
subclasses  of  Object,  the  root  class, 
can  be  either  hidden  or  visible  as  you 


choose.  Once  you  have  chosen  a  par¬ 
ticular  class  with  subclasses,  you  can 
choose  to  show  or  conceal  just  the 
subclasses  under  it.  Smalltalk/V  also 
has  some  additional  commands  on 
the  desktop — for  example,  now  you 
can  cycle  around  to  other  windows 
from  a  command  on  one  of  the  drop¬ 
down  menus.  This  was  needed  be¬ 
cause,  when  a  window  is  completely 
covered  by  another  window,  you 
cannot  select  it  with  the  mouse. 

The  basic  types  of  facilities  you  use 
with  Smalltalk/V  are  things  such  as 
workspace  windows,  browsers, 
menus,  and  occasionally  what  are 
known  as  inspectors.  The  main  types 
of  browsers  are  Class  Browsers,  Class 
Hierarchy  Browsers,  and  the  Disk 
Browser.  These  are  specialized  win¬ 
dow  facilities  that  give  you  a  view¬ 
point  onto  a  particular  aspect  of  the 
system.  And  as  I  mentioned  earlier, 
you  can  create  as  many  instances  of 
these  views  as  you  may  need. 

Class  Consciousness 

A  Class  Hierarchy  Browser  is  the 
Digitalk  version  of  the  Smalltalk  Sys¬ 
tem  Browser.  As  implemented  in 
Smalltalk/V,  this  type  of  browser  has 
four  separate  panels.  The  first  is  a 
scrolling  window  that  lists  the  main 
classes  and  subclasses  in  the  system. 
To  the  right  of  it  is  the  methods  pane, 
which  displays  the  list  of  applicable 
methods.  Beneath  it  are  two  small  se¬ 
lector  panes  containing  the  words 
class  and  instance,  respectively.  Fi¬ 
nally,  on  the  bottom  is  a  large  pane 
that  displays  the  actual  Smalltalk 
source  code  for  selected  items.  De¬ 
pending  upon  whether  you  select  on 
the  instance  or  class  pane,  either  the 
calling  names  of  instance  or  class 
methods  are  displayed  in  the  meth¬ 
ods  pane.  When  you  select  one  of 
these  method  names,  its  source  code 
is  displayed  in  the  lower  pane.  When 
the  source  pane  is  current,  it  acts  as  a 


128 

734 


Dr.  Dobb’s  Journal,  September  1987 


ARTIFICIAL  INTELLIGENCE 

(continued  from  page  128) 


text  editing  pane  in  which  you  can 
create  and  modify  source  code. 

What  this  type  of  browser  means 
to  a  software  developer  is  that  you 
have  a  built-in  overview  of  the  sys¬ 
tem  that  can  give  you  ready  access  to 
anything  in  the  system  at  all  times. 
The  way  you  generally  call  upon 
things  that  have  already  been  en¬ 
tered  into  the  system  is  by  creating 
instances  of  a  class  and  initializing 
variables  in  a  workspace  window 
and  then  sending  messages  to  it.  Any 
involved  interaction  can  itself  be 
made  into  a  program  by  adding  it  as  a 
new  subclass  with  its  own  variables 
and  methods  that  can  be  instantiated 
and  evaluated  more  easily. 

As  mentioned  earlier,  Smalltalk/V 
also  provides  inspectors.  These  are 
special-purpose  windows  you  can 
use  as  low-level  debugging  tools  that 
allow  you  to  examine  and  even 
change  objects  in  the  system.  You 
don't  open  an  Inspector  window  by 
accessing  a  menu;  instead,  you  use  a 
workspace  or  system  transcript  win¬ 
dow  to  send  the  inspect  message  to 
an  instantiated  object.  An  Inspector 
window  with  two  panes — one  on 
the  right  and  one  on  the  left — then 
opens.  The  pane  on  the  left  displays 
the  names  of  the  instance  variables, 
and  the  right  pane  shows  their 
values. 

Two  fonts  come  with  Smalltalk/V. 
The  fonts  differ  mainly  in  size — one 
has  8X8  pixels  and  the  other  8  X  14 
pixels.  Other  features  included  in 
Smalltalk/V  that  were  absent  in 
Methods  are  a  DOS  shell,  a  garbage 
collector,  and  virtual  memory 
management. 

The  Smalltalk/V  manual  is  the  Tu¬ 
torial  and  Programming  Handbook. 


In  many  respects  it  is  remarkably 
clear  and  well  written.  Its  main 
shortcoming  is  that,  in  spite  of  its 
thoroughness  and  readability,  it  is 
not  a  complete  reference  to  the  be¬ 
havior  of  the  Smalltalk/V  system.  I 
would  like  to  see  a  companion  refer¬ 
ence  guide  that  would  go  more  deep¬ 
ly  into  the  behavior  and  implemen¬ 
tation  of  the  garbage  collector  and 
virtual  memory,  for  example. 

Graphics 

An  interesting  approach  to  graphics 
has  been  adopted  in  Smalltalk/V. 
The  basic  class  that  implements  the 
graphics  capability  is  the  BitBlt  class, 
which  is  named  for  the  bit  block 
transfer  operation  and  is  much  as  in 
Smalltalk-80.  Together  with  its  im¬ 
mediate  subclasses  Pen  and  Charac- 
ter Scanner  and  the  subclasses  of  Pen  - 
Commander  and  Animation  —  BitBlt 
provides  the  basis  for  how  Small¬ 
talk/V  creates  bit-mapped  displays. 
The  block  transfers  occur  between 
two  Forms — a  source  Form  and  a  des¬ 
tination  Form.  Forms,  Points,  and 
Rectangles  constitute  the  main  struc¬ 
tures  used  in  Smalltalk/V  graphics.  A 
mechanism  called  a  clipping  rectan¬ 
gle  is  used — again,  much  as  in  Small¬ 
talk-80 — to  define  the  maximum  size 
of  the  bit  transfer.  This  clipping  rect¬ 
angle  restricts  the  size  of  the  rectan¬ 
gular  array  of  bits  that  will  constitute 
the  destination  Form. 

To  see  how  this  works,  first  look  at 
the  section  of  the  class  hierarchy  in 
question: 

BitBlt 

CharacterScanner 

Pen 

Animation 

Commander 

The  CharacterScanner  class  has  the 


job  of  converting  ASCII  character 
codes  into  displayable  bit  patterns. 
The  Pen  class,  as  you  might  have  sur¬ 
mised,  is  the  class  that  implements 
turtle  graphics.  The  Animation  class 
constitutes  collections  of  pens  that 
represent  the  various  objects  being 
animated.  Finally,  the  Commander 
class  controls  arrays  of  pens  in  such  a 
way  that,  whenever  it  receives  pen- 
related  messages,  it  relays  the  same 
message  to  each  of  the  pens  under  its 
command. 

It  is  also  interesting  to  see  how 
Smalltalk/V  handles  windows.  Here, 
the  approach  is  very  different  from 
that  used  in  Smalltalk-80  but  is  essen¬ 
tially  the  same  as  that  used  in  Meth¬ 
ods.  The  following  segment  of  the 
class  hierarchy  comes  into  play: 

Dispatcher 

GraphDispatcher 

PointDispatcher 

ScreenDispatcher 

ScrollDispatcher 

FormEditor 

ListSelector 

TextEditor 

PromptEditor 

TopDispatcher 

DispatchManager 

Browsing  Drives 

The  Disk  Browser  is  one  of  the  more 
original  and  powerful  facilities  in  the 
Smalltalk/V  system.  Its  window  is 
composed  of  four  panes.  In  the  up¬ 
per-left  pane  is  the  directory  hierar¬ 
chy  list,  which  shows  all  the  directo¬ 
ries  on  a  disk.  In  the  pane  to  the  right 
of  this  is  the  file  list,  which  displays 
the  names  of  all  the  files  in  the  select¬ 
ed  directory.  A  large  pane  below 
these  is  the  contents  pane,  which  dis¬ 
plays  either  the  screen  of  directory 
information  as  it  would  appear  after 
an  MS-DOS  dir  command  or  the  con¬ 
tents  of  a  selected  file.  And  just  above 
this,  there  is  a  small  pane  called  the 
directory  order  pane,  which  displays 
the  way  directory  information  is  cur¬ 
rently  being  sorted  for  display  in  the 
contents  frame — such  as  by  date, 
name,  or  size.  With  this  facility  you 
can  create  or  remove  directories  and 
files  as  well  as  rename,  copy,  or  print 
them. 

With  the  command  menus  accessi¬ 
ble  from  the  contents  pane,  you  can 
perform  just  about  any  file  mainte¬ 
nance  operation,  including  cut  and 


130 


Dr.  Dobb's  Journal,  September  1987 

735 


ARTIFICIAL  INTELLIGENCE 

(continued  from  page  131  j 


paste,  copy,  read,  and  install.  The  full 
set  of  commands  available  as  dis¬ 
played  in  the  menu  varies  depend¬ 
ing  upon  the  size  of  the  file.  Normal¬ 
ly,  with  files  greater  than  6,000  bytes, 
the  contents  pane  displays  only  the 
first  and  last  2,000  bytes.  In  that 
event,  you  can  use  the  read  it  com¬ 
mand  to  read  in  the  complete  con¬ 
tents  of  a  large  file.  Also  available  is 
the  save  as  command,  which  allows 
a  file  to  be  saved  under  a  different 
name.  Finally,  the  install  command 
allows  source  files  to  be  compiled 
into  the  Smalltalk/V  system. 

One  thing  you  have  to  keep  track 
of  is  the  size  of  the  changes.log  file.  If 
it  starts  to  get  large,  then  there  is  a 
facility  for  compressing  it.  You  must 
use  this  before  the  changes  log  gets 
too  large  and  space  on  the  disk  runs 
low,  or  your  image  will  die  the 
death.  The  log  facility  is  an  essential 
one  for  those  who  can’t  resist  taking 
advantage  of  the  fact  that  Smalltalk  is 
internally  extensible  to  a  large  de¬ 
gree,  as  are  Forth  and  LISP.  While 


modifying  the  internals  to  create  an 
image  of  a  new  dialect  of  Smalltalk/ 
V,  prior  to  getting  your  modified  sys¬ 
tem  debugged,  it  is  bound  to  crash 
more  than  twice.  The  log  file  is  insur¬ 
ance  that  you  will  never  lose  any¬ 
thing  you've  done  for  keeps,  unless 
for  some  reason  a  crash  clobbers  this 
file.  If  necessary,  you  can  even  use  it 
to  restore  the  system  image. 

A  Method  to  the  Madness 

At  the  very  center  of  all  this  are  the 
methods — the  actual  modular  sub¬ 
routines  that  do  the  message  passing. 
There  are  two  different  types  of 
methods — instance  methods  and 
class  methods,  which  are  analogous 
to  the  instance  and  class  variables. 

One  important  departure  of  Small¬ 
talk/V  from  the  Smalltalk-80  stand¬ 
ard  is  the  omission  of  the  ClassDe- 
scription  class.  In  Smalltalk-80,  the 
class  hierarchy  starting  with  the  Be¬ 
havior  class  is  organized  like  this: 

Behavior 

ClassDescription 

Class 

Metaclass 


The  arrangement  in  Smalltalk/V  is 
the  same  except  that  ClassDescrip¬ 
tion  is  omitted.  As  a  result,  Smalltalk/ 
V  does  not  support  message  catego¬ 
ries — that  is,  the  grouping  of  meth¬ 
ods  for  a  given  class  under  various 
category  names.  In  many  cases,  I 
have  found  it  relatively  easy  to  add 
missing  Smalltalk-80  classes  to  Small¬ 
talk/V.  In  this  case  an  addition  is  not 
easy  to  make  because,  if  ClassDe¬ 
scription  is  added  as  a  subclass  of  Be¬ 
havior,  it  becomes  a  peer  class  of 
Class  and  Metaclass  rather  than  a  su¬ 
perclass  of  them. 

On  the  other  hand,  Smalltalk/V 
has  several  special  classes  that  are 
not  found  in  Smalltalk-80.  Table  1, 
below,  contains  a  list  of  those  that  are 
not  simply  machine-specific  classes 
but  have  to  do  with  the  IBM  PC  imple¬ 
mentation  and  user  interface. 

Example:  A  Small 
Inference  Engine 

To  give  you  a  better  grasp  of  specific 
things  that  Smalltalk/V  can  do,  I’ll 
now  discuss  an  example  of  a  simple 
rule-based  reasoning  system  that 
was  provided  by  Digitalk.  First,  let’s 
look  at  its  overall  structure. 

There  is  a  kind  of  application  hold¬ 
er  class  called  InferenceEngine  that 
has  three  subclasses — Expert,  Fact, 
and  Buie — that  do  all  the  work.  An 
instance  of  the  class  Expert  is  a  work¬ 
ing  inference  engine  that  can  evalu¬ 
ate  rulebases  for  particular  subject 
areas  or  domains.  In  this  case  it  is  a 
tree  expert.  The  inference  engine  is  a 
simple  forward-chaining  one  that  ac¬ 
cepts  a  set  of  facts  and  applies  the 
rules  to  the  facts  it  already  knows  to 
determine  if  the  goal  is  true.  It  then 
allows  you  to  ask  it  to  explain  how  it 


CursorManager 

Directory 

DiskBrowser 

Dispatcher 

Displaystring 

FixedSizeCollection 

IndexedCollection 

InfiniteForm 

LinkedListStream 

MethodDictionary 

StringBIt 

StringModel 

SystemDictionary 


Table  1:  Classes  unique  to  Smalltalk/V 


132 

736 


Dr.  Dobb 's  Journal,  September  1987 


ARTIFICIAL  INTELLIGENCE 

(continued  from  page  13Z) 


arrived  at  its  result.  The  main  opera¬ 
tion  of  the  inference  engine  is  to  ap¬ 
ply  rules  to  test  a  result.  It  sends  a 
message  to  the  rules  collection  to  test 
the  facts  and  then  applies  the  eval: 
method  to  see  if  the  goal  to  be  tested 
matches.  If  so,  it  returns  true;  if  not,  it 
returns  false. 

The  action  method  fires  the  con¬ 
clusion  of  a  rule  and  in  doing  so  adds 
a  new  fact  to  the  facts  collection.  The 
way  you  use  this  little  demo  expert 
system  is  by  entering  everything  it 
needs  directly  in  a  text  area  and  then 
causing  the  interpreter  to  evaluate  it 
by  selecting  DO  IT  from  the  drop¬ 
down  menu.  This  does  the  following 
things:  First,  it  creates  an  instance  of 
the  Expert  class  called  by  the  global 
variable  name  Tree;  second,  it  initial¬ 
izes  an  instance  of  the  Fact  class;  and 
third,  it  submits  various  facts  about  a 
tree  for  evaluation  by  adding  them  to 
the  Fact  collection.  This  sends  the 
message  to  the  Tree  expert  to  prove 
the  goal  that,  given  the  previously 
entered  facts,  for  example,  the  tree  is 


of  the  Cyprus  family. 

Several  things  should  be  noted 
about  this  inference  engine  demo, 
which  was  not  intended  to  be  any¬ 
thing  more  than  a  simple  toy  demon¬ 
stration.  First,  it  does  not  use  a  rule 
syntax  other  than  Smalltalk  itself,  so 
no  parser  is  needed.  This  allows  it  to 
run  quickly  but  makes  the  rules  less 
readable — a  familiar  trade-off.  An¬ 
other  point  is  that  it  lacks  the  ability 
to  read  in  the  fact  and  rule  collec¬ 
tions  directly  from  a  file — a  facility 
that  could  be  added  without  too 
much  difficulty.  There  is  also  no  real 
user  interface  other  than  the  explain 
method.  In  particular,  there  is  no  fa¬ 
cility  for  posing  questions  to  users 
about  values  that  the  system  cannot 
find  otherwise.  The  point  is  that,  al¬ 
though  this  is  a  toy  system,  it  points 
to  what  a  full  system  might  be.  Most 
of  these  features  are  not  too  difficult 
to  add,  once  you  grasp  the  basics  of 
how  to  implement  an  inference  en¬ 
gine  in  Smalltalk. 

The  advantages  of  implementing  a 
full  expert  system  shell  in  Smalltalk/ 
V  are  quite  easy  to  see.  First,  you  get 
the  lush,  easy-to-understand  user  in¬ 


terface  practically  for  free.  It  would 
not  be  a  major  development  project 
to  use  the  Smalltalk/V  menu  and 
windowing  facilities  to  build  a  supe¬ 
rior  expert  system  consultation  envi¬ 
ronment.  Another  important  plus  is 
the  multiple  instance  aspect  of  Small¬ 
talk.  You  can  have  as  many  Experts, 
like  the  Tree  expert,  initialized  as 
necessary,  each  with  a  separate  rule- 
set.  Smalltalk/V  could  then  be  pro¬ 
grammed  for  one  Expert  to  pass  a 
message  to  another  for  a  goal  to  be 
tested.  Also,  more  flexible  inference 
methods  could  be  implemented  for 
backward  chaining  and  combining 
both  forward  and  backward  chain¬ 
ing.  Finally,  a  parser  could  be  writ¬ 
ten  that  could  accept  a  more  "friend¬ 
ly”  rule  syntax  and  compile  it  into 
the  Smalltalk  format  as  used  here  for 
running  finished  knowledge  bases. 

Conclusions 

Smalltalk/V,  Version  1.2,  is  a  remark¬ 
able  accomplishment  and  a  very 
easy-to-understand  environment  for 
newcomers  to  Smalltalk  to  get  ac¬ 
quainted  with  the  language.  Its  per¬ 
formance  is  surprisingly  fast,  consid¬ 
ering  all  the  things  going  on. 

I  should  also  mention  that  the 
Goodies  Extension  Kit  disk,  which  is 
now  available  as  an  option  for  Small¬ 
talk/V,  contains  an  implementation 
of  multiprocessing  that  could  be  im¬ 
portant  if  discrete  simulation  is  what 
you're  after  in  a  Smalltalk  environ¬ 
ment.  The  simulation  examples  in 
the  Xerox  Smalltalk  series  presume 
the  multiprocessing  capability  of 
Smalltalk-80.  Now,  with  the  multi¬ 
processing  classes  provided  on  the 
optional  Goodies  disk,  this  should  not 
be  a  problem. 

DDJ 


Vote  for  your  favorite  feature/article. 
Circle  Reader  Service  No.  7. 


Vendor 

Smalltalk/V 
Digitalk  Inc. 

5200  Century  Blvd. 

Los  Angeles,  CA  90045 
(213)  645-1082 
Reader  Service  No.  127 


134 


Dr.  Dobb's  Journal,  September  1987 

737 


LETTERS 

(continued  from  page  14) 


to.  . .  Oh  my  God — he  did  forget!). 
Stan  Kelly-Bootle 
25  Parkwood 
Mill  Valley,  CA  94941 

Dear  DDJ, 

Brian  Anderson  states  that  "often  the 
only  control  structures  available  in 
assembly  language  are  the  condition¬ 
al  and  unconditional  jump  and  the 
call  to  subroutine.”  The  68000  and 
most  other  modern  processors  have 
looping  structures,  generally  provid¬ 
ed  for  use  by  high-level  language 
compilers.  Therefore,  I  offer  my  Ex¬ 
ample  2,  below,  as  a  replacement  for 
his.  Also,  notice  that  this  code  does 
not  get  stuck  in  an  endless  loop,  un¬ 
like  its  predecessor. 

I  would  like  to  thank  Mr.  Anderson 
for  his  interesting  and  otherwise  ex¬ 
cellent  article. 

Matthew  Siegel 
192  Belvedere  St.,  *9 
San  Rafael,  CA  94901-4748 

Dear  DDJ, 

I  guess  I  still  have  a  bit  of  egg  on  my 
face — it  seems  that  I  can't  program 
my  way  out  of  an  infinite  loop  (at 
least  when  it  comes  to  68000  assem¬ 
bly  language). 

The  68000  example  that  I  cited  in 
my  Viewpoint  won’t  work  because  1 
forgot  to  increment  the  index  vari¬ 
able  inside  the  loop.  My  wife  thought 
that,  because  this  mistake  supports 
my  point  (sort  of),  I  should  claim  that 
I  planted  the  error  just  to  see  if  any¬ 
one  would  catch  it. 

Please  let  me  assure  you,  the  mis¬ 
take  was  an  honest  one.  My  apolo¬ 
gies  to  DDJ  readers  (especially  68000 
hackers). 

Brian  Anderson 
5105  Lorraine  Ave. 

Burnaby,  BC 
Canada  V5G  2S3 

Math  and  Programming 

Dear  DDJ, 

I  feel  I  must  take  issue  with  Allen  Ho- 
lub  concerning  his  Viewpoint  in  the 
April  1987  issue  of  DDJ. 

To  begin  with,  from  personal  ex¬ 
perience,  I  can  draw  a  strong  corre¬ 
lation  between  advanced  mathemat¬ 
ics  and  the  art  of  programming. 
There  is  a  definite  parallel  between 


juggling  a  large  system  of  algebraic 
equations  in  your  head  and  trying  to 
maintain  the  purpose  and  use  of 
many  interacting  variables  in  a  com¬ 
puter  program.  Obviously  imple¬ 
mentations  of  some  of  the  purer  algo¬ 
rithms  in  computer  science  such  as 
queueing  theory,  graphs  and  trees, 
sorting  algorithms,  vector  math,  au¬ 
tomata  theory,  frequency  analysis, 
formal  logic,  cubic  splines,  compres¬ 
sion  algorithms,  minimax  and  Bayes 
decision  theory,  branch  and  bound 
problems,  probabilistic  and  statistical 
concepts,  game  theory,  and  all  as¬ 
pects  of  operations  research  require 
a  deep  understanding  of  algebraic 
notation,  linear  algebra,  and  many 
other  forms  of  abstract 
representation. 

Allen  implies  that  mathematics  is 
nothing  more  than  applying  a  set  of 
memorized  rules  to  a  problem.  I  beg 
to  differ  if  he  feels  that  solving  a 
third-order  differential  equation  or  a 
partial  derivative  is  not  a  creative 
problem  solving  process.  Those 
"memorized  rules”  are  less  rules  and 
more  approaches  and  guidelines 


with  which  to  tackle  the  problem  as 
stated.  I  find  solving  calculus  and  dif¬ 
ferential  equations  not  at  all  unlike 
trying  to  come  up  with  my  own  algo¬ 
rithms  for  a  computer  simulation 
problem.  I  feel  that  those  students 
who  could  not  tackle  and  solve  ad¬ 
vanced  mathematics  would  also 
have  great  difficulty  in  implement¬ 
ing  their  own  fresh  approaches  to 
new  and  yet  unsolved  computer  sci¬ 
ence  problems. 

In  agreement  with  Allen,  I  would 
state  that  if  a  student's  goal  is  to  be¬ 
come  an  applications  programmer 
who  is  never  responsible  for  an  origi¬ 
nal  algorithm  but  who  instead  sim¬ 
ply  implements  code  and  algorithms 
found  in  books,  then  by  all  means 
forego  anything  beyond  Algebra  I. 
Allen  covers  himself  by  stating  that 
math  is  not  a  prerequisite  for  pro¬ 
gramming  as  it  is  for  engineering. 
What  does  he  think  the  name  com¬ 
puter  scientist  means  anyway?  And 
what  kind  of  jobs  do  you  think  com¬ 
puter  scientists  have?  They  work  on 
developing  new  approaches  in  medi¬ 
cine,  vision  processing,  knowledge 


************************************************************ 

♦ 

*  68000  assembly  language  FOR  loop 

* 

*  INPUT 

*  Thefi  rst  element  to  clear  is  FIRST,  the  last  element  LAST. 

*  Character  data  begins  at  DATA. 

* 

*  REGISTER  USAGE 

*  A0  points  to  the  current  element  of  the  array. 

*  DO  =  one  less  than  the  number  of  elements  left  to  clear  . 

* 

************************** ***************************^****** 
* 

FIRST  DS.W  1 
LAST  DS.W  1 
DATA  DS.B500 
* 

* 

CLEAR  LEA  DATA,  A0 
MOVE . W  FIRST,  DO 
LEA  0 ( A0 , DO . W ) ,  A0 
MOVE . W  LAST,  DO 
SUB.W  FIRST,  DO 
LOOP  CLR.B  (A0)  + 

DBF  DO,  LOOP 

* 

END 


A0  points  to  character  data 
DO  =  f  i  rst  element  to  clear 
A0  points  tof  i  rst  element  to  clear 
DO  =  number  of  elements  to  clear 
(less  1,  for  DBcc  loop  use) 
clear  an  element  and  advance 
repeat 


Example  2:  Loop  structure  correction  for  Brian  Anderson's  68000 
assembly  code 


138 

738 


Dr.  Dobb's  Journal,  September  1987 


LETTERS 

(continued  from  page  138) 


representation,  flight  simulation, 
computer  games,  the  defense  indus¬ 
try,  and  all  forms  of  real-world  simu¬ 
lation  and  control.  One  hardly  needs 
a  computer  scientist  to  write  a  dBASE 
III  application,  yet  the  fields  men¬ 
tioned  above  seek  out  and  demand  a 
computer  scientist  with  heavy  math¬ 
ematics  background.  This  group  of 
programmers  does  not  make  up  a 
small  specialized  percentage  but  in¬ 
stead  represents  what  a  computer 
science  degree  is  all  about.  Would 
you  hire  an  MIT  computer  science 
graduate  who  had  no  more  than 
high-school  geometry  to  his  or  her 
credit?  Neither  would  I. 

If  Allen  rewrote  the  entire  View¬ 
point  and  replaced  each  occurence 
of  "computer  scientist”  with  "data 
processing  programmer,”  it  would 
be  a  valid  and  important  commen¬ 
tary.  A  dBASE  III,  4GL,  or  COBOL  pro¬ 
grammer  has  little  need  for  calculus, 
but  for  those  of  us  breaking  new 
frontiers  in  image  processing,  prob¬ 
lem  solving,  and  other  areas  of  com¬ 
puter  science,  the  need  for  a  strong 
background  in  mathematics  and  for¬ 
mal  symbolic  representation  is  clear. 

John  W.  Ratcliff 

2510  LaCaracas 

St.  Louis,  MO  63114 

Allen  Holub  replies: 

There  are  several  ways  to  learn  how 
to  manage  large  systems,  and  I  still 
believe  that  mathematics  is  among 
the  poorest  of  these,  primarily  be¬ 
cause  of  the  amount  of  background 
that  you  need  just  to  get  started.  It 
takes  a  year  and  a  half  of  college-lev¬ 
el  math  to  get  to  the  point  where  you 
can  start  solving  third-order  differ¬ 
ential  equations,  but  most  people 
(hopefully)  know  the  rudiments  of 
English  composition  before  they  get 
out  of  high  school.  Moreover,  most 
holders  of  CS  bachelors  degrees  don’t 
know  how  to  solve  differential  equa¬ 
tions  at  all,  for  the  simple  reason  that 
the  courses  aren't  usually  required. 
Mr.  Ratcliff  is  correct  in  saying  that 
higher  mathematics  can  be  useful  to 
a  programmer.  Mathematics  at  the 
undergraduate  level  is  not.  It  seems 
to  me  a  waste  of  time  to  acquire  the 
skills  necessary  to  understand  differ¬ 
ential  equations  if  all  you  really  need 


to  do  is  understand  how  to  approach 
complex  systems.  It’s  not  that  mathe¬ 
matics  won't  get  you  there  eventual¬ 
ly  but  that  there  are  better  and  faster 
ways  to  acquire  the  same  skills,  nota¬ 
bly  English  composition. 

Mr.  Ratcliff’s  also  contends  that  a 
strong  background  in  "formal  sym¬ 
bolic  representation”  is  necessary. 
Again,  I  disagree.  An  ability  to  work 
with  abstraction  is,  of  course,  neces¬ 
sary,  but  remember  that  logic  started 
out  life  as  a  branch  of  philosophy, 
not  of  mathematics.  Much  of  the  ba¬ 
sic  work  in  compiler  theory  was 
done  by  linguists  (such  as  Noam 
Chomsky  at  MIT),  not  by  mathemati¬ 
cians.  More  often  than  not,  a  "formal 
symbolic  representation”  serves  to 
obfuscate,  rather  than  clarify.  Good 
examples  of  this  obfuscation  can  be 
found  in  the  "dragon”  book  (Aho, 
Sethi,  and  Ulman),  written  by  math¬ 
ematicians  and  used  in  most  compil¬ 
er-design  classes  (even,  I’m  reluctant 
to  admit,  in  the  one  that  I  teach).  Aho 
spends  pages  inventing  "formal  sym¬ 
bolic  representations”  and  then 
spends  five  lines  actually  explaining 
something  useful.  I’d  rather  spend 
half  an  hour  reading  a  clear  descrip¬ 
tion  of  a  process  in  English  than 
spend  the  same  half  hour  trying  to 
decipher  five  lines  of  formal  sym¬ 
bols.  More  to  the  point,  I’ve  found 
that  students  who  find  Aho  incom¬ 
prehensible  have  no  trouble  at  all 
understanding  the  concepts,  once 
these  concepts  are  presented  to  them 
in  a  clear  way  that  doesn't  use  Aho’s 
formal  symbolic  notation. 

Call  for  Recipes 

Dear  DDJ, 

In  his  boffo  review  of  our  book,  Nu¬ 
merical  Recipes:  The  Art  of  Scientific 
Computing  (May  1987),  Joe  Marasco 
mentions  that  we  want  to  hear  from 
readers  who  would  like  to  see  a  C 
version  of  the  book  and  source-code 
"recipes.” 

Actually,  preparation  of  Numerical 
Recipes  in  C  is  well  underway.  In 
fact,  we  would  like  to  hear  from  DDJ 
readers  with  an  interest  in  beta-test¬ 
ing  the  C  recipes.  We'll  happily  send 
free  beta  diskettes  to  the  first  hun¬ 
dred  people  who  respond,  plus  an 
additional  number  to  readers  who 


can  describe  (in  a  sentence  or  two) 
their  strong  qualifications. 

We  view  Numerical  Recipes  as  a  co¬ 
operative  project  between  authors 
and  users.  It's  time  to  get  good, 
cheap,  open-source  numerical  soft¬ 
ware  into  the  C  world,  before  the 
vendors  of  proprietary  object-only  li¬ 
braries  get  too  established,  as  was  his¬ 
torically  the  case  in  the  FORTRAN 
world  (much  to  the  grief  of  FORTRAN 
programmers). 

William  Press 

Numerical  Recipes  Software 

P.O.  Box  243 

Cambridge,  MA  02238 

Correction 

Dear  DDJ, 

Thank  you  for  covering  PC-MOS/386 
from  The  Software  Link  in  two  arti¬ 
cles  in  your  July  1987  issue.  Unfortu¬ 
nately,  some  of  the  information  in 
each  article  is  incorrect.  I'd  like  to 
take  this  opportunity  to  bring  it  to 
your  attention. 

In  the  article  "Developing  80386 
Applications  .  .  .  Today,”  the  prices 
given  for  PC-MOS/386  are  wrong.  The 
single-user/multitasking  version  is 
$195,  the  five-user  version  is  $595, 
and  the  twenty-five-user  version  is 
$995. 

In  the  16-Bit  Software  Toolbox  col¬ 
umn  The  Software  Link  is  erroneous¬ 
ly  referred  to  on  page  106  as  "a  com¬ 
pany  previously  known  for  its  copy¬ 
protection  schemes."  The  Software 
Link  has  never  been  involved  in  any 
type  of  copy-protection — we  devel¬ 
op  and  manufacture  multiuser/ mul¬ 
titasking  software. 

Thank  you  for  the  opportunity  to 
point  out  these  errors  of  fact.  Again, 
we  appreciate  your  editorial  cover¬ 
age.  Our  programming  staff  consid¬ 
ers  Dr.  Dobb's  a  valuable  resource 
and  we  are  grateful  for  your  cover¬ 
age  of  our  products. 

Colleen  G.  Goidel 

The  Software  Link 

3577  Parkway  Ln. 

Atlanta,  GA  30092 


DDJ 


140 


Dr.  Dobb's  Journal,  September  1987 

739 


PROGRAMMER'S  SERVICES 


OF  INTEREST 


PS/2  Add  Ons 

Alloy  Computer  Products  has  an¬ 
nounced  three  products  for  the  PS/2 
line:  an  internal  tape  drive  and  two 
adapters  to  connect  its  external  tape 
drives  and  other  products  to  the  PS/2 
machines.  Reader  Service  No.  17. 
Alloy  Computer  Products  Inc. 

100  Pennsylvania  Ave. 

Framingham,  MA  01701 
(617)  875-6100 

Rodime  has  announced  a  hard  disk 
on  a  card  for  the  PS/2  Model  30, 
which  it  claims  is  "the  only  way  for 
Model  30  users  to  get  more  than  20 
megabytes  of  internal  storage."  The 
suggested  retail  price  is  $1,495.  Read¬ 
er  Service  No.  18. 

Rodime  Inc. 

Peripheral  Systems  Division 
29525  Chagrin  Blvd.,  Ste.  214 
Pepper  Pike,  OH  44122 
(216)  765-8414 

CMS  Enhancements  has  exhibited 
external  hard-disk  subsystems  for 
the  PS/2  Model  30  as  well  as  for  the 
entire  PS/2  line  and  for  the  Macin¬ 
tosh  SE.  CMS  also  has  the  first  5V4-inch 
floppy  drive  for  the  PS/2,  which 
should  help  all  those  early  adopters 
move  their  software  over  to  the  PS/2 
hardware.  Reader  Service  No.  19. 

CMS  Enhancements  Inc. 

1372  Valencia  Ave. 

Tustin,  CA  92680 
(714)  259-9555 

Kodak's  diskette  subsidiary,  Verba¬ 
tim,  has  announced  a  2-megabyte  3.5- 
inch  diskette  (formatted  capacity  1.44 
megabytes).  Reader  Service  No.  20. 
Verbatim  Corp. 

1200  W.T.  Harris  Blvd. 

Charlotte,  NC  28213 
(704)  547-6500 


Development  Software 

Sterling  Castle,  a  new  publisher  of 
PC  software,  has  introduced  the 
Blackstar  C  function  library  designed 
to  support  the  new  ANSI  standard  and 
Microsoft,  Version  3. 0/4.0,  and  Lat¬ 
tice  3.0  C  Compilers.  Reader  Service 
No.  21. 

Sterling  Castle  Software 
702  Washington  St.,  Ste.  174 
Marina  del  Rey,  CA  90292 
(213)  206-3020 

The  macro  processing  program 
SmartKey  is  evolving  closer  to  a  pro¬ 
gramming  language  as  of  its  new  ver¬ 
sion,  5.2,  which  adds  context  sensitiv¬ 
ity  and  conditional  processing. 
Programmer  Nick  Hammond  wrote 
SmartKey  back  in  1979,  and  it  is  the 
original  macro  processor.  It  costs 
$69.95.  Reader  Service  No.  22. 
Software  Research  Technologies  Inc. 
2130  South  Vermont  Ave. 

Los  Angeles,  CA  90007 
(213)  737-7663 

Alan  Weiner  has  taken  macro  gener¬ 
ation  a  step  further.  He  has  devel¬ 
oped  a  memory-resident  program¬ 
ming  language,  which  he  calls  the 
Weiner  Shell  and  which  generates 
memory-resident  programs.  The 
program  is  written  in  assembly  lan¬ 
guage,  takes  up  less  than  50K,  and 
supports  the  LIM  expanded  memory 
spec,  so  any  program  written  with  it 
has  access  to  up  to  8  megabytes  of  LIM 
memory  as  well.  It  supports  DOS  in¬ 
terrupts,  user  I/O,  arrays,  and  float¬ 
ing-point  math.  The  idea  of  writing 
memory-resident  programs  on  the 
fly  that  play  around  with  DOS  inter¬ 
rupts  is  scary,  but  Alan  claims  the 
product  itself  is  robust.  Its  price  is 
$199.  Reader  Service  No.  23. 

Gryphon  Microproducts 
P.O.  Box  6543 
Silver  Spring,  MD  20906 
(301)  384-6868 

Caro  Research  has  a  C  code  genera¬ 
tor,  developed  by  an  outfit  called 
Chancelogic  PLC,  called  Pro-C  that 
supports  various  ISAM  file  handlers; 
produces  stand-alone  C  programs 
rapidly;  and  requires  no  end-user 
run-time  system,  royalties,  or  license. 
Reader  Service  No.  24. 

Caro  Research  Associates 


202  South  22nd  St. 

Tampa,  FL  33605 
(813)  248-0852 

Books  and  Stuff 

We  don’t  know  why  the  company  is 
called  Rabbit,  but  we  are  compelled 
by  this  nomenclature  to  tell  you  that 
Rabbit  has  announced  publication  of 
its  Portable  C  and  Unix  Programming, 
a  240-page  reference  guide  for  pro¬ 
grammers  writing  applications  in  C 
or  in  the  Unix  environment.  The 
book  is  available  from  Prentice-Hall 
for  $21.95.  Reader  Service  No.  25. 
Rabbit  Software  Corp. 

Great  Valley  Corporate  Center 
Seven  Great  Valley  Parkway  East 
Malvern,  PA  19355 
(215)  647-0440 

AT&T  also  wants  to  educate  you.  It 
has  announced  a  series  of  videotapes 
on  Unix  System  V,  Release  3.  The 
tapes  of  possible  interest  to  DDJ  read¬ 
ers  cover  C  and  command  shell  pro¬ 
gramming  and  can  be  leased  for 
$300-375  a  month  or  purchased  for 
more.  Reader  Service  No.  26. 

AT&T 

Videotape  Library 
(800)  247-1212 

Windows  Applications 

Palantir  Software  has  a  shelf  full  of 
Windows  applications,  including 
word  processing,  spelling  checking, 
scheduling,  report  generation, 
spreadsheeting,  drawing,  graphics 
scanning,  and  communications. 
Reader  Service  No.  27. 

Palantir  Software 
12777  Jones  Rd.,  Ste.  100 
Houston,  TX  77070 
(713)  955-8880 

Among  the  announced  Windows  ap¬ 
plications  at  Comdex  were  two  from 
hDC:  an  applications  organizer 
called  ClickStart,  and  EGA-16,  a  driver 
that  doubles  the  number  of  displaya- 
ble  colors  under  Windows.  The  idea 
behind  ClickStart  seems  to  be  that 
corporate  users  of  Windows  will 
need  password  protection  of  applica¬ 
tions,  turnkey  menu  selections,  and 
restricted  access  to  DOS  functions. 
The  vision  of  a  traditional  DP/MIS  user 
interface  on  top  of  a  point-and-shoot 
graphical  user  interface  on  top  of  the 


142  Or.  Dobb's  Journal,  September  1987 

740 


OF  INTEREST 

(continued  from  page  142) 


venerable  A>  prompt  seemed  a  bit 
much,  but  later  the  same  day  we 
found  another  vendor  doing  like¬ 
wise.  EGA-16  is  $24.95,  and  ClickStart 
is  $79.95.  Reader  Service  No.  28. 
hDC  Computer  Corp. 

8405  165th  Ave.  NE 
Redmond,  WA  98052 
(212)  475-5550 

Modems 

Prices  of  2,400-bps  modems  are  com¬ 
ing  down  a  bit.  Okidata  has  an¬ 
nounced  two  new  2,400-bps  modems 
at  $599  (external)  and  $549  (internal), 
with  automatic  adaptive  equalization 
to  ameliorate  the  problems  of  line 
noise.  Reader  Service  No.  29. 

Okidata 

532  Fellowship  Rd. 

Mount  Laurel,  NJ  08054 
(609)  235-2600 

The  Zoom/Modem  PC  2400  HC  is  an 
internal  300/1,200/2,400-bps  IBM  PC, 
IBM  PC/XT,  IBM  PC/AT,  and  compati¬ 
ble  modem  with  a  suggested  retail 
price  of  $199.  It  supports  Bell  103a, 
212a,  and  CCITT  v.22  bis  protocols  and 
uses  the  Hayes  AT  command  set. 
Reader  Service  No.  30. 

Zoom  Telephonies  Inc. 

207  South  St. 

Boston,  MA  02111 
(617)  423-1072 

Displays 

Jeff  Duntemann,  editor  of  Borland’s 
new  Turbo  Techniques,  often  nags 
software  vendors  about  support  for 
full-screen  displays.  Jeff  owns  a  Ge¬ 
nius  15-inch  full-page  display  and  has 
the  radical  idea  that  software  devel¬ 
opers  should  not  penalize  him  for 
owning  a  good  monitor.  Micro  Dis¬ 
play  Systems  announced  at  Comdex 
a  line  of  19-inch  Genius  full-screen 
displays  using  the  TI  34010  graphics 
coprocessor.  Reader  Service  No.  31. 
Micro  Display  Systems  Inc. 

1310  Vermilion  St. 

P.O.  Box  455 
Hastings,  MN  55033 
(612)  437-2233 

Cornerstone  Technology  has  intro¬ 
duced  the  Vista  1600,  a  $2,195  19-inch 
monitor  for  the  Macintosh  II  that  dis¬ 
plays  1,600  X  1,280-resolution,  the 
Mac's  max.  Its  noninterlaced  screen 


has  a  200-MHz  bandwidth  and  re¬ 
freshes  at  67  Hz.  The  monitor  comes 
with  a  NuBUS  controller  card  that  oc¬ 
cupies  one  slot  in  the  Mac  II. 

Cornerstone  has  also  announced  a 
version  of  the  monitor  for  PCs  and 
386  machines  for  $2,395.  Reader  Ser¬ 
vice  No.  32. 

Cornerstone  Technology 
175A  East  Tasman  Dr. 

San  Jose,  CA  95134-1620 
(408)  433-1600 

Hitachi  has  announced  more  hi-res 
monitors,  including  a  20-inch  1,280  X 
1,024-model  with  RGB  input  and  vari¬ 
ous  scanning  frequencies  for  $3,995 
suggested  retail.  Reader  Service  No. 
33. 

Hitachi  Sales  Corp.  of  America 
401  West  Artesia  Blvd. 

Compton,  CA  90220 
(213)  537-8383 

Amdek  has  announced  a  family  of 
color  and  monochrome  displays 
compatible  with  the  PS/2's  VGA  spec. 
Models  432  (monochrome)  and  732 
(color)  display  640  X  480  (VGA),  640  X 
400  (CGA),  or  640  X  350  (EGA)  pixels 
and  sell  for  a  suggested  retail  price  of 
$245  and  $625,  respectively.  Reader 
Service  No.  34. 

Amdek  Corp. 

1901  Zanker  Rd. 

San  Jose,  CA  95112 
(408)  436-8570 

Tseng  Labs  has  announced  a  new 
VLSI  chip  for  IBM  VGA  graphics,  for 
which  the  company  expects  to  show 
a  prototype  this  month.  The  ET3000 
chip  will  support  all  VGA  graphics  as 
well  as  MGA,  CGA,  and  EGA.  Reader 
Service  No.  35. 

Tseng  Laboratories  Inc. 

10  Pheasant  Run 
Newtown,  PA  18940 
(215)  968-0502 

CAD/CAM/CAE 

Vector  Automation  claims  that  its 
CADMAX  Version  3.0  software  for  386 
workstations,  formerly  available  on 
MicroVAX  II,  is  the  first  CAD  software 
to  take  full  advantage  of  the  386.  The 
price  is  $3,350  for  a  2-D  version.  Read¬ 
er  Service  No.  36. 

Vector  Automation  Inc. 

Village  of  Cross  Keys 


Baltimore,  MD  21210 
(301)  433-4200 

Modgraph’s  Pegasus  is  a  graphics 
card  that  supports  EGA  (or  CGA,  MGA, 
or  Hercules)  and  high-resolution 
graphics  on  one  monitor  of  an  IBM  PC, 
IBM  PC/XT,  IBM  PC/AT,  or  compatible 
computer,  for  CADD  work  and  the 
like.  The  hi-res  support  includes  an 
82786  graphics  engine,  there  is  a  chip 
set  supporting  EGA,  and  the  board  dis¬ 
cerns  which  mode  you  want.  Reader 
Service  No.  37. 

Modgraph 

149  Middlesex  Turnpike 
Burlington,  MA  01803 
(617)  229-4800 

The  Great  Western  Software  Com¬ 
pany  has  a  companion  software 
product  to  AutoCAD  called  Auto- 
Board  System  II  for  use  in  the  produc¬ 
tion  of  printed  circuit  boards.  This  is 
TGWSC’s  third  automatic  routing 
package  for  microcomputers;  it’s 
been  at  it  for  a  while.  Reader  Service 
No.  38. 

The  Great  Western  Software  Co. 

207  West  Hickory  St.,  Ste.  202 
Denton,  TX  76201 
(817)  383-4434 

The  MSA  Group,  developer  of  a  2-D 
CAD  system  for  PCs,  took  the  name  of 
its  product,  TurboCAD,  seriously,  and, 
following  the  Borland  lead,  cut  the 
product's  price  from  $395  to  $99  for 
all  modules.  Reader  Service  No.  39. 
MSA  Group 

12021  Wilshire  Blvd.,  Ste.  370 
West  Los  Angeles,  CA  90025 
(213)  473-8711 

Computervision  has  announced 
Version  3.0  of  its  PC-based  3-D  CAD/ 
CAM  package,  Personal  Designer. 
New  features  include  multiple  views, 
improved  dimensioning  capability, 
and  an  undo  feature.  Reader  Service 
No.  40. 

Computervision  Corp. 

100  Crosby  Dr. 

Bedford,  MA  01730 
(617)  275-1800 

DDJ 


144 


Dr.  Dobb's  Journal,  September  1987 

741 


FORUM 


SWAINE'S  FLAMES 


It  seems  that  look  and  feel  may  be 
less  of  a  problem  for  Lotus  Devel¬ 
opment  Corp.  than  compiled  spread¬ 
sheets  are.  Several  products  now 
turn  1-2-3  spreadsheet  data  into  exe¬ 
cutable  programs,  providing  a  legal 
way  to  share  and  manipulate  spread¬ 
sheet  data  while  only  buying  one 
copy  of  the  software.  What  you  can't 
do  with  these  programs  is  design,  or 
alter  the  design  of,  spreadsheets.  In  a 
company  where  spreadsheets  are  de¬ 
signed  by  one  person  and  used  by 
many,  these  products  could  save  a  lot 
of  bucks — bucks  that  Lotus  may  feel 
ought  to  be  flowing  its  way.  Should 
be  interesting. 

Personally,  I  find  it  weird  to  think 
of  1-2-3  as  a  programming 
environment. 

Speaking  of  weird  programming 
environments,  terminate-and-stay- 
resident  implementations  of  BASIC 
and  other  programming  languages 
are  popping  up  all  over  the  place.  If 
you  ask  me,  anyone  who  develops 
software  in  TSR  space  is  playing  with 
fire,  but  that  won't  stop  the  crazed 
code  junkies  from  snorting  this  stuff 
up. 

Well,  you  might  program  in  the 
ether,  but  you  don't  mess  around 
with  the  herald  of  Galactus.  Marvel 
Comics  has  leaned  on  Acius  about  the 
name  Silver  Surfer,  and  the  compa¬ 
ny's  Mac  database  is  now  called  4th 
Dimension.  In  an  unrelated  develop¬ 
ment  in  the  same  paragraph,  Living 
Videotext  recently  moved  to  Easy 
Street  (really),  and  there  were  ru¬ 
mors  that  the  company  was  about  to 
be  bought  by  Symantec.  More  inter¬ 
esting  were  the  hints  that  the  vendor 
of  a  very  successful  windowing  sys¬ 
tem  was  being  pressured  by  a  well- 
known  venture  capitalist  to  sell  out  to 
a  major  operating  system  vendor  for 
the  good  of  the  industry. 

And  you  thought  the  Ollie  North 
Show  was  the  best  soap  opera  in 
town. 


Here,  for  all  you  model  program¬ 
mers,  is  what  you  need  to  do  to  be¬ 
come  a  gatefold  in  a  major  computer 
magazine:  As  a  result  of  your  strate¬ 
gic  partnership  with  the  largest  com¬ 
puter  company,  you  deliver  a  multi¬ 
tasking  operating  system  (retaining 
the  right  to  license  it  to  your  part¬ 
ner’s  competitors),  and  when  your 
partner  lays  out  its  plans  to  add  com¬ 
petitive  value  to  the  product  with 
connectivity  extensions,  you  strike 
another  strategic  partnership  with  a 
leading  LAN  vendor,  intended  to  give 
you  a  piece  of  everybody  else's  an¬ 
swer  to  those  extensions.  Meanwhile 
you  hedge  your  bets  on  the  operating 
system  itself  by  striking  a  strategic 
partnership  with  a  leading  multiuser 
operating  system  vendor,  and  when 
you’ve  got  that  all  sewn  up,  you  con¬ 
vince  programmers  to  pay  you  thou¬ 
sands  of  dollars  to  learn  how  to  devel¬ 
op  the  software  to  run  under  your 
operating  system. 

Cousin  Corbett's  Secrets  of  Software 
Success,  Part  II:  Product  Names. 

The  following  advice  from  my 
cousin  Corbett's  forthcoming  book  is 
reprinted  here  without  comment. 

"Although  there  is  no  formula  for 
successful  product  naming,  there  are 
a  few  simple  rules  that  will  help  you 
avoid  the  most  common  mistakes. 

"Try  for  unambiguous  pronouncea- 
bility.  Unless  you’re  selling  sexually 
explicit  software,  people  ought  to  be 
able  to  go  into  a  store  and  ask  for  your 
product  without  embarrassment. 
Does  Digitalk  expect  people  to  ask  for 
Smalltalk-vee  or  Smalltalk-five? 
(Some  programmers  have  taken  to 


calling  it  ess-tee-vee.)  TgX  shouldn’t 
be  spelled  like  tecks  if  it's  to  be  pro¬ 
nounced  teck.  Vision  is  pronounced 
vih-zhun,  not  vih-zee-on,  and  any  at¬ 
tempt  to  get  masses  of  people  to  pro¬ 
nounce  it  wrong  will  meet  with 
resistance. 

"By  all  means  use  your  own  name, 
especially  if  it  also  happens  to  have 
other,  favorable  connotations,  as  Nor¬ 
ton  suggests  the  authoritative  Norton 
Anthologies.  Consider  changing  your 
name  to  Webster  or  Roget.  Or 
Knuth.” 

And  this  is  from  Ray  Duncan: 

_i_hael  s_aine 
editor 

dr.  do _ s_ournal 

_i_e  _ero  one  _al_eston  dr. 

red_ood  _it _ a 

nine  _  our_ero  si_  three 
dear  _i_hael. 

_our  ,s_aine.s  _  la_es.  _olu_n  in 
the  _ul_  one  nine  ei_ht  se_en  dd_ 
_as_uite  thou_ht. 

_ro_o_in_ .  the  _on_e_t  o_  an 

en_r tion  _ethod  that  _an.t 

_e  used  su _ ess_ull _ ithout 

_oth  a  _a_hine  .to  en_r t.  and  a 

hu_an  .to  de_r _ t.  _as 

_uite  ne_  to  _e.  it  o urs  to  _e 

that  _erha_s  this  rather  si _ le 

t _ e  o_  en_r _ tion  _an 

_e_o_e  the  _asis  o_  a  _o_er_ul 

_et  eas_  to  e _ lain  .to  la _ en.  test 

_or  ,understandin_.  o_  natural 

lan_ua_e _ a_hines. 

sin  _erel _ ours. 

ra_  dun_an 


Michael  Swaine 
editor-in-chief 


152 

742 


Dr.  Dobb  s  Journal,  September  1987 


#132  OCTOBER  1987 


Dr.Dobb’s  Journal  of 


2.95  (3.95  CANADA) 


Software  Tools 

FOR  THE  PROFESSIONAL  PROGRAMMER 


Stretching  AppleTalk 


it 


# 


Focus  on  Forth: 

Unifying  Dialects 
Faster  Forth 
A  New  Column 


C,C 


5 1  "1 6562"  "8 


OCTOBER  1987 


CONTENTS 


VOLUME  12,  ISSUE  10 


ARTICLES 


Networking  ► 
Forth  ^ 

Forth  ^ 

Algorithms  ► 


C  compilers  ► 

New  Forth 
column  ► 

Opaque 
date  types  ^ 


Async  AppleTalk  IS 

by  Richard  E.  Brown  and  Steve  Ligett 

Rich  and  Steve  describe  a  desk  accessory  they’ve  developed  to 
extend  the  AppleTalk  network  via  ordinary  serial  links. 

A  Fast  Forth  for  the  68000  32 

by  Lori  Chavez 

Lori  reveals  a  Forth  implemention  method  that  yields 
execution  speeds  approaching  those  of  compiled  languages 
such  as  C. 

A  Forth  Standard  Prelude  40 

by  Martin  Tracy 

Martin  provides  a  set  of  extensions — a  Forth-83  prelude — that 
can  unify  different  Forth  dialects  and  reduce  project  develop¬ 
ment  time. 

Pattern  Matching  Using  Finite  State  Machines  46 

by  Charles  F.  Bowman 

Charlie  simplifies  command  parsing  with  an  implementation 
of  the  Knuth-Morris-Pratt  algorithm. 


COLUMNS 


124 


C CHEST 
by  Allen  Holub 

Allen  wades  into  the  new  MS-DOS  C  compilers  and  comes  up 
with  several  winners. 

THE  FORTH  COLUMN  132 

by  Martin  Tracy 

Martin  joins  the  DDJ  family  with  a  column  covering  the  Forth 
estate:  everything  from  news  to  reviews. 

STRUCTURED  PROGRAMMING  140 

by  Namir  Clement  Shammas 

Namir  discusses  data  hiding  in  Ada  and  Modula-2  and  offers  a 
few  ways  to  add  this  feature  to  programs  written  in  BASIC  and 
Pascal. 


FORUM 


EDITORIAL  6 

by  Tyler  Sperry 
RUNNING  LIGHT  8 

by  Tyler  Sperry 
ARCHIVES  8 

LETTERS  12 

by  you 

SWAINE’S  FLAMES  102 

by  Michael  Swaine 


PROGRAMMER'S 

SERVICES 


ADVERTISER  INDEX:  129 

Where  to  go  for  information 
on  products 

OF  INTEREST:  140 

Products  for  programmers 


Dr.l>oU)'$|ournalo! 

Software  Tools 


About  the  Cover 

“Apples  in  space"  is  just  our 
way  of  saying  that  physical 
distance,  be  it  a  hundred  feet 
or  a  few  thousand  miles,  is  no 
longer  an  obstacle  to  linking 
AppleTalk  nodes. 


This  Issue 

Our  annual  Forth  issue  intro¬ 
duces  a  new  bimonthly  Forth 
column  as  well  as  two  ways  to 
increase  Forth  speed:  the  first 
increases  execution  speed  on 
68000  machines;  the  second 
trims  development  time  for  PC 
Forths.  Finally,  our  cover  story 
looks  at  how  two  gentlemen 
from  Dartmouth  have  expanded 
AppleTalk  into  a  "nonlocal" 
area  network. 


Next  Issue 

November's  DDJ  features  a 
graphics  theme,  and  we’re  not 
talking  about  just  another 
pretty  face  here,  folks.  Pro¬ 
gramming  PC  graphics  is  the 
focus,  and  we'll  approach  it 
from  several  different  language 
directions. 


Dr.  Dobb's  Journal,  October  1987 

744 


3 


FORUM 


EDITORIAL 


Lebensraum  and  RAM 


Since  the  beginning  of  the  micro 
revolution,  DDJ  readers  have 
been  plagued  with  memory  short¬ 
ages.  To  be  sure,  our  perception  of 
what  constitues  "enough”  memory 
has  been  subject  to  almost  constant 
revision.  In  the  heyday  of  the  S-100 
bus,  the  introduction  of  16K  dy¬ 
namic  RAM  cards  was  an  exciting 
development.  Now  just  about  every¬ 
one  in  the  PC  realm  is  complaining 
that  G40K  of  RAM  simply  isn’t 
enough  room  for  all  their  software 
Deja  vu. 

A  few  years  ago,  the  folks  at  Lotus 
and  Intel  got  together  and  came  up 
with  a  way  of  expanding  the  amount 
of  memory  in  a  PC  beyond  the  640K 
barrier.  Before  long,  Microsoft  had 
gotten  into  the  act,  and  we  had  a 
formal  spec:  the  Lotus/Intel/Micro¬ 
soft  Expanded  Memory  Specification 
(EMS).  The  EMS  was  admittedly  a 
kludge,  but  bank-switched  RAM  was 
better  than  no  RAM  at  all. 

Given  the  nature  of  the  market,  it 
was  inevitable  that  some  people 
wouldn’t  be  all  that  thrilled  with 
some  of  the  limitations  of  the  LIM 
version.  Before  long  an  AST/Quad- 
ram/Ashton-Tate  coalition  produced 
the  Enhanced  Expanded  Memory 
Specification,  a  superset  of  the  origi¬ 
nal  plan. 

Mercifully,  developers  weren’t 
asked  to  deal  with  any  more  ver¬ 
sions  than  these  two. 

Of  course,  not  everyone  was 
happy  with  the  limits  of  the  EMS 
and  the  EEMS.  There  were  still  some 
rough  edges  in  the  standards,  some 
confusion  in  the  minds  of  program¬ 
mers  as  to  which  standard  to  write 
for,  and  that  annoying  upper  limit 
of  eight  megabytes.  And  with  the 
growing  realization  that  IBM’s  PS/2 
line  isn’t  going  to  instantly  make  all 
the  older  DOS  machines  obsolete, 
the  importance  of  hardware  and  soft¬ 
ware  that  maximizes  the  perform¬ 
ance  of  DOS  machines  has  begun  to 
take  on  a  new  importance. 

And  this  is  where  we  have  good 


news  to  report.  Recently  the  folks  at 
Lotus,  Intel,  and  Microsoft  an¬ 
nounced  a  new  version  of  the  EMS. 
Rather  than  yet  another  departure 
in  standards,  Version  4.0  of  the  Ex¬ 
panded  Memory  Standard  is  a  su¬ 
perset  of  both  previous  standards 
and  the  result  of  cooperation  be¬ 
tween  the  vendors. 

Representatives  from  other  com¬ 
panies  were  present  at  the  press 
conference  announcing  the  new  ver¬ 
sion.  People  from  AST  Research 
were  there,  as  you  might  expect,  but 
other  companies,  such  as  Ashton¬ 
Tate,  Borland,  and  Symantec,  were 
also  represented.  It  was  particularly 
gratifying  to  see  Terry  Myers  of 
Quarterdeck  Systems  announcing 
products  that  would  take  advantage 
of  the  new  standard  at  the  same  time 
the  standard  was  announced.  (At 
long  last,  the  EMS  will  support  multi¬ 
ple  processes.) 

Although  it  may  seem  at  times 
that  we  here  at  DDJ  can  only  flame 
when  we  hear  names  such  as  Lotus, 
Intel,  and  Microsoft,  that’s  not  really 
the  case.  We’d  like  to  take  this  op¬ 
portunity  to  commend  the  folks  at 
AST,  Intel,  Lotus,  Microsoft,  and  the 
other  companies  involved  for  coop¬ 
erating  in  the  new  EMS  standard.  In 
particular,  Microsoft  deserves  praise 
for  making  good  on  its  pledge  not 
to  abandon  the  millions  of  PC  users 
who  aren’t  going  to  abandon  DOS 
for  OS/2.  The  older  machines  will 
continue  to  need  new,  innovative 
software,  and  this  cooperative  EMS 
effort  will  ease  the  burdens  of  both 
users  and  programmers.  Indeed,  the 
only  question  I  had  after  the  press 
conference  was  “Why  only  32  mega- 


i/liL/uuu  jjvm.Mm.vm 

Software  Tbols 

FOR  THE  PROFESSIONAL  PROGRAMMER 


Editorial 

Editor-in-Chief  Michael  Swaine 
Editor  Tyler  Sperry 
Managing  Editor  Vince  Leone 
Associate  Editor  Ron  Copeland 
Assistant  Editor  Sara  Noah  Ruddy 
Technical  Editors  Allen  Holub 

Richard  Relph 

Contributing  Editors  Namir  Shammas 
Ernest  R.  Tello 

Copy  Editor  Rhoda  Simmons 

Production 

Production  Manager  Bob  Wynne 

Art  Director  Michael  Hollister 
Assoc.  Art  Director  Joe  Sikorvak 
Technical  Illustrator  Frank  Pollifrone 
Typesetter  Mary  Lopez 
Cover  Photographer  Michael  Carr 
Circulation 

Circulation  Director  Maureen  Kaminski 
Fulfillment  Coordinator  Francesca  Martin 
Book  Marketing  Mgr.  Jane  Sharninghouse 
Subscription  Supervisor  Kathleen  Shay 
Newsstand  Sales  Larry  Hupman 
Administration 
Finance  Director  Kate  Wheat 
Business  Manager  Betty  Trickett 
Accounts  Payable  Supv.  Mavda  Lopez-Quintana 
Accts.  Receivable  Supv.  Laura  DiLazzaro 
Advertising  Director 
Ferris  Ferdon  (415)  366-3600 
Marketing  Mgr.  Michael  Wiener 
Trafficking  Coordinator  Donna  Rogers 
Account  Managers  see  page  129 
Associate  Publisher 
Michael  Swaine 
Assistant  Sara  Noah  Ruddy 


Dr.  Dobb’s  Journal  of  Software  Tools  IUSPS  3076901 
is  published  monthly  by  M&.T  Publishing  Inc.,  501 
Galveston  Dr.,  Redwood  City,  CA  94063;  1415)  366-3600. 
Second-class  postage  paid  at  Redwood  City  and  at 
additional  entry  points.  DDJ  is  published  under  license 
from  People's  Computer  Company,  2682  Bishop  Dr., 
Suite  107,  San  Ramon,  CA  94583,  a  nonprofit  corpora¬ 
tion. 

Article  Submissions:  Send  manuscripts  and  disk  (with 
article  and  listings)  to  the  Associate  Editor. 

DDJ  on  CompuServe:  Type  GO  DDJ 
Address  Correction  Request:  Postmaster:  Send  Form 
3579  to  Dr.  Dobbs  Journal,  P.O.  Box  27809,  San  Diego, 
CA  92128.  ISSN  088-3076 

Customer  Service:  For  subscription  problems  call: 
outside  CA  (8001  321-3333:  in  CA  (619)  485-9623  or 
566-6947.  For  book/software  order  problems  call  (4151 
366-3600. 

Subscriptions:  529.97  per  1  year:  556.97  for  2  years. 
Canada  and  Mexico  add  527  per  year  airmail  or  510  per 
year  surface.  All  other  countries  add  527  per  year 
airmail.  Foreign  subscriptions  must  be  prepaid  in  U.S. 
funds  drawn  on  a  U.S.  bank.  For  foreign  subscriptions, 
TELEX:  752-351. 

Foreign  Newsstand  Distributor:  Worldwide  Media 
Service  Inc.,  386  Park  Ave.  South,  New  York,  NY  10016; 
(212)  686-1520  TELEX  620430  IWUIl. 

Entire  contents  copyright  -  1987  bv  M&T 
Publishing,  Inc.,  unless  otherwise  noted 
on  specific  articles.  All  rights  reserved. 


M&T  Publishing  Inc. 

Chairman  of  the  Board  Otmar  Weber 
Director  C.F.  von  Quadt 
President  and  Publisher  Laird  Foshav 


6 


Dr.  Dobb’s  Journal,  October  1987 

745 


FORUM 


RUNNING  LIGHT 


In  my  first  column  I 
confessed  to  not 
being  much  of  a  lan¬ 
guage  zealot.  Though 
I  do  enjoy  playing 
around  with  Forth,  I 
don’t  have  the  reli¬ 
gious  fervor  that 
seems  to  afflict  so 
many  Forth  enthusi¬ 
asts.  I  will,  however, 
also  confess  to  not 
being  able  to  deny  myself  the  pleas¬ 
ure  of  baiting  C  enthusiasts  about 
how  wonderful  Forth  is  so  that  I  can 
watch  their  faces  turn  all  those 
pretty  colors.  (Forthians  aren't  the 
only  people  who  can  be  fanatic 
about  their  programming  language.) 
I  will  even,  on  occasion,  dirty  my 
hands  with  a  program  written  in 
Turbo  Pascal  or  QuickBASIC. 

From  what  I’ve  learned  in  talking 
with  DDJ  readers,  this  isn’t  all  that 
unusual.  Most  of  you  seem  to  be 
rather  fond  of  programming  in  C  on 
MS-DOS  machines,  but  that  isn’t  all 
that  surprising  given  the  current 
popularity  of  C  and  DDJ’ s  history  of 
exploring  C  compilers  from  the 
inside  out.  In  fact,  one  of  the  distin¬ 
guishing  characteristics  of  a  long¬ 
time  DDJ  reader  is  familiarity  with  a 
variety  of  languages. 

It  is  with  this  in  mind  that  we  put 
together  every  issue  of  DDJ.  Thus, 
you’ll  find  this  issue — our  "annual 
Forth  issue" — not  completely  over¬ 
run  with  articles  written  about 
Forth.  True,  there  a  couple  of  rather 
good  Forth  articles,  and  we’re  intro¬ 
ducing  a  new  Forth  column,  but 
there  is  plenty  more  here  as  well 
— and  our  coverage  of  Forth  doesn’t 
begin  and  end  with  this  issue. 

Because  I’m  in  a  confessing  mood, 
I’ll  unpack  another:  I  am  a  man 
without  a  number.  I  am  neither  a 
sixer  nor  an  eighter.  I  learned  to 
program  in  machine  language  on 
the  RCA  1802  (I  learned  enough  to 
know  better — please  don't  send  in 


any  1802  articles)  and 
moved  through  CP/M 
to  MS-DOS  along  with 
the  rest  of  you.  I’ve 
seen  enough  ma¬ 
chines  and  nifty  soft¬ 
ware  to  realize  that 
short  of  having  a  base¬ 
ment  with  a  Cray  and 
about  a  dozen  other 
computers  hooked  up 
to  a  magic  terminal, 
there’s  simply  no  way  I’ll  ever  be 
satisfied  with  a  single  machine. 

Which  brings  us,  in  a  not  entirely 
circuitous  manner,  to  this  month’s 
lead  article.  Rather  than  putting  off 
Rich  Brown  and  Steve  Legitt’s  piece 
on  extending  the  AppleTalk  network 
until  January's  "annual  68K  issue,” 
we’re  printing  it  now  for  those  of 
you  who  (like  me)  don’t  want  to  wait 
for  a  good  article.  As  you  might 
expect  from  my  comment  on  num¬ 
bering,  this  is  hardly  the  last  Mac 
article  you’ll  be  seeing  in  DDJ.  In  the 
future,  who  knows?  A  column  on 
Mac  topics?  We’ll  see. 

One  last  thing.  No  Running  Light 
would  be  complete  without  solicit¬ 
ing  reader  feedback  and  articles. 
What  are  you  interested  in  reading 
in  1988?  We’re  looking  for  more  arti¬ 
cles  on  debugging,  object-oriented 
programming,  and  real-time  program¬ 
ming  for  the  first  quarter.  Call  me  at 
(415)  366-3600  or  catch  me  on  line 
and  tell  me  what  you’d  like  to  see 
in  the  magazine. 


editor 


ARCHIVES 


Ten  Years  Ago  In  DDJ 

“When  faced  with  the  choice  of  paying 
$100  or  more  for  a  piece  of  commercial 
software  without  source  code,  or  a  freebie 
with  source  code,  most  people  will  try 
the  freebie  first.  If  it  doesn't  work  out, 
they  can  always  buy  the  commercial 
product. 

"Personally,  I  wouldn't  want  a  piece  of 
sourceless  software  as  a  gift,  let  alone  pay 
money  for  it.  Sooner  or  later,  I  want  to 
make  modifications.  Without  the  source 
code  this  can  be  a  real  hassle.  It  can  be  a 
hassle  even  with  the  source  code  and  full 
documentation. . . . 

"By  refusing  to  sell  source  code, 
vendors  are  cutting  down  their  sales 
potential.  The  practice  can’t  prevent 
anyone  from  making  a  tape  copy  for  his 
buddy  down  the  street,  nor  can  it  keep 
anyone  from  using  a  disassembler  to  see 
what  makes  a  program  tick.  Yet  some 
vendors  are  so  secretive  they  won’t  even 
sell  you  a  Basic  manual  unless  you  buy 
the  whole  package  and  sign  a  non¬ 
disclosure  agreement.  Who  needs  that 
kind  of  trip?"  Jim  Day,  letter  to  the  editor, 
DDJ,  October  1977. 

Elegant  Efficiency 

"Computer  programming  is  a  form  of 
art,  far  from  being  a  discipline  of  science 
or  engineering.  For  any  specified  pro¬ 
gramming  problem,  there  are  an  infinite 
number  of  solutions,  entirely  dependent 
upon  the  programmer  as  an  artisan.  We 
can,  however,  rate  a  solution  by  its 
correctness,  its  memory  requirement,  its 
execution  speed,  and  other  qualities. 

"For  some  applications,  best  is  what¬ 
ever  is  shortest  and  fastest.  The  only  way 
to  achieve  this  goal  is  to  use  the 
computer  with  an  instruction  set  opti¬ 
mized  for  the  problem.  Optimization  of 
the  computer  hardware  is  clearly  imprac¬ 
tical  because  of  the  excessive  costs. 

“Thus,  one  would  have  to  compromise 
by  using  a  fixed,  general  purpose 
instruction  set  offered  by  a  real  computer 
or  a  language  compiler.  To  solve  a 
problem  with  a  fixed  instruction  set,  one 
has  to  write  programs  to  circumvent  the 
shortcomings  of  the  instruction  set. 

“The  solution  in  FORTH  is  not  achieved 
by  writing  programs  but  by  creating  a 
new  instruction  set  in  the  FORTH  virtual 
computer.  The  new  instruction  set  in 
essence  becomes  'the'  solution  to  the 
programming  problem.”  C.  H.  Ting, 
"Formal  Definition  of  FORTH,"  DDJ, 
February  1982. 


Dr.  DoBB'S  JoURNALof 

COMPUTER 

(Calisthenics  Orthodontia 


Running  l.iglil  Without  Ovcrbyle 


8 

746 


Dr.  Dobb’s  Journal,  October  1987 


FORUM 


LETTERS 


A-to-IJ  Conversion 

Dear  DDJ, 

I  have  some  comments  on  John  Mus- 
selman's  article  on  page  22  of  the 
June  issue.  All  but  the  first  are  Com¬ 
plaining  Songs,  but  remember  the 
first  is  first  because  it's  important.  I 
would  not  complain  about  details  if  I 
didn't  like  what  the  author  was 
doing. 

1.  The  circuit  in  Figure  1  and  the  in¬ 
sight  that  it  suffices  to  digitize  volt¬ 
ages  are  delightful.  I  can  only  gasp 
''Nifty!”  and  "Why  didn't  I  think  of 
that?” 

2.  The  proof  of  a  pudding,  however, 
is  in  the  eating.  I  want  to  see  calibra¬ 
tion  data  before  I  believe  it's  so  easy. 
Some  of  the  problems  I  antici¬ 
pate  are  saturation  effects 
when  the  unknown  voltage 
gets  close  to  the  limits  (for  ex¬ 
ample,  0  or  +5  volts)  and  in¬ 
terrupt  latency  jitter.  I  don't 
know  how  your  TMS7000 
works,  but  the  specs  for  my 
Z80A  seem  to  say  that  the  de¬ 
lay  from  the  interrupt  signal 
to  start  of  the  interrupt  rou¬ 
tine  can  vary  by  5.8  microsec¬ 
onds,  depending  on  what  the 
CPU  is  doing  when  the  inter¬ 
rupt  comes.  I  guess  it  could  be 
even  longer  if  other  inter¬ 
rupts  were  active.  The  aver¬ 
aging  process  would  tend  to 
wash  out  any  effect,  but  I 
would  try  some  worst-case 
tests  to  see  if  it  causes  any 
trouble. 

3.  Whenever  the  input  volt¬ 
ages  at  the  comparator  are  not 
(nearly)  equal,  the  Out  bit 
stream  will  be  solid  Is  or  Os 
until  the  voltages  become 


equal.  These  bits  don't  belong  in  your 
average,  so  you  must — if  there  is  any 
chance  that  an  average  will  be  asked 
for  before  comparator  voltages  get 
equalized — delay  accumulation  of 
the  average  until  at  least  one  transi¬ 
tion  0<  =  =  >1  comes  down  the  bit 
stream.  I  don't  see  this  safety  feature 
in  the  software,  but  it  would  be  need¬ 
ed,  for  example,  for  a  time  multiplex 
application,  in  which  a  sequence  of 
voltages  was  measured  one  after  the 
other.  When  John  Musselman  pro¬ 
poses  a  digitizer  using  only  one  I/O 
bit,  I  am  impressed,  but  when  I  see  it 
needs  more  than  a  second  per  chan¬ 
nel  in  time  multiplex,  I  start  to  nod 
off. 

Why  not  go  at  it  like  this?  Let  the 
time  constant,  RC,  be  much  longer 
than  the  period,  P,  between  inter¬ 
rupts  (for  example,  1  millisecond).  As¬ 
sume  you  know  the  voltage,  V(T),  on 
the  capacitor  at  time  T,  when  an 
Out=l  period  begins.  By  Ohm's  law 
the  current  into  C  is  I(T)=(H— V(T))/R, 
where  H  is  the  logic  1  voltage.  Be¬ 
cause  P<  <RC,  the  total  charge  flow 
during  this  interrupt  period  is  very 
nearly  P(H-V(T))/R,  and  the  voltage 
at  T+ P  is,  to  first  order  in  P,  V(T+ P)= 


V(T)  +  P(H  -  V(T))/(RC)  =  V(T)(1  -  P/ 
(RC))+HP/(RC).  If,  instead,  a  0  is  sent 
out,  V(T+P)  =  V(T)(1-P/(RC))  +  LP/ 
(RC),  where  L  is  the  logic  0  level. 

In  words,  the  voltage  on  C  at  the 
end  of  a  period  is  the  average  of  the 
voltage  at  the  start  of  the  period, 
weighted  by  1  — P(RC),  and  the  volt¬ 
age  Out,  weighted  by  P/(RC).  Repeat¬ 
ing  the  procedure,  you  can  calculate 
V(T+2P)  and  so  on. 

The  "old”  voltage  is  always  multi¬ 
plied  by  (1— P/(RC)).  This  is  a  "forget¬ 
ting  factor”  because  it  is  close-to-but- 
rigorously-less-than  1.  (The 
arithmetic  needs  some  thought  lest 
truncation  error  obscure  this  fact.)  So, 
even  if  you  use  a  wrong  number  for 
V(T),  the  error  gets  squashed  down 
by  one  forgetting  factor  at  each  inter¬ 
rupt;  V(T)  and  the  actual  voltage  on  C 
get  ever  closer.  But  you  don't  need  to 
guess  V(T):  initialize  Out  to  logic  0  be¬ 
fore  loading  the  program,  and  start 
interrupts  at  T=0  with  the  (excellent) 
assumption  that  V(0)=L. 

Meanwhile  it’s  still  true  that  a  bit 
stream  transition  0<  =  =  >1  marks  a 
voltage  crossover  at  the  comparator 
inputs,  so  the  current  value  of  V(T)  is 
already  a  good  estimate  of  the  un¬ 
known  voltage;  nothing  is  gained  by 
averaging  for  another  second. 

Thus,  in  a  given  setup,  John 
Musselman 's  method  (usually 
called  the  boxcar  average)  and 
the  forgetting-factor  average 
converge  to  the  same  result 
on  a  DC  voltage.  But  when  the 
boxcar  algorithm  is  ready  to 
start  measuring,  the  forget¬ 
ting-factor  algorithm  already 
has  the  answer. 

The  pudding  is  still  pie  in 
the  sky.  I  will  not  be  able  to 
run  the  necessary  tests  soon, 
but  I  hope  that  someone  else 
will  do  so  and  send  me  the 
conclusions. 

R.  W.  Hartung 
408  Orchard 
East  Lansing,  MI  48823 

Curses 

Dear  DDJ, 

Thanks  to  Allen  Holub  for  his 
article  on  curses  [C  Chest,  July 
1987],  Last  year  I,  too,  needed 
a  curses  package  to  assist  in 


12 


Dr.  Dobb's  Journal,  October  1987 

747 


LETTERS 

(continued  from  page  12) 


porting  code  between  my  PC  and 
Unix,  so  I  ended  up  writing  my  own. 
Since  then,  my  package  has  been  ex¬ 
panded  significantly  (it  now  provides 
almost  full  Unix  System  5  compatibil¬ 
ity).  After  receiving  several  requests 
for  it,  I  decided  to  distribute  my  pack¬ 
age  (now  called  PC  Curses)  as 
shareware.  The  package  is  compati¬ 
ble  with  Microsoft  C  4.0  and  includes 
small-  and  large-model  libraries.  If 
any  DDJ  readers  are  interested  in  ob¬ 
taining  a  copy,  they  can  do  so  by 
sending  me  one  SV4-inch  floppy  along 
with  a  mailer,  return  label,  and  post¬ 
age  (or  $5).  Readers  outside  the  U.S. 
should  write  for  information. 

Just  a  few  comments/clarification 
about  the  article.  After  complaining 
about  the  missing  vsprintf  function, 
Holub  may  be  happy  to  hear  that 
vsprintf  is  beginning  to  appear  on 
various  Unix  systems  (I  know  that 
Unix  System  5  has  it).  His  comment 
about  using  nl  (instead  of  nonl )  seems 
logical,  but  Unix  curses  encourages 
the  latter;  moreover,  some  imple¬ 
mentations  of  Unix  curses  fail  to 
work  properly  unless  nonl  is  speci¬ 
fied.  The  comment  that  ". . .  the  re- 
freshO  command  only  works  under 
the  real  curses  if  all  windows  are  sub¬ 
windows  of  stdscr"  is  incorrect  if  it  is 
referring  to  subwindows  created  by 
subwinO;  I  suggest  that  readers  forget 
they  ever  read  that  sentence.  Also, 
Holub  made  several  other  comments 
about  subwindows  that  are  incorrect 
or  confusing,  and  I  suggest  that  any¬ 
one  using  other  versions  of  curses 
consult  the  documentation  (or  code) 
for  more  information. 

Jeff  Dean 

710  Chimalus 

Palo  Alto,  CA  94306 

Educatin’ 

Programmers 

Dear  DDJ, 

I  wish  to  comment  on  the  letter 
from  Neil  Pignatano  published  in 
your  July  1987  issue. 

I  agree  with  Mr.  Pignatano's  asser¬ 
tion  that  mathematical  reasoning  is 
the  best  paradigm  for  passing  on  crit¬ 
ical  analytical  skills  to  students  of 
computer  programming  and  is  cer¬ 
tainly  superior  in  this  regard  to  classi¬ 
cal  languages,  as  suggested  by  Allen 


Holub  in  his  Viewpoint  of  April  1987. 

I  most  stringently  disagree,  howev¬ 
er,  with  the  point  Mr.  Pignatano 
makes  next — that  there  are  enough 
"text  editors,  database  managers,  and 
so  on."  I  believe  that  the  era  of  truly 
productive  productivity  programs  is 
just  beginning  to  emerge  and  that  it  is 
precisely  this  field  that  will  offer  the 
most  challenges  and  opportunities 
for  programmers  in  the  future. 

Mr.  Pignatano's  views  on  "elec¬ 
tronic  desk"  software  are  well 
known  (to  his  coworkers),  but  I  feel 
that  he  is  missing  out  on  the  possibili¬ 
ties  that  are  continually  offered  by 
new  developments  in  hardware 
technologies;  Neil  may  be  content  us¬ 
ing  TECO  for  his  text  editor,  but  I  cer¬ 
tainly  am  not! 

Mr.  Pignatano,  as  a  physicist  and  a 
scientific  programmer  by  profession, 
perhaps  does  not  appreciate  the  de¬ 
gree  to  which  difficulty  of  use  and 
obtrusiveness  (I  might  say  "pig-hea¬ 
dedness”)  of  most  currently  available 
software  tools  inhibits  the  applica¬ 
tion  of  computer  solutions  to  many 
tasks  in  business,  education,  and  even 
technical  environments.  Unless  text 
editors,  database  managers,  and 
spreadsheet  software  begin  to  re¬ 
spond  in  consistent  and  predictable 
(to  the  new  and  occasional  user) 
ways,  microcomputers  will  fail  to 
reach  a  significant  fraction  of  their 
potential  markets. 

Naturally,  this  belief  of  mine  leads 
me  to  conclude  differently  from  Mr. 
Pignatano  on  the  place  of  scientific 
programming  in  the  education  of 
programmers.  Rather  than  the 
strong  emphasis  that  Mr.  Pignatano 
would  have  it  given,  I  feel  that  the 
overhead  intrinsic  in  the  explanation 
of  scientific  problems  for  computer 
solution  make  them  more  appropri¬ 
ate  for  only  occasional  use  in  comput¬ 
er  science  curricula.  I  maintain,  rath¬ 
er,  that  scientists  of  all  sorts  should  be 
given  a  strong  background  in  com¬ 
puter  programming  techniques  that 
they  will  be  able  to  apply  to  the  en¬ 
deavors  of  their  own  disciplines — a 
point  on  which  I  sense  that  Neil  Pig¬ 
natano  and  I  can  agree. 

Martin  Veneroso 

1646  Latham,  #10 

Mountain  View,  CA  94041 


VM/RUN  Update 

Dear  DDJ, 

Your  readers  may  be  interested  in 
the  following  updates  to  Richard 
Relph's  review  of  VM/RUN  from 
Softguard  Systems  [see  the  July  1987 
issue].  The  February  version,  which 
he  reviewed,  has  been  superseded  by 
Release  1.10,  which  is  now  available. 

The  7  to  15  percent  penalty  in  exe¬ 
cution  time  has  been  totally  eliminat¬ 
ed.  The  problem  was  caused  entirely 
by  diagnostic  code  that  kept  the 
"global  exact”  flag  on  at  all  times. 
This  resulted  in  instruction  fetch  deg¬ 
radation.  Subsequent  benchmarks 
(with  test  problems  provided  by  your 
reviewer)  have  shown  comparable 
results  between  the  VM/RUN  envi¬ 
ronment,  which  runs  at  CPL  3,  and 
Phar  Lap’s,  which  runs  at  CPL  0. 

Full  directory  path  support  is  now 
available.  Support  files  no  longer 
have  to  be  in  the  current  directory, 
and  the  number  of  files  has  been 
reduced. 

Significant  improvements  have 
been  made  in  application  load  times. 
Large-scale  applications  of  %  mega¬ 
byte  in  size  are  typically  loaded  in 
about  5  seconds  on  a  Compaq  Desk- 
pro  386. 

Finally,  let  me  say  that,  with  the 
addition  of  a  symbolic  debugger  and 
new  features,  such  as  the  capability 
of  calling  8086  code  from  a  386  pro¬ 
gram  and  vice  versa,  the  VM/RUN  en¬ 
vironment  offers  software  develop¬ 
ers  a  powerful  tool  for  the  creation  of 
386  applications. 

Ken  Williams 

Softguard  Systems  Inc. 

2840  San  Tomas  Expy.,  Ste.  201 

Santa  Clara,  CA  95051 

The  Problem  Is 
Architecture 

Dear  DDJ, 

Frank  Albe’s  letter  in  Ray  Duncan's 
July  1987 16-Bit  Software  Toolbox  col¬ 
umn  doesn’t  get  to  the  point  about 
the  computer  language  controversy. 
The  problem  is  not  with  the  lan¬ 
guages  but  with  the  computer  archi¬ 
tectures  that  (do  not)  support  them. 

Computers  are  integer  manipula¬ 
tors  with  simple  decision-making  ca¬ 
pabilities.  But  the  world  is  not  com- 
(continued  on  page  144) 


14 

748 


Dr.  Dobb's  Journal,  October  1987 


ARTICLES 


Async  AppleTalk 


For  most  users  Apple- 
Talk  appears  to  be 
nothing  more  than  an 
expensive  cable.  Actually,  it’s 
much  more.  It's  a  complete 
family  of  data  communica¬ 
tion  protocols  designed  to  al¬ 
low  Macintoshes  and  other 
computers  to  share  periph¬ 
erals  and  resources.  Although  the  initial  AppleTalk  service 
provided  a  printing  connection  between  a  Macintosh  and 
an  Apple  LaserWriter,  these  days  there  are  literally  dozens 
of  products,  including  other  high-quality  output  devices 
and  disk  or  file  servers,  that  take  advantage  of  this  230.4- 
kbps  link  between  devices. 

Async  AppleTalk,  a  software  application  developed  at 
Dartmouth  College,  was  implemented  to  expand  the  po¬ 
tential  of  this  network  by  allowing  AppleTalk  devices  to  be 
connected  over  an  asynchronous  (RS-232)  link.  The  soft¬ 
ware  provides  low-cost,  remote  access  to  AppleTalk  de¬ 
vices  and  works  with  nearly  all  AppleTalk  applications. 

The  first  step  in  designing  Async  AppleTalk  was  to  de¬ 
fine  the  necessary  protocols.  A  protocol  is  a  set  of  rules  that 
two  or  more  parties  (human  or  computer)  use  to  conduct 
orderly  discourse.  A  protocol  might,  for  example,  involve 
two  computers  negotiating  to  send  a  file  between  them. 
The  conversation  could  be  as  simple  as  that  illustrated  in 
Figure  1,  page  20.  A  real  protocol  would  have  to  specify 
what  would  happen  if  computer  B  couldn’t  take  a  file  that 


Richard  E.  Brown  and  Steve  Ligett,  Kiewit  Computer  Ctr., 
Dartmouth  College,  Hanover,  NH  03755.  Rich  is  manager  of 
special  projects  at  Dartmouth.  He  has  been  writing  commu¬ 
nications  software  for  seven  years  and  is  a  member  of  the 
IEEE  and  the  ACM.  Steve  is  an  engineer  in  the  special  pro¬ 
jects  group  of  the  computing  services  department  at  Dart¬ 
mouth.  He  writes  software  for  personal  computers  and  de¬ 
signs  and  builds  hardware  for  Dartmouth's  local-area 
network. 


big,  or  if  it  already  had  a  file 
named  FRED,  or  if  it  wasn’t 
saved  correctly  at  the  end. 
But  you  get  the  picture — a 
few  more  rules  could  make 
the  scheme  watertight. 

If  you  were  designing  the 
program  to  implement  this 
protocol,  you  would  like  the 
code  to  be  about  as  simple  as  the  problem  definition.  The 
top-level  code  should  implement  the  rules  mentioned 
earlier:  it  shouldn't  have  to  worry  about  each  little  mis¬ 
hap  that  might  occur  anywhere  in  the  transmission 
chain.  Consequently,  you  would  design  this  simple  file 
transfer  protocol  as  a  network  layer  and  delegate  to  an¬ 
other  layer  the  responsibility  for  detecting  errors,  re¬ 
transmitting  corrupted  messages,  or  sending  messages 
around  a  network. 

Network  layering  brings  to  data  communications  the 
simplicity  that  structured  programming  brings  to  soft¬ 
ware.  Using  layers  designers  can  isolate  issues  and  design 
solutions  for  each  of  the  separated  areas — for  example, 
code  that  detects  transmission  errors  solves  a  very  differ¬ 
ent  problem  from  code  that  implements  the  file  transfer 
protocol  discussed  earlier. 

Layering  also  allows  substitution  of  equivalent  services 
by  a  different  implementation.  The  link  access  protocol 
(LAP),  for  example,  is  generally  responsible  for  sending 
sequences  of  bytes  (frames)  on  a  wire.  Different  LAP  soft¬ 
ware  and/or  hardware  can  use  different  kinds  of  wire. 
The  standard  AppleTalk  Link  Access  Protocol  (ALAP)  lay¬ 
er  uses  twisted-pair  wire.  Async  AppleTalk  replaces  that 
wire  with  a  modem  or  RS-232  link.  Another  LAP  replace¬ 
ment  is  an  AppleTalk  to  Ethernet  converter — it  uses  the 
10-Mbps  link  to  deliver  AppleTalk  datagrams.  All  three 
LAP  layers  have  their  place;  none  of  them  could  have 
been  done  if  the  highest-level  code  had  to  deal  with  all  the 
successive  details  of  data  transmission. 

Each  network  layer  provides  a  well-defined  service  to  a 


by  Richard  E.  Brown  and  Steve  Ligett 


The  design  and 
implementation  of  an 
RS-232  enhancement 
to  the  AppleTalk 
protocols 


18 


Dr.  Dobb's  Journal,  October  1987 

749 


higher-level  client.  The  client  then  communicates  with  a 
peer  across  the  network.  The  effect  is  that  peers  appear  to 
be  talking  directly  to  each  other,  as  if  the  lower  layers 
weren’t  there  at  all.  Each  layer  specifies  the  means  of 
communicating  with  its  peer — the  protocol — and  the  ser¬ 
vice  it  will  provide  to  its  clients. 

Peers  communicate  by  asking  a  lower  layer  to  perform 
a  service  for  them.  The  lower  layer  can,  in  turn,  invoke 
still  lower  layers  to  do  more  work.  The  lowest  layer — the 
physical  layer — is  the  electronics  that  sends  and  receives 
the  bits  across  a  link.  At  the  other  end,  each  layer,  starting 
at  the  lowest,  passes  the  received  data  up  to  its  client  until 
the  messages  reach  the  peer  of  the  originator. 

Another  concept  in  networking  is  the  onion-skin  prin¬ 
ciple.  When  a  layer  has  data  to  communicate  with  its 
peer,  it  passes  the  data  to  the  next  lower  layer.  That  layer 
adds  its  own  information  (called  a  wrapper)  and  passes 
the  message  to  the  next  lowest  layer,  and  so  on.  When  the 
message  gets  to  the  bottom  (physical)  layer,  it  is  actually 
transmitted.  Each  peer  at  the  other  end  verifies  any  infor¬ 
mation  contained  in  the  header,  removes  the  correspond¬ 
ing  wrapper,  and  passes  the  resulting  data  up  to  its  cli¬ 
ent — thus,  the  analogy  to  the  layers  of  an  onion.  In  the 
earlier  file  transfer  example,  the  file  transfer  layer  might 
send  data  to  a  layer  that  guarantees  delivery  of  the  infor¬ 
mation  it  was  given.  This  layer  could  wrap  routing  infor¬ 
mation  around  the  message  and  pass  it  on  to  the  LAP 
layer  to  be  sent  out  on  the  wire.  Figure  2,  page  20,  shows 
these  transformations. 

There  is  an  interesting  symmetry  in  the  transmit  and 
receive  algorithms.  Data  to  be  transmitted  is  passed  down 
through  subroutine  calls  from  one  layer  to  another  until 
it  is  finally  transmitted  on  the  link.  Just  the  opposite  se¬ 
quence  occurs  when  a  frame  arrives  in  a  computer.  The 
receiver  accepts  characters  until  it  gets  a  full  frame.  If  the 
frame  is  good  (CRC  OK,  proper  length,  and  so  on),  the 
receiver  makes  a  subroutine  call  to  the  next  layer  with 
the  data  as  a  parameter.  Depending  on  how  the  software 
was  set  up,  that  layer  may  make  further  subroutine  calls 


to  its  next  higher  layer  until  the  message  arrives  at  the 
final  recipient. 

For  a  detailed  and  quite  readable  explanation  of  net¬ 
working,  take  a  look  at  Tanenbaum’s  Computer 
Networks. 

The  AppleTalk  Protocols 

Each  protocol  of  the  AppleTalk  family  performs  a  specif¬ 
ic  function.  The  AppleTalk  Transaction  Protocol  (ATP), 
for  example,  specifies  how  multiple  blocks  of  data  (a 
transaction  of,  say,  eight  disk  blocks)  can  be  reliably  trans¬ 
ferred  between  nodes  (that  is,  Macs)  on  a  network.  The 
Name  Binding  Protocol  (NBP)  specifies  how  to  convert  a 
(text)  name  to  a  (numeric)  network  address.  The  full  de¬ 
tails  of  the  AppleTalk  protocol  family  are  available  from 
Inside  AppleTalk. 

Both  ATP  and  NBP  rely  on  the  ability  to  send  a  single 
message  through  the  network,  possibly  directing  (rout¬ 
ing)  the  message  through  multiple  links.  The  Datagram 
Delivery  Protocol  (DDP)  specifies  how  these  messages  are 
routed.  Another  protocol,  the  Routing  Table  Maintenance 
Protocol  (RTMP),  describes  how  to  maintain  tables  of  rout¬ 
ing  information. 

All  AppleTalk  protocols  rely  on  the  ability  to  send  a 
frame  of  data  (an  orderly  sequence  of  8-bit  bytes)  to  a 
neighbor  node.  The  AppleTalk  Link  Access  Protocol 
(ALAP)  specifies  the  rules  for  access  to  the  twisted-pair 
bus,  which  connects  devices  on  AppleTalk.  Just  as  with  a 
telephone  party  line,  AppleTalk  devices  must  cooperate 
to  keep  their  messages  from  interfering.  ALAP  uses  a 
three-way  exchange  to  minimize  the  time  wasted  by  col¬ 
lisions.  The  initiator  of  the  frame  listens  to  see  if  the  bus  is 
in  use;  if  so,  it  waits  a  random  amount  of  time  and  retries. 
If  the  bus  is  idle,  the  initiator  sends  a  request-to-send  (RTS) 
frame  to  the  intended  recipient.  If  the  recipient  is  pre¬ 
pared  to  handle  the  data,  it  responds  with  a  clear-to-send 
(CTS)  frame.  If  the  initiator  receives  the  CTS,  it  then  sends 
the  data. 

If  two  devices  begin  to  transmit  simultaneously,  their 


Dr.  Dobb's  Journal,  October  1987 

750 


19 


ASYNC  APPLETALK 

(continued  from  page  19) 

RTS  frames  will  collide,  garbling  both.  Their  recipients 
will  not  receive  the  RTS  correctly  and  will  not  respond. 
Each  initiator  will  fail  to  receive  a  CTS,  wait  a  random 
amount  of  time,  and  then  retry  the  transmission  se¬ 
quence.  By  waiting  a  random  interval,  one  of  the  initia¬ 
tors  will  retry  first  and  will  probably  succeed. 

A  final  (and  very  important)  point:  notice  that  ALAP 
does  not  guarantee  correct  delivery  of  a  frame.  A  trans¬ 
mission  error  could  garble  a  frame;  when  that  occurs,  the 
recipient  simply  discards  the  frame  as  if  it  had  never  ar¬ 
rived.  How  does  the  data  ever  get  through?  Higher  layers 
(ATP  or  NBP)  establish  rules  for  acknowledging  messages 
that  must  be  delivered. 

An  AppleTalk  Implementation 

In  late  1983,  Dartmouth  College  recommended  that  the 
freshmen  entering  in  the  fall  of  1984  purchase  Macintosh 
computers.  It  also  decided  to  expand  its  campuswide  data 
network  to  support  communications  in  the  students’ 
rooms.  At  that  time,  Dartmouth  had  a  network  of  1,800 
terminal  ports  (asynchronous,  RS-232  hard-wired  and 
dial-up  ports)  served  by  approximately  40  network  nodes 
(minicomputers)  that  were  located  in  the  basements  of 
academic  and  administrative  buildings.  A  user  at  any  ter¬ 
minal  could  connect  to  any  host  computer  as  well  as  to 
the  library  ’s  on-line  card  catalog,  a  campus  events  calen¬ 
dar,  or  other  off-campus  networks. 


Over  the  summer  of  1984,  AppleTalk  support  was  added 
to  the  network.  Immediately,  the  number  of  nodes  was 
more  than  doubled  (to  a  total  of  95)  to  support  the  dormi¬ 
tories.  The  dorms  were  wired  with  2,600  AppleTalk  out¬ 
lets,  one  plug  per  pillow.  We  decided  not  to  retrofit  all  the 
async  ports  to  AppleTalk  as  it  would  have  been  prohibi¬ 
tively  expensive. 

We  also  developed  a  terminal  emulation  program, 
called  DarTerminal,  which  works  across  this  AppleTalk 
network  to  all  the  host  computers.  DarTerminal  acts  as  a 
VT-100  terminal,  a  TEK10  (4012)  graphics  terminal,  and  a 
distributed  screen  editor,  in  which  the  Macintosh  acts  as  a 
screen  manager  for  a  back-end  host  editor  (now  available 
on  the  DOTS,  VAX/VMS,  and  Unix  systems).  DarTerminal 
also  offers  simultaneous  terminal  sessions,  cut-and-paste 
transfers  between  sessions,  and  file  transfers  of  entire 
Macintosh  documents. 

Unfortunately,  in  that  one  stroke,  we  rendered  obso¬ 
lete  the  1,800  asynchronous  ports  located  in  the  college's 
academic  and  administrative  offices.  Of  course,  the  ports 
still  worked,  but  over  time,  as  people  in  those  offices 
bought  Macs,  they  couldn't  get  the  benefits  of  AppleTalk 
that  we  had  already  provided  to  the  students.  This  caused 
some  real  frustration  because  faculty  couldn’t  use  the 
same  tools  as  their  students.  Staff  wanted  to  use  a  terminal 
program  that  was  tuned  to  the  students’  computing  envi¬ 
ronment.  As  network  maintainers,  we  wanted  to  use  the 
multiple  sessions  to  troubleshoot  network  problems  from 
home  (instead  of  driving  to  the  computer  center  on  cold 
winter  nights  .  . .). 

There  was  no  off-the-shelf  way  to 
make  the  async  ports  of  the  network 
deal  with  AppleTalk  frames.  Yet  we  felt 
it  was  important  to  let  people  use  Dar¬ 
Terminal  over  their  RS-232  ports.  One 
solution  would  have  been  to  make  an 
"Async  DarTerminal”  that  ran  directly 
on  an  async  port  or  modem.  This  might 
have  worked,  but  it  would  have  been 
troubled  by  flow  control  (does  XOFF 
mean  stop  sending,  or  is  it  just  another 
data  character?)  and  we  would  have 
had  problems  maintaining  two  versions 
of  the  program.  Furthermore,  it  would 
only  have  solved  the  specific  problem  of 
DarTerminal;  other  AppleTalk  applica¬ 
tions  such  as  printing  and  file  sharing 
wouldn’t  have  been  able  to  exploit  that 
solution. 

Instead,  we  chose  to  send  AppleTalk 
frames  over  an  async  link.  To  distin¬ 
guish  it  from  Apple's  standard  link  ac¬ 
cess  protocol,  we  named  it  Async  Apple- 
Talk  Link  Access  Protocol  (AALAP). 
AALAP  frames  contain  all  the  informa¬ 
tion  conveyed  in  the  230.4-kbps  frames. 
Furthermore,  AALAP  provides  exactly 
the  same  software  interface  to  the  high¬ 
er  AppleTalk  layers  (DDP,  NBP,  and  so 
on)  in  the  Macintosh.  In  this  manner,  the 
layers  above  AALAP  are  "fooled”  into 
thinking  there  is  a  230.4-kbps  link  in 


Computer  A 

Computer  B 

Take  a  file  called  FRED  of  length  123456 

-> 

* 

o 

I 

V 

Here  it  comes. 

--> 

cmuch  data  fiows> 

-> 

That’s  all.  Save  it. 

-> 

A 

1 

o 

Thanks. 

-> 

Figure  1:  Two  computers  negotiating  to  send  a  file  between  them 


File  transfer  layer 


Reliable  delivery  layer 


File  transfer  info 
Take  FRED  length  123456 


Reliable 

delivery 

info 


File  transfer  info 
Take  FRED  length  123456 


Routing  layer 

Routing  info 

Reliable 

delivery 

info 

File  transfer  info 

Take  FRED  length  123456 

LAP  Layer 

LAP  info 

Routing  info 

Reliable 

delivery 

info 

File  transfer  info 

Take  FRED  length  123456 

CRC 

Figure  2:  The  layer  onion  skin 


20 


Dr.  Dobb's  Journal,  October  1987 

751 


ASYNC  APPLETALK 

(continued  from  page  20) 


place. 

Installing  the  Async  AppleTalk  Driver 

Conventionally,  a  Macintosh  application  needing  Apple- 
Talk  services  checks  to  see  if  the  AppleTalk  driver  is  in¬ 
stalled,  and  if  it  is,  opens  it.  This  offers  us  two  options  for 
installing  Async  AppleTalk  on  a  disk:  providing  a  way  to 
open  Async  AppleTalk  before  an  application  checks  to  see 
if  AppleTalk  is  open  or  replacing  the  standard  AppleTalk 
driver  so  that  an  application  will  open  ours  instead.  The 
simpler  method  is  to  replace  the  standard  AppleTalk  driv¬ 
er  with  our  driver  (in  the  system  file).  Then,  when  an  apt- 
plication  opens  AppleTalk,  Async  AppleTalk  comes  up. 

There  are  a  few  problems  with  using  this  scheme,  how¬ 
ever.  Ordinary  AppleTalk  networks  are  permanently 
wired,  whereas  Async  AppleTalk  connections  are  usual¬ 
ly  temporary,  over  hard-wired  or  dial-up  lines.  The  user 
may  need  to  connect  some  cables,  turn  on  a  modem,  or 
dial  a  phone  to  initiate  a  connection.  In  addition,  Async 
AppleTalk  must  provide  a  way  to  reestablish  a  connec¬ 
tion  if  a  line  is  disconnected  and  to  hang  up  the  phone 
when  finished.  Finally,  we  wanted  users  to  be  able  to  use 
the  same  disk  with  standard  AppleTalk  and  Async  Apple- 
Talk,  and  it  isn’t  possible  to  have  two  AppleTalk  drivers  in 
the  same  system  file. 

Instead,  we  chose  to  write  an  Async  AppleTalk  installer 
as  a  desk  accessory.  Desk  accessories  are  simple  to  write, 
and  they  can  also  run  concurrently  with  applications. 
This  lets  users  start  up  Async  AppleTalk  within  Mac- 
Write — for  example,  to  access  a  LaserWriter — rather 
than  having  to  remember  to  start  it  beforehand  or  forcing 
users  to  quit  MacWrite,  start  up  Async  AppleTalk,  and 
then  reenter  MacWrite. 


The  User  Interface 

A  user  runs  the  Async  AppleTalk  installer  by  selecting  it 
from  the  Apple  menu.  The  installer  presents  a  window 
(see  Figure  3,  below)  containing  several  controls  and  a 
status  message  display.  First  is  a  set  of  speeds,  from  1,200 
to  19,200  bits  per  second.  To  the  right  of  this  are  four 


Rsync  AppleTalk  Installer 


Speed: 

O  1200 
®  2400 
O  4800 

O  0600 
O  19200 
Rsync  AppleTalk  Installed 


Start 
[  Cancel  ] 
[  Hangup  ] 

c 


Help 


□  fluto-Dial:  T96436300 


Figure  3:  The  Async  AppleTalk  installer  window 


buttons  that  the  user  can  click:  Start,  which  loads  and 
initializes  Async  AppleTalk  (and  can  even  dial  the  phone 
if  necessary);  Cancel,  which  terminates  the  desk  accesso¬ 
ry  without  doing  anything;  Hangup,  which  disconnects 
the  phone  and  closes  any  AppleTalk  driver;  and  Help, 
which  describes  what  the  installer  is  and  how  to  use  it. 

Below  those  controls  is  a  status  area  that  displays  the 
state  of  AppleTalk  (installed,  not  installed,  and  so  on)  or 
messages  from  the  installer.  At  the  bottom  of  the  window 
are  controls  for  the  auto-dialer.  If  the  Auto-Dial  check  box 
is  checked,  the  installer  runs  a  program  to  dial  the  phone, 
using  the  phone  number  in  the  box,  before  making  a  con¬ 
nection.  The  user  can  enter  a  phone  number  before  click¬ 
ing  the  Start  button. 

A  user  starts  up  Async  AppleTalk  by  selecting  the  install¬ 
er  from  the  Apple  menu.  When  the  dialog  window  ap¬ 
pears,  the  user  chooses  a  speed,  checks  Auto-Dial  and  en¬ 
ters  a  phone  number  if  desired,  and  clicks  the  Start  button. 
The  installer  then  loads  and  opens  the  Async  AppleTalk 
driver,  dials  the  phone  if  requested,  and  negotiates  for  a 
network  address.  If  all  is  well,  the  installer  displays  the 
message  “Async  AppleTalk  Installed,”  saves  the  speed  and 
phone  settings,  and  terminates.  If  errors  occur  in  loading 
the  driver,  the  desk  accessory  displays  a  message  and  al¬ 
lows  the  user  to  correct  the  problem  and  try  again.  Com¬ 
mon  problems  are  loose  cables,  a  modem  that  is  turned  off, 
an  incorrect  telephone  number,  or  the  other  end  is  down. 
The  user  can  exit  from  the  desk  accessory  by  clicking  the 
Cancel  button  if  the  problem  can't  be  corrected. 

Because  the  installer  desk  accessory  (DA)  saves  the  set¬ 
tings  of  successful  connections,  the  user  usually  only 
needs  to  select  the  desk  accessory  and  click  the  Start  but¬ 
ton  to  start  Async  AppleTalk. 

Explanation  of  the  Code 

On  the  Macintosh,  AppleTalk  is  implemented  as  a  set  of 
drivers.  A  driver  is  a  group  of  routines  that  isolates  an 
application  from  an  I/O  device  by  presenting  a  standard 
software  interface.  Because  drivers  aren't  directly  linked 
to  an  application,  an  alternative  driver  can  easily  be  load¬ 
ed  in  place  of  the  standard  code. 

Each  of  the  AppleTalk  protocols  is  implemented  as  a 
group  of  subroutines  (for  ATP,  NBP,  DDP,  and  LAP).  When 
an  AppleTalk  application  requests  an  AppleTalk  service, 
such  as  an  ATP  transaction,  the  ATP  code  prepares  one  or 
more  messages.  For  each  message  to  be  sent,  the  ATP  calls 
a  routine  (let's  call  it  DDPWrite)  to  prepare  a  datagram. 
The  DDPWrite  routine  finally  calls  a  routine  ( LAPWrite ) 
that  actually  sends  the  message  out  on  the  wire.  Now  the 
Async  AppleTalk  code  comes  into  play.  We  took  the 
source  for  standard  AppleTalk  and  rewrote  the  LAPWrite 
routine  to  send  the  characters  out  on  an  async  link  in¬ 
stead  of  at  230.4  kbps.  This  produced  a  driver  that  was 
completely  transparent  to  application  programs.  Our  ter¬ 
minal  emulation  program,  file  sharing,  printing,  and 
other  network  services  all  operate  unchanged  over 
Async  AppleTalk. 

AALAP  Environment 

A  low-memory  variable  (AbusVars)  at  address  $2D8  points 
to  the  local  variables  that  Async  AppleTalk  needs.  By  con¬ 
vention,  the  AppleTalk  code  places  this  value  in  register 


32 

752 


Dr.  Dobb's  Journal,  October  1987 


ASYNC  APPLETALK 

(continued  from  page  22) 

A2  so  that  variables  are  at  offsets  from  A2. 

The  Macintosh  uses  the  (Zilog  or  AMD)  8530  Serial  Com¬ 
munications  Controller  (SCC).  This  chip  is  capable  of  both 
async  or  synchronous  transmission.  Other  serial  chips 
(8250, 1602,  and  so  on)  also  have  the  functions  required  for 
Async  AppleTalk. 

There  are  five  important  sections  to  the  listings:  send¬ 
ing  a  frame  ( LAPWrite  and  the  transmit  interrupt  han¬ 
dler),  receiving  a  frame  (the  receive  interrupt  handler), 
the  CRC  algorithm  in  68000  assembly  language  and  Pas¬ 
cal,  the  port  A  polling  procedure,  and  miscellaneous  sup¬ 
port  routines.  (Listing  One  begins  on  page  60.) 

Sending  a  Frame 

Each  time  a  frame  is  to  be  sent,  the  driver  calls  the  LAP¬ 
Write  routine  to  set  up  certain  variables  and  send  the  first 
character  of  the  frame.  After  that,  the  transmission  is 
interrupt-driven — each  time  a  character  has  been  com¬ 
pletely  transmitted,  the  SCC  interrupts  the  CPU.  The  in¬ 
terrupt  handler  sends  another  character  and  returns  to 
the  application. 

LAPWrite  is  complicated  by  the  fact  that  it  must  send 
frames  "asynchronously” — this  allows  the  initiator  of  the 
frame  to  continue  without  waiting  for  a  long  I/O  opera¬ 
tion,  which  is  important  because  many  seconds  may 
elapse  between  the  start  and  end  of  a  frame.  The  device 
manager  is  responsible  for  queuing  operations  for  driv¬ 
ers.  It  doles  out  tasks  (say,  to  transmit  a  frame)  one  at  a 
time.  If  an  operation  can  be  completed  immediately,  or  if 
an  error  is  detected,  control  returns  immediately  to  the 
device  manager. 

When  an  asynchronous  operation,  such  as  transmitting 
a  frame,  cannot  be  completed  immediately,  control  re¬ 
turns  to  the  code  that  originally  called  the  device  manag¬ 
er.  When  the  operation  completes,  the  driver  returns 
control  to  the  device  manager,  which  may  initiate  anoth¬ 
er  operation.  For  more  information  about  this  scheme, 
refer  to  Inside  Macintosh,  Volume  II. 

LAPWrite  uses  a  description  of  the  frame  called  a  write 
data  structure  (WDS).  The  WDS  contains  one  or  more  seg¬ 
ments  of  data  to  send  as  a  single  frame.  Its  format  is  a 
series  of  length-pointer  pairs  that  describe  each  segment 
to  be  sent  (see  Figure  4,  below).  The  length  is  a  (16-bit) 


word,  and  the  pointer  is  a  (32-bit)  pointer  to  the  first  char¬ 
acter  of  the  segment.  Several  length-pointer  pairs  can  be 
concatenated  in  the  WDS,  with  the  final  segment  being 
followed  by  a  word  of  0. 

Before  sending  a  frame,  LAPWrite  first  validates  the 
frame  by  checking  that  the  length  is  less  than  600  data 
bytes. 

Next,  LAPWrite  checks  to  see  if  there  is  currently  a 
frame  being  transmitted.  If  so,  it  checks  that  it  is  an  IM  or 
UR  frame  (it  returns  an  error  otherwise),  saves  the  WDS  of 
the  new  frame  in  qWDSptr  (the  “queued  WDS”),  and  re¬ 
turns  (see  accompanying  article  on  page  25).  If  a  frame  is 
not  being  transmitted,  LAPWrite  updates  the  PollProc 
pointer  (described  later). 

Finally,  it  ensures  that  the  link  is  still  up  by  checking 
that  a  valid  frame  has  been  received  within  30  seconds.  If 
not,  LAPWrite  sends  an  IM  frame.  A  good  response  (a  UR 
frame)  updates  the  last-valid-frame  timer,  and  the  frame 
is  sent  as  normal.  The  lack  of  a  response  after  10  seconds 
results  in  an  alert  to  the  user,  who  can  use  the  desk  acces¬ 
sory  to  restart  the  link. 

Assuming  the  link  is  up,  LAPWrite  calls  SendWDSptr, 
which  saves  a  pointer  to  the  current  WDS  in  a  variable 
tWDSPtr  and  then  initializes  several  variables  ( EscOut , 
nCRC,  nFrmChr )  that  will  be  used  during  the  frame  trans¬ 
mission.  SendWDSptr  also  points  the  LAPFetch  pointer  at 
the  first  byte  to  send  and  sets  T^cCount  to  the  length  of  the 
first  segment.  Finally,  it  sends  the  first  character  of  the 
frame  ( $A5 ,  the  framing  character)  to  the  SCC.  The  trans¬ 
mit  interrupt  handler  sends  the  remainder  of  the  frame. 

The  transmit  interrupt  handler  ( TIntHnd )  is  an  inter¬ 
rupt  routine  that  is  called  each  time  the  SCC  finishes 
transmitting  a  character.  This  routine  sends  the  next 
character  (by  calling  T^cNejctCh)  and  cleans  up  before  re¬ 
turning  from  the  interrupt.  If  no  character  was  actually 
sent  (the  previous  character  was  the  last  one  of  the 
frame),  then  the  transmit  pending  bit  is  reset  so  that  no 
more  interrupts  will  arrive.  In  any  case,  TIntHnd  resets 
the  highest  interrupt  under  service  (IUS)  so  that  other  in¬ 
terrupts  can  come  in. 

TgNejctCh  picks  the  next  character  of  the  frame  to  send, 
advances  LAPFetch  (the  pointer  to  the  next  character  to 
send),  and  decrements  the  segment  length  counter 
(TjcCount).  It  also  accumulates  the  CRC  for  the  frame  (by 
calling  NejctCRC).  If  escaping  is  necessary,  T^cNextCh  sets 
the  EscOut  variable  to  true,  leaves  LAPFetch  pointing  at 
the  character  to  be  escaped,  and  sends  a  DLE  ($10).  If  the 
EscOut  flag  is  set,  it  exclusive-ORs  the 
data  character  (at  LAPFetch)  with  $40,  in¬ 
crements  LAPFetch,  clears  the  EscOut 
flag,  and  sends  the  character.  Notice  that 
escaping  always  follows  CRC  accumula¬ 
tion  on  the  transmit  side.  We  will  see 
that  unescaping  is  done  before  CRC 
computations  on  the  receive  side. 

There  are  several  special  cases  for  Ty- 
Ne?ctCh.  When  the  last  character  of  a 
segment  of  a  WDS  has  been  sent  (T)c- 
Count  =  0),  it  moves  to  the  next  length- 
pointer  pair,  updates  LAPFetch  and  Ty- 
Count,  and  continues  sending.  After  all 
the  segments  of  a  frame  have  been  sent, 


length  of  first  segment 

pointer  to  first  segment 

length  of  second  segment 

\  ■  . ;  " 7  ,:  ■-  ;  -  ;  -  :  ■  ' 

pointer  to  second  segment 

zero  (end  of  WDS) 

Figure  4:  Write  data  structure  (WDS) 


24 


Dr.  Dobb's  Journal,  October  1987 

753 


The  Async  AppleTalk  Link  Access  Protocol 


This  section  describes  the  important  features  of  AALAP.  have  arrived.  Some  host  computers  send  flow  control 
The  full  protocol  definition,  the  Asynchronous  Apple-  characters  of  either  parity  (XOFF  may  be  $13  or  $93  )  as 
Talk  Link  Access  Protocol,  Version  1.0,  is  included  on  the  flow  control;  AALAP  must  honor  either. 

Async  AppleTalk  distribution  disk.  •  AALAP  must  detect  and  recover  from  link  failures  (such 

as  loss  of  carrier).  There  are  actually  two  problems:  re- 

•  AALAP  transmits  frames  of  arbitrary  8-bit  data  to  anoth-  establishing  the  link  and  regaining  the  previous  network 
er  node  across  an  RS-232  link.  It  uses  8-bit,  no  parity  trans-  address.  On  the  Macintosh,  a  desk  accessory  (DA)  accom- 
mission  with  one  stop  bit.  AALAP  sends  data  bytes  whose  plishes  both.  The  DA  will  dial  a  selected  telephone  num- 
value  is  the  same  as  certain  special  characters  (that  is,  ber,  if  necessary,  and  establish  the  link.  If  the  link  fails 
flow  control  characters)  by  escaping  those  characters;  it  during  a  session,  the  user  gets  an  alert  and  can  then  use 
sends  a  DLE  ($10 )  and  then  the  desired  data  character  the  DA  to  restart  the  link. 

exclusive-ORed  with  $40.  Thus  the  data  byte  $13  (XOFF)  •  To  start  the  link,  the  Async  AppleTalk  installer  makes  a 
becomes  the  two-character  sequence  $10  $53 ;  to  send  a  link  (by  dialing  the  phone  number  and  so  on)  and  then 
data  byte  of  $10  (DLE),  AALAP  sends  the  sequence  $10  $50.  gives  a  command  to  the  AALAP  driver  to  obtain  a  net- 
The  receiver  always  exclusive-ORs  the  character  after  a  work  and  node  number  (NNN).  This  is  a  two-step  process. 
DLE  with  $40  to  recover  the  correct  data  byte.  The  charac-  Each  end  sends  an  IM  (for  "I  aM")  LAP  frame  that  contains 
ters  treated  specially  are  $10,  $11,  $13,  $91,  $93,  and  $A5.  its  suggested  network  address;  the  other  end  responds 
•AALAP  marks  the  beginning  and  end  of  each  frame  with  a  UR  (for  “yoUaRe”)  frame  that  either  confirms  that 
with  a  special  "framing”  character.  This  framing  charac-  address  or  makes  a  different  suggestion  (see  Figure  6, 
ter  never  occurs  in  the  middle  of  a  frame;  if  AALAP  needs  below).  The  default  values  for  network  and  node  num- 
to  send  it,  it  will  be  escaped.  $A5  is  a  good  choice  because  it  bers  are  $0000  and  $00,  respectively.  When  zero  values 
is  unlikely  to  occur  in  most  data  streams  and  therefore  arrive  in  an  IM  frame,  the  recipient  suggests  different 
will  not  have  to  be  escaped  often.  Placing  this  character  at  (possibly  random)  values  in  a  UR  response.  If  the  recipient 
the  start  and  end  of  each  frame  allows  easy  resynchroni-  of  a  UR  can  accommodate  the  suggested  value(s),  it  switch- 
zation  if  one  framing  character  has  been  garbled;  the  re-  es  to  use  those  numbers.  When  each  end  has  sent  an  IM 
ceiver  simply  waits  for  two  adjacent  framing  characters,  and  received  a  UR  with  the  same  address,  they  declare 
which  indicate  the  start  of  another  frame.  Figure  5,  the  link  to  be  up. 

below,  shows  the  format  of  an  AALAP  frame.  •  Notice  that  this  scheme  allows  a  node  to  regain  its  previ- 

•  AALAP  detects  transmission  errors  using  a  cyclic  redun-  ous  network  address  after  being  disconnected.  When  re- 
dancy  check  (CRC).  AALAP  computes  the  CRC  (a  16-bit  starting,  the  node  suggests  its  previous  address  in  the  ini- 
value  that  depends  on  all  the  bits  of  every  character  in  the  tial  IM;  if  the  other  end  can  accommodate  that  address,  it 
frame)  and  sends  this  value  at  the  end  of  the  frame.  The  replies  with  the  same  values  in  the  UR. 

receiver  also  computes  the  CRC  on  its  received  data  and  •  Whenever  a  node  wants  to  send  a  data  frame,  it  checks 
compares  its  computed  value  with  the  value  it  received.  If  to  see  if  it  has  heard  from  the  other  end  within  the  last  30 
the  two  values  are  different,  an  error  occurred,  and  the  seconds.  If  not,  it  sends  an  IM  frame  with  its  current  ad- 
receiver  discards  the  frame.  If  the  values  are  the  same,  dress  (to  force  an  immediate  UR  response).  If  no  response 
then  the  frame  is  presumed  to  have  been  received  cor-  returns,  the  link  can  be  assumed  to  be  down,  and  the  user 
rectly.  gets  a  warning.  If  a  response  arrives,  then  the  original 

•  AALAP  links  are  full-duplex,  point-to-point  channels,  so  data  frame  can  be  sent, 
there  is  no  contention  for  the  link;  stand¬ 
ard  ALAP  has  to  arbitrate  access  to  the  bus 
to  prevent  two  station’s  messages  from 
colliding. 

•  AALAP  negotiates  the  network  and 
node  number  with  the  device  at  the  other 
end  of  the  link  at  start-up  time.  The  Ap¬ 
pleTalk  address  for  a  particular  device 
consists  of  a  (16-bit)  network  number  and 
a  (8-bit)  node  number  (for  the  device 
itself). 

•  AALAP  uses  XON/XOFF  flow  control.  Figure  S:  Generic  AALAP  frame 
This  was  a  requirement  for  dealing  with 
host  computers  that  cannot  accept  large 
blocks  of  data  at  high  speed.  If  AALAP  re¬ 
ceives  an  XOFF,  it  stops  sending.  It  re¬ 
sumes  output  after  receiving  an  XON. 

Similarly,  AALAP  may  send  an  XOFF  if  it 

cannot  process  all  the  characters  that  Figure  6:  IM/UR  frames 


Dr.  Dobb's  Journal,  October  1987 

754 


35 


ASYNC  APPLETALK 

(continued  from  page  24) 

TjcNeytCh  checks  nCRC  to  see  if  it  needs  to  send  the  CRC.  If 
so,  it  swaps  the  bytes  of  the  CRC  (so  that  it  will  be  sent  the 
least  significant  byte  first)  and  saves  them  in  CRCBuf 
points  LAPFetch  at  CRCBuf,  sets  TyCount  to  2,  and  clears 
the  nCRC  flag.  Finally,  after  the  CRC  has  been  sent,  if  the 
nFrmChr  flag  is  set,  a  closing  framing  character  will  be 
sent  and  the  nFrmChr  flag  cleared.  After  the  entire  frame 
has  been  sent,  TyNeytCh  looks  to  see  if  a  queued  WDS  is 
waiting  to  be  sent.  If  so,  it  resumes  transmission  with  the 
new  frame;  otherwise,  it  informs  the  device  manager 
that  a  frame  is  complete. 

Receiving  a  Frame 

The  receive  interrupt  handler  ( RIntHnd )  processes  arriv¬ 
ing  characters.  Every  time  a  "read  data  available”  inter¬ 
rupt  occurs,  RIntHnd  gets  a  character  from  the  SCC,  pro¬ 
cesses  the  character  as  described  later,  and  checks  for 
another  character.  When  there  are  no  more  characters, 
RIntHnd  returns  to  the  interrupted  application. 

The  normal  state  of  RIntHnd  is  waiting  for  a  frame  to 
begin.  The  variable  inMsg  is  set  false  until  a  $A5  character 
arrives.  All  other  input  characters  will  be  discarded.  Once 
a  frame  has  started,  each  arriving  character  is  checked 
for  escaping.  If  it  is  a  DLE,  the  Escln  flag  is  set  and  the  DLE 
discarded.  If  a  character  arrives  when  the  Escln  flag  is  set, 
the  data  character  is  exclusive-ORed  with  $40  to  get  the 
correct  value,  and  the  Escln  flag  is  cleared.  RIntHnd  then 
updates  the  input  CRC,  stores  the  data  character  in  the 
input  buffer,  and  updates  the  pointers  ( LAPStash  points  at 
the  next  free  location;  RcvdLen  is  the  total  length  of  the 
input  buffer). 

When  a  closing  framing  character  arrives,  the  inMsg 
variable  is  set  false  and  the  frame's  received  CRC  is 
checked.  A  nonzero  value  means  that  a  transmission  er¬ 
ror  occurred,  and  the  data  is  ignored.  A  CRC  of  zero  im¬ 
plies  that  the  frame  is  correct;  RIntHnd  remembers  the 
frame’s  arrival  time  and  passes  control  to  a  higher  layer 
in  the  AppleTalk  protocol  for  processing. 

RIntHnd  has  several  special  tests.  First,  it  discards  frames 
that  contain  either  fewer  than  3  or  more  than  600  data 
characters  (as  required  by  the  AppleTalk  specification). 
RIntHnd  also  checks  for  flow  control  characters  arriving 
from  the  other  side  and  stops  and  resumes  transmission  as 
necessary.  Finally,  RIntHnd  has  some  tricky  code  to  keep 
up  with  arriving  data  while  the  processor  is  busy. 

Remember  that  RIntHnd  runs  as  an  interrupt  routine. 
No  other  interrupts  will  normally  be  processed  until  RInt¬ 
Hnd  exits.  Unfortunately,  several  of  the  higher  protocol 
layers  can  take  quite  a  while  (3—4  msec)  to  process  a  long 
frame.  At  9,600  bps,  interrupts  are  disabled  for  three  or 
four  character  times  and  characters  are  frequently  over¬ 
run,  corrupting  the  frame. 

To  avoid  dropping  characters  while  a  frame  is  being 
processed  by  a  higher  layer,  RIntHnd  uses  the  stillBusy 
flag  to  indicate  that  the  previous  frame  is  still  being  pro¬ 
cessed.  When  a  complete  frame  arrives,  RIntHnd  sets  still¬ 
Busy  and  reenables  interrupts  before  passing  control  to 
the  higher  layer’s  code.  Characters  that  arrive  when  still¬ 
Busy  is  true  are  placed  in  a  buffer  (Busy Buff)  of  16  charac¬ 


ters.  Once  the  higher  layer  returns,  RIntHnd  clears  still¬ 
Busy  and  exits. 

There  is  one  final  complication  in  these  routines — the 
Macintosh  doesn’t  have  enough  speed  to  service  the  disk 
and  its  two  serial  ports  simultaneously.  This  means  that 
incoming  characters  may  be  dropped  when  the  disk  is 
spinning.  The  disk  driver  installs  a  "serial  port  A  polling 
procedure,”  or  PollProc,  which  is  invoked  whenever  the 
disk  driver  turns  off  interrupts  for  more  than  100  micro¬ 
seconds.  During  disk  operations,  PollProc  stashes  charac¬ 
ters  from  port  A  into  a  special  buffer  for  processing  after 
the  disk  stops. 

Unfortunately,  the  Async  AppleTalk  driver  (on  port  B) 
drops  characters  with  this  scheme.  To  circumvent  this, 
we  install  a  small  piece  of  code  to  be  executed  before  the 
standard  PollProc  runs.  This  code  has  one  function:  to 
send  an  XOFF  flow  control  character  to  stop  the  other  end 
from  sending.  A  while  later,  AALAP  sends  an  XON  to  re¬ 
sume  the  data  (the  vertical  blanking  task  controls  this). 

CRC  Algorithm 

The  two  CRC  routines  shown  in  Listings  Two  and  Three, 
next  month,  implement  the  same  algorithm.  Listing  Two 
is  in  Pascal  and  is  easier  to  follow  than  the  actual 
assembly-language  code  used  (in  Listing  Three).  Each 
NeytCRC  routine  receives  the  accumulated  CRC  (so  far) 
and  the  next  data  character  to  accumulate.  NeytCRC  re¬ 
turns  the  updated  (16-bit)  CRC.  The  CRC-16  algorithm  has 
the  special  property  that  when  the  computed  CRC  is  ap¬ 
pended  (least  significant  byte  first)  to  a  string  of  m  data 
bytes,  and  a  CRC  computed  on  the  entire  m+2  bytes,  the 
result  is  exactly  0.  This  makes  it  quite  easy  for  the  receiver 
of  a  frame  to  check  the  CRC  of  a  frame. 

The  Polling  Procedure  (PollProc) 

Our  PollProc  gets  control  when  the  disk  driver  turns  off 
interrupts.  It  first  tests  to  see  if  first,  it  has  sent  an  XOFF,  and 
second,  if  we're  receiving  a  frame.  If  the  answers  are  no 
and  yes,  respectively,  it  stashes  any  characters  from  the 
SCC  in  Busy  Buff  and  then  sends  an  XOFF  to  the  other  end. 

In  any  case,  our  PollProc  then  restores  the  state  and 
passes  control  to  the  real  PollProc  (if  any). 

The  AALAP  PollProc  only  sends  an  XOFF  when  actually 
receiving  a  frame  (when  inMsg  is  true).  This  prevents  a 
continuous  dribble  of  XOFF  and  XON  characters  as  the 
disk  spins. 

Support  Routines 

Macintosh  programs  can  receive  periodic  action  with  a 
vertical  blanking  (VBL)  task.  Async  AppleTalk  uses  a  VBL 
task  to  check  that  the  data  flow  hasn’t  stopped  without 
reason.  The  VBL  code  runs  every  third  of  a  second  and 
either  sends  an  XON  if  the  transmitter  has  sent  an  XOFF  to 
stop  the  other  side  or  experimentally  sends  the  next  char¬ 
acter  (by  calling  TyNeytCh)  if  no  character  has  been  sent 
for  a  full  second.  Each  of  these  actions  may  reinvoke  flow 
control,  but  they  guarantee  that  things  will  become  "un¬ 
stuck”  if,  say,  an  XON  was  dropped. 

PutChar  sends  a  character  synchronously — that  is,  it 
waits  until  the  character  can  be  sent.  If  the  SCC  cannot 
accept  the  character  within  one-half  second,  PutChar  re- 


26 


Dr.  Dobb's  Journal,  October  1987 

755 


ASYNC  APPLETALK 

(continued  from  page  26) 


turns  an  error  code.  A  character  of  $FFFF  will  send  a 
break  on  the  link. 

SetBaud  changes  the  baud  rate  generator  (BRG)  in  the 
SCC  to  run  at  the  desired  speed.  It  is  careful  to  disable  the 
BRG  before  changing  it  and  restores  the  state  of  the  SCC 
before  returning. 

Get— NNNN  does  the  network  and  node  number  negoti¬ 
ation  (NNNN).  It  sends  an  IM  with  the  current  network 
address  up  to  four  times.  A  UR  response  will  be  processed 
asynchronously  by  the  receive  interrupt  handler.  If  the 
address  in  the  UR  matches  that  of  the  IM,  then 
Get— NNNN  declares  the  link  up.  If  the  AALAPup  variable 
is  true  before  the  fourth  IM  time-out,  Get— NNNN  returns 
a  good  status;  otherwise,  it  returns  a  bad  status  to  indicate 
that  the  link  could  not  be  started. 

Do  Warn  uses  the  return  status  from  Get— NNNN  to  dis¬ 
play  a  dialog  box  with  a  warning  to  the  user  that  there  is 
trouble  with  the  Async  AppleTalk  link.  Two  possibilities 
are  likely:  either  there  is  no  response  from  the  other  end 
or  the  two  ends  of  the  link  cannot  agree  on  a  network 
address.  Do  Warn  will  fail  if  the  disk  that  originally  con¬ 
tained  the  Async  AppleTalk  desk  accessory  file  is  not  cur¬ 
rently  mounted. 

Who ’s  Using  Async  AppleTalk? 

At  Dartmouth,  more  than  1,000  Macintoshes  in  adminis¬ 
trative  and  academic  offices  can  use  Async  AppleTalk 
through  the  Kiewit  network.  We  have  successfully  used 
many  applications,  including  DarTerminal,  printing  to 
LaserWriters,  and  various  file  servers  and  electronic-mail 
programs,  with  Async  AppleTalk. 

At  the  time  of  this  writing  (May  1987),  Async  AppleTalk 
is  only  useful  on  the  Dartmouth  network.  Fortunately, 
commercial  products  that  use  Async  AppleTalk  should 
soon  reach  the  market.  First  is  the  Reactor,  from  Sand  Hill 
Engineering  in  Geneva,  Florida  ([305]  349-5960).  The  Reac¬ 
tor  is  a  port  switcher  that  can  interconnect  RS-232  ports 
internally  or  between  several  Reactors,  which  allows 
multiple  users  to  share  a  modem,  printer,  or  whatever, 
without  switching  cables.  The  Reactor  also  acts  as  an 
Async  AppleTalk  bridge,  routing  AALAP  frames  between 
the  appropriate  ports. 

Solana  Electronics,  of  San  Diego,  California  ([619]  566- 
1701),  is  also  working  on  an  Async  AppleTalk  product  that 
will  connect  to  a  230.4-kbps  bus. 

Future  Projects 

Although  the  current  implementation  of  Async  Apple- 
Talk  is  quite  usable,  several  enhancements  could  be 
made.  Perhaps  the  simplest  would  be  to  make  an  INIT 
resource  that  installed  Async  AppleTalk  at  system  boot 
time.  Each  time  you  booted  the  system,  the  INIT  mecha¬ 
nism  of  the  Mac  would  load  in  the  Async  AppleTalk  driv¬ 
er,  dial  the  telephone  (if  necessary),  and  establish  the  link. 
Errors  would  be  reported  in  alerts  as  they  are  currently. 

Another  project  is  a  “half-bridge.”  This  is  most  easily 
described  as  a  stand-alone  Macintosh  that  has  Async 
AppleTalk  running  over  the  phone  port  (port  A)  and 
standard  (230.4-kbps)  AppleTalk  running  on  the  printer 


port  (port  B).  Frames  that  arrived  on  port  A  would  be  sent 
out  port  B,  and  vice  versa.  This  is  a  bit  tricky  because  not 
all  frames  need  to  be  sent  out  the  other  port — the  soft¬ 
ware  in  the  half-bridge  should  forward  frames  only  if  the 
destination  is  out  the  other  side.  This  would  be  useful  for 
a  couple  of  purposes:  a  half-bridge  can  join  two  separate 
230.4-kbps  AppleTalk  networks,  only  passing  frames  des¬ 
tined  for  the  other  side;  a  half-bridge  also  gives  a  dial-up 
port  for  someone  using  Async  AppleTalk,  say,  from 
home.  (This  isn't  as  wasteful  as  it  seems:  it  is  quite  reason¬ 
able  to  leave  a  Mac  in  the  office  running  as  a  half-bridge 
after  people  leave.  After  all,  many  people  turn  the  Mac 
off  when  they’re  not  there.) 

Availability 

Although  you  may  distribute  it  freely,  Async  AppleTalk  is 
not  in  the  public  domain.  It  bears  a  copyright  notice  of  the 
Trustees  of  Dartmouth  College.  We  distribute  it  at  a  nomi¬ 
nal  cost,  and  others  may  redistribute  it  as  long  as  it  is  not 
sold  for  profit  and  as  long  as  the  copyright  notice  is  main¬ 
tained.  The  full  distribution  policy  is  included  with  the 
distribution  disk.  If  you  plan  to  use  Async  AppleTalk  in  a 
commercial  product,  we  ask  that  you  send  us  a  letter  de¬ 
scribing  your  plans  and  your  agreement  to  abide  by  our 
policy.  (Historical  note:  this  is  fundamentally  the  same 
arrangement  Columbia  University  uses  for  its  Kermit 
package.) 

Dartmouth  College  distributes  the  following  modules: 

•  A  desk  accessory  for  the  Macintosh. 

•  Sources  for  the  desk  accessory  and  auto-dialer. 

•  Sources  for  most,  but  not  all,  of  the  AALAP  driver.  AA¬ 
LAP  was  developed  from  the  original  AppleTalk  source 
code,  which  we  licensed  from  Apple  under  a  nondisclo¬ 
sure  agreement.  Consequently,  we  cannot  distribute  the 
entire  source  package.  We  do  show  most  of  the  software, 
including  interrupt  handlers,  so  that  these  can  be  ported 
to  other  machines. 

All  source  code  for  articles  in  this  issue  is  available  on  a 
single  disk.  To  order,  send  $14.95  to  Dr.  Dobb's  Journal, 
501  Galveston  Dr.,  Redwood  City,  CA  94036,  or  call  (415) 
366-3600,  ext.  216.  Please  specify  issue  number  and  format 
(MS-DOS,  Macintosh,  Kaypro). 

Bibliography 

Apple  Computer.  Inside  AppleTalk.  Cupertino,  Calif.:  Ap¬ 
ple  Computer,  1986. 

Apple  Computer.  Inside  Macintosh,  vols.  I  — IV.  Reading, 
Mass.:  Addison-Wesley,  1985. 

Dartmouth  College.  Asynchronous  AppleTalk  Link  Access 
Protocol,  Version  1.0.  Hanover,  N.H.:  Kiewit  Computation 
Center,  1986. 

Tanenbaum,  Andrew  S.  Computer  Networks.  Englewood 
Cliffs,  N.J.:  Prentice-Hall,  1981. 

DDJ 

(Listing  begins  on  page  60.) 

Vote  for  your  favorite  feature/article. 

Circle  Reader  Service  No.  1 . 


28 

756 


Dr.  Dobb's  Journal,  October  1987 


ARTICLES 


A  Fast  Forth 
for  the  68000 

by  Lori  Chavez 


Interactive  languages  such  as 
Forth  and  BASIC  traditionally 
force  programmers  to  sacrifice 
program  execution  speed  for  the 
powerful  debugging  facilities  of  an  in¬ 
teractive  programming  environment. 
This  article  takes  a  look  at  a  Forth  im¬ 
plementation  scheme  that  allows  an 
interactive  Forth  system  to  generate 
code  that  executes  at  speeds  that  rival 
the  execution  speeds  of  code  generat¬ 
ed  using  compiled  languages. 

Mach2,  created  for  the  Macintosh 
and  other  68000  environments,  is  a 
Forth  system  that  has  discarded  the 
elaborate  pointer-threading  schemes 
commonly  used  for  Forth  in  favor  of 
a  simple  “subroutine-threaded”  ap¬ 
proach.  In  a  subroutine-threaded 
Forth,  the  Forth  compiler  generates 
machine  code.  Both  the  pointer- 
threaded  Forth  pseudocode  and  the 
inner  interpreter  required  to  execute 
the  pseudocode  are  eliminated;  the 
microprocessor  can  directly  execute 
subroutine-threaded  Forth  code. 

In  benchmark  tests  performed  on  a 
Macintosh  SE,  the  Sieve  (Forth  ver¬ 
sion)  benchmark  compiled  by  Mach2 
executed  at  70  percent  of  the  execu¬ 
tion  speed  of  a  Sieve  program  com¬ 
piled  using  a  high-performance  Mac¬ 
intosh  C  compiler.  The  Sieve  (Forth 
version)  benchmark  compiled  using 
a  traditional  pointer-threaded  Forth 
system  executed  at  17  percent  of  the 
speed  of  the  compiled  C  version. 

A  Short  Description  of  Forth 

The  Forth  language  is  composed  of 


Lori  Chavez,  Palo  Alto  Shipping  Com¬ 
pany,  P.O.  Bo?c  7430,  Menlo  Park,  CA 
94026.  Lori  is  a  codeveloper  of  the 
Mach2  Forth  Development  System 
and  is  also  a  software  consultant. 


Saying  good-bye  to 
the  inner  interpreter 


approximately  200—300  routines.  In 
Forth  terminology  these  routines  are 
referred  to  as  words,  or  definitions. 
The  data  structure  used  to  hold  the 
code  for  the  routines,  information 
about  the  routines,  and  to  link  all  the 
routines  is  called  the  dictionary.  Each 
Forth  word  has  its  own  entry  in  the 
dictionary. 

New  Forth  definitions  are  created 
by  compiling  references  to  several 
previously  compiled  Forth  words 
and  assigning  a  name  to  those  com¬ 
piled  references.  The  new  Forth  defi¬ 
nition,  like  any  other  Forth  word, 
can  then  be  executed  interactively  by 
the  Forth  interpreter.  Execution  of 
the  new  definition  will  cause  all 
words  referenced  by  the  definition 
to  be  executed  sequentially. 

To  illustrate  this  process,  let's  de¬ 
fine  a  Forth  word  called  SHIP  that, 
when  executed,  will  perform  the 
steps  required  to  ship  a  product  from 
a  warehouse: 

:  SHIP 

GET_QUANTITY 

PREPARE_SHIPMENT 

SEND_PACKAGE 


GET— QUANTITY  fetches  the  desired 
quantity  value  from  a  variable  and 
passes  the  value  to  PREPARE— SHIP- 
MENT.  PREPARE— SHIPMENT  places 
the  desired  number  of  items  in  the 


package  and  SEND-PACKAGE  puts 
the  package  in  the  mail. 

Subroutine  vs. 

Pointer  Threading 

Figure  1,  page  33  ,  shows  the  code 
generated  when  SHIP  is  compiled  in 
a  subroutine-threaded  system.  Each 
word  that  SHIP  references  is  a  sub¬ 
routine  that  ends  in  an  assembly-lan¬ 
guage  return-to-subroutine  (RTS)  in¬ 
struction.  The  compiler  generates  a 
4-byte,  PC-relative,  jump-to-subrou- 
tine  (JSR)  reference  for  each  of  the 
three  routines.  Because  SHIP  itself 
ends  with  an  RTS  instruction,  it  too 
can  be  referenced  by  another  word 
in  the  same  manner. 

Figure  2,  page  33 ,  shows  how  the 
dictionary  entry  for  SHIP  would  ap¬ 
pear  if  it  were  compiled  in  a  pointer- 
threaded  Forth  implementation.  A 
pointer-threaded  Forth  does  not  gen¬ 
erate  directly  executable  machine 
code.  Instead,  it  generates  lists  of  ei¬ 
ther  addresses,  offsets,  or  tokens  that 
indirectly  "point”  to  the  referenced 
word.  This  means  a  Forth  interpreter 
that  understands  this  “pseudocode” 
generated  by  the  pointer-threaded 
compiler  must  be  used  to  execute  the 
“code.” 

H7iv  Pseudocode? 

Forth  was  originally  developed  in  the 
days  of  the  8-  and  16-bit  processors. 
Typically,  only  one  of  the  registers  in 
these  processors — the  system  stack 
pointer  register — could  take  advan¬ 
tage  of  the  fast  processor  stack  ma¬ 
nipulation  operators  (push  and  pop). 

In  a  pointer-threaded  Forth,  the 
processor  system  stack  is  used  as  the 
data  stack.  Because  microprocessors 
use  the  system  stack  for  their  subrou¬ 
tine  calling  mechanism,  pointer- 


32 


Dr.  Dobb's  Journal,  October  1987 

757 


threaded  Forth  implementations 
were  forced  to  develop  their  own  def¬ 
inition  execution  mechanism.  The 
scheme  devised  involved  a  simulation 
of  the  microprocessor  program 
counter,  instruction  pointer,  and  even 
the  instruction  set.  For  those  interest¬ 
ed,  Ronald  Greene  provides  a  listing  of 
the  8088  assembly-language  code  re¬ 
quired  to  implement  a  generic  inner 
interpreter  in  his  1984  Byte  article  on 
reducing  overhead  in  threaded  inter¬ 
pretive  languages.  The  inner  inter¬ 
preter  is  naturally  slower  than  the  mi¬ 
croprocessor’s  subroutine  calling 
mechanism  and  requires  more  regis¬ 
ters  to  implement. 

It  is  this  simulation  of  the  natural 
functions  of  the  microprocessor  that 
became  the  bottleneck  in  the  effort  to 
achieve  fast  Forth  execution  times.  At 
the  time,  given  the  architecture  of 
the  available  processors,  the  Forth 
implementors’  decision  made  sense. 
The  data  stack  is  heavily  used  in  a 
Forth  program.  By  using  the  system 
stack  as  the  data  stack,  fast  push  and 
pop  stack  manipulation  instructions 
could  be  used  to  optimize  parameter 
passing.  And,  in  those  days  of  limited 
memory,  the  use  of  subroutine  call 
instructions  would  have  required 
more  memory  for  each  compiled  ref¬ 
erence  in  a  Forth  word. 

Today,  however,  processors  are 
much  more  flexible.  The  Motorola 
68000  microprocessor  has  16  general- 
purpose  registers  (D0—D7  and 
AO— A 7),  7  of  which  LA0—A7)  can  be 
used  as  stack  pointer  registers.  This 
means  that  a  68000  Forth  can  take  ad¬ 
vantage  of  the  68000’s  subroutine  call¬ 
ing  mechanism,  which  affects  the  sys¬ 
tem  stack  pointer  register,  and  still  use 
another  register  for  the  parameter 
stack  pointer  with  no  speed  penalty. 

Optimi2tation  Techniques 

As  Figure  1  demonstrated,  a  subrou¬ 
tine-threaded  Forth  compiler  gener¬ 
ates  machine  code.  Although  this 
may  seem  only  natural  to  those  fa¬ 
miliar  with  other  language  compilers 
and  assemblers,  it  is  not  a  common 
characteristic  of  Forth.  This  change 
from  pseudocode  to  machine-code 
generation  has  had  two  positive  ef¬ 
fects  on  the  Forth  language.  First,  as 
the  previously  mentioned  Sieve  re¬ 
sults  testify,  subroutine  threading 
makes  Forth  code  execute  signifi¬ 
cantly  faster.  Second,  the  code  and 


speed  optimization  techniques  tradi¬ 
tional  compilers  and  assemblers  have 
used  for  many  years  can  finally  be 
applied  to  Forth  code. 

The  rest  of  this  article  discusses 
how  macro  substitution — a  speed  im¬ 
provement  technique  commonly 
used  by  assemblers — and  edge-mac¬ 
ro  (peephole)  optimization — a  code 
optimization  technique  commonly 
used  by  traditional  language  compil¬ 
ers — can  be  used  by  a  subroutine- 
threaded  Forth  compiler  to  generate 
faster  and  more  compact  code. 

Optimization  for  Speed 

To  obtain  speed  improvements  in  an 
assembly-language  program,  pro¬ 
grammers  generally  avoid  subrou¬ 
tine  calls  and  place  as  much  code  as 
possible  directly  into  the  routine  be¬ 
ing  optimized.  If  this  technique  is 
used  too  often  in  a  program,  the 
source  listing  can  become  quite  long 
and  unreadable.  Therefore,  most  as¬ 
semblers  include  some  sort  of  macro 
facility  that  allows  many  instructions 
to  be  represented  by  one  word. 
When  the  macro  word  is  encoun¬ 
tered  during  the  assembly  process, 
all  instructions  that  comprise  the 
macro  word  will  be  assembled. 

"Mach’’  words  are  the  Mach2 
equivalent  of  a  traditional  assem¬ 
bler's  macro  words.  In  Mach2,  one  bit 
in  the  length  byte  of  the  dictionary 
header  is  used  as  the  Mach  bit.  If  this 
bit  is  set,  the  corresponding  word  is  a 
Mach  word  that  is  treated  as  a  macro 
by  the  compiler.  Whenever  a  Mach 


word  is  referenced  in  a  definition, 
the  compiler  lays  a  copy  of  all  in¬ 
structions  from  the  start  of  the  Mach 
word  up  to  the  first  RTS  encountered 
into  the  definition  being  compiled. 
Because  the  compiler  in  a  subrou¬ 
tine-threaded  Forth  already  gener¬ 
ates  machine  code,  teaching  it  to  pro¬ 
duce  in-line  code  is  a  simple  task. 

To  decide  which  Forth  words 
should  be  laid  in-line,  the  overhead 
of  the  subroutine  call  instruction 
must  be  weighed  against  the  size  of 
the  subroutine  being  called.  The 
68000  PC-relative  JSR  instruction  re¬ 
quires  4  bytes  of  memory  and  18  pro¬ 
cessor  clock  cycles.  The  68000  RTS  in¬ 
struction  requires  2  bytes  of  memory 
and  16  processor  clock  cycles.  The  to¬ 
tal  overhead  for  a  complete  subrou¬ 
tine  call,  for  both  the  JSR  and  RTS  in¬ 
structions,  is  therefore  6  bytes  and  34 
clock  cycles.  For  approximately  75 
percent  of  the  words  in  the  Forth 
kernel,  the  overhead  of  the  subrou¬ 
tine  calling  mechanism  is  small  in 
comparison  to  the  size  and  time  re¬ 
quired  to  execute  the  word  itself.  For 
the  remaining  25  percent  of  the 
words,  however,  the  overhead  in¬ 
volved  in  calling  the  words  is  much 
greater  than  the  execution  time  and 
memory  requirements  of  the  word 
being  called. 

An  ideal  candidate  for  a  Mach 
word  is  a  word  such  as  DUP,  which  is 
composed  of  a  single  68000  instruc¬ 
tion  (MOVE.L  (A6),  ~(A6)  )  that  is  both 
smaller  in  size  than  the  JSR  instruc¬ 
tion  (2  bytes  vs.  4  bytes)  and  faster  in 


:  SHIP 


GET _ QUANTITY 

— ► 

JSR  GET _ QUANTITY 

PREPARE-SHIPMENT 

— ► 

JSR  PREPARE-SHIPMENT 

SEND _ PACKAGE 

— ► 

JSR  S  END _ PACKAGE 

; 

— ► 

RTS 

Figure  1:  Subroutine-threaded  code  for  SHIP 


:  SHIP 

— ► 

OFFSET  TO  COLON 

GET _ QUANTITY 

— ► 

OFFSET  TO  GET-QUANTITY 

PREPARE-SHIPMENT 

- ► 

OFFSET  TO  PREPARE-SHIPMENT 

SEND— PACKAGE 

— ► 

OFFSET  TO  SEND-PACKAGE 

t 

OFFSET  TO  SEMICOLON 

Figure  2:  Pointer-  threaded  code  for  SHIP 


Dr.  Dobb's  Journal,  October  1987 

758 


33 


68000  FORTH 

(continued  from  page  33) 

execution  time  than  a  subroutine  ref¬ 
erence  (20  processor  cycles  for  DUP 
vs.  54  processor  cycles  for  JSR  +  DUP 
+  RTS). 

In  Mach2,  all  of  the  simple  stack, 
arithmetic,  comparison,  and  memo¬ 
ry  operators  and  all  variable  and  con¬ 
stant  references  have  been  declared 
as  Mach  words.  Fifty  percent  of  these 
words  generate  code  that  is  smaller 
than  or  equal  in  size  to  the  JSR  in¬ 
struction  they  replace.  An  additional 
25  percent  of  the  words  weigh  in  at  6 
bytes — only  2  bytes  larger  than  the 
JSR  instruction.  In  each  case,  the  re¬ 
placement  of  the  JSR  instruction 
with  in-line  code  results  in  at  least  a 
50  percent  speed  improvement. 
When  used  thoughtfully  and  selec¬ 
tively,  macro  substition  can  yield  a 
large  increase  in  program  execution 
speed  with  only  a  small  increase  in 
program  size. 

One  example,  the  definition  of 
GET-QUANTITY,  is  shown  below. 
QUANTITY  is  the  name  of  a  4-byte 
variable  storage  area.  When  execut¬ 
ed  it  will  return  the  address  of  its 
storage  area.  The  @  operator  (fetch)  is 
used  to  fetch  the  4-byte  contents  of 
the  QUANTITY  variable. 

:  GET_QUANTITY  (  -  n  )  \  n  means 
a  value  is  returned  on  the  stack. 

QUANTITY  \  Return  variable 

address 

@  ;  \  Fetch  4-byte  value 

from  variable 

Both  the  variable  reference  and  @ 
are  Mach  words.  By  placing  QUANTI¬ 
TY  and  @  directly  into  SHIP,  you  can 
eliminate  the  overhead  of  the  GET 
_ QUANTITY  subroutine  call  and  ex¬ 


perience  the  benefits  of  macro 
substitution. 

Figure  3,  below  ,  shows  the  code 
generated  for  SHIP  when  the  code 
for  the  QUANTITY  variable  refer¬ 
ence  and  the  @  memory  access  are 
treated  as  assembly-language  macros 
and  laid  in-line  into  SHIP’ s  code  area. 
As  you  study  Figure  3  note  that 
Mach2  variables  are  located  relative 
to  the  address  in  the  A5  register  and 
that  the  A  6  register  is  used  to  main¬ 
tain  its  parameter  stack. 

With  macro  substitution,  the  4-byte 
JSR  QUANTITY  is  replaced  with  the  4 
bytes  of  assembly-language  code  that 
perform  the  variable  reference.  Like¬ 
wise,  the  4-byte  JSR  @  is  replaced 
with  the  4  bytes  of  assembly-lan¬ 
guage  code  that  perform  the  memo¬ 
ry  access.  SHIP  doesn’t  increase  in 
size  but  the  QUANTITY  @  part  of 
SHIP  experiences  a  57  percent  speed 
improvement. 

Code  Optimization 

Because  Forth  words  use  a  stack  for 
parameter  passing,  it  is  common  for 
the  first  instruction  in  a  Forth  word 
to  pull  a  parameter  from  the  stack 
(. MOVE.L  (A6)+,A0)  and  for  the  last 
instruction  to  put  a  parameter  on  the 
stack  (MOVE.L  AO,  —(A6) ).  When  two 
Forth  Mach  words  that  have  these 
common  beginning  and  ending  in¬ 


structions  are  butted  together,  as  in 
Figure  3,  the  result  is  the  following 
redundant  and  inefficient  code 
sequence: 


MOVE.L  A0,- (A6) 
MOVE.L  (A6)  +  ,  A0 


Fortunately,  it  is  not  hard  for  the 
compiler  to  watch  for  these  edge 
macro  conditions  and  eliminate  un¬ 
necessary  code  if  possible  (see  Figure 
4,  below).  With  the  use  of  this  edge 
macro  optimization,  the  SHIP  code 
becomes  faster  and  more  compact. 
Two  instructions,  4  bytes,  can  be  re¬ 
moved  from  the  QUANTITY  @  code 
and  the  required  clock  cycles  then 
drop  from  52  cycles  to  28  cycles. 

Conclusions 

The  advantages  of  subroutine  thread¬ 
ing  are  many — for  example: 

1.  Speed — By  removing  the  inner  in¬ 
terpreter  and  using  code  and  speed 
optimization  techniques,  subroutine- 
threaded  Forth  code  executes  three 
to  four  times  faster  than  pointer- 
threaded  Forth  code. 

2.  Sharable  code — Linking  to  other 
languages  is  simpler.  Because  all  lan¬ 
guages  involved  use  only  machine 


:  SHIP 


' 

LEA  XX  (A5  )  f  A0 

QUANTITY  a 

MOVE.L  (A0),-(A8) 

PREPARE-SHIPMENT 

- 

JSR  PREPARE_SHIPMENT 

S  END-PACKAGE 

JSR  S  END— PACKAGE 

: 

RTS 

Figure  4:  Subroutine  -  threaded  code  for  SHIP  (with  macro  substitution  and 
edge  macro  optimization) 


:  SHIP 


JSR  QUANTITY 

' 

LEAXX(A5)  ,A0 

QUANTITY 

* 

MOVE .  L  A0  ,  —  ( A6  ) 

JSR  a 

MOVE.L  <A6)  +  ,A0 

a 

* 

MOVE.L  (A0),-(A6) 

PREPARE-SHIPMENT 

— ► 

JSR  PREPARE-SHIPMENT 

— ► 

JSR  PREPARE-SHIPMENT 

SEND _ PACKAGE 

— ► 

r 

JSR  S END _ PACKAGE 

— ► 

JSR  SEND-PACKAGE 

f 

- ► 

RTS 

— ► 

RTS 

without  macro  substitution  with  macro  substitution 

Figure  3:  Subroutine-  threaded  code  for  SHIP 


34 


Dr.  Dobb's  Journal,  October  1987 

759 


68000  FORTH 

(continued  from  page  34) 


code  instructions,  only  register  sav¬ 
ing  and  setup  are  required. 

3.  Conventional — The  implementa¬ 
tion  is  easy  for  non-Forth  program¬ 
mers  to  understand  and  use. 

4.  Relocatable  code — The  use  of  PC- 
relative  subroutine  references  means 
the  code  generated  is  relocatable. 

5.  Well-suited  to  systems-level  pro¬ 
gramming — It  is  easy  for  a  subrou¬ 
tine-threaded  compiler  to  generate 
stand-alone  assembly-language  rou¬ 
tines,  such  as  those  required  for  de¬ 
vice  drivers. 

Subroutine  threading  has  disad¬ 
vantages  in  the  areas  of  size  of  gener¬ 
ated  code  and  the  ease  of  disassem¬ 
bly. 

Regarding  size,  with  the  inclusion 
of  a  subroutine  call  instruction  with 
each  reference  and  the  use  of  macro 
substitution,  the  code  generated  by  a 
subroutine-threaded  Forth  will  be 
50—100  percent  larger  than  code  gen¬ 
erated  by  a  pointer-threaded  Forth. 

The  ease  of  disassembly  is,  happily, 
not  a  problem  every  software  devel¬ 
oper  will  face.  Subroutine-threaded 
code  is  easier  to  disassemble  than 
pointer-threaded  code.  Therefore  it  is 
correspondingly  harder  to  disguise 
proprietary  algorithms  written  in  a 
subroutine-threaded  Forth. 

Bibliography 

Greene,  Ronald.  "Faster  Forth:  Re¬ 
ducing  Overhead  in  Threaded  Inter¬ 
pretive  Languages.”  Byte  (June  1984): 
127. 

Loeliger,  R.  G.  Threaded  Interpretive 
Languages.  Peterborough,  N.H.:  Byte 
Books,  1981. 

DDJ 

Vote  for  your  favorite  feature/article. 

Circle  Reader  Service  No.  2. 


Dr.  Dobb’s  Journal,  October  1987 


ARTICLES 


A  Forth 

Standard  Prelude 


by  Martin  Tracy 


Can  the  Forth-83  Standard  dia¬ 
lect  be  used  to  write  substan¬ 
tial  programs?  The  answer  is  a 
qualified  “yes.”  The  qualification  is 
that  the  program  must  be  preceded 
by  a  software  "prelude.”  The  pur¬ 
pose  of  this  prelude  is  to  provide  the 
program  with  those  facilities  that 
cannot  be  found  within  the  Standard. 
The  good  news  is  that  the  prelude 
can  be  surprisingly  short. 

Two  years  ago,  I  began  editing  pro¬ 
grams  for  the  Forth  Model  Library, 
which  is  sold  by  the  Forth  Interest 
Group  (FIG).  The  programs  had  been 
submitted  by  different  authors  and 
were  written  in  various  implementa¬ 
tions  of  the  Forth-83  Standard  dialect. 
Each  program  consisted  of  70  to  120 
screens  of  source  code.  My  task  was  to 
convert  these  programs  to  forms  that 
would  run  equally  well  under  several 
available  Forth-83  implementations. 

I  found  that  a  program  written  in 
one  Forth-83  implementation  would 
not  run  correctly  under  another.  The 
reason  for  this  was  simple.  The  au¬ 
thors  needed  software  tools  that 
could  not  be  provided  by  the  Stand¬ 
ard  but  that  were  instead  provided 
by  the  implementation.  The  solution 
to  the  problem  was  also  simple.  I 
wrote  a  software  prelude,  one  for 
each  implementation,  that  provided 
the  missing  functions.  The  user  of  a 
particular  Forth-83  package  then 
simply  loaded  these  screens  before 
loading  the  program. 

When  I  say  "could  not  be  provided 
by  the  Standard,”  I  do  not  mean 
words  that  do  not  happen  to  appear 


Martin  is  DDJ 's  Forth  columnist.  He 
can  be  reached  at  Forth  Inc.,  Ill  N. 
Sepulveda,  Manhattan  Beach,  CA 
90266. 


Forth  recurses  faster 
and  more  naturally 
than  any  other 
popular  language . 


in  the  Standard  but  that  can  be  writ¬ 
ten  using  Standard  words.  Take,  for 
example,  the  words  NIP  and  TUCK, 
which  can  be  found  in  the  Laxen/ 
Perry  F83  implementation: 

:  NIP  (  n  n2  -  n2)  SWAP  DROP  ; 

:  TUCK  (  n  n2  -  n2  n  n2)  SWAP 

OVER; 

As  you  can  see,  these  words  can  be 
defined  using  the  Forth-83  Standard 
words  SWAP  DROP  and  OVER.  Thus, 
although  NIP  and  TUCK  are  not  men¬ 
tioned  in  the  Standard,  they  can  be 
written  with  Standard  words  and 
used  in  Standard  programs.  It  is  up  to 
the  author  (or  the  hapless  editor)  to 
make  sure  these  words  are  defined 
before  use. 

Recursion 

Consider,  on  the  other  hand,  recur¬ 
sive  definitions.  The  Forth-83  Stand¬ 
ard  does  not  mention  recursion  in  its 
Required  Word  Set,  although  it  de¬ 
fines  RECURSE  in  the  Controlled  Ref¬ 
erence  Word  Set: 

RECURSE  -  C,I,83 

-  (compiling) 

"Compile  the  compilation  address  of 
the  definition  being  compiled  to 
cause  the  definition  to  later  be  exe¬ 
cuted  recursively.” 


Controlled  reference  words  are 
word  definitions  that  "although  not 
required,  cannot  be  present  with  a 
non-standard  definition  in  the  vocab¬ 
ulary  FORTH  of  a  Standard  System.” 

You  would  know  what  RECURSE 
meant  if  you  saw  it  in  a  Standard  pro¬ 
gram.  This  does  not  mean,  however, 
that  programs  using  recursion  must 
do  so  with  RECURSE.  Some  imple¬ 
mentations  prefer  the  word  MYSELF. 
Others,  like  Laxen/Perry  F83,  pro¬ 
vide  a  different  mechanism  entirely: 
words  whose  definitions  begin  with 
the  command  RECURSIVE  can  refer 
to  themselves. 

In  other  words,  the  Standard  says 
"a  word  by  this  name  has  this  func¬ 
tion  and  syntax”  rather  than  "this 
function  is  performed  by  a  word  of 
this  name  and  syntax.”  The  confu¬ 
sion  of  form  and  function  is  apparent 
in  other  places  as  well.  Consider  the 
definition  of  >NAME  in  the  "Experi¬ 
mental  Proposal:  Definition  Field  Ad¬ 
dress  Conversion  Operators”: 

>NAME  addrl  -  addr2  "to-name" 
"addr2  is  the  name  field  address  cor¬ 
responding  to  the  compilation  ad¬ 
dress  addrl.” 

The  form  or  syntax  of  this  word  is  the 
same  as  its  function:  to  take  you  from 
the  compilation  address  to  the  name 
field  address.  Because  you  don't 
know  the  structure  of  the  name  field, 
however,  you  can't  do  anything  with 
it.  The  underlying  function  you  real¬ 
ly  want  is  to  display  the  name  of  a 
word,  given  its  compilation  address. 

Is  recursion  important?  You  bet. 
Forth  recurses  faster  and  more  natu¬ 
rally  than  any  other  popular  high-lev¬ 
el  programming  language,  including 
C  and  LISP.  Why?  First  of  all,  because 


40 


Dr.  Dobb's  Journal,  October  1987 

761 


Forth  words  keep  the  majority  of 
their  arguments  on  the  stack,  which  is 
a  naturally  recursive  structure.  Sec¬ 
ond,  because  Forth  generally  does  not 
use  local  variables  and  so  has  no  stack 
frame  or  other  lexical  structure  to 
build  each  time  it  recurses. 

A  popular  belief,  even  among 
Forth  programmers,  is  that  recursion 
should  be  avoided  because  the  return 
stack  is  small.  The  Standard  guaran¬ 
tees  a  return  stack  of  only  48  bytes. 
But  this  only  means  that  ''tail-recur¬ 
sive”  problems,  such  as  those  that  vis¬ 
it  each  item  on  a  list,  should  be  unrav¬ 
elled  into  iterative  structures,  such  as 
a  DO  loop.  (Modern  LISP  compilers  do 
this  automatically.)  There  is  an  equal¬ 
ly  rich  set  of  "head-recursive”  prob¬ 
lems,  however,  that  recurse  only  to 
the  depth  of  the  problem  and  not  to 
its  breadth.  These  are  problems  such 
as  shape-filling  algorithms,  tree  tra¬ 
versal,  and  natural  language  parsing, 
which  seldom  recurse  any  deeper 
than  eight  or  nine  levels. 

Fortunately,  RECURSE  can  be  de¬ 
fined  easily  even  in  implementations 
that  do  not  include  it.  For  example: 

:  RECURSE  [COMPILE]  MYSELF  ; 

IMMEDIATE 

This  definition  is  likely  to  be  a  "one- 
liner”  even  in  implementations  that 
support  some  other  recursion  mech¬ 
anism  entirely.  Why?  Because  the 
concept  of  recursion  is  simple  and 
natural  to  the  Forth  language.  Sim¬ 
plicity  and  harmony  are  the  guide¬ 
lines  for  selecting  words  for  the 
Standard  prelude. 

Compiler  Words 

The  Forth-83  Standard  provides  a  set 
of  words  to  support  the  compiler, 
such  as  [COMPILE]  and  IMMEDIATE. 
These  words  can  also  be  used  to  ex¬ 
tend  the  compiler.  They  are,  in  fact, 
the  building  blocks  for  new  compiler 
words.  Compiler  words  are  used  in 
Forth  to  build  flow-of-control  struc¬ 
tures  and  to  "hide  and  provide”  in¬ 
line  data  in  colon  definitions.  The  ex¬ 
tensible  compiler  is  one  of  the  true 
strengths  of  the  Forth  language. 

Compiler  words  generally  have 
two  functions:  they  must  compile 
both  a  run-time  operator  and  the  in¬ 
line  data  upon  which  it  operates.  The 
run-time  operator  also  has  two  func¬ 
tions:  it  must  operate  meaningfully 


on  the  in-line  data,  and  it  must  adjust 
the  Forth  instruction  pointer  (which  I 
will  call  I)  to  skip  over  this  data. 

Unfortunately,  the  Standard  does 
not  provide  for  the  creation  of  new 
run-time  words.  A  run-time  word 
has  no  Standard  way  of  finding  or 
skipping  over  the  following  in-line 
data.  Consider  how  the  word  LITER¬ 
AL  might  be  defined: 

:  lit  (  -  n)  R>  DUP  @  SWAP  2+  >R  ; 

:  LITERAL  COMPILE  lit ,  ; 

IMMEDIATE 

LITERAL  compiles  the  run-time 
word  lit,  followed  by  the  number  on 
the  stack.  Because  lit  is  a  colon  defini¬ 
tion,  you  might  expect  R>  to  move 
lit' s  return  address  to  the  data  stack. 
Furthermore,  the  return  address  of 
lit  should  be  the  address  of  the  in-line 
data  that  follows,  right?  Actually,  this 
definition  of  lit  will  work  correctly 
on  a  great  many  Forth  implementa¬ 
tions.  Not  all  Forths  increment  the  in¬ 
struction  pointer  /  to  point  to  the  next 
address,  though.  Some  increment  it 
only  on  demand,  reasoning  that  the 
increment  is  wasted  when  it  pre¬ 
cedes  a  branch.  Others  compromise 
and  only  increment  it  to  point  to  the 
next  byte.  In  these  cases,  R>  points 
near  to  but  not  directly  at  the  in-line 
data  and  must  be  adjusted.  The  ad¬ 
justment  can  be  hidden  in  the  way 


shown  in  Example  1,  below.  Experi¬ 
ence  has  shown  that  this  solves  the 
problem  in  almost  all  cases.  I>  (/- 
from)  and  >/  (to-/ )  have  even  been 
written  for  a  Forth  with  a  16-bit  data 
stack  width  and  a  32-bit  return  stack 
width  and  have  worked  correctly. 

Alignment 

Now  consider  an  often  requested 
function — the  in-line  string  compil¬ 
er.  Usually  called  simply  "  (double 
quote),  it  might  be  implemented  in 
this  way: 

:  (")  (  -  addr  n)  I>  @  COUNT  2DUP 

+  >i ; 

:  *  COMPILE  (")  34  WORD 

DUP  C@  (  n  )  1  +  >R 
COUNT  HERE  SWAP  CMOVE  R> 
ALLOT  ;  IMMEDIATE 

:  EXAMPLE 

"  EXAMPLE  prints  a  string."  TYPE  ; 

This  definition  of  "  works,  in  prin¬ 
ciple,  on  strings  with  an  odd  number 
of  characters.  Otherwise,  on  a  byte- 
addressed  machine  with  even  ad¬ 
dress  or  "cell”  alignment,  the  def¬ 
inition  fails.  On  some  Motorola  68000 
Forth  implementations,  the  failure 
is  fatal. 

You  can  easily  define  a  pair  of 
words  to  hide  address  alignment, 
provided  you  are  able  to  make  one 
simplifying  assumption:  Forths  that 


:  I>  COMPILER)  ;  IMMEDIATE  (  no  offset) 

:  >1  COMPILE >R  ;  IMMEDIATE  (  no  off  set) 

:  I)  R)  R>  1 +  SWAP  >R  ;  IMMEDIATE  (  one-byte  off  set ) 
:  >1  R)  SWAP  1-  >R  >R  ;  IMMEDIATE  (  one-byte  off  set) 

:  I>  R>  R>  2  +  SWAP  >R  ;  IMMEDIATE  (  two-byte  of  f  set ) 
:  >1  R>  SWAP  2-  >R  >R  ;  IMMEDIATE  (  two-byte  off  set) 

:  lit  (  -  n)  I)  DUP  3  SWAP  2+  >1  ; 

:  LITERAL  COMPILE  lit  ,  ;  IMMEDIATE 


Example  1:  A  way  to  hide  the  adjustment  of  R> 


:  ALIGN  ;  IMMEDIATE  (  no  alignment) 

:  REALIGN  ;  (  no  alignment) 

:  ALIGN  HERE  1  AND  ALLOT  ;  IMMEDIATE  (  cell-alignment) 
:  REALIGN  (  a  -  a’  )  DUP  1  AND  +  ;  (  cell-alignment) 

:(")(-  addr  n)  I>  3  COUNT  2DUP  +  REALIGN  >1  ; 

:  "  COMPILE  (")  34  WORD  DUP  C3  (  n  )  1  +  >R 
COUNT  HERE  SWAP  CMOVE  R)  ALLOT  ALIGN  ;  IMMEDIATE 


Example  2:  A  definition  of  "  when  the  dictionary  is  aligned 


Dr.  Dobb’s  Journal,  October  1987 

762 


41 


FORTH  PRELUDE 

(continued  from  page  41) 


use  cell  alignment  keep  the  dictio¬ 
nary  in  an  aligned  state  at  all  times. 
Even  if  the  dictionary  is  only  aligned 
while  compiling  a  definition,  you  can 
define  "  with  ALIGN  and  REALIGN  as 
shown  in  Example  2,  page  41. 

You  might  argue  that  the  word  " 
should  be  in  the  Standard  prelude  in¬ 
stead  of  the  ALIGN  and  REALIGN 
pair.  The  trouble  with  "  is  that  it’s  not 
simple  enough.  It  leaves  you  with  no 
way  to  compile  a  sequence  of  non- 
printable  control  characters  or  to 
align  a  CREATE .  . .  DOES>  structure. 
Furthermore,  there  is  no  general 
agreement  among  implementors  as 
to  whether  "  should  return  the  one- 
argument  address  of  a  counted  string 
or  the  two-argument  address  of  the 
first  character  and  its  length — a  form 
suitable  for  TYPE.  By  redefining  "  in 
the  Standard  prelude,  you  can  guar¬ 
antee  its  syntax. 

Interpreting  a  String 

The  Forth-83  Standard  describes  the 
terminal  input  buffer  in  such  a  way 
that  you  might  think  that  if  you 
CMOVE  a  string  into  TIB,  store  its 
length  into  #TIB,  and  set  BLK  and 
>IN  to  0,  then  you  will  force  the 
Forth  text  interpreter  to  interpret  the 
string.  Why  would  you  want  to  do 
that?  Well,  it  is  handy  to  be  able  to 
compile  a  word  such  as  FIND  or  FOR¬ 
GET  in  a  colon  definition  along  with 
its  argument. 

Unfortunately,  these  words  are  de¬ 
fined  to  read  their  arguments  from 
the  input  stream.  How  nice  if  you 
could  compile  the  input  stream  as 
well: 

:  ME  ( Initialize  system,  then. .  .) 

"  FORGET  ME"  EVAL  ; 

I  am  assuming  that  EVAL  does 
the  work  of  moving  the  string 
into  TIB  and  so  on.  In  the  same 
manner,  you  could  use  FIND  to  see 
if  a  particular  word  is  present  in 
the  dictionary: 

:  WELL?  "  FIND  ME"  EVAL  IF  .  .  . 

You  could  also  create  words  from 
within  other  words  and  reference 
words  before  they  are  defined.  The 
fundamental  right  to  treat  (string) 


data  as  an  executable  program  is 
guaranteed  by  the  Von  Neumann 
architecture. 

I  have  borrowed  the  word  EVAL 
from  the  LISP  function  by  the  same 
name.  The  problem  with  EVAL  is  that 
it’s  not  simple  enough.  The  Standard 
already  lets  you  set  up  TIB  #TIB  >IN 
and  BLK.  But  it  provides  you  with  no 
way  to  invoke  the  text  interpreter  to 
interpret  it.  The  function  you  are 
missing  is  INTERPRET.  Like  RE¬ 
CURSE,  it  is  found  in  the  Controlled 
Reference  Word  Set: 

INTERPRET  -  M,83 
"Begin  text  interpretation  at  the 
character  indexed  by  the  contents  of 
>IN  relative  to  the  block  number 
contained  in  BLK,  continuing  until 
the  input  stream  is  exhausted.  If  BLK 
contains  zero,  interpret  characters 
from  the  text  input  buffer.” 

Given  INTERPRET,  you  can  now 
define  a  simple  EVAL: 

:  EV AL  (  a  n)  DUP  >  R  TIB  SWAP 

CMOVE  R@  #TIB  ! 

0  >IN  !  0  BLK !  INTERPRET  R>  >IN 


The  sequence  R>  >IN  !  marks  the 
input  stream  as  exhausted.  I  have 
chosen  the  two-argument  string 
form.  With  INTERPRET,  you  can  im¬ 
plement  either  string  form. 

Screen  Display 

The  six  words  RECURSE,  INTER¬ 
PRET,  />,  >1,  ALIGN,  and  REALIGN 
supply  the  most  often  requested  non¬ 
standard  functions.  The  most  often 
requested  extension,  however,  is  for 
video  screen  control.  Virtually  all 
available  Forth-83  implementations 
allow  the  programmer  to  control  the 
appearance  and  cursor  position  of  the 
video  display.  A  smart  presentation 
can  mean  more  to  a  program  than  a 
string  stack  or  floating-point  math. 

Although  video  displays  vary 
widely,  modern  displays  have  been 
standardized  to  a  gratifying  extent. 
You  can,  in  fact,  add  a  fairly  good 
screen  display  extension  to  the  Stand¬ 
ard  prelude  if  you  follow  a  few  sim¬ 
plifying  assumptions  and  rules: 

1.  Assume  the  screen  is  at  least  80 
characters  wide  and  24  lines  high. 

2.  Define  the  word  PAGE  to  clear  the 
entire  screen  and  home  the  cursor  to 


FORTH  PRELUDE 

( continued  from  page  43) 


the  upper  left-hand  corner. 

3.  Define  the  word  TAB  (  ?cy  )  to  posi¬ 
tion  the  cursor  at  the  given  y  (charac¬ 
ter)  and  y  (line)  coordinates.  Coordi¬ 
nate  pair  (0,0)  is  in  the  upper  left- 
hand  corner. 

4.  Define  the  word  MARK  (an)  to 
print  the  given  two-argument  char¬ 
acter  string  in  highlight  or  inverse 
video. 

5.  Never  write  into  or  over  column 
79.  Never  issue  a  carriage  return 
from  row  23.  In  other  words,  you  do 
not  face  the  issues  of  wrapping  the 
line  or  scrolling  the  display. 

It  turns  out  that  these  five  restrictions 
are  sufficient  to  generate  some  really 
nice  displays. 

Ceil  Addressing 

The  Forth-83  Standard  describes  only 
implementations  on  byte-addressed 
CPUs  with  a  64K  address  space.  Fu¬ 
ture  Forth  standards  are  likely  to  con¬ 
sider  much  larger  address  spaces. 
There  are  already  several  32-bit 
Forth  implementations. 

A  Standard  Forth  implementation 
assumes  that  there  are  2  bytes  per 
cell,  and  Standard  programs  are 
filled  with  2  +  s  and  2* s  accordingly. 
On  a  byte-addressed  32-bit  Motorola 
68000  implementation,  however, 
there  are  4  bytes  per  cell.  On  a  cell- 
addressed  Novix  4016  or  Texas  Instru¬ 
ments  TMS32020,  there  may  be  only  1 
byte  per  cell.  The  number  of  bytes 
per  cell  is  used  mostly  to  specify  how 
much  dictionary  memory  to  allocate 
or  how  to  skip  to  the  next  cell  of  a 
data  structure. 

It  can  be  hidden  from  an  applica¬ 
tion  with  CELL,  CELLS,  and  CELLT  as 
shown  in  Example  3,  page  45. 

Byte  order  within  a  cell  is  normally 
not  a  problem.  Bytes  that  are  written 
by  byte  operators  should  be  read  by 
byte  operators.  Be  careful  when  you 
define  byte  operators  that  are  based 
on  cell  operators  to  make  them  inde¬ 
pendent  of  the  byte  order. 

An  Experimental  Proposal 

The  Forth  community  has  long  been 
searching  for  a  solution  to  a  classic 
programming  problem:  the  string 
search.  When  you  search  for  charac¬ 
ters  in  a  string,  you  generally  use  a 


Dr.  Dobb's  Journal ,  October  1987 


43 

763 


DO  loop.  You  leave  the  loop  in  one  of 
two  circumstances: 

1.  The  search  is  successful.  You  leave 
the  loop  immediately. 

2.  The  search  is  unsuccessful.  You 
leave  the  loop  because  it  is  exhausted. 

The  problem  is  that  once  you  have 
left  the  loop,  how  do  you  know  if  the 
search  was  successful? 

One  solution  is  to  maintain  a  flag  on 
the  stack: 

:  SEARCH  ...  0  (  flag  )  ROT  ROT 
DO  DROP  .  .  .  (  compare  strings  )  = 
IF  ...  -1  LEAVE  THEN  0  (  flag  ) 
LOOP; 

If  the  search  is  successful,  the  flag 
will  be  true. 

Leo  Brodie,  Wil  Baden,  and  others 
point  out  that  a  much  better  ap¬ 
proach  is  to  leave  the  loop  and  the 
word  that  contains  it  when  the 
search  is  successful: 

:  SEARCH . .  . 

DO  .  .  .  (  compare  strings  )  = 

IF  .  .  .  —  1  LEAP  (  leave  word  en¬ 
tirely)  THEN 

LOOP  0 ; 

LEAP  is  Leo  Brodie's  suggestion,  but  it 
has  the  usual  problem:  it's  not  simple 
enough. 

Wil  Baden’s  solution  looks  like  this: 

:  SEARCH .  .  . 

DO  . .  .  (  compare  strings )  = 

IF  ...  -1  UNDO  EXIT  (  leave 
word  entirely)  THEN 

LOOP  0 ; 

The  command  UNDO  “undoes”  the 
loop  by  discarding  the  index,  limit, 
and  any  other  loop  items  on  the  re¬ 
turn  stack  before  leaving  the  word 
with  EXIT.  UNDO  has  the  additional 
charm  that  it  can  leave  a  word  from  a 
nested  loop,  as  in  UNDO  UNDO  EXIT. 

UNDO  could  be  defined  as  a  colon 
definition  in  this  way: 

:  UNDO  I>  R>  R>  2DROP  >1  ;  (dis¬ 
card  2  items) 

:  UNDO  I>  R>  R>  2DROP  R>  DROP 
>1 ;  ( discard  3  items) 

Actually,  UNDO  is  more  easily  de¬ 
fined  as  a  CODE  definition  and  in 
some  implementations  is  only  one 


instruction. 

In  Summary 

The  Forth  Standard  prelude  de¬ 
scribed  in  this  article  (see  Listing  One, 
page  90)  has  proven  to  be  an  effective 
way  to  write  substantial  programs 
that  run  unchanged  on  several  dif¬ 
ferent  implementations  of  the  Forth- 
83  Standard.  The  prelude  can  hide 
differences  in  byte  addressing  and 
cell  alignment  from  the  application 
programmer.  As 
more  experience  is 
gained,  it  may  be  pos¬ 
sible  to  extend  the 
prelude  to  hide  ROM- 
ability  and  other  im¬ 
plementation  depen¬ 
dencies. 

Availability 

All  the  source  code 
for  articles  in  this  is¬ 
sue  is  available  on  a 
single  disk.  To  order, 
send  $14.95  to  Dr. 

Dobb’s  Journal,  501 
Galveston  Dr.,  Red¬ 
wood  City,  CA  94063, 
or  call  (415)366-3600. 


Bibliography 

Baden,  Wil.  “Escaping  Forth.”  1986 
FORML  Proceedings.  Available  from 
the  Forth  Interest  Group  ([408]  277- 
0668). 

Brodie,  Leo.  Thinking  Forth.  Engle¬ 
wood  Cliffs,  N.J.:  Prentice-Hall,  1984. 

DDJ 

(Listing  begins  on  page  90.) 

Vote  for  your  favorite  feature/article. 

Circle  Reader  Service  No.  3. 


2  CONSTANT  CELL  (  byte-addressed  16-bit  cells) 
:  CELLS  2*  ;  (  sizeof  cell  area  in  bytes  ) 

:  CELL+  2+  ;  (  skip  to  the  next  cel  1  address  ) 

4  CONSTANT  CELL  (  byte-addressed  32-bit  cells  ) 
:  CELLS  t  *  ;  (  sizeof  cell  area  in  bytes ) 

:  CELL-F  i(  +  ;  (  skip  to  the  next  cell  address  ) 

1  CONSTANT  CELL  (  cell-addressed one-byte-pt.r 

-cell ) 

:  CELLS  ;  (  size  of  cell  area  in  bytes  ) 

:  CELL+  1+;  (  skip  to  the  next  cell  address  ) 

CELL  ALLOT  (  allocate  a  cell ) 

10CELLS ALLOT  (  al locate  1 0  cells ) 

(  addr  )  DO  .  .  .  CELL+  (  skip  to  next  cell )  LOOP 


Example  3:  A  way  to  hide  the  number  of  bytes  per  cell 


Dr.  Dobb's  Journal,  October  1987 

764 


45 


ARTICLES 


Pattern  Matching  Using 
Finite  State  Machines 

by  Charles  F.  Bowman 


If  you’re  a  programmer,  the 
chances  are  you’ve  had  to  deal 
with  the  problem  of  pattern 
matching.  It  can  be  a  simple  problem 
solved  with  a  trivial  string  compari¬ 
son  utility,  or  it  can  be  so  complex  as 
to  require  the  use  of  a  lexical  analyz¬ 
er. 

Most  people  deal  with  pattern 
matching  every  day:  as  part  of  a 
search-and-replace  operation  in  a 
text  editor,  as  a  means  of  data  retriev¬ 
al  in  a  database  application,  as  unique 
identifiers  in  program  source  code, 
and  so  on.  Because  the  task  is  so  com¬ 
mon,  it’s  worthwhile  to  examine 
ways  to  perform  it  as  efficiently  as 
possible.  This  article  describes  how 
state  machines  can  be  useful  in  solv¬ 
ing  this  type  of  problem. 

To  demonstrate  the  use  and  effec¬ 
tiveness  of  state  machines,  this  article 
provides  the  source  code  for  a  pro¬ 
gram  called  findcmd.  The  program 
does  just  what  the  name  implies — it 
finds  commands.  When  invoked,  the 
program  searches  each  component 
of  the  user's  path  variable  for  all  pro¬ 
grams  (files)  that  match  the  supplied 
pattern  arguments.  The  pattern 
string  can  contain  wildcard  charac¬ 
ters  just  like  those  the  DOS  command 
shell  accepts.  The  program  uses  an 
extension  of  the  Knuth-Morris-Pratt 
(KMP)  algorithm,  which  implements 
pattern  matching  using  a  finite  state 
machine — more  on  this  later. 

State  Machines 

A  complete  and  formal  discussion  of 


Charles  F.  Bowman,  24  Jacques  Ave., 
Staten  Island,  NY  10306.  Charles  is  a 
consultant  and  is  currently  writing  a 
textbook  on  data  structures.  He  holds 
an  M.S.  degree  from  New  York 
University. 


The  program  searches 
each  component  of 
the  path  variable 
for  all  files  that 
match  the  pattern. 


Turing  machines,  automata  theory, 
and  so  on  is  beyond  the  scope  of  this 
article.  For  a  more  detailed  discus¬ 
sion  of  the  topic,  see  the  bibliogra¬ 
phy- 

In  general,  a  state  machine  has  the 
following  attributes: 

•  A  finite  set  of  states,  including  an  ini¬ 
tial,  or  start,  state  and  a  stopping,  or 
end,  state. 

•  A  finite  set  of  state  transitions  (a  col¬ 
lection  of  moves  the  machine  can 
make).  Generally,  the  transitions  are 
represented  as  a  two-dimensional  ar¬ 
ray,  indexed  on  one  axis  by  the  cur¬ 
rent  state  number  and  on  the  other 
axis  by  the  current  input  token. 

•  A  set  of  input  symbols  (alphabetical) 
on  which  the  state  transitions  will  oc¬ 
cur  (for  my  purposes,  I  define  this  as 
the  file  names  that  will  be  compared 
with  the  pattern  arguments). 

•  A  read  head  that  points  to  the  next 
available  input  character. 

The  basic  operation  of  a  state  ma¬ 
chine  can  be  described  as  follows. 
Upon  invocation,  the  machine  is  ini¬ 
tialized  to  the  begin  state.  It  then  iter¬ 
atively  examines  the  current  input 
symbol  and,  using  the  set  of  state 
transitions,  progresses  or  moves  to 
the  next  state,  advancing  the  read 
head  as  required.  It  continues  this 
process  until  it  encounters  either  an 
error  condition  or  an  accepting  state. 


The  actual  transition  decisions  are  ac¬ 
complished  by  what  is  effectively  a 
table  lookup.  Both  the  current  state 
and  current  input  token  are  used  to 
index  into  the  transition  table  to  de¬ 
termine  the  resulting  action  of  the 
machine.  The  range  of  actions  in¬ 
cludes  stop  (error),  accept,  and  move 
(to  new  state). 

The  stop  action  occurs  when  the 
current  input  token  is  invalid  with  re¬ 
spect  to  the  current  state.  This  situa¬ 
tion  raises  an  error  condition  that  usu¬ 
ally  results  in  the  machine  halting  (in 
a  compiler,  for  example,  this  would 
typically  be  the  point  at  which  you 
would  receive  a  message  such  as  "syn¬ 
tax  error  line  16”).  The  accept  action 
occurs  only  when  the  machine  is  in  a 
valid  halt  state  and  has  exhausted  the 
input  stream  (the  read  head  points  to 
the  end  of  a  file).  (In  a  compiler,  this 
would  mean  your  source  code  was 
syntactically  correct.)  The  intermedi¬ 
ate  transition  states  comprise  the  re¬ 
mainder  of  the  table;  they  move  the 
machine  from  state  to  state  based  on 
the  input  stream. 

There  are  two  important  points  to 
note  here.  One  is  that  the  machine  is 
not  required  to  absorb  an  input  sym¬ 
bol  (advance  the  read  head)  with  each 
state  transition.  It  is  perfectly  accept¬ 
able  for  the  machine  to  accomplish 
multiple  state  transitions  with  the 
same  input  symbol  remaining  cur¬ 
rent.  The  other  point  is  that,  by  the 
very  virtue  of  the  state  transitions,  the 
machine  always  "knows”  what  it  has 
seen  previously.  In  other  words,  you 
could  say,  "If  the  machine  is  in  state  X, 
then  the  last  N  characters  have  to  be 
the  following  .  .  .  .”  This  is  an  ex¬ 
tremely  important  characteristic  of  a 
state  machine  because  it  affords  it  the 
luxury  of  discarding  input  symbols 
once  it  has  used  them.  (If  need  be,  the 


46 


Dr.  Dobb's  Journal,  October  1987 

765 


machine  can  reconstruct  the  input 
stream  for  the  last  N  characters  just 
from  knowing  the  current  state.) 

Uses  of  State  Machines 

State  machines  have  many  uses.  The 
most  common,  as  mentioned  earlier, 
is  in  compiler  writing,  where  they 
are  typically  used  in  the  lexical  anal¬ 
ysis  and  the  parsing  phases  of  compi¬ 
lation.  There  are  also  programs  (most 
notably  YACC)  that  produce  a  state 
machine  from  a  formal  definition  of 
a  language.  Database  management 
systems  (query  languages)  also  rely 
heavily  upon  state  machines. 

I  have  used  state  machines  on  nu¬ 
merous  occasions,  most  recently 
when  I  was  asked  to  write  a  program 
that  would  extract  information  selec¬ 
tively  from  a  continuous,  on-line  data 
stream.  The  input  was  being  pro¬ 
duced  by  a  PBX  that  generated  call- 
usage  and  call-summary  reports.  The 
application  required  that  only  select¬ 
ed  data,  from  a  subset  of  the  reports, 
be  extracted  and  stored  for  further 
processing.  The  obvious  difficulty 
was  remembering,  at  any  given 
point,  which  of  the  many  reports 
were  being  read  and  what  data  to  ex¬ 
tract.  I  was  able  to  write  such  a  pro¬ 
gram  quickly  and  efficiently  by  mod¬ 
eling  the  events  within  the 
framework  of  a  state  machine. 

KMP  Algorithm 

The  Knuth-Morris-Pratt  (KMP)  algo¬ 
rithm  accomplishes  pattern  match¬ 
ing  through  the  use  of  a  state  ma¬ 
chine.  Using  this  technique,  you  can 
efficiently  construct  a  finite  automa¬ 
ton  for  a  given  pattern  string.  More¬ 
over,  you  can  then  use  the  machine 
to  test  quickly  for  an  occurrence  of 
the  pattern  in  subject  strings.  The  al¬ 
gorithm  is  really  divided  into  two  sec¬ 
tions:  the  first  produces  the  state  ma¬ 
chine  (transition  table)  derived  from 
its  pattern  arguments;  the  second 
compares  the  compiled  pattern  with 
subject  strings. 

The  transition  table  is  constructed  in 
a  straightforward  manner.  It  has  an 
initial  or  start  state,  followed  by  one  or 
more  transition  states  that  are  derived 
directly  from  the  pattern  string.  A  one- 
to-one  correspondence  exists  between 
the  pattern  characters  and  the  gener¬ 
ated  machine  states.  These  are  then 
followed  by  an  accept  state.  Refer  to 
the  function  inits(  )  in  Listing  Three, 


page  106,  for  an  example. 

The  second  part  of  the  algorithm 
uses  the  transition  table  to  make  com¬ 
parisons  with  input  strings.  It  begins 
processing  in  the  start  state  and  then 
iteratively  compares  the  current 
state  information  with  the  corre¬ 
sponding  subject  string  character.  If 
the  comparison  shows  them  to  be 
equal,  the  machine  moves  to  the  next 
state;  if  they  are  not  equal,  the  strings 
are  not  identical  and  the  machine 
halts.  If  the  machine  reaches  the  ac¬ 
cept  state  at  the  same  time  as  it  ex¬ 
hausts  the  subject  string,  it  halts  and 
accepts  the  input  (the  pattern  and 
subject  match). 

Let’s  take  a  close  look  at  the  opera¬ 
tion  of  the  algorithm.  For  this  exam¬ 
ple,  assume  a  pattern  string  P  =  pj  p2 
p3 . . .  pn  and  an  input  string  I  =  it  i2  i3 
. . .  in.  The  machine  begins  in  state  0 
with  its  read  head  pointing  at  i2. 

If  the  first  input  token,  i2,  is  not 
equal  to  p,,  then  the  machine  remains 
in  state  0.  If  i,  =  p1(  then  the  machine 
advances  to  state  1.  In  both  cases,  the 
read  head  is  advanced  and  the  cur¬ 
rent  input  token  is  discarded.  The  ma¬ 
chine  continues  in  this  fashion  as  long 
as  the  current  input  symbol  matches 
the  pattern  character  of  the  current 
state. 

To  generalize,  suppose  that,  after 
having  read  the  input  symbols  h  i2  i3 
. . .  ik,  the  machine  is  in  state  j.  That 
means  that  the  last  j  tokens  of  the  in¬ 
put  stream  are  equal  to  pi  p2  p3  • . .  Pj, 
the  first  j  tokens  of  the  pattern  string. 
If  ik+i  =  Pj+i,  then  the  machine  en¬ 
ters  state  j+l  and  advances  the  read 
head.  If  ik+i  <>  Pj+i,  then  the  ma¬ 
chine  must  recover — that  is,  it  must 
begin  to  look  for  an  occurrence  of  the 
pattern  string  at  the  next  logical  in¬ 
put  position.  It  cannot  however,  just 


Figure  1:  Before  the  comparison 


blindly  enter  state  0  and  resume  pro¬ 
cessing  with  the  next  (current)  input 
token;  it  could  miss  an  occurrence  of 
the  pattern  beginning  at  locations 
i(k-ji+i  or  i(k-j)+2-  And,  although  possi¬ 
ble  (as  stated  previously),  it  is  ex¬ 
tremely  inefficient  to  have  the  ma¬ 
chine  reconstruct  and  reprocess 
portions  of  the  input  stream.  What 
the  machine  must  do,  therefore,  is 
"shift”  the  pattern  forward  so  that  it 
lines  up  with  some  portion  of  the  in¬ 
put  stream  already  processed. 

Figure  1,  below,  shows  an  example 
of  how  this  works.  If  the  next  subject 
character,  X,  is  not  an  A,  then  the  state 
machine  would  move  the  pattern  for¬ 
ward  as  in  Figure  2,  below.  Conse¬ 
quently,  it  saves  the  expense  of  hav¬ 
ing  to  compare  the  pattern  beginning 
at  the  first  B  in  the  subject  string.  Also, 
because  the  states  effectively  "re¬ 
member”  the  last  n  symbols,  the  ma¬ 
chine  is  not  forced  to  backtrack  over 
the  input  stream.  (As  mentioned  earli¬ 
er,  this  feature  lets  the  machine  dis¬ 
card  input  tokens  once  it  has  read 
them.) 

To  accomplish  this  algorithmically, 
the  machine  employs  a  failure  func¬ 
tion,  f(j),  which  is  defined  as  return¬ 
ing  the  largest  s  (smaller  than  j)  such 
that  pi  pz  Ps  ■  •  -  Ps  is  a  suffix  of  Pj_s+i 
Pi  -S+2  Pj-s  +  3  •  •  •  Pj.  That  is,  Pi  p2  p3  .  •  ■ 
Ps  Pj-s+1  Pj-s+2  Pj-s+3  •  ■  •  Pj- 

Before  I  demonstrate  how  to  com¬ 
pute  the  failure  function,  I'll  explain 
how  it  will  be  used.  Given  the  pattern 
string  P=aabbaab,  the  values  of  the 
failure  function  will  be  as  shown  in 
Figure  3,  below. 

Suppose  that  the  machine  is  again 
in  state  j,  having  read  i3  i2  i3 . . .  ik.  Fur¬ 
ther,  suppose  that  ik+i  <>  Pj+i.  The 
machine  will  apply  the  failure  func¬ 
tion  in  the  following  manner: 


Figure  2:  After  the  comparison 


STATE  Ill2l3l4l5l6l7l 


F(STATE)  I0lll0l0lll2l3l 


Figure  3:  Values  of  the  failure  function  for  the  pattern  string  P=aabbaab 


Pattern: 

Subject: 


IABCABCI  A . 
.  .  .IABCABCI  X 


Pattern:  I  ABC  I  ABCA . 

Subject:  .  .  .  ABO  ABC  I  X  .  . . 


Dr.  Dobb's  Journal,  October  1987 

766 


47 


PATTERN  MATCHING 

(continued  from  page  47) 

Step  1:  If  u  =  f(j)  and  ik+1  =  pu+i,  the 
machine  enters  state  u+1  and  ad¬ 
vances  the  read  head. 

Step  2:  If  f(j)  <>  0,  then  j  =  f(j)  and 
repeat  step  1. 

Step  3:  If  f(j)  =  0  and  ik+1  <>  Pi,  then 
the  state  is  reset  to  0  and  the  read 
head  is  advanced. 

For  example,  given  a  pattern 
P  =  aabbcc  and  an  input  I  =  aab- 
baabbcc,  the  machine  would  under¬ 
go  the  sequence  of  transitions  shown 
in  Figure  4,  right.  Notice  that  in  step  5 
the  pattern  was  shifted  to  the  third  a 
of  the  input  stream,  where  the 
search  was  resumed. 

The  failure  function  is  implement¬ 
ed  as  a  table  that  is  created  in  a  man¬ 
ner  analogous  to  its  use.  It  begins 
with  f(l)  =  0,  by  definition.  The  next 
steps  are  easiest  to  explain  by  way  of 
an  example. 

Suppose  you  have  computed  f(l), 
f(2),  f(3),  .  .  .,  f(j)  and  that  f(j)  =  i.  To 
compute  f(j  +  l),  you  compare  pj+1 
with  pi+1.  If  they  are  equal,  then 
f(j  + 1)  =  f(j)  +  1.  This  is  because,  p,  p2 

P3  •  ■  ■  Pi  Pi+1  =  Pj-i+1  Pj-i+2  Pj-i  +  3  ■  ■  ■  Pi 
pj+1.  If  pj+1  <>  pi+i,  set  j  =  f(j)  and 
repeat  the  previous  step.  Continue  in 
this  manner  until  a  given  pj+1  =  pi+i 
or  j  =  0.  Example  1,  page  50,  contains 
a  pseudocode  description  of  the  algo¬ 
rithm. 

Modifications  to  the 
KMP  Algorithm 

I  wanted  to  use  wildcard  characters 
in  my  findcmd  program,  and  be¬ 
cause  I  could  assume  a  fixed-length 
subject  string  (the  length  of  DOS  file 
names),  I  dispensed  with  the  failure 
states.  (In  applications  in  which  the 
subject  strings  are  lengthy,  failure 
states  greatly  increase  the  efficiency 
of  the  algorithm  and  should  be  im¬ 
plemented.)  I  also  needed  to  use  a 
backtracking  facility  to  implement 
the  asterisk  operator — I  will  ex¬ 
plain  more  about  this  when  I  dis¬ 
cuss  the  program  itself. 

I  also  took  the  liberty  of  changing 
the  interpretation  of  the  wildcard 
characters  and  the  DOS  notion  of  the 
dot  (.)  file  name  extension.  As  in  DOS, 
a  question  mark  (?)  in  a  pattern  posi¬ 
tionally  matches  any  one  character 
in  a  subject  string.  The  asterisk  (*), 


however,  functions  as  a  true  regular- 
expression  operator  (a  la  Unix), 
matching  zero  or  more  characters. 
For  example,  the  pattern  m*a*y 
matches  may,  maay,  and  myayy, 
whereas  the  pattern  m?a?y  matches 
only  myayy.  Note  that  you  can  use 
more  than  one  asterisk  in  a  pattern 
(as  long  as  they  are  not  juxtaposed). 
Finally,  a  dot  (.)  in  a  file  name  is  not 
treated  specially;  it  is  handled  in  the 
same  manner  as  is  any  other  valid  file 
name  character. 


Implementation 

I  have  divided  the  source  code  for 
findcmd  into  three  modules,  both  as 
an  aid  to  presentation  and  to  simplify 
development.  The  file  main.c  (Listing 
One,  page  92)  contains  the  driving 
code  of  the  program.  It  loops  through 
each  pattern  (parameter)  supplied  on 
the  command  line;  compiles  that  pat¬ 
tern  into  a  state  machine;  then  steps 
through  each  component  of  the  path, 
comparing  the  compiled  pattern  with 
every  file  contained  in  that  directory. 

Notes  I 


Step 

Input 

Curr 

State 

Trans 

State 

0 

— 

0 

- 

1 

a 

0 

1 

2 

a 

1 

2 

3 

b 

2 

3 

4 

b 

3 

4 

5 

a 

4 

1 

6 

a 

1 

2 

7 

b 

2 

3 

8 

b 

3 

4 

9 

c 

4 

5 

10 

c 

5 

6 

11 

— 

H 

- 

Initial 


Failure 


Accept 


Figure  4:  The  sequence  of  transitions  given  a  pattern  P=aabbcc  and  an 
input  I—aabbaabbcc 


Dr.  Dobb’s  Journal,  October  1987 


49 

767 


PATTERN  MATCHING 

(continued  from  page  49) 

It  then  displays  every  file  that  success¬ 
fully  matched  the  pattern  string  in  a 
manner  analogous  to  Unix's  Is  —l 
command. 

There  are  two  important  points  to 
note.  The  program  inserts  the  cur¬ 
rent  directory  (.)  at  the  beginning  of 
the  path  string  to  mimic  the  order  in 
which  DOS  searches  for  commands. 
You  should  omit  this  in  Unix  ports  of 
the  program.  The  program  also  tests 
for  a  trailing  backslash  in  several 
places  in  the  code.  This  is  largely  for 
aesthetic  reasons  but  it  also  mitigates 
DOS’  tendency  to  "cough”  when  pre¬ 
sented  with  double  backslashes  in  its 
directory  search  functions. 

Main.c  also  contains  several  ancil¬ 
lary  routines.  The  function  ne^tdirf ) 


parses  each  directory  segment  of  the 
path  variable  into  individual  strings; 
putfilef )  prints,  in  a  formatted  man¬ 
ner,  all  the  pertinent  information 
about  each  successfully  matched  file; 
and  fdosdte(  )  converts  an  internal 
DOS  date  into  a  formatted  ASCII 
string. 

The  second  module,  dos.c  (Listing 
Two,  page  94),  contains  the  two  DOS- 
dependent  functions  firstff  )  and 
ncx-tfJ  ),  which  are  used  to  access  DOS 
Find  First  ( 4EH )  and  Find  Next  ( 4FH ) 
system  calls.  In  addition,  firstff  )  sets 
the  disk  transfer  address  to  point  to  a 
C  structure  using  the  DOS  system  call 
Set  Disk  Transfer  Address  ( 1AH ).  In 
order  to  ensure  the  validity  of  the 
pattern  matching  operation,  you 
need  to  gain  access  to  every  file  in  a 
directory.  You  therefore  need  to  set 
the  DOS  search  pattern  to  *.  *  (match 


all)  and  set  the  CX  register  to  request 
all  types  of  files  (system,  hidden,  and 
so  on).  This  module  contains  the  only 
compiler-dependent  piece  of  code  in 
the  program. 

The  third  file,  state.c  (Listing  Three) 
contains  the  code  for  the  routines  in- 
its(  )  and  state(  ).  The  first  function, 
inits(  ),  compiles  the  pattern  strings 
into  a  state-machine  format.  It  func¬ 
tions  in  a  straightforward  manner 
and  is  simple  to  understand.  The 
other  function,  statef  ),  is  slightly 
more  subtle.  The  basic  operation  of 
the  routine  is  simple:  while  in  each 
state,  it  compares  the  search  string 
with  the  pattern  and,  if  they  are 
equal,  effects  a  transition  to  the  next 
state;  if  the  search  string  and  pattern 
are  not  equal,  it  halts  and  returns  an 
indicative  value.  If  it  makes  transi¬ 
tions  as  far  as  the  end  state  and  ex¬ 
hausts  the  search  string,  it  returns  a 
value  indicating  a  match. 

This  procedure  works  quite  well 
for  patterns  that  do  not  contain  any 
metacharacters  (regular-expression 
operators).  The  addition  of  the  ques¬ 
tion  mark  (?)  operator  (match  any  one 
character)  is  also  straightforward  be¬ 
cause  it  really  functions  as  a  place¬ 
holder.  It  is  only  with  the  asterisk  (*) 
operator  (match  zero  or  more  repeti¬ 
tions  of  any  character)  that  a  problem 
arises. 

Consider  the  pattern  m  *ax.  The  ba¬ 
sic  algorithm  would  be  able  to  match 
this  pattern  successfully  with  the 
search  strings  max  or  mmrax-  But 
what  would  happen  with  the  search 
string  maaaay?  If  you  are  not  careful, 


proc  f  i  ndf  ail  (  ) 
begin 

f  [  0  1  :  =  0  ; 

f  or  (  j  :  =  2  to  N  ) 

do 

i  :=  f [  j-1  1  ; 

while (  p[j]  () p [ i + 1 ]  AND  i  >  0  ) 
do 

i  :=  f  [il  ; 
end  while ; 

if  (  Pi  j  ]  0  p[  i+  1  1  AND  i  =  0  ) 
then 

fill  :=  0; 
else 

f  I  j]  :=  i  +  1  ; 
end  if  ; 
end  for  ; 
end  proc ; 

Example  1.  Pseudocode  description  of  the  failure 


PATTERN  MATCHING 


statef )  will  return  a  "no-match”  indi¬ 
cation  when  it  compares  maa  with 
max.  If  you  provide  the  function  with 
a  backtracking  capability,  however, 
when  it  encounters  a  no-match  con¬ 
dition,  it  will  be  able  to  restore  the 
environment  to  a  previously  saved 
state  (actually  one  subject  string  char¬ 
acter  beyond  the  saved  position)  and 
resume  the  search. 

I  implemented  the  backtracking 
feature  using  a  stack  that  lets  the 
function  store  (and  recall)  state  infor¬ 
mation  for  patterns  containing  one 
or  more  asterisks.  Each  time  it  en¬ 
counters  the  wildcard  operator  (case 
statef  )  saves  (pushes)  both  the 
current  state  and  a  pointer  into  the 


subject  string  onto  the  stack.  Then,  if 
it  should  encounter  a  no-match  con¬ 
dition  (default:),  it  can  check  the  con¬ 
tents  of  the  stack,  and  if  it  is  nonemp¬ 
ty,  restore  (pop)  a  saved  state  and 
continue  processing. 

The  file  findcmd.h  (Listing  Four, 
page  107)  contains  all  the  global  defi¬ 
nitions,  declarations,  and  macros  ref¬ 
erenced  in  the  other  three  files. 

Invoking  the  program  is  simple — 
just  typ efindcmd  followed  by  a  list  of 
patterns  (files)  for  which  you  want  it 
to  search.  Patterns  can  be  as  simple  or 
complex  as  you  wish.  For  example,  if 
you  want  to  find  the  directory  that 
contains  your  favorite  editor,  just 
type  findcmd  ed.e?ce,  and  the  pro¬ 
gram  will  search  each  directory  com¬ 
ponent  of  your  path  variable  for  the 
file  ed.exe.  If  you  cannot  remember 


whether  the  file  is  in  .com  or  .exe  for¬ 
mat,  typ  efindcmd  ed.  *,  findcmd  ed* , 
or  findcmd  ed????.  The  latter  two  ex¬ 
amples  highlight  the  fact  that 
findcmd,  unlike  COMMAND.COM, 
treats  the  period  as  an  ordinary  char¬ 
acter  in  the  file  name.  That  is,  a  dot 
does  not  delimit  the  scope  of  the  as¬ 
terisk  operator. 

Portability 

The  program  was  written  under  MS- 
DOS  using  Version  3.0  of  the  Mark  Wil¬ 
liams  C  compiler.  If  you  use  another 
compiler,  the  only  nonportable  code 
is  contained  in  the  file  dos.c.  This  file 
should  not  be  too  difficult  to  deal 
with,  however.  Most  popular  C  com¬ 
pilers  have  a  method  of  accessing  DOS 
system  facilities — just  modify  the  two 


768 


PATTERN  MATCHING 

(continued  from  page  52) 

C  functions  to  work  within  your  com¬ 
pilation  environment. 

For  those  who  wish  to  port  the  pro¬ 
gram  to  a  Unix  system,  the  same  ca¬ 
veat  applies:  just  modify  the  func¬ 
tions  in  dos.c.  You  will  have  to 
completely  rewrite  the  routines  but, 
again,  this  should  not  be  difficult.  A 
good  description  of  how  to  access  a 
Unix  directory  is  in  Brian  Kernighan 
and  Rob  Pike's  The  Uni y  Program¬ 
ming  Environment  (see  bibliography). 

Extensions 

This  program  can  accommodate  sev¬ 
eral  extensions.  One  that  I  like  (I  have 
implemented  it  on  both  my  PC  and 
Unix  versions)  is  the  addition  of  a 
switch  on  the  command  line  to  direct 
the  program  to  access  an  alternate 
environment  variable  in  lieu  of  path. 
This  is  useful  if  you  tend  to  keep  ap¬ 
plication  source  files  scattered  all 
over  your  disk.  You  could  set  this  al¬ 
ternate  variable  (in  the  same  format 
as  you  would  your  path)  to  point  to  all 
your  source  code  directories.  Then, 
by  setting  one  switch  on  the  com¬ 
mand  line,  findcmd  can  search  an  al¬ 
ternate  set  of  directories  for  you. 

You  can  modify  the  functions  con¬ 
tained  in  state.c  for  independent  use 
by  incorporating  the  routine  inits( ) 
into  a  separate  program.  This  pro¬ 
gram's  only  function  would  be  to 
compile  pattern  strings  and  store  the 
results  in  a  file.  Then,  at  execution 
time,  a  program  performing  the  actu¬ 
al  comparisons  would  not  incur  the 
expense  of  "compiling”  the  pat¬ 
terns — it  would  just  read  them  from 
disk.  This  would  operate  in  a  manner 
similar  to  the  Unix  utilities  reg- 
comp( )  and  regeyf  J. 

Additional  regular-expression  op¬ 
erators  could  be  incorporated  into 
the  algorithm  to  add  more  power 
and  flexibility  to  search  patterns.  I 
have  found,  however,  that  the  opera¬ 
tors  now  included  in  the  program  are 
adequate  for  everyday  use  (and  there 
are  better  algorithms  than  KMP  for 
recognizing  complex  regular  expres¬ 
sions). 

Finally,  consider  how  simple  the 
changes  would  be  to  transform  this 
program  into  an  Is  command  (as  if 


Dr.  Dobb's  Journal,  October  1987 


you  needed  another  one!). 

Conclusions 

In  any  development  environment, 
no  single  programming  tool  is  suffi¬ 
cient  to  satisfy  every  need.  Any  in¬ 
strument  or  methodology  that  can 
spare  us  from  the  tedium  of  a  routine 
chore,  or  enable  us  to  become  more 
productive,  should  be  embraced 
with  open  arms.  State  machines  are 
just  such  a  tool  and  can  benefit  all 
programmers. 

Availability 

All  the  source  code  for  articles  in  this 
issue  is  available  on  a  single  disk.  To 
order,  send  $14.95  to  Dr.  Dobb  s  Jour¬ 
nal,  501  Galveston  Dr.,  Redwood  City, 
CA  94063,  or  call  (415)  366-3600,  ext. 
216.  Please  specify  the  issue  number 
and  format  (MS-DOS,  Macintosh,  Kay- 
pro).  If  you  would  rather  not  have  to 
retype  the  entire  program  send  me  a 
check  for  $6  and  I  will  mail  you  an 
MS-DOS  floppy  (360K  format)  contain¬ 
ing  the  program  and  source  code. 

Bibliography 

Aho,  Alfred  V;  Hopcroft,  John  E.;  and 
Ullman,  Jeffrey  D.  The  Design  and 
Analysis  of  Computer  Algorithms. 
Reading,  Mass.:  Addison-Wesley, 
1974. 

Aho,  Alfred  V;  and  Ullman,  Jeffrey  D. 
Principles  of  Compiler  Design.  Read¬ 
ing,  Mass.:  Addison-Wesley,  1979. 
Baase,  Sarah.  Computer  Algorithms. 
Reading,  Mass.:  Addison-Wesley, 
1979. 

Barrett,  William  A.;  and  Couch,  John 
D.  Compiler  Construction:  Theory  and 
Practice.  Chicago,  Ill.:  Science  Re¬ 
search  Associates  Inc.,  1979. 
Kernighan,  Brian;  and  Pike,  Rob.  The 
Uniy  Programming  Environment.  En¬ 
glewood  Cliffs,  N.J.:  Prentice-Hall, 
1984:  208-219. 

DDJ 

(Listings  begin  on  page  92.) 

Vote  for  your  favorite  feature/article. 

Circle  Reader  Service  No.  4. 


55 

769 


ASYNC  APPLETALK 


Listing  One  (Text  begins  on  page  18.) 


************************************** 


* 

* 

* 


THIS  FILE  CONTAINS  EXCERPTS  FROM  THE  APPLETALK  SOURCES/ 

VERSION  39,  AUGUST  1985,  AS  MODIFIED  BY  DARTMOUTH  COLLEGE 
TO  PRODUCE  THE  ASYNC  APPLETALK  DRIVER  (.BPP)  VERSION  1.2 
(ASYNC  APPLETALK  INSTALLER  VERSION  2.1)  OF  MAY  1987. 

THESE  EXCERPTS  CONTAIN  INFORMATION  OF  TWO  TYPES: 

1)  CODE  WRITTEN  ENTIRELY  AT  DARTMOUTH  COLLEGE; 

2)  CODE  WHICH  IS  FUNDAMENTALLY  SIMILAR  TO  THE 

PRELIMINARY  APPLETALK  SOFTWARE  DISTRIBUTED 
WITHOUT  RESTRICTION  AT  THE  APPLEBUS  DEVELOPER'S 
CONFERENCE  IN 'CUPERTINO,  CA  IN  MAY,  1984. 

PORTIONS  OF  THIS  CODE  ARE  COPYRIGHT  OF  THE  TRUSTEES  OF 
DARTMOUTH  COLLEGE  OR  APPLE  COMPUTER  INC. 

THESE  CODE  SEGMENTS  ARE  PROVIDED  FOR  INFORMATION  ONLY.  NO 
GUARANTEE  OF  CORRECT  OPERATION  IS  PROVIDED. 

FOR  MORE  INFORMATION  ABOUT  THIS  CODE,  CONTACT: 

Rich  Brown 

Manager  of  Special  Projects 
Dartmouth  College 
Kiewit  Computer  Center 
Hanover,  NH  03755 
603/646-3648 


************************************** 
*****  fne  ALAPDEFS.A  ***** 

;  AALAPdefs.a  contains  all  the  special  definitions  which  were  not 
;  needed  for  .MPP  . 

;  .BPP  et  al  should  now  use  unmodified  versions  of: 

/*  {AIncludes}atalkequ.a 

;  lapdefs.a 

;  vardefs.a 

;  Created  31  Mar  87  reb 

;  AALAP  constant  defs 


MaxLAPFrmLen  EQU 

586+13+3+2 

;  DDP  data  +  DDP  hdr  +  LAP  hdr  +  CRC 

FrameChar 

EQU 

$A5 

;  the  Framing  Char 

qFrmChar 

EQU 

-91 

;  for  moveq  instructions 

DLE 

EQU 

$10 

Xoff 

EQU 

$13 

Xon 

EQU 

$11 

lapIM 

EQU 

$86 

;  I  aM 

lapUR 

EQU 

$87 

;  yoU  aRe  (sorry  for  these  names...) 

qlapIM 

EQU 

-122 

qlapUR 

EQU 

-121 

noansalrt 

EQU 

-15998 

portncalrt 

EQU 

-15997 

;  Added  constant  return  value  for  AALAP  — 


noAnswer 

EQU 

-95 

;  same  as  excessCollsns  in  real  AtalkEqu 

AAOfst 

EQU 

10000  ; 

r  0  or  10000  (for  final  version) 

;+  MPP  (Status 

calls 

to  NBP,  DDP  and 

AALAP) 

Get Stats 

EQU 

400  ; 

(ABLAP)  get  the  statistics 

GetMyName 

(AALAP) 

EQU 

AAOfst+255  ; 

1  get  the  name  of  the  ATalk  driver 

GetChar 

(AALAP) 

EQU 

AAOfst+254  ; 

’  get  the  most  recently  received  char 

GetLAPStatus 

EQU 

AAOfst +2 5 3  ; 

’  return  AALAP  status  (AALAP) 

;+  MPP  (Control 

calls 

to  NBP,  DDP,  and  AALAP) 

FirstAPP 

EQU 

AAOfst+237  ; 

1  First  APP  control  call 

DoWarnings 

(AALAP) 

EQU 

AAOfst +2 3 7  ; 

1  Put  up  the  specified  alerts 

PutChar 

EQU 

AAOfst+238  ; 

1  Loop  'til  TBMT,  then  output  the  char 

(AALAP) 


(continued  on  page  62) 


60 

770 


Dr.  Dobb's  Journal,  October  1987 


ASYNC  APPLETALK 

Listing  One  (Listing  continued,  tepct  begins  on  page  18.) 

RelnitAALAP 

EQU 

AAOfst+239.  ;  Reinitialize  the  AALAP  variables  s  SCC 

(AALAP) 

Get NNNN 

EQU 

AAOfst+240  ;  Do 

NNNN  using  SysNetNum  and  sysLAPAddr 

(AALAP) 

Set Baud 

EQU 

AAOfst+241  ;  Set 

the  baud  rate  of  the  SCC 

(AALAP) 

LastAPP 

EQU 

AAOfst+241 

EJECT 

;  LAP  variables 

WDSPtr 

EQU 

MPPVarsEnd 

(4)  WDS  pointer  saved  here  on  writes 

LAPWrt Rtn 

EQU 

WDSptr+4 

(4)  return  adrs  of  LAPWrite  caller 

SaveA45 

EQU 

LAPWrt Rtn+ 4 

(8)  A4  and  A5  saved  here  on  interrupt 

SaveDskRtn 

EQU 

SaveA45+8 

(4)  DskRtnAdr  saved  here  for 

PollProc 

SavePS 

EQU 

SaveDskRtn+4  ; 

(4)  in  AALAP,  the  real  PollProc‘s 

address 

SaveBIn 

EQU 

SavePS+4 

(4)  .BIN  DCE  saved  here  (for  close) 

SaveBOut 

EQU 

SaveBIn+4 

(4)  .BOUT  DCE  saved  here  (for  close) 

SaveVects 

EQU 

SaveBOut +4 

(16)  SCC  interrupt  vectors  saved 

here 

SaveRegs 

EQU 

SaveVects+16  ; 

(20)  Registers  saved  here  across 

PollProc 

;  Variables 

for  Lisa/Mac  hardware  differences 

VAVBufA 

EQU 

SaveRegs+20 

Pointer  to  VIA  or  a  $FF  word 

STLth 

EQU 

6 

Size  of  STData  area 

VSTData 

EQU 

VAVBufA+4 

Data  string  to  SCC  after  send 

VDisTxRTS 

EQU 

VSTData+1 

This  is  the  DisTxRTS  byte 

EndOrigStuff  EQU 

VSTData+STLth 

;  AALAP  varibles 

tWDSptr 

EQU 

EndOrigStuff +2 

(4)  WDS  ptr  of  frame  being  tx 

qWDSptr 

EQU 

t WDSptr+4 

(4)  WDS  of  a  queued  DevMgr  frame 

LastXmit 

EQU 

qWDSptr+4 

(4)  Ticks  at  time  of  last  char  sent 

La st Rev 

EQU 

LastXmit+4 

(4)  Ticks  at  time  of  last  good 

received  frame 

LAPStash 

EQU 

LastRcv+4 

(4)  Pointer  to  next  received  char's 

place 

LAP Fetch 

EQU 

LAPStash+4 

(4)  Pointer  to  next  char  to  xmit 

LAPInBuf 

EQU 

LapFetch+4 

(4)  Pointer  to  the  LAP  input  buffer 

IMURwds 

EQU 

LAPInBUf +4 

(8)  WDS  for  IM  or  UR  frames 

BusyBuf 

EQU 

IMURwds+8 

(16)  Holds  up  to  16  chars  revd  while 

doingRead 

BusyStash 

EQU 

BusyBuf +16 

(4)  pointer  to  next  space  in  BusyBuf 

Busy Fetch 

EQU 

BusyStash+4 

(4)  pointer  to  next  char  to  remove 

IMURbuf 

EQU 

BusyFetch+4 

(8)  Holds  IM  or  UR  (starting  at  odd 

adrs) 

InputCRC 

EQU 

IMURBuf+8 

(2)  CRC  for  the  receiver 

OutputCRC 

EQU 

Input CRC+2 

(2)  CRC  for  the  transmit  side 

RcvdLen 

EQU 

OutputCRC+2 

(2)  Number  of  chars  received 

TxCount 

EQU 

RcvdLen+2 

(2)  Number  of  char's  transmitted 

CRCBuf 

EQU 

TxCount +2 

(2)  Two  bytes  for  the  CRC  for 

xmission 

RandomSeed 

EQU 

CRCBuf +2 

(2)  Seed  for  random  number  generator 

LastRxCh 

EQU 

RandomSeed+2 

(2)  Lsbyte  is  last  revd  char,  else 

$FFFF 

AALAPbaud 

EQU 

Last RxCh+2 

(2)  Current  baud  rate  of  the  LAP 

SentChar 

EQU 

AALAPbaud+2 

(1)  True  if  TxNextCh  sent  a  char 

nFrmChr 

EQU 

SentChar+1 

(1)  True  if  we  must  send  a  FrameChar 

nCRC 

EQU 

nFrmChr+1 

(1)  True  if  we  must  send  the  CRC 

Escln 

EQU 

nCRC+1 

(1)  Escaping  flag  for  the  receiver 

EscOut 

EQU 

Escln+l 

(1)  Transmitter  is  sending  an  escaped 

char 

RcvdXoff 

EQU 

EscOut+1 

(1)  We  received  Xoff 

AALAPup 

EQU 

RcvdXoff +1 

(1)  true  if  we've  handshook  IM  &  UR 

AALAPstuck 

EQU 

AALAP up+1 

(1)  true  if  we  have  NNNN  conflict 

InpState 

EQU 

AALAP stuck+1 

(1)  0  =  idle;  <>  0  =  in  a  frame 

stillBusy 

EQU 

InpState+1 

(1)  true  if  still  processing  a  read 

nXon 

EQU 

stillBusy+1 

(1)  true  if  we  sent  Xoff 

SendinglMUR  EQU 

nXon+1 

(1)  true  if  sending  AALAP  control  frame 

AssumeEq 

(InpState+1) ,  stillBusy 

;  tst.w  InpState (A4)  in 

_AssumeEq 

(InpState**$FFFFFFFE) , InpState  ;  myPollProc  fails  otherwise 

IF 

debug  THEN 

;  doing  statistics 

(continued  on  page  65) 

62 


Dr.  Dobb's  Journal,  October  1987 

111 


ASYNC  APPLETALK 

Listing  One  (Listing  continued,  text  begins  on  page  18.) 

Xmit Count 

EQU 

Sendi ngIMUR+1 

XOFFTOcount 

EQU 

XmitCount+4 

OVRcount 

EQU 

XOFFTOcount +4 

RcvIntCount 

EQU 

OVRcount +4 

XOFFcount 

EQU 

RcvIntCount+4 

XONcount 

EQU 

XOFFcount+4 

LongFrame 

EQU 

XONcount +4 

ShortFrame 

EQU 

LongFrame+4 

FrmCount 

EQU 

ShortFrame+4 

NoHandCnt 

EQU 

FrmCount+4 

CRCCount 

EQU 

NoHandCnt+4 

LenErrCnt 

EQU 

CRCCount+4 

BadDDP 

EQU 

LenErrCnt+4 

PPCount 

EQU 

BadDDP+4 

PPXoffCnt 

EQU 

PPCount +4 

DeferXmit 

EQU 

PPXoffCnt +4 

ABVarsEnd 

EQU 

DeferXmit+4 

ELSE 

ABVarsEnd 

EQU 

SendingIMUR+1 

end  of  AALAP  variables 

ENDIF 

*****  file  MPP.A  ***** 

. .  section 

removed 

... 

SCCConfig 

-  set  up  the  SCC  for  AppleBus 

SCCConfig 

LEA 

OpenTbl ,  A0 

A0  ->  (common)  open  table 

CMP.B 

#$FF,MacTypeByte  ; 

Mac  or  Lisa? 

BNE.S 

@10 

Branch  if  Mac  -  configure  it 

BSR 

ToSCC 

Configure  SCC  to  major  settings 

LEA 

LOpenTbl , A0  ; 

A0  ->  Lisa  open  table 

@10 

BRA 

ToSCC 

Configure  SCC  and  return 

. .  section 

removed 

... 

ToSCC 

MOVE . L 

SCCWr, A3 

Point  to  SCC  port  B  write  registers 

IF 

PortA  THEN 

ADDQ 

#  ACt  1 ,  A3 

Add  in  port  A  offset 

ENDIF 

@10 

MOVE 

(A0)  +  ,  DO 

Get  next  register  number  /  control 

word 

BEQ.S 

CloseRTS  ; 

Zero  is  terminator 

MOVE . B 

DO, (A3) 

Put  out  register  number 

ROR 

#8, DO 

Pickup  control  word 

MOVE . B 

DO,  (A3) 

Set  to  SCC 

BRA.  S 

@10 

And  keep  going 

. .  section 

removed 

... 

Initialization  tables 

SCC  Initialization  table  -  common  between  Mac  and  Lisa 

Entry  format:  .BYTE  control-value,  control-reg-number 

Taken  from  the  Zilog  SCC  Application  note,  00-2957-02 

OpenTbl 

DC.  B 

ResetOurPort,  9  ; 

($40  or  $80)  Reset  port 

DC.  B 

$44,4 

xl6  clock,  1  stop,  no  parity 

DC.  B 

$0,2 

Interrupt  vector  =  $00 

DC.  B 

$C0,  3 

Rx  is  8  bits,  disable  Rx 

DC.  B 

$E2,  5 

Tx  is  8  bits.  Disable  Tx;  DTR,  RTS 

on 

DC.  B 

$0,  6 

No  address 

DC.  B 

$0,7 

No  Flag  character 

DC.  B 

$0,10 

NRZ 

DC.B 

$56,11 

Tx  s  Rx  clock  from  BRG 

DC.  B 

$2,14 

BRG  source  =  PCLK,  BRG  off 

; 

enables 

DC.B 

$3,14 

BRG  on 

DC.B 

$C1, 3 

Rx  on 

DC.B 

$EA,  5 

Tx  on 

; 

Interrupt 

controls 

DC.B 

MouseInts,15  ; 

enable  DCD  ints  (for  mouse) 

(continued  on  next  page) 

Dr.  Dobb's  Journal,  October  1987 

772 


65 


ASYNC  APPLETALK 


Listing  One  (Listing  continued,  text  begins  on  page  18.) 


DC.  B 

$10,0 

DC.B 

$10,0 

DC.  B 

$13,1 

DC.B 

MIE,  9 

DC.W 

0 

IF 

RAM  THEN 

;  SCC  initialization  table  for  Li 

;  Port  A  uses  PCLK 

(@4.0  MHz  TTL) 

IF 

PortA  THEN 

LOpenTbl  DC.B 

$00,14 

DC.B 

$6A,  5 

DC.B 

$56,11 

DC.B 

$02,14 

DC.B 

$03,14 

DC.W 

0 

ELSE 

LOpenTbl  DC.B 

$00,14 

DC.B 

$6A,  5 

DC.B 

$D6, 11 

DC.B 

$00,14 

DC.B 

$01,14 

DC.W 

0 

ENDIF 

;  Mac  initialization  data  (first 

MacInitData  DC.B 

5,MDisTxRTS 

DC.B 

14,ResetClks 

DC.B 

3, EnbRxSlv 

DC.W 

$2100 

DC.B 

Abort Del ay, 0 

IF 

RAM  THEN 

;  Lisa  initialization  data 

LisalnitData  DC.B 

5, LDisTxRTS 

DC.B 

14, ResetClks 

DC.B 

3, EnbRxSlv 

DC.W 

$2500 

DC.B 

34,0 

ENDIF 

;  reset  external  ints 
;  reset  external  ints  (twice) 

;  Tx,  Rx,  Ext  int  enable 
;  Master  Interrupt  Enable 
;  ***  End  of  table  *** 

;  Only  need  Lisa  table  if  RAM-based 


configuration  for  Port  A 
turn  off  BRG 
enable  TX,  RTS;  DTR  low 
TTL  clock,  tx  and  rx  use  BRG 
Use  PCLK  to  feed  BRG,  BRG  off 
BRG  on 

configuration  for  PortB 

turn  off  BRG 

enable  TX,  RTS;  DTR  low 

Crystal  clock,  tx  and  rx  use  BRG 

Use  crystal  to  feed  BRG,  BRG  off 

BRG  on 


($60)  Turn  off  drivers 

($41)  Reset  missing  clocks  flag 

($DD)  Enable  receiver 

SR  to  enable  SCC  interrupts 

Delay  to  send  out  abort  bits  (3.2B) 


($E2)  Turn  off  drivers 

($41)  Reset  missing  clocks  flag 

($DD)  Enable  receiver 

SR  to  enable  SCC  interrupts 

Just  delay  this  much  on  Lisa  (3.2B) 


*****  file  LAP. A  ***** 


LAP. TEXT  -  the  LAP  part  of  AALAP 

April-August,  1984 

Alan  Oppenheimer  and  Larry  Kenyon 

Rich  Brown,  Dartmouth  College 
May  1987 

Version  1.2a6  Created  qWDSptr  to  point  at  queued  WDS  21  May  87  reb 
Version  1.2a5  Always  check  that  TBMT  is  true  before  sending  19  May  87  reb 
Version  1.2a4  TintHnd,  VBLHnd,  RintHnd  now  call  TxNextCh;  only  TintHnd 
clears  interrupts  (as  it  should  be)  14  May  87  reb 

Version  1.2a3  Prefetching  warning  dialogs  doesn't  work;  backed  out  10  May  87  reb 
Version  1.2a2  tWDSptr  now  determines  whether  we're  sending  a  frame; 

DoWarn  now  doesn't  read  the  resource  file  8  May  87  reb 
Version  1.2al  Removed  queueing  from  LAPWrite.  LAPWrite  no  longer 
allocates  memory,  so  it  won't  fail  if  called  from 
interrupt  handling.  19  Apr  87  reb 

Version  l.lb2  Changed  noAnswer  to  -95  (so  it  can  be  handled  like  excessCollsns) 
LAPWrite  returns  noAnswer  if  AALAP  not  up;  (30  Mar  87) 

GetNNNN  returns  noAnswer  or  PortNotCF; 

Changed  LAPWrite  to  return  ddpLenErr  if  too  long 

Version  l.lbl  Fixed  PollProc  to  be  more  agressive  about  sucking  chars 
from  the  SCC;  added  -1  SendChar  value  (sends  Break) ; 
fix  Initcursor  bug  in  Dowarn  16  &  30  Dec  86  reb 

Version  l.lal  Output  an  Xoff  if  called  by  PollProc  during  an  input  message 
3  Nov  86  reb 

Version  1.0b2  Changed  to  set  up  SCC  properly  for  Lisa  15  Oct  86 
(still  has  intermittent  hangups,  tho  —  not  diagnosed) 

Version  l.Obl  Changed  last_valid_frame  timer  to  30  seconds;  always  send 
UR,  even  after  un-matchable  IM  address 


66 


Dr.  Dobb's  Journal,  October  1987 

773 


Version  1.0a3  Fixed  Status  return  buffer  bug;  SetBaud  now  takes  actual 

baud  rate;  added  GetLAPStatus  call;  copy  entire  message 
Version  1.0A2  Added  alerts  for  NoAnswer,  PortNotCf  (17  Jul  86  reb) 
Version  4.2  Int  handlers  now  do  IUS  etc.  more  carefully  (4  Jul  86) 

Version  4.1  Now  escapes  either  parity  Xon  and  Xoff  (21  Apr  86) 

Version  4.0  First  cut  at  AALAP  (26  Oct-14  Dec  85) 

. .  section  removed  . . . 

COPYRIGHT  (C)  1984  APPLE  COMPUTER 


. . .  section  removed  . . . 


MRelnit  -  Control  call  to  reinit  AALAP  and  the  SCC 


MRelnit  bsr.s  AALAPWarm  ;  Warm  start  ourselves 

BRA  AbusExit  ;  and  return 


AALAPCold  —  cold  start  for  AALAP;  called  only  once 


AALAPCold 


Allocate  the  input  buffer  (This  should  be  alloc  above  BufPtr,  not  sysheap) 


move .  1 

♦maxLAPFrmLen, DO 

;  get  an  AALAP  input  buffer 

_newptr 

,  SYS 

;  from  the  system  heap 

bne.s 

WarmRTS 

;  exit  if  bad 

move . 1 

A0,LAPInBuf (a2) 

;  otherwise  save  its  pointer 

;  Clear  out 

LAP  variables 

clr.l 

WDSPtr (a2) 

clr.  1 

tWDSptr  (A2> 

clr.l 

LAPWrtRtn (A2) 

clr.w 

SysNetNum(a2) 

clr.b 

SysLAPAddr (a2) 

clr.l 

SavePS (A2) 

sf 

AALAP up (a2) 

sf 

AALAPstuck (a2) 

;  Setup  SCC 

for  AALAP 

BSR 

SCCConfig 

;  Configure  the  SCC  for  Async 

AppleTalk 

move 

#9600, DO 

;  and  set  up  for  9600  baud 

bsr 

Set_Baud 

! 

;  Reset  all 

the  LAP 

variables  which  don' 

t  irrevocably  change  the 

;  state  of  the  driver.  This  routine  can 

be  called  any  time,  only 

;  killing  the  current  message (s)  in  progress. 

AALAPWarm 

move . 1 

Ticks, DO 

move . 1 

D0,LastXmit (a2) 

move . 1 

D0,LastRcv (a2) 

lea 

BusyBuf (a2)  ,A0 

move . 1 

AO, BusyStash (a2) 

move . 1 

A0,BusyFetch(a2) 

move . w 

#?FFFF,  LastRxCh  (a2) 

clr.l 

tWDSptr  (A2) 

clr.l 

qWDSptr (A2) 

clr.w 

TxCount (a2) 

sf 

RcvdXoff (a 2) 

sf 

InpState  (a2) 

sf 

Escln  (a2) 

sf 

Sendi nglMUR (A2 ) 

sf 

stillBusy (a2) 

sf 

nFrmChr (a2) 

sf 

nCRC (a2) 

sf 

nXon (A2) 

WarmRTS 

rts 

. 

EJECT 

;  Status  -  handle  driver  status  request 

(continued  on  next  page) 


Dr.  Dobb’s  Journal,  October  1987 

774 


67 


ASYNC  APPLETALK 


Listing  One  (Listing  continued,  te/ct  begins  on  page  18.) 


SUBR 

no  one  better  call  this... 

Status 

MOVE . L 

MPPVars, A2 

A2  ->  our  variables 

MOVEQ 

♦StatusErr, DO  ; 

Assume  a  status  error 

lea 

CSParam (AO) ,  Al 

point  at  the  CSParam  buffer 

move.w 

CSCode (AO) , D1 

and  get  the  CScode 

IF 

Stats  THEN 

CMP .  W 

♦GetStats, Dl 

Clear  stats  command? 

BNE.S 

@1 

check  for  "What's  my  Name?"  if  not 

move.w 

CSParam  (AO)  ,A1 

CSParam  contains  a  pointer  to  buffer 

MOVE 

SR,- (SP) 

MOVE 

#  SCCLockout , SR  ; 

exclude  interrupts  to  keep  stats 

clean 

ADD 

#StatsStart,A2  ; 

point  to  stats  we  keep 

MOVEQ 

#  (StatsLgCnt-1) ,D0 

MOVEQ 

#0,D1 

zero  for  faster  clearing 

@0 

MOVE .L 

(A2),  (Al)  + 

return  current  value 

MOVE . L 

Dl,  (A2)  + 

then  zero  count 

DBRA 

DO,  @0 

MOVE 

(SP) +, SR 

bra.s 

AbusExit 

ENDIF 

@1 

cmp.w 

♦  GetMyName,  Dl 

is  this  a  "what's  my  name"  call? 

bne.  s 

@2 

go  if  not 

Move . 1 

MPP+18, (Al) + 

move  Pascal  string  from  front  of  driver 

move . b 

MPP+22, (Al) 

to  beginning  the  buffer  (5  chars) 

bra.s 

@4 

and  exit  with  good  status 

@2 

cmp.w 

#GetChar,Dl 

is  this  a  "get  last  char"  call? 

bne.  s 

03 

go  if  not 

move . w 

LastRxCh (a2) , (Al) 

copy  the  character  (word) 

move . w 

#$FFFF, LastRxCh (a2) 

and  flag  the  character 

bra.s 

@4 

@3 

cmp.w 

*  Get  LAP  St  at  us,  Dl 

is  this  a  "get  LAP  status"  call? 

bne.  s 

AbusExit 

go  if  not 

move.b 

AALAPup  (A2)  ,  (Al)  + 

AALAPup? 

move . b 

AALAP stuck  (A2)  ,  (Al)  + 

AALAP stuck? 

move . w 

AALAPbaud (A2) , (Al) 

What's  the  baud  rate? 

@4 

clr.l 

DO 

return  good  status 

AbusExit 

MOVE . L 

MPPDCE, Al  ; 

r  Make  sure  Al  has  DCE  address 

AbusExAl 

MOVE . L 

JIODone, -  (SP)  ; 

:  This  is  how  we  exit  (Prime,  Control, 

Status) 

AbusRTS 

RTS 

SUBEND 

•MY STATUS'  ; 

r  this  marks  the  AbusExit 

Prime 

BCLR 

#DrvrActive,DCtlFlags+l (Al)  ;  ***  V2.0C  Fix  Mac  ROM  bug 

*  *★ 

RTS 

;  ***  V2.0C  Fix  Mac  ROM  bug 

*  ** 

EJECT 

MGetNNNN  —  Do  the  NNNN,  using  the  current  values  of  SysLAPAddr  and 
SysNetNum.  Return  bad  status  if  it  didn't  work. 

On  entry:  A2  ->  BPP  variables 

On  exit:  DO  =  noErr  (0)  if  we  succeeded, 

PortNotCF  (-98)  or 
noAnswer  (-95)  if  not 


MGetNNNN 

bsr .  s 

Get_NNNN 

Use  them  just  as  they  are 

bra.s 

AbusExit  ; 

return  from  the  control  calls 

tries 

EQU 

-2 

counter  for  the  tries 

end time 

EQU 

-6 

end  time 

Get_NNNN 

_SUBR 

6 

move . w 

Ticks+2, RandomSeed (a2) 

;  randomize  things 

move.b 

SysLAPAddr  (a2) ,D0  ; 

Node  number  in  DO 

move 

SysNetNum(a2) ,D1  ; 

Net  number  in  Dl 

sf 

AALAPup (a2) 

we're  not  up  yet 

sf 

AALAP stuck (a2)  ; 

and  we're  not  in  trouble  either 

move 

#4, tries  (a6)  ; 

tries  counter  (4  tries) 

@10 

move . 1 

Ticks, D2 

add.  1 

#30, D2 

set  endtime  to  the  current  time+30 

move . 1 

D2, endtime (a6)  ; 

remember  the  ending  time 

moveq 

#qlapIM, D2 

get  the  lap  type 

68 


Dr.  Dobb’s  Journal,  October  1987 

775 


move . w 

SysNetNum(a2)  ,D1 

;  get  the  Net  number 

move.b 

SysLAPAddr (a2) ,D0 

;  and  the  node  number 

bsr 

SendIMUR 

;  and  send  it 

@20 

clr 

DO 

;  good  status  if  things  are  OK 

tst  .b 

AALAPup (a2) 

;  did  the  magic  work? 

bne.s 

NNNNexit 

;  go  if  so 

tst  .b 

AALAP stuck (a2) 

;  is  there  an  irreconcilable  difference? 

bne.s 

NNNNstuck 

move . 1 

endtime (a6)  ,D2 

cmp.l 

Ticks, D2 

;  otherwise,  check  the  timer 

bpl.s 

@20 

;  loop  if  not  timed  out 

sub 

#1,  tries  (a6) 

;  deer  the  counter 

bgt .  s 

@10 

;  loop  if  non- zero 

moveq 

#noAnswer,D0 

;  They  don't  want  to  talk 

bra.  s 

NNNNexit 

NNNNstuck 

moveq 

IPortNotCF, DO 

;  They  talk  but  say  bad  things 

NNNNexit 

tst  .w 

DO 

;  set  CC 

Subend 

'GETNNNN  ' 

;  and  return 

EJECT 

MPutChar 

i 

i 

rx 

H-* 

output  and  send  the 

char  pointed  to  in  the  control  call 

Entry: 

AO  ->  IOQelement 

Exit: 

Return 

status  is  0000  if 

noErr, 

BadIO  if 

timed  out  waiting  for  TBMT 

SendBrk 

equ 

$12 

Sends  Break  (w/RTS)  when  sent  to  WR5 

MPutChar 

bsr.  s 

Put  Char 

bra 

Abus Exit 

and  exit 

Put  Char 

SUBR 

move . w 

CSParam  (aO) ,D0 

get  the  character  (in  an  integer) 

bmi .  s 

@10 

if  it's  0 . .255, 

bsr 

SendChar 

output  the  character 

bra .  s 

@20 

and  quit 

@10 

lea 

BreakTbl,A0 

move . b 

# SendBrk, (A0) 

set  the  break  bit  in  WR5 

bsr 

ToScc 

move . 1 

#10, A0 

wait  10  ticks 

_delay 

move 

# SEA, DO 

Enable  Tx,  DTR,  RTS 

CMP I .B 

#$FF,MacTypeByte 

Mac  or  Lisa? 

BNE.S 

@15 

Branch  if  Mac  (PortA  &  PortB  are  same) 

move 

#$6A,  DO 

;  Lisa  doesn't  assert  DTR 

@151ea 

BreakTbl,A0 

move . b 

DO,  (A0) 

;  and  turn  the  Break  off 

bsr 

ToSCC 

clr 

DO 

@20 

SUBEND 

'MPUTCHAR' 

breaktbl 

dc.b 

0,5  ;  THIS  WON'T  MAKE  ROMMABLE  CODE 

dc.w 

0000 

MSetBaud  —  send 

the  (integer)  value  in  the  CSParamblk  to  SCC  as  its  Baud 

Rate 

Entry: 

AO  ->  IOQelement 

Exit: 

noErr  if 

aok 

-1  if 

requesting  19,200  baud  on  a  Lisa,  port  A  (cannot  be  done) 

THIS  WON 

T  MAKE  ROMMABLE  CODE! 

BaudConsts 

DC.B 

2,14 

turn  off  BRG  (so  it  doesn't  count  for 

a  while) 

IsBaudVal 

DC.B 

0,12 

LSByte  of  BRG 

msBaudVal 

DC.B 

0,13 

MSByte  of  BRG 

BaudSrc 

DC.B 

0,14 

turn  it  on  again,  with  proper  baud 

source 

DC.W 

0000 

end  of  constant  string 

BaudTable 

DC.W 

1200,94,102 

1200  baud,  Mac&LisaB  ,  LisaA  BRG 

constants 

DC.W 

2400,46,50 

2400  baud 

DC.W 

4800,22,24 

4800  baud 

DC.W 

9600,10,11 

9600  baud 

(continued  on  ne\t  page) 

Dr.  Dobb's  Journal ,  October  1987 

776 


69 


ASYNC  APPLETALK 


Listing  One  (Listing  continued ,  text  begins  on  page  18.) 


DC.W 

19200, 4,-1 

19200  baud  (but  not  for  Lisa  Port 

BaudTblEnd  DC.W 

-1 

sentinel 

MSetBaud 

move 

CSParam  (aO) ,D0 

get  the  (integer)  baud  rate 

bsr.  s 

Set  Baud 

bra 

AbusExit 

Set  Baud 

SUBR 

DO  contains  the  actual  desired  baud 

rate 

move . w 

DO, AALAPbaud (A2) 

save  the  current  baud  rate 

lea 

BaudTable,A0 

point  at  the  table 

@10 

cmp.w 

(aO) , DO 

does  it  match? 

beq.s 

@30 

go  if  so 

addq.l 

#6, A0 

tst.w 

(A0) 

are  we  done? 

bpl.s 

@10 

loop  if  we  didn’t  hit  the  sentinel 

moveq 

#-l, DO 

bra.  s 

@50 

and  bail  out 

@30 

moveq 

#3,D1 

set  up  for  Mac  port  A/B  (PCLK/BRG  on 

CMP  I  .B 

#  $  FF ,  Ma  cTypeBy  t  e 

Mac  or  Lisa? 

BNE.S 

@40 

Branch  if  Mac  (PortA  s  PortB  are 

same) 

IF  Port A 

THEN 

Lisa  ports  A/B  differ;  Macs  don't 

addq.l 

#2, AO 

bump  to  Lisa  PortA  column 

ELSE 

moveq 

#1,D1 

Lisa  portB  works  from  Xtal,  not  PCLK 

ENDIF 

@40 

move.w 

2  (aO) ,D0 

get  value  BRG  (-1  if  19,200  on  Lisa) 

bmi.s 

@50 

exit  if  negative 

;  DO  now 

contains  the  value  for  the  BRG 

lea 

BaudConsts, aO 

point  at  the  constants 

move . b 

dO, IsBaudVal-BaudConsts (aO)  ;  save  the  LSByte  of  the  BRG 

ror 

#8,d0 

move.b 

dO,msBaudVal-BaudConsts (aO)  ;  and  the  MSByte  of  the  BRG 

move.b 

dl,BaudSrc-BaudConsts (aO)  ;  and  the  source  for  BRG 

bsr 

toSCC 

and  output  it 

clr 

dO 

@50 

SUBEND  'MSETBAUD' 

. 

EJECT 

MWriteLAP  -  write  out  a  LAP  packet 


Call: 

AO  ->  10  queue  element 

A1  ->  WDS.  First  entry  must  start 

+ - + 

|  Destination  addr| 


I  LAP  type  code  | 


A2  ->  local  variables 
Return: 

DO  -  error  code 

NOTE:  for  MPP,  first  two  data  bytes  must  be  length 


MWriteLAP 

MOVE . L 

2  (Al) , A0 

;  A0  ->  first  WDS  entry 

MOVEQ 

#LAPProtErr , DO 

;  Assume  an  error  (2.3F) 

TST.B 

LAPType (A0) 

;  Make  sure  protocol  is  a  valid  one 

ble.s 

MWRLAPex 

;  Return  error  if  not 

MOVE.B 

LAPDstAdr (A0) ,D2 

;  D2  =  destination  address 

bsr.s 

LAPWrite 

;  Write  out  the  packet 

MWRLAPex 

bra 

EJECT 

AbusExit 

;  LAPWrite 

-  send 

a  packet  out  an  Async 

port.  Called  both  by  MWriteLAP 

;  and  DDPWrite. 

;  Call: 

;  A1 

->  WDS 

(first  entry  must  start  as  in  MWriteLAP  above) 

;  A2 

->  local  variables 

as  follows: 


[  for  source  addr  ] 


70 


Dr.  Dobb’s  Journal,  October  1987 

111 


D2  =  LAP  destination  address 
Return: 

DO  =  noErr  or  the  error  code 
Uses  D1-D3,A0,A1,A3 

Save  the  WDS  passed  in 

If  AALAP  isn't  up,  return  noAnswer 

Next,  check  the  length  of  the  frame  for  <»  603  bytes;  return  error  if  bad 
If  we're  currently  sending  a  frame: 

if  it's  an  IM/UR,  simply  return  (WDS  will  be  sent  when  done) 

if  it's  not,  then  stop  (somehow  we  got  two  frames  to  send  from  DevMgr) 

If  interrupts  are  on 

Update  PollProc  pointer  if  it  needs  it 

Check  that  the  AALAP  is  still  working,  sending  IM/UR  if  necessary. 
Start  sending  the  frame 

This  code  relies  on  the  Device  Manager  for  queuing.  Here's  how  it  works: 

General  Rule  #1:  All  operations  initiated  by  the  device  manager 
ultimately  return  to  the  DevMgr  through  jlOdone. 

General  Rule  #2:  All  async  operations  which  cannot  complete  immediately 
return  thru  a  RTS.  When  the  operation  does  complete,  the  (interrupt) 
routine  can  go  thru  jlOdone. 

Specific  AppleTalk  Rule  #1:  All  callers  of  LAPWrite  have  bra  AbusExit 
code  right  after  the  call  to  LAPWrite.  This  eventually  jumps  to  jlOdone. 

Specific  AppleTalk  Rule  #2:  Since  they've  taken  care  of  the  details, 
LAPWrite  only  has  to  remember  two  things:  If  we  finish,  we  can  return 
to  our  original  caller  (by  jumping  thru  LAPWrtRtn  to  go  to  the  device 
manager);  If  we  don’t  finish,  we  should  return  to  the  caller's  caller 
(which  called  the  device  manager  in  the  first  place) .  Whew! 


LAPWrite  move.l 

(SP) +, LAPWrtRtn (A2)  ;  save  the  caller's  adrs 

move . 1 

Al, WDSptr (A2) 

;  and  the  frame  we're  asked  to  send 

move . w 

#noAnswer,D0 

tst  .b 

AALAPup (a2) 

;  is  the  AALAP  up? 

beq 

LAPWexit 

;  exit  if  bad 

;  Next  compute  the 

length  of  the  WDS 

—  exit  if  it's  bad 

move . 1 

Al,  AO 

;  get  the  WDS  pointer . 

clr.  1 

D2 

;  D2  =  number  of  data  bytes  in  frame 

clr.l 

D1 

;  Dl  *=  number  of  segments  in  WDS 

cmp.w 

#2,  (aO) 

;  is  first  segment  too  short? 

ble.  s 

LAPWexit 

;  go  if  it  is 

@20  tst.w 

(aO) 

;  is  WDS  length  =  0? 

beq.  s 

@30 

;  go  if  so 

add.w 

(aO)  ,d2 

;  add  in  this  length 

addq 

#1,  dl 

;  incr  the  segment  counter 

addq.l 

#6, AO 

;  bump  the  WDS  pointer 

bra.s 

@20 

;  D2  is  the  length 

of  the  message  we' 

ve  been  asked  to  send 

;  D1  is  the  number 

of  segments  we've 

been  presented  with 

;  (A1  still  has  WDSptr) 

@30  moveq 

#LAPProtErr,D0 

tst .  1 

dl 

;  is  Dl  (number  of  segments)  <1? 

ble.  s 

LAPWexit 

;  go  if  so  (error) 

moveq 

#ddpLenErr,D0 

cmp.w 

data) 

#603, D2 

;  is  the  length  >  603  (3  LAP  +  600 

bgt .  s 

LAPWexit 

;  go  if  it's  bad 

;  we  can  try  to  send  WDS  in  A1  —  are 

we  currently  sending  a  frame? 

tst  .1 

tWDSptr (a2) 

;  are  we  presently  sending  a  frame? 

beq.  s 

@40 

;  go  if  not 

tst  ,b 

SendinglMUR (A2) 

;  is  it  an  IM  or  UR? 

beq.  s 

@35 

;  go  if  not 

tst .  1 

qWDSptr  (A2) 

;  is  one  already  queued? 

bne.  s 

@35 

;  go  if  so  (stop) 

move . 1 

Al ,  qWDSptr  (A2 ) 

;  save  the  (queued)  WDS  pointer 

stat count  DeferXmit 

rts 

(continued  on  next  page) 


Dr.  Dobb’s  Journal,  October  1987 

778 


71 


ASYNC  APPLETALK 

Listing  One  (Listing  continued,  text  begins  on  page  18.) 

@35 

pea 

AALAP2inl 

point  at  the  string 

DC.W 

$ABFF 

and  trap  'em  (in  lieu  of  $A9FF) 

WDS  in  A1 

is  OK  to  send  now:  if  interrupts  enabled. 

update 

PollProc  and  check  time  since  last  good  frame 

J40 

move 

SR, DO 

and 

#$70, DO 

is  the  interrupt  mask  <>  0? 

bne.  s 

SendWDSptr 

just  send  it 

Update  our  local 

PollProc  pointer 

move 

SR, - (A7) 

save  the  state 

move 

#  SCCLockout , SR 

turn  off  interrupts 

lea 

myPollProc, A1 

Al  ->  our  PollProc 

move . 1 

PollProc, DO 

get  the  current  PollProc  address 

cmp.  1 

DO,  A1 

have  we  already  updated  it? 

beq.  s 

@50 

go  if  we  have 

move . 1 

DO, SavePS (A2) 

else  update  our  saved  copy 

move . 1 

Al, PollProc 

and  point  the  real  PollProc  at  us 

@50 

move 

(A7) +,  SR 

and  re-enable 

check  for 

(Ticks 

-  LastRcv)  >  1800  -  see  if  they're  still  there 

move . 1 

Ticks, DO 

have  we  received  a  frame  recently? 

sub.  1 

LastRcv (a2) ,D0 

cmp.  1 

#1800, DO 

(ticks  -  LastRcv)  >  1800  (30  sec) ? 

bmi .  s 

SendWDSptr 

go  if  not  (send  it) 

bsr 

Get  NNNN 

do  the  IM/UR  stuff 

beq.  s 

SendWDSptr 

go  if  it  worked 

move.w 

DO, -  (SP) 

otherwise,  save  the  status 

bsr 

DoWarn 

else,  warn  them 

move . w 

(SP) +, DO 

and  return  bad  status 

Come  here 

if  we  need  to  return  immediately  (status  is  in  DO) 

uAPWexit 

move . 1 

LAPWrtRtn (A2) ,A0 

this'll  get  'em  to  IOdone 

jmp 

(A0) 

sooner  or  later 

AALAP2i.nl 

dc.b  24 

dc.b 

•AALAP  -  TWO  MSGS  AT  ONCE' 

align 

EJECT 

SendFrame 

—  Starts  off  transmission  of  a  frame 

AO  points 

to  the 

WDS  of  the  frame  to  send 

SendFrame 

sets  all  the  pointers,  etc. 

and  then  sends  the  FrameChar 

($A5)-.  The  Transmit  Interrupt  Handler  ships  all  the  remaining  bytes 

as  they  are  needed. 

SendWDSptr 

move . 1 

WDSptr (A2) , A0 

get  the  WDS  to  send 

SendFrame 

move . w 

(aO) +,D0 

DO  =  the  length  of  the  1st  segment 

move . 1 

(aO) +,al 

al  ->  the  first  byte  of  1st  segment 

move . 1 

aO, tWDSPtr (a2) 

and  save  the  pointer  to  rest  of  WDS 

subq 

#2, DO 

Finagle  the  length  and  address 

move 

DO, TxCount (a2) 

of  the  segment  (AALAP  doesn't 

addq.l 

#2,  Al 

send  dest  and  source  node) 

move . 1 

al, LAPFetch (a2) 

St 

nCRC (a2) 

we'll  need  to  send  a  CRC 

St 

nFrmChr (a2) 

and  a  closing  FrameChar 

sf 

EscOut (a2) 

clear  the  Escape  flag 

clr 

OutputCRC  (a2) 

and  the  CRC 

moveq 

♦qFrmChar, DO 

load  a  FrameChar 

bra.s 

SendSCC 

and  kick  off  the  frame 

EJECT 

LAPSend  - 

-  send  the  next  byte  in  the  LAP  frame 

This  routine  checks  to  see  if  we're  flow-controlled,  if  not,  it 

gets  the 

next  char,  accumulates  the  CRC,  generates  DLE's  as 

required. 

and  calls  the  routine  to  place  the  byte  in  the  SCC. 

It  works 

from  LAPFetch (a2) ,  and  advances  it  (and  decrements  TxCount) 

as  necessary. 

(continued  on  page  74) 

72 


Dr.  Dobb’s  Journal,  October  1987 

779 


_ ASYNC  APPLETALK 

Listing  One  (Listing  continued,  tejct  begins  on  page  18.) 


If  we  sent  a  char,  then  we  set  SentChar(A2)  to  true 


LAP Send 

tst  .b 

RcvdXoff (a 2) 

are  we  flow  controlled? 

bne.  s 

LAPSendRTS 

go  if  so 

move 

TxCount (a2) ,D3 

get  the  remaining  length 

ble.s 

LAPBadCount 

go  if  zero  or  negative 

cmp.w 

fmaxLAPFimLen,  D3 

check  its  length 

bgt .  s 

LAPBadCount 

go  if  too  big 

move . 1 

Ticks, LastXmit (a2) 

remember  when  we  last  sent  a  char 

subq 

#1,D3 

deer  the  count 

move . 1 

LAPFetch (a2)  ,a0 

move . b 

(aO) +, DO 

and  fetch  the  character,  bumping  the 

tst  .b 

EscOut (a2) 

are  we  escaping  this  char? 

bne.s 

@15 

■  go  if  yes  —  it's  already  in  CRC 

lea 

OutputCRC (a2) ,a3 

point  at  the  output  CRC  Accumulator 

bsr 

NextCRC 

accumulate  the  un-processed  char 

cmp.b 

#DLE, DO 

test  for  DLE,  Xon,  Xoff,  FrameChar 

beq.  s 

@10 

go  if  it's  a  special  one 

cmp.b 

#FrameChar, DO 

beq.s 

@10 

move . b 

DO,  D1 

and.b 

#$7F,D1 

is  it  a  XON  or  XOFF  (either  parity)? 

cmp.b 

#Xoff,Dl 

beq.s 

@10 

cmp.b 

#Xon,Dl 

bne.s 

@20 

go  if  it's  just  a  normal  character 

@10 

St 

EscOut (a2) 

remember  that  we're  escaping 

moveq 

#DLE, DO 

data  to  send  is  a  DLE 

bra.s 

SendSCC 

(and  don't  update  the  pointer/ len) 

@15 

eor 

#$40, DO 

come  here  if  we're  escaping  this  char 

@20 

move . 1 

a0, LAPFetch (a2) 

update  the  pointer 

move 

D3, TxCount (a2) 

and  the  remaining  length 

; 

sf 

EscOut (a2) 

DO  has  the  next  char  to  send 

■ 

bra.s 

SendSCC 

and  send  the  character 

;  SendSCC  - 

-  sends 

DO  to  the  SCC  Write  Data  Register 

; 

Assumes  that  SCC  is  ready  (TBMT  is  true) 

;  Returns  DO  -  0 

;  uses  A1 

SendSCC 

St 

Sent Char (A2) 

remember  we  sent  a  char 

move . 1 

SCCWr, al 

point  at  the  SCC  Write  Control 

IF 

PortA  THEN 

addq. 1 
ENDIF 

#ACtl, al 

add  in  the  offset  for  Port  A 

move . b 

DO, SCCData (al) 

output  the  character 

moveq 

#0,  DO 

clear  the  return  status 

LAPSendRTS 

rts 

and  return 

LAPBadCount 

pea 

BadCntStr 

DC.W 

rts 

$ABFF 

Trap  'em  (not  $A9FF) 

BadCntStr 

DC.B  10 

DC.B 

align 

EJECT 

'Bad  length' 

SendChar 

—  Synchronously  wait  for  TBMT  and  send  another  character 

Use 

Ticks  to  watch  for  1/2  sec  timeout,  so  we  don't  hang  forever 

Entry: 

DO 

=  char  to  send 

Exit: 

DO 

=  0000  if  OK 

DO 

=  BadTBMT  if  we  timed 

out  (-3110) 

A0, 

A1,D2  changed 

SendChar 

SUBR 

move 

Ticks, D2 

fail-safe  counter 

add.  1 

#30, D2 

bump  by  1/2  second 

@10 

bsr.s 

TestTBMT 

look  to  see  if  we  can  send  it 

bne.s 

@20 

go  if  we  can 

cmp.  1 

Ticks, D2 

did  we  time  out? 

74 

780 


Dr.  Dobb’s  Journal,  October  1987 


bpl  .s 

@10 

go  if  not 

move 

#-3110, DO 

BadTBMT  return  code 

bra.s 

@40 

@20 

bsr .  s 

SendSCC 

else  send  it 

@40 

_SUBEND 

•SENDCHAR' 

;  Check  state  of  TBMT  -  sets  CCR  to  state 

of  TBMT 

;  Uses  AO 

TestTBMT 

movem. 

1  SCCRd, A0 

point  at  the  SCC 

IF 

PortA  THEN 

addq.l 

#Actl, A0 

ENDIF 

btst 

ITxEmptyBit, (aO) 

is  the  TBMT  set? 

rts 

return 

EJECT 

TIntHnd  ~  this  code  catches  the  Tx  Buffer  Empty  interrupts  from 

the  SCC  and  tries  to  send  another  character.  If  it  could  not 
send  a  character,  it  clears  the  Tx  Pending  bit,  so  that  the  SCC 
will  not  interrupt  again.  Finally  (in  any  case)  it  also  resets 
the  highest  interrupt  under  service  (IUS)  in  the  SCC  to  clear 
the  interrupt  before  returning. 

On  entry,  A0/A1  point  to  the  SCC  control  read/write  registers. 

Like  a  normal  interrupt  handler,  it  must  preserve  D4-D7  and  A4-A7 


TIntHnd 

move . 1 

MPPVars, a2 

;  point  at  the  MPP  Variables 

_statcount  XmitCount 

; 

sf 

SentChar  (A2) 

; 

bsr.s 

TxNextCh 

;  try  to  send  another  char 

tst  .b 

SentChar  (A2) 

;  did  we? 

bne.  s 

TIntIUS 

;  go  if  so 

move . 1 

SCCWr,Al 

;  otherwise  reset  TxPend 

IF 

PortA  THEN 

addq.l 

#Actl,Al 

ENDIF 

move . b 

#$28, (Al) 

TIntIUS 

bra 

DoIUS 

;  and  reset  the  highest  IUS 

TxNextCh  —  try  to  send  (in  this  order) 

the  next  character  of  the  segment,  or 
the  next  segment,  or 
the  CRC,  or 

the  trailing  FrameChar. 

If  a  complete  frame  which  was  initiated  by  the  device  manager  has 
been  sent,  we  should  jump  thru  IODone  (asking  the  DevMgr  for  more 
to  do) .  Otherwise,  (it  was  an  IM  or  UR)  we  look  to  see  if  there 
is  a  frame  from  the  DevMgr  queued  (in  WDSptr) .  If  so,  we  start 
sending  it,  otherwise,  we  simply  RTS. 


TxNextCh 

move . 1 
beq.  s 
tst  .w 
bne 

t WDSPtr (a2) ,D0 
TxNextRTS 

TxCount (A2) 

LAPSend 

DO  ->  WDS  in  progress 
if  nil,  just  exit  (no  message) 
is  there  more  of  the  segment  to  send 
if  so,  send  next  character 

@5 

move . 1 
tst 
beq.  s 
move 

move . 1 
move . 1 
bra 

DO,  A0 
(a0) 

@10 

(aO) +, TxCount  (a2) 

(aO) +,LAPFetch  (a 2) 
aO, tWDSptr (a2) 

LAPSend 

otherwise,  point  at  the  WDS 
check  the  next  length 
go  if  it’s  zero  (end  of  the  frame) 
otherwise,  update  TxCount  and 
and  LAPFetch 
and  update  the  tWDSPtr 
and  send  it  off 

;  Now  send 

the  CRC 

@10 

tst  .b 
beq.s 
sf 

move 

ror  .w 

nCRC (a2) 

@20 

nCRC (a2) 

outputCRC (a2) , DO 
#8, DO 

do  we  need  to  send  a  CRC? 
go  if  not 

don't  need  one  now 
get  the  two  CRC  bytes 
swap  them 

(continued  on  next  page) 


Dr.  Dobb's  Journal,  October  1987 


75 

781 


_ ASYNC  APPLETALK 

Listing  One  (Listing  continued,  text  begins  on  page  18.) 


lea 

CRCBuf (a2) ,  aO 

point  at  the  CRC  Tx  Buffer 

move 

DO,  (aO) 

save  the  CRC  bytes 

move . 1 

aO, LAPFetch (a2) 

and  save  the  fetch  pointer 

move 

#2,TxCount (a2)  ; 

save  the  length,  too 

bra 

LAPSend  ; 

and  send  them  off 

We've 

sent  the  CRC,  now  send  the  closing  FrameChar 

@20 

tst  .b 

nFrmChr(a2)  ; 

do  we  need  to  send  a  FrameChar? 

beq.s 

@30 

go  if  not 

sf 

nFrmChr(a2)  ; 

moveq 

IqFrmChar,  DO  ; 

get  $A5 

bra 

SendSCC  ; 

send  it  and  exit 

We've 

sent  a  full 

frame,  now  clean  up 

@30 

clr .  w 

TxCount (a2) 

clear  the  TxCount 

clr.  1 

tWDSPtr (a2) 

clear  the  tWDSptr  (no  longer  sending) 

Now  decide  whether  to  return,  wakeup  the  Dev.  Mgr,  or  start  a  queued  frame 


tst  .b 

Sending IMUR (A2) 

;  were  we  sending  an  IM  or 

UR? 

beq.s 

NotIMUR 

;  go  if  not 

sf 

SendinglMUR (A2) 

;  wel 1 ,  we're  not  anymore 

move . 1 

qWDSptr (A2) ,D0 

;  is  there  a  queued  frame? 

beq.s 

TxNextRTS 

;  go  if  not 

move . 1 

DO,  AO 

bra 

Send Frame 

;  otherwise,  start  sending 

it 

TxNextRTS  rts 

;  otherwise,  return  (RTS) 

;  We  weren't  sending  IM/UR  so  we  must 

have  finished  a  msg  from  the 

;  device  mgr.  Therefore,  we  should  return  to  the  Device  Manager. 

NotIMUR  clr.l 

qWDSptr (A2) 

;  clear  out  the  WDS 

moveq 

#0,  DO 

;  good  return  status 

bra 

LAPWexit 

;  and  go  thru  LAPWrtRtn  to 

IOdone 

EJECT 

RandomWord  -  generate  a  random  number 


Call: 

RandomSeed (A2)  -  seed 
Return: 

DO  =  random  number  (CCR  set  to  it) 


RandomWord  MOVE 

RandomSeed  (A2) ,D0 

;  DO  =  current  seed 

MULU 

#773, DO 

;  Times  773 

ADDQ 

#1,D0 

;  Plus  1 

MOVE 

DO,-  (SP) 

;  Save  high  byte  on  stack 

LSL 

#8, DO 

;  Put  low  byte  into  high  byte 

MOVE . B 

(SP) +, DO 

;  And  high  byte  into  low  byte 

MOVE 

DO,  RandomSeed  (A2) 

;  Set  back  in  seed 

RTS 

EJECT 

VBL  handler  -  come  here  every  VBLtimer  ticks.  Used  to  check  for  long 


;  output  puases;  if  we  stop  for  >  1  second, 
;  the  next  character. 

;  AO  ->  VBL  queue  element 

we  expermientally  send 

VBLHnd 

MOVE 

#  VBLtimer, VBLCount (AO) 

;  Better  re- ini t  VBL  count 

MOVE . L 

MPPVars,  A2 

;  A2  ->  local  variables 

;  Have  we 

sent  an  Xoff  (did  we  set  nXon) ? 

If  so,  try  to  send  an  Xon 

tst  .b 

nXon (A2) 

;  do  we  need  an  Xon? 

beq.s 

@20 

;  go  if  not 

bsr 

TestTBMT 

;  try  to  send  it  to  the  SCC 

beq.s 

VBLHndRTS 

;  quit  if  we  couldn't  send  it 

moveq 

#Xon, DO 

bsr 

SendSCC 

;  send  an  Xon 

sf 

nXon (A2) 

;  and  clear  the  flag 

bra.s 

VBLHndRTS 

;  and  quit 

Check  for  long  pause  during  transmit 


(continued  on  page  78) 


76 

782 


Dr.  Dobb's  Journal,  October,  1987 


ASYNC  APPLETALK 


Listing  One  (Listing  continued,  tejct  begins  on  page  18.) 


VBLHndRTS 


tst.l  tWDSptr  (A2) 

beq.s  VBLHndRTS 

move.l  Ticks, DO 

sub.l  LastXmit (a2)  ,D0 
cmp  #60, DO 

bmi . s  VBLHndRTS 

bsr  TestTBMT 

beq.s  VBLHndRTS 

sf  RcvdXoff(a2) 

_st at count  XOFFTOcount 
MOVE  #  SCCLockout , SR 

bsr  TxNextCh 

rts 
eject 


do  we  have  anything  to  send? 
return  if  not 

if  (ticks  -  LastXmit)  >  60  then 
let ' s  try  to  send  another  char 

is  TBMT  set  (can  we  send  another  char?) 
go  if  not 


exclude  SCC  interrupts  (VIA  priority  <  SCC) 
otherwise,  do  another  character 
this'll  restore  SR  et  al 


myPollProc  —  AALAP  PollProc  addendum  (predendum?) : 

The  AALAP  needs  a  bit  of  a  PollProc,  since  it  will  lose  characters 
whenever  the  disk  spins.  .Of  course,  all  good  Macintosh  programmers 
know  that  the  Printer  Port  (PortB)  isn't  polled  by  the  disk  driver 
since  there's  just  not  enough  horsepower  to  go  around. 

The  PollProc  is  called  by  the  disk  driver  to  poll  PortA.  We 
execute  a  snippet  of  code  before  the  real  PollProc,  and  send  an 
Xoff  to  the  other  end  if  we're  receiving  or  processing  a  message 
while  the  disk  is  spinning.  Then  we  transfer  to  the  real  PollProc. 

This  routine  preserves  all  regs  except  the  SR.  It  does  this  by 
reserving  a  longword  on  the  stack,  and  then  stuffing  the  SavePS 
value  in  it.  If  it's  zero,  then  there  wasn't  a  PollProc,  and  we 
pop  that  value  off  the  stack  and  return  to  the  disk  driver.  If 
that  value  wasn't  zero,  then  the  real  PollProc' s  address  will  be 
on  the  top  of  the  stack,  and  we  go  there.  The  disk  driver's  return 
address  will  be  left  on  the  stack,  allowing  the  PollProc  to  return 
normally. 

InpState  and  stillBusy  must  both  be  in  the  same  word.  The 
tst.w  InpState (A2)  below  fails  otherwise. 


myPollProc 


subq  #4,A7 
move.l  A2,-(SP) 
move.l  MPPVars,A2 
tst.b  nXon(A2) 
bne.s  myPPexit 
tst.w  InpState  (A2) 
beq.s  myPPexit 


;  save  space  for  a  return  adrs 
;  and  save  A2 
;  point  at  the  MPP  locals 
;  have  we  already  sent  an  Xoff? 

;  go  if  so 

;  are  we  receiving  or  processing  a  message? 
;  go  if  not 


myPPexit 


movem.l  A0/A1/D0,- (SP) 

bsr  StashSCCch 

bne.s  @10 

st at count  PPCount 

bsr  TestTBMT 

beq.s  @30 

moveq  #Xoff,D0 

bsr  SendSCC 

st  nXon(A2) 

stat count  PPXoffCnt 

movem.l  (SP) +, A0/A1/D0 

move.l  SavePS  (A2) ,4 (SP) 
movea.l  (SP)+,A2 
bne.s  @20 
addq.l  #4,SP 


;  save  regs 

;  grab  a  char  from  the  SCC,  save  it 
;  loop  'til  it's  empty 

;  is  it  OK  to  send  the  Xoff? 

;  go  if  not 

;  send  Xoff 

;  and  remember  we  need  Xon 
;  restore  the  regs 

;  move  address  onto  stack  (sets  CC) 
;  restore  A2 

;  go  if  PollProc  adrs  <>  0  (use  it) 
;  else  pop  the  (nil)  adrs 
;  and  go  there 


ExtlntHnd  --  catch  the  External  or  Status  Interrupts  from  the  SCC 

Checks  for  mouse  interrupt,  passes  control  if  it  is  one,  else  resets 
the  external/status  SCC  interrupts. 


ExtlntHnd  btst  #DCDbit,Dl 

beq . s  @10 

move.l  MouseVector,A3 
jmp  (A3) 


;  did  the  DCD  bit  change  (mouse  moved) 
;  go  if  not 

;  else,  point  at  the  mouse  handler 
;  and  go  there 


78 


Dr.  Dobb's  Journal,  October  1987 

783 


@10  move.b 

#$10,  (al) 

reset  ext  interrupts 

move.b 

#$10,  (al) 

(twice) 

move . b 
rts 

EJECT 

#ResetIUS,  (al) 

Reset  Highest  IUS  in  SCC  (to  WR0) 

RIntHnd  -  SCC  receive  interrupt  handler 

Called:  AO  ->  SCC  control  read  register 

A1  ->  SCC  control  write  register 

This  code  is  structured  differently  from  the  ABLAP  code,  since 
the  arrival  rate  of  the  chars  is  so  much  slower  for  AALAP.  Normal 

ABLAP  routines  call  ReadPacket  and  ReadRest  to  get  pieces  or  the  rest 
of  the  frame  as  they  arrive  in  real  time.  With  AALAP,  the  character 
arrival  rate  is  so  slow  that  we  copy  the  entire  frame  into  an 

interrupt-time  buffer. 

When  we  receive  a 

good  frame,  we  then  pass  control  to  the  appropriate 

protocol  handler. 

which  then  makes  calls  on  ReadPacket  and  ReadRest  to 

dole  out  the  characters  as  necessary. 

Like  all  Mac  interrupt  handlers,  it  must  preserve  D4-D7  and  A4-A7. 

and  return  with  a 

RTS  instruction. 

Since  the  default 

DDP  socket  listener 

-s  quite  slow  (3-4  msec  to  process 

a  newly  received  message)  we  set  up  a  buffer  to  contain  characters 
which  arrive  during  the  time  the  socket  listener  is  in  control.  We 
set  a  flag  (stillBusy)  to  indicate  that  we're  still  busy,  and  save  the 

chars  in  BusyBuf. 

SpIntHnd 

RIntHnd  move.l 

MPPVars, A2 

A2  ->  driver  variables 

statcount  RcvIntCount 

remember  the  number  of  Rev 

interrupts 

RlntHndlO  bsr 

SCC) 

NextChar 

handle  next  char  (from  BusyBuf  or 

beq 

RIntRTS 

quit  if  no  data 

and 

#$00FF, DO 

use  only  eight  bits 

move . w 

DO, LastRxCh (a2) 

remember  the  char 

Check  for  flow  control  from  other  side 

§15  move.b 

D0,D1 

check  for  either  parity  Xon/Xoff 

and.b 

#$7F,D1 

cmp.b 

#Xoff,Dl 

is  it  a  control-S? 

bne.s 

@20 

go  if  not 

statcount  XOFFcount 

count  it 

St 

rcvdXoff (a2) 

and  remember  we  received  Xoff 

bra.s 

RlntHndlO 

loop  for  another  char 

@20  cmp.b 

#Xon,Dl 

or  is  it  a  control-Q? 

bne.s 

@30 

go  if  not 

statcount  XONcount 

sf 

rcvdXoff (a2) 

bsr 

TestTBMT 

is  the  tx  empty? 

beq.  s 

RlntHndlO 

loop  if  not 

bsr 

TxNextCh 

otherwise,  start  up  Tx  side  again 

bra.s 

RlntHndlO 

loop  for  another  char 

Watch  out  for  framing  characters 

30  cmp.b 

#FrameChar,D0 

is  it  a  framing  character? 

beq.  s 

GotFrmCh 

go  if  so 

tst  .b 

InpState (a2) 

are  we  in  a  frame? 

beq.  s 
EJECT 

RlntHndlO 

loop  for  another  char 

Maybe  this  is  a  data  char  —  check  the 

frame  length 

cmp 

♦MaxLAPFrmLen, rcvdlen (a2)  ;  is  the  frame  too  long? 

bls.s 

@50 

go  if  it's  OK 

statcount  LongFrame 

remember  the  long  frame 

sf 

InpState (a2) 

go  idle 

bra.s 

RlntHndlO 

loop  for  another  char 

We  have  a  real  char  —  un-escape  it 

( continued  on  nejet  page) 

Dr.  Dobb’s  Journal,  October  1987 

784 


79 


ASYNC  APPLETALK 

Listing  One  (Listing  continued,  tejct  begins  on  page  18.) 

@50 

cmp.b 

#DLE, DO 

is  it  a  DLE? 

bne.s 

@90 

go  if  not 

St 

Escln  (a2) 

remember  we've  seen  an  escape 

bra.s 

RlntHndlO 

;  This  is  a 

data  char  —  complete  any  escaping,  accumulate  the  CRC 

@90 

tst  .b 

Escln  (a2) 

should  we  escape  it? 

beq.s 

@100 

go  if  not 

eor 

#$40, DO 

xor  with  $40 

sf 

Escln  (a2) 

and  clear  the  escape  flag 

;  now  we ' ve 

got  a  good  char 

@100 

lea 

input CRC  (a2) , a3 

point  at  the  CRC  accumulator 

bsr 

NextCRC 

update  the  CRC  accum  using  byte  in 

DO 

move . 1 

LAPStash (a2),a0 

point  at  the  next  free  char  in 

buffer 

move . b 

DO, (aO) + 

save  the  char  in  the  buffer,  bump  the 

pointer 

addq 

#l,rcvdlen  (a2) 

increment  the  bytes-read  counter 

cmp 

#3, rcvdlen  (a2) 

;  have  we  read  in  exactly  three  chars? 

bne.s 

@110 

;  go  if  not 

move . 1 

LAPInBuf (a2) , aO 

;  otherwise  point  at  the  LAPInBuf 

@110 

move . 1 

aO, LAPStash (a2) 

;  and  update  the  pointer 

bra 

RlntHndlO 

;  loop  for  another  char 

RIntRTS 

bra 

DoIUS 

;  reset  Highest  IUS  and  return 

;  We've  discovered 

a  FrameChar  —  check 

if  we're  done  or  just  starting 

GotFrmCh 

tst  .b 

InpState (a 2) 

;  are  we  in  a  frame? 

beq.s 

FrmStart 

;  go  if  not  (we  will  be) 

FrmEnd 

cmp 

#2, rcvdlen (a2) 

;  found  closing  char 

bhi.s 

CheckCRC 

;  go  if  frame  is  long  enough 

statcount  ShortFrame 

;  else,  flag  that  we  got  a  short  frame 

;  and  fall  into  FrameStrt 

;  We're  in 

a  frame 

now ! 

FrmStart 

lea 

toRHA (a2) ,  a3 

;  a3  ->  RHA  (holds  1st  5  bytes) 

move . b 

sysLAPAddr (a2) ,  (a3)  + 

;  copy  the  node  number 

move .  b 

sysABridge (a2) , (a3)  + 

;  and  the  bridge  address 

move .  1 

a3, LAPStash (a2) 

;  remember  where  next  byte  goes 

St 

InpState (a2) 

;  change  the  InpState  to  in_msg 

sf 

Escln (a2) 

;  and  we're  not  escaping  data 

clr 

Input CRC (a 2) 

;  no  CRC  yet 

clr 

rcvdlen (a2) 

;  no  data,  either 

bra.s 

RIntRTS 

EJECT 

;  We  received  a  complete  frame  —  check 

the  CRC 

CheckCRC 

sf 

InpState (a2) 

;  we're  not  in  a  frame  now 

tst 

InputCRC (a2) 

;  is  the  CRC  zero? 

beq.s 

LAPDemux 

;  go  if  it  is  OK 

statcount  CRCCount 

;  save  the  statistic 

bra.s 

RIntRTS 

;  and  exit 

;  Come  here 

on  receipt  of  a  good  frame. 

We've  cleared  the  InpState 

;  to  indicate  we're  out  of  a  frame. 

LAP Demux 

statcount  FrmCount 

;  log  another  good  frame 

move . 1 

Ticks, LastRcv (a2) 

;  remember  this  frame's  arrival  time 

lea 

2+toRHA (a2) ,a3 

;  a3  ->  LAP  type  byte 

MOVE . B 

(A3) +,D0 

;  Get  the  LAPtype,  bump  pointer 

tst  .b 

DO 

BMI 

LAPIn 

;  If  minus,  it's  a  LAP  packet 

;  Got  a  data  packet  -  look  for  a  protocol  handler 

tst.b 

AALAPup (a2) 

;  but  first,  is  the  AALAP  up? 

beq.s 

@60 

;  go  if  it's  not  up 

MOVEQ 

# (LAPTblSz-1) ,D2 

;  D2  =  index  into  active  protocols  list 

@30 

CMP.B 

Protocols (A2, D2) , DO 

;  Match? 

80 


Dr.  Dobb's  Journal,  October  1987 

785 


DBEQ 

D2,  @30 

;  (If  none,  D2  is  negative  -  3. IF) 

LSL.W 

*  2,D2 

;  Make  D2  a  longword  index  into  Handlers 

Got  a 

protocol  handler  —  Compute 

the  desired  length  of  the  message  in  Dl 

move.b 

(a3)+,Dl 

;  Get  MSByte  of  the  length  into  Dl 

and 

#3,D1 

;  mask  for  two  lsbits 

LSL 

»8,D1 

;  Move  to  proper  position 

MOVE . B 

(a3)+,Dl 

;  Dl  *  total  length 

move 

rcvdlen(a2) ,D0 

;  DO  =  total  chars  received  (DDP  +  LAP 

+  CRC) 

subq 

#3, DO 

;  disregard  LAP  type  and  CRC 

emp 

D1,D0 

;  are  they  equal? 

beq.s 

@40 

;  go  if  so 

statcount  LenErrCnt 

;  save  the  stats 

bra 

RIntRTS 

;  and  exit 

@40 

SUBQ 

#2,D1 

;  Subtract  2  for  length  bytes 

move 

dl , RcvdLen (a2) 

;  and  remember  the  number  of  unread  chars 

; 

EJECT 

At  this  point.  Handlers (A2,D2)  points  to  the  address  of  the  protocol 
handler  for  this  packet's  protocol  (or  D2  is  negative  if  there  is 
none  —  3. IF).  JMP  to  it  with  the  following: 

AO, A1  -  SCC  read/write  addressses 
A2  -  ptr  to  driver  locals 

A3  -  ptr  into  the  RHA  (first  5  bytes  loaded) 

A4  will  be  the  address  of  our  read  packet  routine 

A5  will  be  saved  for  handler's  usage  (until  packet's  all  in  or  error) 

D1  -  length  of  packet  still  left  to  read  (from  header) 

The  protocol  handler  must  obey  the  following  conventions: 

1)  It  must  preserve,  across  the  call,  A0-A2,  A4  and  D1 

2)  A6  and  D4-D7  must  be  saved  and  restored  if  used. 

3)  It  must  JSR  to  the  routine  at  (A4)  or  2(A4)  with  registers  as  defined 
there,  for  the  purpose  of  reading  more  of  the  packet  and  eventually 
resetting  the  SCC  for  the  next  interrupt. 


TST 

D2 

Is  there  a  protocol  handler?  (3. IF) 

BMI.S 

@60 

Branch  if  not 

bsr 

DoIUS 

reset  Highest  IUS 

MOVEM. 

L  A4/A5,  SaveA45  (A2)  ; 

Save  A4  and  A5  (may  be  free  time  now) 

move . 1 

LAPInBuf (a2),a4 

point  at  the  next  char  of  the  msg 

move . 1 

A4,LAPStash(a2) 

(we  can  snatch  A4  for  a  few  instrs) 

MOVE . L 

Handlers (A2,D2) ,  A5  ; 

A5  ->  protocol  handler 

LEA 

ReadPacket, A4  ; 

A4  ->  ReadPacket 

St 

stillBusy (a2) 

remember  we're  processing  a  frame 

move.w 

VSCCEnable (A2) , SR  ; 

re-enable  so  we  can  catch  more  chars  (! 

JSR 

(A5) 

Call  the  protocol  handler 

move . 1 

MPPVars,  A2 

point  at  our  variables 

empa . 1 

SaveA45 (A2) ,A4  ; 

paranoia  land  —  make  sure  they've  left 

bne.s 

@45 

things  as  they  should  be 

cmpa.l 

(SaveA45+4)  (A2),A5 

beq.s 

@50 

@45 

pea 

BadA4A5 

DC.W 

$ABFF 

print  the  text  (in  lieu  of  $A9FF) 

@50 

sf 

stillBusy (A2)  ; 

and  now  we're  not  in  a  frame 

rts 

exit  the  interrupt  handler 

;  No 

handler,  just 

log  the  error 

@60 

_StatCount  NoHandCnt  ; 

Count  packets  without  a  handler 

bra 

RIntRTS 

and  exit 

BadA4A5  DC.B 

17 

debugging  only 

DC.B 

•AALAP  -  Bad  A4/A5' 

align  2 
EJECT 


NextChar  —  Handle  the  next  char 

This  routine  does  two  things:  If  we're  awaiting  a  full  message,  then 
it  gets  the  next  character.  That  char  may  have  arrived  from  the  SCC, 
or  it  may  be  a  char  left  in  the  BusyBuf.  (Chars  in  the  BusyBuf  take 
precedence. ) 

(continued  on  nejet  page) 


Dr.  Dobb's  Journal,  October  1987 

786 


81 


ASYNC  APPLETALK 

Listing  One  (Listing  continued,  text  begins  on  page  18.) 

If  we're  still  processing  the  previous  message  (stillBusy  set  true). 

then  all  characters  which  arrive 

will  be  placed  in  BusyBuf,  and  the 

associated  pointers  updated.  (Note:  myPollProc  also  inserts  data 

into  the  BusyBuf 

,  but  it  doesn't 

set  stillBusy.) 

Uses  A0,A1,D0 

Assumes  A2  ->  MPPVars 

Returns  Z  if  no 

character 

NZ  if  char  present  (char 

is  in  8  lsbits  of  DO) 

NextChar  SUBR 

tst  ,b 

stillBusy (A2) 

;  are  we  still  processing  the  prev. 

frame? 

bne.s 

@30 

;  go  if  we  are 

bsr.s 

GetBusyChar 

;  else,  look  for  a  char  from  BusyBuf 

bne.s 

@50 

;  quit  if  we  got  one 

bsr .  s 

GetSCCchar 

;  else  check  the  SCC 

bra.  s 

@50 

;  and  quit 

@30  bsr.s 

StashSCCch 

;  stash  a  char  from  SCC  into  BusyBuf 

bne.s 

@30 

;  go  back  and  look  for  more 

@50  SUBEND 

'NEXTCHAR' 

assumeEq  BusyStash, BusyBuf +16  ;  otherwise  cmpa.l  A0,A1  (above) 

fails 

GetBusyChar  SUBR 

;  get  a  char  from  the  BusyBuf 

move . 1 

BusyFetch (A2)  ,  DO 

;  get  the  fetch  pointer 

cmp.  1 

BusyStash (A2) ,  DO 

;  is  it  the  same  as  the  stash  pointer 

bne.  s 

@10 

;  go  if  not  (more  chars  to  do) 

lea 

BusyBuf (a2) ,a0 

;  point  at  the  busy  buffer 

move . 1 

aO, BusyStash (A2) 

;  and  save  it  in  the  BusyStash 

move .  1 

aO, BusyFetch (A2) 

;  and  BusyFetch 

moveq 

#0,D0 

;  clear  the  CC 

bra .  s 

@20 

@10  move.l 

DO,  A0 

;  there's  still  more  to  take 

move.b  (A0)+,D0 
move.l  AO, BusyFetch (A2) 
or.w  #$100,00 


;  get  the  byte 

;  update  the  pointer 

;  make  CC  <>  Z  (must  preserve  8 


lsbits) 

@20 


_SUBEND 

EJECT 


'  GETBUSYC ' 


GetSCCchar  and  StashSCCch  both  are  called  by  RintHnd  and  myPollProc 
BOTH  ROUTINES  MAY  ONLY  USE  A0,  Al,  AND  DO!!!!!  (A2  will  ->  MPPVars) 

GetSCCchar  looks  at  RCA  on  the  proper  channel,  and  returns  the  char 


;  in  DO  if  there 

was  one  (with  CC  set 

<>  Z) ;  else  it  returns  CC  -  Z. 

GetSCCchar 

movem. 1 

SCCRd,  A0/A1 

;  forces  A0/A1  to  point  at  SCC 

IF 

PortA  THEN 

addq.l 

fActl, A0 

addq.l 

ENDIF 

#Actl, Al 

btst 

#RCAbit ,  (A0) 

;  is  there  a  char? 

beq.  s 

@20 

;  go  if  not 

move.b 

♦  1,  (al) 

;  point  at  the  error  bits  from  RR1 

nop 

move . b 

(aO)  ,D0 

;  get  them  (Overrun, Framing)  in  DO 

and 

#$70, DO 

;  any  error  bits? 

beq.s 

@10 

;  go  if  not 

move.b 

#ResetErr,  (al) 

;  else  send  Error  Reset  to  WR0 

nop 

move . b 

♦1,  (al) 

;  point  at  WR1 

nop 

move . b 

#$13, (al) 

;  and  set  up  for  int  on  all  rx  chars 

nop 

st at count  OVRcount 

;  count  ' em 

@10 

move . b 

SCCData (aO) ,D0 

;  and  get  the  data  (EVEN  IF  ERROR!) 

or.w 

#$100, DO 

;  set  the  SR  (to  NZ  —  there’s  a  char) 

@20 

rts 

;  StashSCCch  —  take  a  char  from  SCC, 

save  in  BusyBuf  if  there's  space 

;  Return 

Z  if  no 

char  or  no  space;  NZ 

otherwise 

StashSCCch 

bsr.s 

GetSCCchar 

;  look  for  a  char  in  the  SCC 

(continued  on  next  page) 


787 


ASYNC  APPLETALK 


Listing  One  (Listing  continued,  te^ct 

begins  on  page  18.) 

beq.s 

@50 

go  if  none 

lea 

BusyStash (a2) , A1 

point  at  the  BusyStash  pointer 

move . 1 

(al)  ,A0 

and  get  it 

cmpa .1 

A0,A1 

will  this  be  too  many  chars? 

beq.s 

@50 

yes,  simply  exit  (and  ignore  the  char) 

move.b 

DO,  (aO)  + 

save  the  char,  and  bump  the  pointer 

move . 1 

A0, BusyStash (A2) 

and  update  the  pointer 

or.  w 

#$100, DO 

set  the  CC  <>  Z  ('cause  we  took  one  ) 

@50 

rts 

and  return 

EJECT 

DoIUS 

—  reset  Highest  IUS 

DoIUS 

SUBR 

move . 1 

SCCWr,Al 

point  at  the  SCC  write  regs 

IF 

PortA  THEN 

addq. 1 

# Act 1 ,  Al 

ENDIF 

move.b 

#ResetIUS,  (al) 

Reset  Highest  IUS  in  SCC  (to  WR0) 

SUBEND 

•DOIUS 

EJECT 

LAPIn 

-  it's  a  LAP  control  packet. 

DO  *  LAP  type 

A3  - 

>  remainder 

of  the  frame 

Note:  for  IM/UR  frames,  the  net  number  (2  bytes)  is  at  (a3) , 

but  the 

node  number  (1  byte) 

is  the  first  byte  in  LAPInBuf 

Check 

for  IM 

LAPIn 

move 

(a3),Dl 

D1  -  Net  number  (a3  sb  even) 

move . 1 

LAPInBuf  (a2) ,  A0 

point  at  first  char  in  input  buf 

move . b 

(a0)  ,D2 

D2  “  node  number 

cmp.b 

#lapIM, DO 

is  it  an  IM? 

bne.  s 

@60 

go  if  not 

move 

D2,  DO 

DO  =  node  number 

sf 

RcvdXoff (A2) 

so  we  can  start  sending 

bsr .  s 

CheckIM 

figure  out  the  net  and  node  to  send 

bsr .  s 

SendIMUR 

send  ' em 

bra.s 

@80 

Check 

for  UR 

§60 

cmp.b 

#lapUR, DO 

is  it  a  UR? 

bne.s 

@80 

go  if  not 

move 

D2,  DO 

DO  =  Node  number  (D1  *=  Net  number) 

bsr.s 

CheckUR 

check  these  values,  return  <>  0  if  OK 

sne 

AALAPup  (a2) 

if  non-zero,  then  we're  up 

@80 

rts 

and  return 

_AssumeEq  lapENQ,$81 

(i) 

AssumeEq  lapRTS,  lapENQ+3 

(2) 

AssumeEq  lapCTS, lapRTS+1 

(3) 

EJECT 

Check IM  —  check  the  received  IM  frame. 

compute  UR  response 

Entry: 

DO  *  their  node  number 

D1  -  their  network  number 

Exit: 

D0,D1  -  node,  net  number  for  the  UR 

D2  -  qlapUR 

Changes  A0,A1,A3,  D0-D3 

CheckIM 

move . 1 

#0,  A0 

return  nil  sometimes 

move .  w 

SysNetNum (a2) , D2 

D2  -  our  Net  number 

beq.s 

@10 

go  if  so  —  check  the  node  numbers 

move 

D2,D1 

else,  use  our  net  number 

@10 

move . b 

SysLAPAddr (a2) ,D3 

D3  =  our  node  number 

@15 

tst  .b 

DO 

while  (theirnode  <>  0) 

beq.s 

@18 

&  (theirnode  <>  mynode) 

cmp.b 

D3,D0 

have  we  both  chosen  the  same  value? 

bne.s 

@20 

go  if  not  —  return  their  value 

@18 

bsr 

RandomWord 

choose  a  random  value 

and 

#$7F, DO 

mask  to  7  bits 

(continued  on  page  86) 

84 

788 


Dr.  Dobb’s  Journal,  October  1987 


ASYNC  APPLETALK 


Listing  One  (Listing  continued,  tejct  begins  on  page  18.) 


bra.s 

@15 

;  loop  to  insure  they' 

're  different 

@20 

move.b 

DO , sy sABridge (a2) 

;  remember  their  node 

number 

moveq 

rts 

#qlapUR,D2 

;  D2  -  LAP  type 

EJECT 

CheckUR  —  check  the  received  UR  frame 

Entry:  DO  -  node  number 

D1  -  network  number 

Exit:  DO  -  0  if  net /node  didn't  match 

<>  0  if  they  matched  right  off 


CheckUR 

SUBR 

cmp 

SysNetNum(a2) ,D1 

;  Network  numbers  match? 

bne.  s 

@10 

;  go  if  not 

cmp.b 

SysLAPAddr (a2) ,D0 

;  Node  number  match? 

bne.s 

@10 

;  go  if  not 

moveq 

f-l,D0 

;  make  DO  non- zero  (it's  OK) 

bra.s 

CkURRTS 

;  and  exit 

@10 

tst 

SysNetNum(a2) 

;  is  our  network  number  0000? 

bne.s 

@50 

;  go  if  not  (we  cannot  resolve  this) 

move 

D1 , SysNetNum (a2) 

;  save  their  Net /Node  suggestions 

move.b 

DO, SysLAPAddr (a2) 

bra.s 

@60 

@50 

St 

AALAP stuck (a2) 

;  we're  really  bad  off  —  NNNN  conflict 

@60 

clr 

DO 

;  we  didn't  match 

CkURRTS 

SUBEND 

•CHECKUR  ' 

. 

EJECT 

SendIMUR  -  This  routine  fills  and  sends  an  IM  or  UR  frame.  This  is 
a  bit  dicey,  since  a  UR  may  be  required  as  a  result  of  receiving 
an  IM.  Since  it's  difficult  to  abort  a  frame  already  in  progress, 
we  finesse  the  problem  by  not  sending  the  IM/UR  frame.  Here’s  why 
it  works: 

A  UR  response  is  only  necessary  in  two  cases: 

a)  we're  trying  to  bring  the  link  up,  and  the  other  guy  said  "IM"; 

b)  he  hasn't  heard  from  us,  and  he  wants  to  make  sure  we're  here. 

For  a),  we  shouldn't  be  talking,  but  he'll  ask  again  anyway; 
for  b) ,  the  IM  is  trying  to  force  us  to  send  a  good  frame. 

If  the  frame  in  transit  makes  it,  OK.  If  not,  he'll 
still  ask  again. 

Entry:  AO  ->  master  pointer  of  this  hdlblk 

A2  ->  MPPVars 
DO  -  node  number 
D1  -  Net  number 
D2  *  LAP  type 

Exit:  AO, Al, A3,D0-D3  changed 


SendIMUR 

SUBR 

tst .  1 

tWDSptr  (A2) 

;  are  we  sending? 

bne.s 

SndIMURl 

;  yes,  just  return 

lea 

IMURbuf+1 (A2) , Al 

;  Al  points  at  IMURbuf  (odd  adrs) 

move.b 

D2,2  (al) 

;  save  the  LAPtype  (IM  or  UR) 

move .  w 

Dl,3  (al) 

;  and  the  Net  number 

move.b 

DO, 5  (al) 

;  and  the  Node  number 

lea 

IMURwds (A2) , A0 

;  A0  points  at  the  WDS 

move .  w 

#6,  (A0) 

;  save  the  length 

move . 1 

Al ,  2  (A0 ) 

;  and  the  pointer  to  the  data 

clr.w 

6  (A0) 

St 

SendinglMUR (A2) 

;  remember  this! 

bsr 

SendFrame 

;  and  send  it 

SndIMURl 

SUBEND 

•SENDIMUR' 

. 

EJECT 

ReadPacket  -  read  in  the  specified  number  of  bytes  into  the  specified 

buffer.  It  is  an  error  to  request  more  bytes  than  have  been  received. 

(continued  on  page  88) 


86 


Dr.  Dobb’s  Journal,  October  1987 

789 


ASYNC  APPLETALK 


Listing  One  (Listing  continued,  text  begins  on  page  18.) 

;  ReadRest  -  read  in  the  rest  of  the  packet,  putting  the  specified  number 
;  of  bytes  into  the  specified  buffer.  Error  if  packet  longer  than  buffer. 

;  Call: 

;  A0,A1,A2  =  SCO  read  and  write  addresses  and  local  variables 

;  A3  ->  buffer  to  read  into 

;  A4  ->  start  of  ReadPacket 

;  D3  =  byte  count  to  read  (word) 

;  Return: 

;  DO  changed 

;  D1  number  of  chars  still  unread  (ReadPacket);  modified  (ReadRest) 

;  D2  saved 

;  D3  -  0  if  exact  number  of  bytes  requested  were  read 

;  >0  indicates  number  of  bytes  requested  but  not  read 

;  (packet  smaller  than  requested  maximum) 

;  <0  indicates  number  of  extra  bytes  read  but  not  returned 

;  (packet  larger  than  requested  maximum) 

;  A0,A1  preserved  by  ReadPacket,  modified  by  ReadRest 

;  A3  ->  one  past  where  last  character  went 

;  A4,A5  saved  (until  packet's  all  in  or  error) 

;  NOTE:  CRC  bytes  not  included  in  counts 


ReadPacket 

BRA.S 

DoRP 

; 

Need  this  for  two  entry  points 

ReadRest 

movem.  1 

a0/al/D2,-(sp) 

s 

save  some  regs 

move 

RcvdLen  (a2)  ,D1 

; 

get  the  number  of  remaining  chars  In  D1 

move 

D1,D0 

; 

we  expect  to  copy  D1  bytes 

move 

♦  0,-(«p) 

; 

and  expect  good  return  status 

sub 

D1,D3 

; 

compute  (D3  -  Dl) 

bpl.s 

81 

; 

go  If  we  should  copy  Dl  bytes  (It  fits) 

add 

D3,D0 

; 

otherwise,  copy  D3  bytes  (dl  +  (d3-dl)) 

move 

#-i.  t*p) 

; 

and  set  error  return  status 

ii 

movem.  1 

SaveA45(a2),a4/a5 

; 

restore  A4  and  A5 

bra.  a 

DoCopy 

; 

and  go  to  the  common  code 

88 

790 


Dr.  Dobb’s  Journal,  October  1987 


DoRP 

movem.l 

aO/al/D2,-(sp) 

; 

push  some  regs 

move 

RcvdLen  (a2),Dl 

$ 

get  the  number  of  remaining  chars  in  Dl 

move 

D1,D0 

» 

assume  we'll  copy  them  all 

move 

»-l.-(*P) 

; 

and  that  there's  an  error 

sub 

D3.D1 

$ 

update  Dl  (remaining  bytes  in  buf) 

bmi  .s 

DoCopy 

• 

• 

go  if  it's  negative  (error) 

move 

D3,D0 

• 

$ 

we'll  read  what  they  asked  for  (D3) 

clr 

(sp) 

i 

and  remember  that  it's  exactly  right 

clr 

D3 

# 

DoCopy 

move.l 

LAPStash  (A2) ,  .0 

; 

point  at  tha  source  data 

ext  .1 

DO 

; 

belt  and  suspenders  (DO  ~  actual  length) 

add.l 

DO*  LAPStash (A2) 

; 

and  update  the  LAPStash  value 

sub 

DO,  RcvdLen (a2) 

; 

and  the  num  chars  remaining 

move . 1 

A3,A1 

; 

point  at  the  dost  buffer 

lea 

0  (A3. DO)  ,A3 

; 

update  the  return  pointer 

BlockMove 

1 

Do  It 

move 

RcvdLen  (a2),Dl 

• 

» 

return  number  of  unread  chars 

move 

(sp)  +,d0 

l 

get  the  return  status  back 

movem. 1 

.  (sp)  +,a0/al/D2 

; 

get  tha  other  registers 

t  St 

rts 

EJECT 

DO 

; 

set  the  CCR. 

NextCRC  —  compute  a  CRC  on  tha  word  pointed  at  by  A3  and  tha  char  in  DO 

Thia  routina  computes  a  CRC-16  on  a  stream  of  bytes.  It  uses  a 
table  lookup  scheme  to  implement  a  xA16  ♦  xA15  ♦  xA2  +  1  polynomial. 
Tha  interested  reader  is  referred  to  McNamara's  Technical  Aspects 
of  Data  Communications,  second  edition,  pps  110-122  for  an  obliquely 
related  discussion. 

This  routina  takes  tha  storage  short  cut  of  looking  up  two  four-bit 
values  in  a  16-entry  table  instead  of  one  eight-bit  value  in  a  256 
word  table.  This  saves  a  considerable  amount  of  space  (32  bytes  vs. 
512  bytes  for  the  table) . 

One  pass  thru  this  routine  (one  character)  is  about  262  cycles,  or 
33.45  usee  on  a  Mac.  This  is  a  data  rate  of  -29,900  char/sec, 
or  plenty  fast  to  keep  up  with  a  9600  baud  link. 

Entry:  A3  ->  CRC  accumulator 

DO  LSbyte  is  the  data  char  (already  masked  to  8  bits) 

Exit:  D1,D2  changed 

Other  regs  unchanged 


SUBR 

0 

for  macsbug 

move 

U3),D2 

D2  is  the  temp  accumulator 

move 

DO,  Dl 

make  a  copy  of  the  input  character 

work  on 

the  least  significant  nibble 

eor 

D2.D1 

xor  the  accumulator  with  the  data  char 

and 

IS0F.D1 

to  get  an  index  into  the  CRCTable 

add 

Dl,  Dl 

to  make  a  word  index 

lsr 

#4,D2 

shift  the  CRC  right  four  bits 

move 

CRCTable (Dl) ,D1 

eor 

Dl,  D2 

and  mask  it  with  the  approp.  table  entry 

move 

DO,  Dl 

lsr 

44,  Dl 

shift  the  data  char  right  four  bits 

and  do  it  again  for  the  high  nibble 


eor 

and 

add 

lsr 

move 

eor 

move 

SUBEND 


D2,D1 

#$0F,D1 

D1,D1 

#4,D2 

CRCTable (Dl) , 
D1,D2 
D2,  (a3) 
•NEXTCRC  ' 


Dl 


xor  the  accumulator  with  the  data  char 
to  get  an  index  into  the  CRCTable 
to  make  a  word  index 
shift  the  CRC  right  four  bits 
and  mask  it  with  the  approp.  table  entry 

remember  this  CRC  for  next  time 


CRCTable  DC.W 
DC .  W 


DC.W 

EJECT 


$0000, $CC01, $D801, $1400 
$F001, $3C00, $2800, $E401 
$A001, $6C00, $7800, $B401 
$5000, $9C01, $8801, $4400 


(Listing  One  to  be  continued  next  month. 

Listings  Two  and  Three  will  also  appear  next  month.) 


Dr.  Dobb's  Journal,  October  1987 


89 

791 


FORTH  PRELUDE 


Listing  One  (Text  begins  on  page  40. ) 

Listing  1 
SCR*  0 

(  A  Forth  Standard  Prelude  ) 

This  file  defines  additional  functions  and  extensions  which 
cannot  be  provided  in  the  Forth-83  Standard. 

Select  the  appropriate  prelude  for  the  particular  Forth  you  are 
using  and  load  it  before  loading  a  Standard  program  using  these 
functions. 


Copyright  1986  1987  by  Martin  J.  Tracy. 

SCR*  1 

(  Select  a  FORTH- 83  implementation) 

FORTH  DECIMAL 

(  2  LOAD 

(  3  4  THRU 

(  5  LOAD 

(  6  LOAD 

(  7  LOAD 

(  8  LOAD 

SCR*  2 

(  FORTH-83  functions —  typical  definitions) 

(  Note:  functions  already  provided  need  not  be  redefined.) 

:  RECURSE  '  [COMPILE]  RECURSE  ;  IMMEDIATE 
:  INTERPRET  INTERPRET  ; 

:  I>  (  -  'data  )  COMPILE  R>  ;  IMMEDIATE 

:  >1  (  -  'data  )  COMPILE  >R  ;  IMMEDIATE 

(  Used  for  alignment:  ) 

:  ALIGN  HERE  1  AND  ALLOT  ; 

:  REALIGN  (  a  -  a'  )  DUP  1  AND  +  ; 

(  Note:  defined  here  for  a  dumb  terminal.) 

:  TAB  (  x  y  )  CR  2 DROP  ; 

:  PAGE  25  0  DO  CR  LOOP  0  0  TAB  ; 

:  MARK  (  a  n)  TYPE  ? 

SCR*  3 

(  Lab  Microsystems,  Inc.  PC/FORTH  Ver.  3.0) 

(  RECURSE  and  INTERPRET  are  provided.) 

(  Used  by  hi-level  run-time  words  to  find  in-line  data:  ) 

:  I>  (  -  'data  )  COMPILE  R>  ;  IMMEDIATE 
:  >1  (  -  'data  )  COMPILE  >R  ;  IMMEDIATE 

(  PC/FORTH  already  has  a  word  ALIGN  which  works  differently.) 
(  The  following  two  definitions  must  be  in  the  order  shown:  ) 

:  REALIGN  (  addr  -  addr '  )  ALIGN  ; 

(  8086/80286  run-time  I  realignment.) 

:  ALIGN  (  — )  EVEN  ; 

{  8086/80286  compilation  address  alignment.) 


SCR*  4 

(  Lab  Microsystems,  Inc.  PC/FORTH  Ver.  3.0) 

(  x  y - ) 

:  TAB  GOTOXY  ; 

(  -  ;  clear  screen  and  home  cursor  ) 

:  PAGE  CLS  ; 

(  a  n  -  ;  display  string  in  inverse  video  ) 

:  MARK  REVERSE  TYPE  REVERSE  ; 


SCR*  5 

(  Laxen-Perry  No  Visible  Support  F83  Ver.  2.1) 

{  RECURSE  and  INTERPRET  are  provided.) 

(  Used  by  hi-level  run-time  words  to  find  in-line  data:  ) 
:!>(**  'data  )  COMPILE  R>  ;  IMMEDIATE 

:  >1  (  -  'data  )  COMPILE  >R  ;  IMMEDIATE 

(  Used  for  alignment :  ) 

:  ALIGN  ;  :  REALIGN 

:  PAGE  (  —  )  DARK  ; 

\  clear  screen  and  home  cursor. 

:  TAB  (  x  y  —  )  AT  ; 

\  move  cursor  to  given  coordinate. 

:  MARK  (an  —  )  TYPE  ; 

\  the  MARK  function  is  not  provided  in  F83 . 

SCR*  6 

(  MicroMotion  MasterForth  Ver.  1.2.4) 

(  RECURSE  is  provided.) 

:  INTERPRET  TIB  *TIB  0  EVAL  ; 

:  >1  COMPILE  >R  ;  IMMEDIATE 
:  I>  COMPILE  R>  ;  IMMEDIATE 

NEED  ALIGN  \IF  :  ALIGN  ; 

NEED  REALIGN  \IF  :  REALIGN  ; 

(  PAGE  is  provided.) 


(  ***  model  ****) 

(  Lab  Microsystems,  Inc.  PC/FORTH  Ver.  3.0) 
(  Laxen-Perry  F83  Ver.  2.1) 
(  MicroMotion  MasterForth  Ver.  1.2) 
(  ZEN  Ver.  0.0) 
(  FORTH,  Inc.  PolyFORTH  II  ISD-4  MS-DOS) 


:  TAB  (  X  y)  AT  ; 

:  MARK  (  a  n)  +INVERSE  TYPE  -INVERSE  ; 


SCR*  7 
(  ZEN  0.0) 

(  RECURSE  and  INTERPRET  are  provided.) 
(  >1  and  I>  are  provided.) 

NEED  ALIGN  \IF  :  ALIGN  ; 

NEED  REALIGN  \IF  :  REALIGN  ; 

(  PAGE  TAB  and  MARK  are  provided.) 


SCR*  8 

(  FORTH,  Inc.  polyFORTH  MS-DOS  ISD)  HEX  1F1F  WIDTH  ! 

(  Forth-83  Compatibility  layer  must  be  loaded  first.) 

:  RECURSE  LAST  002+  COUNT  +  ,  ;  IMMEDIATE 

(  INTERPRET  is  provided.) 

(  Used  by  hi-level  run-time  words  to  find  in-line  data:  ) 

:  I>  (  -  'data  )  COMPILE  R>  ;  IMMEDIATE 
:  >1  (  -  'data  )  COMPILE  >R  ;  IMMEDIATE 

(  Used  for  alignment:  ) 

:  ALIGN  ;  :  REALIGN 

(  PAGE  and  MARK  are  provided.) 

:  TAB  (  X  y  )  SWAP  TAB  ; 

(  move  cursor  to  given  coordinate.) 

End  Listing 


90 

792 


Dr.  Dobb’s  Journal ,  October  1987 


PATTERN  MATCHING 


Listing  One  (Text  begins  on  page  46.) 


/* 

★ 

*  FINDCMD 

★ 

*  MODULE:  MAIN.C 

★ 

*  COPYRIGHT  (C)  1987  by  Charles  F.  Bowman 

*  All  Rights  Reserved. 

* 

*/ 


This  module  contains  the  driving  loop  of  the  program. 


/*  For  display  */ 


*/ 

tinclude 

<stdio.h> 

tinclude 

"findcmd.h1 

struct  attribs 

atts [ ]  =  { 

DIRECT, 

'd'. 

RDONLY, 

■r\ 

HIDDEN, 

'h\ 

SYSTEM, 

's'. 

ARCH  IV, 

'a' 

}; 


*month [ ] 

=  { 

"Jan", 

"Feb", 

"Mar 

"Apr", 

"May" , 

"Jun 

"Jul", 

"Aug” , 

"Sep 

"Oct", 

"Nov", 

"Dec 

); 


main  ( 

ac,  av  ) 

int 

ac; 

char 

{ 

*av[]  ; 

int 

i,  len; 

char 

dir  [  MAXPATH  ] ; 

char 

path [  MAXPATH  ]; 

char 

*tptr,  tpath [  MAXPATH  ]; 

struct 

dirent  dfile; 

if (  ac  ==  1  ) { 

fprintf(  stderr,  "usage:  fc  patl  pat2  ...  patn\n"  ); 
exit (  1  ) ; 


printf (  "FINDCMD  -  COPYRIGHT  (C)  1987,  Charles  F.  Bowman.  "  ); 
printf(  "All  Rights  Reserved. \n\n"  ); 

sprintf  (  path,  ".;%s",  getenv ("PATH")  );  /*  Get  path  */ 

for(  i  =  1;  i  <  ac;  i++  ) { 

/* 

*  For  each  supplied  pattern  argument 

*/ 

strcpy(  tpath,  path  ); 
tptr  =  tpath; 

inits(  av[i]  );  /*  set-up  transition  table  */ 

while  (  (len  =  nextdir  (tptr) )  >  0  ){ 

/* 

*  For  each  directory  coirponent 

*/ 

tptr [  len  ]  =  NIL; 

if  (  tptr[len-l]  ==  '\V  ){ 

/* 

*  No  double  backslashes! 

*/ 

sprintf (  dir,  "%s*.*",  tptr  ); 


92 


Dr.  Dobb's  Journal,  October  1987 

793 


}  else  { 


tptr  ) ; 


} 


} 


sprintf(  dir,  "%s\\*.*". 


/* 

*  Compare  each  entry  in  directory 

*/ 

if (  firstffdir,  sdfile)  ){  /*  Get  first  */ 

/* 

*  Error  -  bad  dir  in  path 

*/ 

fprintf(  stderr, 

"BAD  DIRECTORY  SEGMENT:  [%s]\n", 
dir  ) ; 

break; 

} 

do{ 

if (  state (slcase (dfile.dname) )  ){ 

/* 

*  Print  if  a  match  is  found! 

*/ 

putfile  (  sdfile,  tptr  ) ; 

} 

}  while!  nextf (sdfile)  );  /*  Get  next  */ 

tptr  +=  len  +  1; 


exit (  0  )  ; 

} 

/* 

NEXTDIR:  return  the  next  directory  component  of  'PATH' 

V 

nextdir (  cp  ) 
char  *cp; 

{ 

register  count; 


count  =  0; 

if (  *cp  ==  NIL  ) {  /*  End  of  list  */ 

return (  -1  ) ; 

} 

while  (  *cp  !=  NIL  SS  *cp  !=  FLDSEP  ){ 
cp++; 
count ++; 

) 

return (  count  ) ; 

) 

/* 

*  SLCASE:  convert  a  string  to  lower  case 

*/ 

char  * 

slcase (  cp  ) 
char  *cp; 

{ 

register  char  *ptr; 

ptr  =  cp; 
while (  *ptr  ) { 

*ptr  =  tolower (  *ptr  ) ; 
ptr++; 

) 

return!  cp  ) ; 

} 

/* 

PUTFILE:  display  directory  info  for  each  matched  file 


(continued  on  next  page) 


Dr.  Dobbs  Journal,  October  1987 


S»1 


PATTERN  MATCHING 


Listing  One 

(Listing  continued,  text  begins  on  page  46.) 

*/ 


put file ( 

dfile. 

dir  ) 

struct 

dirent 

Mfile; 

char 

{ 

Mir; 

int 

i; 

char 

/* 

datestr [  50  ]  ; 

★ 

Print  attribute  chatacters 

*/ 

for  (  i  =  0;  i  <  SA_SIZE (atts) ;  i++  ) ( 

if (  atts[i].val  &  dfile->dattr  ){ 
putchar(  atts[i].chr  ); 

}  else  { 

put char(  ); 

} 

} 


fdosdte (  datestr,  sdfile->ddate  ) ; 

/*  DOS  ->  ASCII  */ 
if  (  dir  [strlen  (dir) -1]  —  '  \\'  ){ 

/* 

*  No  double  backslashes! 


*/ 


printf(  "  %8D  %s  %s%s\n”, 
dfile->dsize, 
datestr, 
slcase (dir) , 
dfile->dname 

)  ; 

)  else  { 

printf(  "  %8D  %s  %s\\%s\n", 
dfile->dsize, 
datestr, 
slcase (dir) , 
dfile->dname 

)  ; 

) 


} 


return (  0  ) ; 


/* 

*  FDOSDATE:  convert  a  DOS  format  date  to  an 

ASCII  string 

*/ 

fdosdte (  where,  dptr  ) 
char  *where; 

struct  dosdate  *dptr; 

{ 

sprintf (  where,  "%s  %2d  %02d:%02d:%02d  %4d" 
month!  dptr->month-l  ], 
dptr->day, 
dptr->hour, 
dptr->min, 
dptr->sec  *  2, 
dptr->year  +  1980 

)  ; 


return  (  0  ) ;  End  Listing  One 

) 

Listing  Two 

/* 

* 

*  FINDCMD 

* 

*  MODULE:  DOS.C 

(continued  on  page  96) 


94 


Dr.  Dobb's  Journal,  October  1987 

795 


PATTERN  MATCHING 


Listing  Two  (Listing  continued,  te?ct  begins  on  page  46.) 

ie 

*  COPYRIGHT  (C)  1987  by  Charles  F.  Bowman 

*  All  Rights  Reserved. 


*  This  module  contains  the  DOS  dependent  functions  to 

*  access  file  names  in  directories.  This  is  non-portable  code. 
*/ 

#include  <dos.h> 

# include  <stdio.h> 


# include 


''findcmd.h" 


struct  reg  regs; 
static  struct  dirent  lfile; 


/*  set  s  retrieve  regs  */ 
/*  DOS  disk  trans  addr  */ 


FIRSTF:  initiate  DOS  environment  and  return  first  file  name 

*/ 

firstf (  dirpath,  dfile  ) 
char  *dirpath; 
struct  dirent  *dfile; 


*  Set  disk  transfer  address 

*/ 

regs.r_ax  =  SETDTA; 

ptoregt  dsreg,  regs.r_dx,  regs.r_ds,  slfile  ); 
intcall  (  Sregs,  Sregs,  DOSINT  ); 


*  Find  first 

*/ 

regs.r_ax  =  NFFIRST ; 

regs . r_cx  =  HIDDEN  |  SYSTEM  |  DIRECT  |  RDONLY  |  ARCHIV;  /*  All!  */ 
ptoreg (  dsreg,  regs.r_dx,  regs.r_ds,  dirpath  ); 
intcall (  Sregs,  Sregs,  DOSINT  )  ; 


if (  regs. r_f lags  S  F_CF  )( 


return (  1  ) ; 


*dfile  =  lfile; 
return (  0  )  ; 


NEXTF :  return  all  subsequent  files  in  directory 


nextf(  dfile  ) 
struct  dirent  *dfile; 


*  Call  DOS:  find  next 

*/ 

regs.r_ax  =  NFNEXT; 

regs .  r_cx  =  HIDDEN  |  SYSTEM  |  DIRECT  I  RDONLY  |  ARCHIV; 
intcall (  Sregs,  Sregs,  DOSINT  )  ; 
iff  regs. r_f lags  S  F_CF  ){ 

/* 

*  Error ! 

*/ 

return (  0  ) ; 


*dfile  =  lfile; 
return (  1  ) ; 


End  listing  Two 

(Listing  Three  begins  on  page  106.) 


Dr.  Dobb’s  Journal,  October  1987 


PATTERN  MATCHING 


■ 


Listing  Three  (Text  begins  on  page  46.) 


FINDCMD 

MODULE:  STATE .C 

COPYRIGHT  (C)  1987  by  Charles  F.  Bowman 
All  Rights  Reserved. 


This  module  contains  the  routines  nesessary 
to  implement  the  state  machine 


*/ 

# include 

<stdio .h> 

# include 

"findcmd.h" 

static  int 

tos  =  -1; 

static  int 

pat [  100  ] ; 

static  struct 

stk  stk 

/* 

INITS:  initialize  the  state  machine  (transition  table) 


inits (  p  ) 
char  *p; 


register  int  i; 

i  =  1; 

pat [0]  =  PATBEG; 
while (  *p  !=  NIL  ){ 

/* 

*  Add  each  char  in  pattern  to  state  array 

*/ 

switch (  *p  ) { 
case  1 ? 1 : 

pat  [i]  =  QUEST; 
break; 
case  1  * ' : 

pat  [i]  =  ASTER; 
break; 

default : 

pat  [i]  =  *p; 
break; 


pat [i]  =  PATEND; 
return (  0  ) ; 


*  STATE:  driving  routine  for  the  state  machine; 

*  performs  the  actual  pattern  matching. 
*/ 

state (  n  ) 
char  *n; 

{ 

register  int  state; 
char  *ptr; 


ptr  =  n; 
tos  =  -1; 
state  =  0; 
for(;;)  { 


switch  (  pat  [state]  ){ 
case  PATBEG: 

break; 


/*  Forever  */ 

/*  Begin  state  */ 


Dr.  Dobb’s  Journal,  October  1987 

1 


case  ASTER:  /*  Wild  card  */ 

if (  * (ptr+1)  !=  NIL  ) { 

/* 

*  Save  machine  state 
*/ 

PUSH(  state-1,  ptr+1  ); 

} 

while (  (*ptr  !=  pat [state+1] )  SS  (*ptr  !=  NIL)  ){ 
/* 

*  Skip  non-matching  chars 

*  up  to  end  of  string 
*/ 

ptr++; 

) 

break; 


case  PATEND: 

if (  *ptr  =  NIL  ) { 


/*  End  state  */ 


return (  1  ) ; 

)  else  if (  TOS  ){ 

/* 

*  No  match  -  restore  saved  state 
*/ 

POP  (  state,  ptr  ) ; 

}  else  { 

/* 

*  No  match! 

*/ 

return (  0  ) ; 

) 

break; 


case  QUEST: 

ptr++; 

break; 


/*  Any  1  character  */ 


default : 


if  (  *ptr  !=  pat  [state]  ){ 
if (  TOS  ) { 

/* 

*  Restore  saved  state 
*/ 

POP  (  state,  ptr  ); 

}  else  { 

/* 

*  Fail  -  no  match! 

*/ 

return (  0  ) ; 

) 

)  else  { 


*/ 

ptr++; 


Equal  -  move  on 


break; 


state++; 


/*  Next  state  */ 


Listing  Four 


End  Listing  Three 


MODULE:  FINDCMD.H 

COPYRIGHT  (C)  1987  by  Charles  F.  Bowman 
All  Rights  Reserved. 


(continued  on  next  page. ) 


Dr.  Dobb’s  Journal,  October  1987 

8 


PATTERN  MATCHING 


Listing  Four  (Listing  continued,  te?ct  begins  on  page  46.) 


Header  file  for  FINDCMD.C 


#define  NIL  'SO' 

♦define  FLDSEP 
♦define  MAXPATH  250 


/*  dir  separator  in  path  */ 


*  Machine  states 
*/ 

♦define  ASTER  128 
♦define  QUEST  129 
♦define  PATBEG  130 
♦define  PATEND  131 


DOS  file  attribute  bits 


♦define  RDONLY 
♦define  HIDDEN 
♦define  SYSTEM 
♦define  DIRECT 
♦define  ARCHIV 


/*  Read  only  file  */ 
/*  Hidden  file  */ 

/*  System  file  */ 

/*  Directory  file  */ 
/*  Archive  bit  */ 


♦define  SA  SIZE(foo)  (sizeof (foo) /sizeof (foo [0] ) ) 


Macros  for  stack  manipulation 


♦define  TOS 
♦define  POP(x,y) 
♦define  PUSH(x,y) 


(tos  =  -1  ?  0  :  1) 

{x  =  stk [tos] .state;  y  =  stk [tos] -ptr;  tos — ;  ) 

{  tos++;  stk  [tos] -state  =  (x) ;  stk  [tos]  .ptr  =  (y) ;  } 


*slcase  ()  ; 


Internal  DOS  date  structure 


struct  dosdate  { 

unsigned  sec 
unsigned  min 
unsigned  hour 
unsigned  day 
unsigned  month 
unsigned  year 


/*  Second  (intervals  of  2)  */ 
/*  Minutes  */ 

/*  Hours  */ 

/*  Day  of  month  */ 

/*  Month  of  year  */ 

/*  Year  since  1980  */ 


*  DOS  FCB  /  DIR  ENTRY  -  Set  to  DTA 

*/ 

struct  dirent  ( 

char  dinfo [  21  ]; 
char  dattr; 
struct  dosdate  ddate; 
long  dsize; 
char  dname [  13  ]; 


Used  to  print  attribute  bits  of  files 


struct  attribs  { 


Used  to  save  an  restore  machine  state 


struct  stk  { 

int  state; 

char  *ptr; 

}  ; 


End  I  .ktingw 


Dr.  Dobb  s  Journal,  October  1987 


[§Ti] 


STRUCTURED  PROGRAMMING 


Listing  One  (Tegt  begins  on  page  140.) 

Listing  1.  QuickBASIC  library  to  implement  opaque  matrices. 

'  QuickBASIC  implementation  of  an  opaque  numeric  matrix 
'  Matrix  is  stored  as  arrays  of  columns 

'  OPTION  BASE  0  must  be  used,  although  the  row/column  indices 
'  start  at  one . 

SUB  InitMat (Mat# (1) ,  Max.Row%,  Max.Col%)  STATIC 
'  Initialize  matrix 

Mat#  (0)  =  Max.Row%  +  Max.Col%  /  1000 

FOR  1%  =  1  TO  UBound (Mat  # ) 

Mat#  (1%)  =  0 
NEXT  1% 

END  SUB  '  CreateMat 

SUB  StoreElem(Mat# (1) ,  Row%,  Col%,  Elem#,  OK%)  STATIC 
'  Store  Elem#  in  matrix  position  (Row%,Col%) 

'  OK%  is  zero  if  error  has  occurred,  -1  if  operation  was  done 

STATIC  1%,  MaxR%,  MaxC% 

MaxR%  =  INT (Mat# (0) ) 

MaxC»  =  1000  *  (Mat# (0)  -  MaxR%) 

IF  (MaxR%  <  Row%)  OR  (MaxC%  <  Col%)  OR  (Row%  <  1)  OR  (Col%  <  1)  THEN 
OK%  =  0  '  Bad  row  or  column  numbers. 

EXIT  SUB 
END  IF 

OK%  =  -1 

'  Calculate  index 

1%  =  Row%  +  (Col%  -  1)  *  MaxR% 

'  for  the  arrays  of  rows  representation  use 
'  1%  =  Col%  +  (Row%  -  1)  *  MaxC% 


'  Store  element 
Mat# (1%)  =  Elem# 

END  SUB  1  StoreElem 

SUB  RecallElem(Mat#  (1)  ,  Row%,  Col%,  Elem#,  OK%)  STATIC 

'  Recall  Elem#  in  matrix  position  (Row%,Col%) 

1  OK%  is  zero  if  error  has  occurred,  -1  if  operation  was  done 

STATIC  1%,  MaxR%,  MaxC% 

MaxR%  =  INT (Mat# (0)) 

MaxC%  =  1000  *  (Mat# (0)  -  MaxR%) 

IF  (MaxR%  <  Row%)  OR  (MaxC%  <  Col%)  OR  (Row%  <  1)  OR  (Col%  <  1)  THEN 
OK%  =  0  '  Bad  row  or  column  numbers. 

EXIT  SUB 
END  IF 

OK%  =  -1 

'  Calculate  index 

1%  =  Row%  +  (Col%  -  1)  *  MaxR% 

'  for  the  arrays  of  rows  representation  use 
'  1%  =  Col%  +  (Row%  -  1)  *  MaxC% 

'  Recall  element 
Elem#  =  Mat# (1%) 

END  SUB  '  RecallElem 

_ End  Listing  One 


110 

800 


Dr.  Dobb's  Journal,  October  1987 


Listing  Two 

Listing  2.  True  BASIC  module  that  implements  an  array-based  binary  tree. 
MODULE  Binary_Tree 

!  TRUE  BASIC  module  that  implements  a  single  binary  tree 
!  Copyright  (c)  1987  Namir  Clement  Shammas 

DECLARE  DEF  NIL,  TRUE,  FALSE 

SHARE  Left(l),  Right (1) ,  Node_Count,  Num_Nodes,  Bin_Tree$ (1) 


! -  Module  initialization 

LET  Num_Nodes  =  0 

! -  local  functions  - 

DEF  NIL  =  MAXNUM 
DEF  TRUE  =  1 
DEF  FALSE  =  0 


SUB  Initialize (Item$) 

!  Subroutine  to  initialize  the  binary  tree 

LET  Num_Nodes  =  1 
LET  Tree_Size  =  1 
LET  Bin_Tree$(l)  =  Item$ 

LET  Left  (1)  =  NIL 
LET  Right (1)  =  NIL 

END  SUB 


SUB  Search (Item$,  Found,  Index) 

!  Search  for  Item$  and  return  Index  if  found.  (continued  On  page  113) 


Dr.  Dobb’s  Journal ,  October  1987 


111 

801 


STRUCTURED  PROGRAMMING 


Listing  Two  (Listing  continued,  text  begins  on  page  140.) 

LET  Found  =  FALSE 
LET  Index  =  1 

DO  WHILE  (Index  <>  NIL)  AND  (Found  =  FALSE) 

IF  Bin_Tree$ (Index)  =  Item$  THEN 
LET  Found  =  TRUE 
ELSE 

IF  Bin_Tree$ (Index)  <  Item$  THEN 
LET  Index  =  Right (Index) 

ELSE 

LET  Index  =  Left  (Index) 

END  IF 
END  IF 
LOOP 

END  SUB 


SUB  Insert (Item$) 

!  Insert  Item$  in  the  "dynamic"  binary  tree  structure 

LET  Num_Nodes  =  Num_Nodes  +  1 

IF  Num_Nodes  >  Tree_Size  THEN 
LET  Tree_Size  =  Num_Nodes 

MAT  REDIM  Bin_Tree$ (Tree_Size) ,  Left (Tree_Size) ,  Right (Tree_Size) 
END  IF 

LET  Index  =  1 
LET  Found  =  FALSE 

DO  WHILE  Index  <>  NIL 

IF  Bin_Tree$ (Index)  <  Item$  THEN 
IF  Right (Index)  <>  NIL  THEN 
LET  Index  =  Right (Index) 

ELSE 

LET  Right (Index)  =  Num_Nodes 
LET  Index  =  NIL 
END  IF 
ELSE 

IF  Left (Index)  <>  NIL  THEN 
LET  Index  =  Left (Index) 

ELSE 

LET  Left (Index)  =  Num_Nodes 
LET  Index  =  NIL 
END  IF 
END  IF 
LOOP 

LET  Bin_Tree$ (Num_Nodes)  =  Item$ 

LET  Right  <Num_Nodes)  =  NIL 
LET  Left  (Num_Nodes)  =  NIL 

END  SUB 

END  MODULE  End  Listing  Two 

Listing  Three 

Listing  3.  Pascal  code  for  emulating  opaque  complex  data  types. 


TYPE 

Opaque_Complex_type  =  /'Opaque_Complex_type_record; 

{  record  type  is  deliberately  enpty  ) 
Opaque_Complex_type_record  =  RECORD 

END; 


(continued  on  next  page) 


Dr.  Dobb’s  Journal,  October  1987 

802 


113 


STRUCTURED  PROGRAMMING 


Listing  Three  (Listing  continued,  te?ct  begins  on  page  140.) 

Actual  Complex_type  -  "Actual_Complex_type_record; 

Actual_Complex_type_record  =  RECORD 

Reel, 

Imag  :  REAL; 

END; 

Convert_Complex  =  RECORD 

CASE  BOOLEAN  OF 

TRUE  :  (Opaque  :  Opaque_Conplex_type)  ; 
FALSE  :  (Actual  :  Actual_Complex_type) 

END; 


FUNCTION  Convert_Opaque_to_Actual (  Opaque_Complex  :  Opaque_Complex_type  )  . 

Actual_Complex_type; 

VAR  Transfer  :  Convert_Complex; 

BEGIN 

Transfer -Opaque  :=  Opaque_Complex; 

Convert_Opaque_to_Actual  :=  Transfer -Actual 
END;  {  Convert_Opaque_to_Actual  } 


FUNCTION  Convert_Actual_to_Opaque (  Actual_Coraplex  :  Actual_Complex_type  )  . 

Opaque_Complex_type ; 

VAR  Transfer  :  Convert_Complex; 

BEGIN 

Transfer -Actual  :=  Actual  Complex; 


Convert_Actual_to_Opaque  :=  Transfer -Opaque 
END;  {  Convert_Actual_to_Opaque  } 


FUNCTION  Real_Imag_Complex(Re,  Im  :  REAL)  :  Opaque_Complex_type ; 

{  Convert  from  Real/Imaginary  numbers  to  opaque  complex  numbers  ) 
VAR  Transfer  :  Actual_Complex_type; 

BEGIN 

NEW  (Transfer)  ; 

Transfer". Reel  :=  Re; 

Transfer" . Imag  :=  Im; 

Real_Imag_Complex:=  Convert_Actual_to_Opaque (Transfer) ; 

END;  {  Real_Imag_Complex  ) 

FUNCTION  Polar_Complex (Angle,  Modulus  :  REAL)  :  Opaque_Complex_type; 
(  Convert  from  polar  coordinates  to  opaque  complex  numbers  ) 

VAR  Transfer  :  Actual_Complex_type; 

BEGIN 

NEW  (Transfer)  ; 

Transfer". Reel  :=  Modulus  *  SIN (Angle); 

Transfer". Imag  :=  Modulus  *  COS (Angle) ; 

Real_Imag_Complex:=  Convert_Actual_to_Opaque (Transfer) ; 

END;  {  Polar_Complex  ) 


PROCEDURE  Get_Real_Imag (MyComplex  :  Opaque_Complex_type; 

VAR  Re,  Im  :  REAL  {  output)); 

{  Convert  opaque  complex  numbers  into  Real/ Imaginary  components  ) 

VAR  Transfer  :  Actual_Complex_type; 

(continued  on  nejct  page) 


803 


STRUCTURED  PROGRAMMING 


Listing  Three  (Listing  continued,  text  begins  on  page  140.) 

BEGIN 

Transfer  :=  Convert_Opaque_to_Actual  (MyComplex) 

Re  :=  Transfer". Reel; 

Im  :=  Transfer" . Imag; 

END;  (  Get_Real_Imag  } 


PROCEDURE  Get_Polar (MyComplex  :  Opaque_Complex_type; 

VAR  Angle,  Modulus  :  REAL  {  output}); 
{  Convert  opaque  complex  numbers  into  polar  components  } 
VAR  Transfer  :  Actual_Complex_type; 

BEGIN 

Transfer  :=  Convert_Opaque_to_Actual (MyComplex)  ; 

WITH  Transfer"  DO  BEGIN 

Modulus  :=  SORT (SQR (Reel)  +  SQR  (Imag)); 

Angle  :=  Imag  /  Reel; 

END;  (  WITH  ) 

END;  {  Get_Polar  } 


FUNCTION  Add_Complex (Cl,  C2  :  Opaque_Complex_type)  :  Opaque_Complex_type; 

VAR  Transfer  :  Actual_Complex_type; 

Re,  Im  :  REAL; 

BEGIN 

(  Get  first  complex  number  } 

Transfer  :=  Convert_Opaque_to_Actual (Cl) ; 

Re  :=  Transfer". Reel; 

Im  :=  Transfer" . Imag; 

{  Get  second  complex  number  } 

Transfer  :=  Convert_Opaque_to_Actual (C2) ; 

Re  :=  Re  +  Transfer" .Reel ; 

Im  :=  Im  +  Transfer" . Imag; 

{  Update  result  } 

Transfer". Reel  :=  Re; 

Transfer" . Imag  :=  Im; 

Add_Complex  :=  Convert_Actual_to_Opaque (Transfer)  ; 

END;  {  Add_Complex  }  End  listing  Three 


Listing  Four 

Listing  4.  Modula-2  code  for  opaque  complex  data  types. 


DEFINITION  MODULE  Complex; 

EXPORT  QUALIFIED  Complex,  ReallmagComplex,  PolarComplex, 


TYPE  Complex;  (*  opaque  type  *) 


PROCEDURE  ReallmagComplex (Re,  Im  :  REAL)  :  Complex; 

(*  Convert  from  Real/Imaginary  numbers  to  opaque  complex  numbers  *) 

PROCEDURE  PolarComplex (Angle,  Modulus  :  REAL)  :  Complex; 

(*  Convert  from  polar  coordinates  to  opaque  complex  numbers  *) 

PROCEDURE  GetReallmag (MyComplex  :  Complex;  VAR  Re,  Im  :  REAL  (*  output  *) ) ; 
(*  Convert  opaque  complex  numbers  into  Real/Imaginary  components  *) 


PROCEDURE  GetPolar (MyComplex  :  Complex;  VAR  Angle,  Modulus  :  REAL  (*  output*)); 

(*  Convert  opaque  complex  numbers  into  polar  components  *) 

PROCEDURE  AddComplex(Cl,  C2  :  Complex)  :  Complex; 

end  Complex.  (continued  on  page  119) 


116 

804 


Dr.  Dobb’s  Journal,  October  1987 


STRUCTURED  PROGRAMMING 


Listing  Four  (Listing  continued,  te/ct  begins  on  page  140.) 

IMPLEMENTATION  MODULE  Complex; 

FROM  MathLibO  IMPORT  sqrt,  sin,  cos; 

TYPE 

ComplexRecord  =  RECORD 

Reel, 

Imag  :  REAL; 

END; 

(*  opaque  type  rrtus  be  a  pointer  *) 

Complex  =  POINTER  TO  ComplexRecord; 

PROCEDURE  ReallmagComplex (Re,  Im  :  REAL)  :  Complex; 

(*  Convert  from  Real/Imaginary  numbers  to  opaque  complex  numbers  *) 

VAR  C  :  Complex; 

BEGIN 

NEW (C) ; 

CA.Reel  :=  Re; 

CA . Imag  : =  Im; 

RETURN  (C) 

END  ReallmagComplex; 

FUNCTION  PolarComplex (Angle,  Modulus  :  REAL)  :  Complex; 

(*  Convert  from  polar  coordinates  to  opaque  complex  numbers  *) 

VAR  C  :  Complex; 

BEGIN 

NEW (C) ; 

CA.Reel  :=  Modulus  *  sin(Angle); 

CA.Imag  :=  Modulus  *  cos (Angle); 

RETURN  (C) 

END  PolarComplex; 

PROCEDURE  GetRealImag(MyComplex  :  Complex;  VAR  Re,  Im  :  REAL 

(*  output  *) ) ; 

(*  Convert  opaque  complex  numbers  into  Real/Imaginary  components  *) 
BEGIN 

Re  :=  MyComplexA.Reel; 

Im  :=  MyComplexA . Imag; 

END  GetReallmag; 


PROCEDURE  GetPolar (MyConplex  :  Complex;  VAR  Angle,  Modulus  :  REAL 

( *  output  * ) ) ; 

(*  Convert  opaque  complex  numbers  into  polar  components  *) 

BEGIN 

WITH  MyComplex  DO 

Modulus  :=  sqrt (Reel*Reel  +  Imag* Imag) ; 

Angle  :=  Imag  /  Reel; 

END; 

END  GetPolar; 

PROCEDURE  AddConplex (Cl,  C2  :  Complex)  :  Complex; 

VAR  C  :  Complex; 

Re,  Im  :  REAL; 

BEGIN 

(*  Get  first  complex  number  *) 

Re  :=  ClA.Reel; 

Im  :=  ClA.Imag; 

(*  Get  second  complex  number  *) 

Re  :=  Re  +  C2A.Reel; 

Im  : =  Im  +  C2 A . Imag; 

(*  Update  result  *) 

CA .Reel  :=  Re; 

CA.Imag  :=  Im; 

RETURN  (C) 

END  AddConplex; 

END  Complex .  End  Listings 


Dr.  Dobb's  Journal,  October  1987 


119 

805 


COLUMNS 


C  CHEST 

Language  Wars  Over  Cs 


There  seems  to  be  a  plethora  of 
new  C  compilers  hitting  the 
market  at  present.  This  month  I'm 
going  to  look  at  four  of  them:  Data- 
light's  Optimum-C;  Microsoft’s  Quick 
C;  Microsoft's  C  compiler,  Version  5.0; 
and  Borland’s  Turbo  C. 

I'm  going  to  take  a  different  ap¬ 
proach  from  most  reviews  and  leave 
out  the  traditional  tables  and  bench¬ 
marks.  I'm  doing  this  for  several  rea¬ 
sons.  First,  tables  of  features  don't 
usually  contain  new  information; 
you  can  read  the  advertisements  as 
well  as  I  can,  and  there's  no  point  in 
my  duplicating  that  information 
here.  Next,  I’ve  found  that  bench¬ 
marks  aren't  a  particularly  reliable 
way  to  evaluate  compilers.  That  is,  a 
compiler  that  does  poorly  in  compil¬ 
ing  a  single  benchmark  subroutine 
often  does  well  when  you  use  it  for  a 
large,  complex  program  and  vice 
versa.  Also,  raw  speed  is  not  the  most 
important  thing  in  a  compiler,  at 
least  not  to  me. 

As  a  programmer  I  spend  most  of 
my  day  trying  to  debug  computer 
programs.  Consequently,  a  good  de¬ 
velopment  environment  is  critical  to 
me.  I  frankly  don't  care  if  my  pro¬ 
gram  takes  50  milliseconds  longer  to 
run  or  is  512  bytes  larger  than  it  could 
be,  provided  that  I  can  minimize  the 
time  I  spend  in  debugging  and  the 

by  Allen  Holub 

code  quality  is  adequate.  Most  pro¬ 
grams  are  I/O-bound  anyway.  That 
is,  the  fastest  execution  speed  in  the 
world  doesn’t  speed  up  the  rate  at 
which  you  can  update  the  screen  or 
get  input  from  a  human  being. 

Of  course,  my  own  criteria  may  be 
different  from  yours.  If  you’re  doing 


124 

806 


engineering  applications  where  you 
measure  run  time  in  hours  rather 
than  seconds,  a  few  milliseconds 
here  and  there  can  be  significant.  If 
you  fall  into  that  category,  the  fol¬ 
lowing  reviews  will  still  be  useful  but 
you  should  probably  look  at  a  review 
that  has  benchmarks,  too. 

All  four  compilers  are  quite  good 
from  the  language  perspective.  They 
all  support  the  complete  C  language 
(including  bit  fields),  and  they  all  in¬ 
corporate  various  ANSI  extensions, 
such  as  function  prototypes  and 
structure  assignment.  They  all  sup¬ 
port  floating  point,  including  in-line 
8087  code.  They  all  include  a  version 
of  the  Unix  make  utility.  They  all 
come  with  the  start-up  module 
source  code  and  have  a  good  set  of 
DOS  interface  functions.  There  are 
differences,  however. 

Optimum-C,  Version  3.0 

Datalight's  Optimum-C  is  a  pleasant 
surprise.  First  of  all,  it  generates  spec¬ 
tacularly  good  code — better  than  any 
of  the  other  compilers  reviewed 
here.  It  not  only  does  the  usual  opti¬ 
mizations — such  as  constant  folding, 
strength  reduction,  branch  optimiza¬ 
tion,  and  aliasing — but  it  does  some 
very  sophisticated  stuff  too — con¬ 
stant  and  variable  propagation,  dead 
assignment  elimination,  automatic 
register  allocation,  global  common 
subexpression  elimination,  loop  in¬ 
variant  removal,  and  loop  induction 


variables.  The  four  basic  memory 
models  are  supported  (no  huge  or 
tiny),  but  mixed-model  program¬ 
ming  is  difficult  to  do. 

Optimum-C  has  good  ROM  sup¬ 
port,  though  it  doesn’t  come  with  the 
special  linker  that  you  need  to  locate 
specific  segments  in  particular  places 
in  memory.  (One  of  these  days  I'll 
write  a  linker  for  C  Chest.)  It  provides 
an  integer-only  option  that  lets  you 
compile  without  any  floating-point 
support,  and  it  provides  you  with  a 
stripped-down  start-up  module  that 
helps  put  together  a  ROMed  environ¬ 
ment. 

Datalight  includes  a  list  of  known 
bugs  on  the  distribution  disks.  All 
compilers  have  bugs,  but  Datalight  is 
honest  enough  to  admit  it,  thereby 
helping  you  program  around  them. 
This  honesty  should  be  commended. 
It's  a  small  thing,  but  it  helps  a  lot 
when  you  want  to  get  your  program 
running. 

The  two  drawbacks  of  Optimum-C 
are  the  lack  of  debugging  support 
and  a  library  that  is  too  small.  You 
pretty  much  have  to  rely  on  embed¬ 
ded  printff  )  statements  to  debug 
your  programs.  For  this  reason  I 
don't  really  recommend  the  compil¬ 
er  if  you're  not  already  an  experi¬ 
enced  C  programmer.  The  run-time 
library  is  adequate  but  just  so.  All  the 
essential  functions  are  there,  but 
there’s  no  gravy — for  example, 
there's  a  getenvf  )  but  no  putenvf  ). 
The  library  does  include  some  useful 
stuff  not  found  in  the  other  compil¬ 
ers.  There's  a  mouse-interface  li¬ 
brary  and  a  set  of  interrupt-manage¬ 
ment  functions  that  are  useful  for 
terminate-and-stay-resident  utilities 
(Microsoft  C  has  these  too,  however). 

The  compiler  includes  all  the  li- 


Dr.  Dobb's  Journal,  October  1987 


C  CHEST 

(continued  from  page  124) 


brary  sources  and  an  adequate  full¬ 
screen  editor  that  might  be  useful  if 
you  don’t  have  one  already. 

Quich  C  and 
Microsoft  C,  Version  S 

I  should  preface  my  comments  by 
saying  that  I'm  looking  at  beta  ver¬ 
sions  of  both  Quick  C  and  the  C  com¬ 
piler.  Consequently,  some  of  the 
things  I'm  saying  here  may  become 
untrue  later.  (Both  programs  should 
be  released  in  the  final  version  by  the 
time  this  article  makes  it  into  print.) 
I'll  report  back  periodically  as  I  get 
updates  of  these  products.  I'm  also 
willing  to  give  the  people  at  Micro¬ 
soft  the  benefit  of  the  doubt,  for  the 
present.  If  they  say  something  is  go¬ 
ing  to  be  fixed,  I’ll  believe  them. 
Nonetheless,  if  the  final  version  of 
the  product  still  has  problems,  I'll  dis¬ 
cuss  the  problems  in  depth  in  this  col¬ 
umn. 

The  similarities  between  Quick  C 
and  Turbo  C  are  unavoidable.  They 
both  include  a  development  environ¬ 
ment  that  incorporates  an  editor, 
compiler,  and  so  forth.  Neither  of  the 
editors  is  anything  to  write  home 
about;  both  are  adequate  for  fixing 
occasional  errors  that  might  pop  up 
in  debugging.  Neither  is  really  ade¬ 
quate  for  writing  an  actual  program 
however,  so  you  will  probably  do 
your  initial  typing  outside  the  Quick 
C  or  Turbo  C  environment.  In  fact, 
these  much-touted  user  interfaces 
are  pretty  worthless  for  the  most 
part.  I’d  as  soon  use  my  own  editor 
and  compile  with  the  command-line 
versions  of  the  compilers.  I  see  little 
point  in  wading  through  infinite 
menus  when  I  can  do  the  same  job 
faster  from  the  command  line.  How 
long  can  it  take  to  learn  a  few  com¬ 
mand-line  switches  anyway?  Fortu¬ 
nately,  both  Turbo  and  Quick  C  pro¬ 
vide  an  easy-to-use,  command-line 
version  of  the  compiler. 

There’s  a  certain  amount  of  over¬ 
lap  between  Quick  C  and  the  full, 
Version  5,  compiler,  too.  In  fact, 
Quick  C  is  included  in  the  full  compil¬ 
er  package.  The  main  differences  are 
code  quality  and  compile  time.  The 
full  compiler  is  slower,  but  it  gener¬ 
ates  much  better,  heavily  optimized 
code.  Quick  C,  on  the  other  hand,  is 


lightning  fast  but  the  output  code 
isn’t  great.  There’s  also  a  new,  much 
faster  version  of  LINK  provided  with 
the  package. 

Both  Quick  C  and  the  full  compiler 
use  the  same  libraries,  literally — they 
use  the  same  .lib  files  on  the  disk.  This 
is  a  real  boon  if  you're  doing  serious 
development  work — you  can  use 
Quick  C  to  develop  individual  mod¬ 
ules,  and  once  they're  debugged,  you 
can  recompile  with  the  full  compiler 
to  get  better  code  quality. 

Quick  C  provides  a  phenomenally 
good  development  environment.  It 
has  about  three-quarters  of  Code¬ 
View  built  into  it,  so  it  not  only  finds 
syntax  errors  for  you  but  it  also  helps 
you  debug.  You  can  use  a  command¬ 
line  version  of  Quick  C  to  generate 
CodeView-compatible  files,  howev¬ 
er,  so  you  can  have  both  fast  compile 
time  and  the  full  CodeView  if  you 
need  it. 

For  those  of  you  who  aren't  famil¬ 
iar  with  the  CodeView  debugger,  it  is 
a  source-level  debugger  that  really 
works,  and  it  has  become  an  essential 
part  of  my  development  environ¬ 
ment.  You  can  actually  watch  pro¬ 
gram  execution  at  the  source  level; 
you  can  even  see  local  and  global 
variables  change  as  they  are  modi¬ 
fied.  You  can  set  breakpoints  on  vari¬ 
ables  being  modified,  on  expressions 
becoming  true,  even  on  ranges  of 
memory  being  modified.  I  discussed 
the  debugger  in  depth  in  the  Novem¬ 
ber  1986  C  Chest.  My  program  devel¬ 
opment  time  has  decreased  by  at 
least  25  percent  since  I  started  using 
this  debugger.  I  can't  live  without  it. 
Quick  C  is  essentially  CodeView  with 
a  compile  instruction. 

Quick  C  handles  multiple-module 
files  quite  easily.  You  create  a  pro¬ 
gram  list  that  has  an  entry  for  each 
file  or  library  in  the  program.  Quick 
C  uses  this  list  to  create  a  makefile 
automatically  and  then  assembles 
the  program  in  a  manner  similar  to 
the  way  make  does  (Turbo  C  uses  a 
similar  mechanism). 

Quick  C  is  really  an  incremental 
compiler  not  an  interpreter.  It  com¬ 
piles  the  entire  module  at  lightning 
speed — comparable  in  all  respects  to 
Turbo  C — without  stopping.  It  has  a 
"go  to  next  error”  function  key  in  the 
editor  that  lets  you  find  errors  pain¬ 
lessly.  It  positions  you  at  the  line  that 
contains  the  error,  then  prints  the  er¬ 


ror  message  in  a  window  at  the  bot¬ 
tom  of  the  screen.  A  context-sensitive 
help  feature  gives  you  an  on-line  ref¬ 
erence  to  all  the  library  routines, 
showing  you  a  function  prototype 
and  capsule  description  of  any  li¬ 
brary  function  that  is  highlighted  by 
the  cursor. 

The  Microsoft  run-time  library, 
used  by  both  Quick  C  and  the  full 
compiler,  is  the  best  out  of  those  re¬ 
viewed  here.  It  is  very  Unix  compati¬ 
ble  and  is  packed  with  useful  rou¬ 
tines,  including  a  set  of  graphics 
functions  (new  to  this  version)  and 
very  good  support  for  terminate-and- 
stay-resident  utilities  (also  new).  The 
graphics  functions  let  you  do  the  ba¬ 
sic  stuff  (lines,  circles,  ellipses,  and 
area  fill)  and  supports  all  the  IBM  vid¬ 
eo  adapters  (but,  unfortunately,  not 
the  Hercules  graphics  card).  The  li¬ 
brary  documentation  is  among  the 
best  I've  seen — better  than  that  of 
any  of  the  other  compilers  I'm  re¬ 
viewing  here.  It  has  one  page  per 
function,  with  the  function  name 
across  the  top  of  the  page.  The  func¬ 
tions  are  listed  in  strict  alphabetical 
order  so  that  they're  easy  to  find,  and 
most  entries  have  an  example  of  how 
to  use  them. 

Microsoft  is  finally  selling  the  li¬ 
brary  source  code  too- — useful  if 
you're  either  porting  to  another  envi¬ 
ronment  or  need  to  correct  bugs  that 
are  bound  to  show  up  in  the  new  rou¬ 
tines.  Microsoft  is  also  accelerating 
the  release  schedule  from  yearly  to 
semiannually  so  that  you  can  get  a 
compiler  update  with  fixed  bugs  ev¬ 
ery  six  months  instead  of  having  to 
wait  a  year.  This  last  change  makes 
the  library  sources  a  little  less  impor¬ 
tant,  but  it’s  still  good  that  they're  be¬ 
ing  released. 

Turbo C 

Last,  and  least,  on  the  list  is  Borland's 
Turbo  C.  I  finally  got  my  copy  of  Tur¬ 
bo  C  in  the  mail,  several  months  after 
Borland  had  not  only  started  adver¬ 
tising  it  but  also  collecting  money  for 
it.  The  cover  letter  started  ominously 
with:  ". . .  right  on  schedule,  here's 
Turbo  C.”  In  spite  of  the  hype,  how¬ 
ever,  and  in  spite  of  the  seeming  pop¬ 
ularity  of  the  product  (at  least  judging 
by  the  sales  figures),  Turbo  C  comes 
in  last  when  compared  to  the  compil¬ 
ers  I’ve  just  discussed.  It's  not  so  much 
that  Turbo  C  isn't  a  good  product  but 


126 


Dr.  Dobb's  Journal,  October  1987 

807 


C  CHEST 

(continued  from  page  126) 


rather  that  the  others  are  better. 

Like  Quick  C,  there  are  two  inter¬ 
faces  to  the  compiler  itself — a  normal 
command-line  interface  and  "the 
standard  Borland  integrated  environ¬ 
ment.”  The  integrated  environment 
is  pretty  worthless.  Were  I  a  Turbo 
Pascal  user,  I  might  like  it  better,  but 
I’m  not — the  integrated  environment 
does  nothing  but  put  extra  steps  be¬ 
tween  me  and  the  compiled  pro¬ 
gram.  That  is,  it  forces  me  to  use  an 
editor  that  I  don’t  like;  it  forces  me  to 
wade  through  infinite  menus  to  get 
anything  done;  and  its  menu  system 
manages  to  make  even  simple  things 
complex  by  adding  too  many  steps  to 
any  process.  Borland  has  made  the 
classic  mistake  of  confusing  "user 
friendly”  with  "coddle  the  novice.” 
The  interface  is  only  "friendly”  until 
you  know  what  you’re  doing. 

The  biggest  problem  with  the  inte¬ 
grated  environment  is  the  complete 
lack  of  debugging  support.  The  prod¬ 
uct  does  find  syntax  errors  for  you,  in 
a  manner  similar  to  Turbo  Pascal.  It 
puts  you  into  the  editor  at  the  error 
point  so  that  you  can  change  things 
and  finish  the  compilations.  Syntax 
errors,  however,  are  the  least  of  it.  It 
takes  me  a  few  minutes  to  get  the 
syntax  errors  out  of  a  program,  but  it 
can  take  days  to  get  the  program  de¬ 
bugged,  and  Turbo  C  gives  you  abso¬ 
lutely  no  help  with  real  debugging.  It 
has  nothing  like  the  CodeView-like 
environment  provided  in  Quick  C. 

So,  once  you  get  rid  of  the  excess 
baggage,  what  you  have  is  an  inex¬ 
pensive  and  reasonably  good  C  com¬ 
piler.  The  code  quality  is  better  than 
Microsoft  C,  Version  4,  but  not  as 
good  as  Version  5.  It  supports  all  six 
memory  models  and  lets  you  do 
mixed-model  programming.  It's  easy 
to  use  from  the  command  line  and 
generates  reasonably  good  code.  Ru¬ 
mor  has  it  that  it’s  the  Wizard  C  com¬ 
piler,  and  looking  at  the  generated 
code,  I  have  no  reason  to  doubt  this 
supposition.  The  compiler  supports 
in-line  assembly  language  and  has  a 
reasonably  good  library,  the  sources 
for  which  are  available  if  you  need 
them.  The  DOS  support  is  good  but 
not  portable,  and  there  are  no  graph¬ 
ics  functions. 

The  library  has  a  few  problems, 


808 


however,  most  of  which  fall  into  the 
Unix-compatibility  and  documenta¬ 
tion  areas.  For  example,  there’s  no 
Unix-compatible  signaK  )  function, 
though  there  is  a  nonstandard  mech¬ 
anism  to  intercept  the  Ctrl-Break  in¬ 
terrupt;  there's  a  function  called 
ioctlf  )  but  this  function  is  nothing 
like  the  Unix  function  with  the  same 
name;  and  so  forth.  These  sorts  of  in¬ 
compatibilities  always  mystify  me. 
It’s  easy  enough  to  do  it  right,  so  why 
don’t  they? 

The  library  documentation  is  ade¬ 
quate  but  not  nearly  as  good  as  Micro¬ 
soft's.  There’s  lots  of  nonstandard 
stuff  with  little  or  no  explanation  of 
how  that  stuff  works.  For  example, 
the  nonstandard  ioctl(  )  subroutine 
evidently  lets  you  change  file  attri¬ 
butes,  but  the  documentation  doesn’t 
tell  you  how.  I  assume  that  the  DOS 
Technical  Reference  would  help,  but 
Borland's  documentation  doesn’t  say 
one  way  or  the  other.  In  addition,  ex¬ 
amples  of  how  to  use  library  routines 
are  few  and  far  between,  and  the 
documentation  assumes  a  lot  in 
terms  of  previous  knowledge  need¬ 
ed  to  figure  out  the  subroutine  de¬ 
scription.  A  thorough  knowledge  of 
DOS  interfacing  details  is  assumed 
throughout. 

The  manual’s  layout  is  poor.  Bor¬ 
land  has  saved  space  by  putting  the 
documentation  for  several  subrou¬ 
tines  on  a  single  page,  thereby  mak¬ 
ing  things  harder  to  find  (there’s  no 
header  with  the  subroutine  names 
for  that  page  across  the  top).  More  of¬ 
ten  than  not,  when  you  look  up  a  sub¬ 
routine,  the  entry  refers  you  some¬ 
where  else  to  get  the  actual 
description.  There  are  a  few  typos 
that  cause  problems  too — for  exam¬ 
ple,  the  stime(  )  function  is  described 
as  follows:  “stime  returns  a  value  of  0 
is  returned.” 

Conclusion 

Quick  C  in  the  stand-alone  version  is 
better  than  Turbo  C.  It  does  every¬ 
thing  that  Turbo  C  does,  and  then 
some,  incorporating  very  good  de¬ 
bugging  support  that  is  totally  absent 
from  Turbo  C  (finding  syntax  errors 
alone  is  not  sufficient).  My  Quick  C  is 
a  beta  version,  so  I  can’t  really  com¬ 
pare  code  quality,  and  in  any  event, 


all  of  the  products  are  more  or  less  on 
par.  Nonetheless,  Turbo  C  is,  at  pre¬ 
sent,  between  Quick  C  (which  is  a  lit¬ 
tle  poorer)  and  the  full  Microsoft 
compiler  (which  is  considerably  bet¬ 
ter).  Microsoft  says  that  the  code  qual¬ 
ity  will  be  considerably  improved  in 
Quick  C’s  final  release. 

To  my  mind,  the  better  debugging 
environment  provided  by  Quick  C 
far  outweighs  any  code-quality  con¬ 
siderations.  If  you’re  doing  serious 
development,  you’ll  use  the  full  com¬ 
piler  anyway.  Quick  C’s  debugging 
environment  is  particularly  useful  if 
you’re  learning  the  language. 

Datalight’s  Optimum-C  gives  the 
best  code  quality  of  these  four  com¬ 
pilers,  and  the  library,  though  small, 
is  adequate  for  most  programs.  It  also 
provides  the  best  support  for  ROMed 
code,  and  the  library  sources  are  in¬ 
cluded  for  free.  The  lack  of  debug¬ 
ging  support  is  a  serious  omission, 
however,  and  I  recommend  it  pri¬ 
marily  to  experienced  programmers 
who  are  generating  code  to  run  out¬ 
side  the  DOS  environment  or  to  pro¬ 
grammers  who  need  very  efficient 
code  and  don't  care  about  the  small 
library.  I  do  recommend  it,  though, 
in  spite  of  these  shortcomings.  It’s  a 
shame  that  Datalight's  C  generates 
Lattice-compatible  assembly  lan¬ 
guage  rather  than  Microsoft-compat¬ 
ible.  Were  this  not  the  case,  you  could 
use  Quick  C  to  do  your  development 
and  Optimum-C  for  the  final  compi¬ 
lation  pass. 

The  complete  Microsoft  compiler 
package  is  the  most  powerful,  but  it's 
by  far  the  most  expensive,  especially 
when  you  add  in  the  cost  of  the  li¬ 
brary  sources.  If  cost  is  not  an  issue,  I 
think  the  Microsoft  compiler  (which 
includes  Quick  C)  is  the  best  bet.  It 
combines  a  very  good  compiler  with 
a  complete  (and  Unix-compatible)  li¬ 
brary  and  a  fantastic  debugging  envi¬ 
ronment.  If  cost  is  an  issue,  or  if 
you're  just  learning  the  language,  I'd 
go  with  Quick  C  alone.  If  you're  doing 
serious  production  work,  get  the  full 
compiler  package. 

Bug  Report 

There  is  a  bug  on  page  95  of  the  June 
column  listings.  On  line  215,  change 
&item  to  item. 

DDJ 

Vote  for  your  favorite  feature/article. 

Circle  Reader  Service  No.  6. 


Dr.  Dobb's  Journal,  October  1987 


COLUMNS 


THE  FORTH  COLUMN 


With  this  issue,  DDJ  both  inaugurates  a 
new  column  and  welcomes  a  new  col¬ 
umnist  to  its  ranks.  Martin  is  eminent¬ 
ly  qualified  to  discuss  the  issues  and 
nuances  of  the  Forth  language:  he  is 
one  of  the  founders  ofMicroMotion,  a 
senior  programmer  at  FORTH  Inc., 
and  vice-president  of  the  Forth  Inter¬ 
est  Group  (FIG). — eds. 

Sure,  you  say,  a  column  on  Forth 
programming  makes  sense,  but 
what's  this  business  about  news  and 
reviews?  Well,  a  lot  has  been  happen¬ 
ing  lately.  Forth  is  being  used  in  ev¬ 
erything  from  digital  signal  process¬ 
ing  to  neural  nets.  There  are  several 
Forth  conferences  each  year  (one 
near  you)  and  hundreds  of  publica¬ 
tions.  You  can  find  Forth  in  credit- 
card  phones,  cyclotrons,  and  on 
board  the  Titanic.  Staying  current 
can  be  a  major  task,  but  I'll  try  to  keep 
you  well  informed  and  up  to  date. 

There  are  four  sources  of  Forth  in¬ 
formation  that  you  should  know 
about:  the  Forth  Interest  Group,  The 
Institute  for  Applied  Forth  Research, 
various  Forth  electronic  bulletin 
boards,  and  (of  course)  DDJ. 

Forth  Interest  Group 

The  Forth  Interest  Group  (FIG)  brings 
together  more  than  4,000  Forth  pro¬ 
grammers  and  hobbyists  in  more 

by  Martin  Tracy 

than  80  local  chapters.  You  can  order 
just  about  any  Forth  publication 
through  FIG,  or  find  a  job,  or  pur¬ 
chase  group  health  insurance.  FIG 
sponsors  both  the  National  Forth 
Convention  and  the  Forth  Modifica¬ 
tion  Laboratory  (FORML)  conference. 
Membership  is  $30/year  and  in¬ 


cludes  a  subscription  to  Forth  Dimen¬ 
sions.  You  can  join  by  calling  the  FIG 
hot  line:  (408)  277-0668. 

The  National  Forth  Convention  is 
where  you  go  to  meet  your  local  ven¬ 
dor  and  to  learn  more  about  FIG. 
More  than  40  vendors  attended  last 
year,  many  with  display  booths.  You 
can  hear  panels  of  Forth  celebrities 
speculating  about  the  future  of  the 
language.  There  are  two  days  of  tech¬ 
nical  presentations  and  a  banquet, 
too.  This  year’s  convention  meets  No¬ 
vember  14—15  at  the  Red  Lion  Inn  in 
San  Jose,  California.  The  National 
Forth  Convention  is  usually  held  one 
or  two  weeks  before  FORML  so  that 
visitors  from  abroad  can  reasonably 
attend  both. 

FORML 

Imagine  spending  Thanksgiving  this 
year  in  beautiful  Monterey,  Califor¬ 
nia.  FORML  is  the  technical  Forth 
conference  and  is  held  every  year 
following  Thanksgiving  (November 
27—29)  at  Asilomar  in  nearby  Pacific 
Grove.  Asilomar  is  a  modern  confer¬ 
ence  center  situated  in  a  pine  forest 
overlooking  the  Pacific  Ocean.  There 
are  raccoon  and  deer  in  abundance, 
and  wine  and  cheese  parties  are  held 
nightly.  Although  the  technical  con¬ 
tent  of  FORML  is  quite  advanced,  the 
atmosphere  is  relaxed  and  informal. 

The  theme  of  this  year’s  ninth 
FORML  is  "Forth  and  the  32-bit  Com¬ 
puter.”  The  call  for  abstracts  (100 


words  or  less)  for  the  ninth  annual 
FORML  is  September  1,  1987,  so 
you've  just  missed  it.  Completed  pa¬ 
pers  are  due  November  1,  which 
means  that  if  you  call  right  away  you 
might  still  be  able  to  sneak  in  on  time. 
You  can  call  the  organizers  at  (408) 
277-0668. 

The  iVeir  Forth  Dimensions 

Actually,  its  pretty  much  the  same 
old  Forth  Dimensions  but  with  a  new 
look  that  includes  a  glossy  cover  and 
a  snappier  format.  Forth  Dimensions 
is  the  bimonthly  publication  of  FIG, 
and  a  one  year's  subscription  is  in¬ 
cluded  with  each  annual  member¬ 
ship.  Some  even  say  that  FIG  mem¬ 
bership  is  free  when  you  subscribe  to 
Forth  Dimensions.  Volume  IX,  Num¬ 
ber  1,  starts  the  new  look  with  arti¬ 
cles  on  fractal  landscapes,  32-bit 
Forths,  and  other  topics.  Marlin  Ou- 
verson,  the  editor,  keeps  a  careful 
balance  of  beginning  and  advanced 
material,  with  tiny  thermometers 
printed  next  to  each  to  show  you  the 
level  of  difficulty. 

This  same  issue  includes  an  extend¬ 
ed  interview  with  Elizabeth  Rather 
entitled  "Starting  FORTH  Inc.”  If  you 
want  to  know  how  it  all  got  started, 
here's  where  to  find  out  all  about  it. 
You  might  also  be  interested  in  Ray 
Duncan's  series  "Starting  Your  Own 
Software  House”  in  Programmer's 
Journal.  Ray  is  known  to  DDJ  readers 
for  his  16-bit  Software  Toolbox  col¬ 
umn,  but  he  is  also  well  known  to  the 
Forth  community  as  the  founder  of 
Laboratory  Microsystems  Inc.  (LMI) 
and  the  creator  of  PC/FORTH. 

The  Forth  Model  Library 

The  Forth  Model  Library  is  a  series  of 
selected  Forth  programs  published  by 


132 


Dr.  Dobb's  Journal,  October  1987 

809 


THE  FORTH  COLUMN 

(continued  from  page  132) 


FIG  and  written  in  the  Forth-83  stand¬ 
ard  dialect.  The  library  is,  in  part,  an 
ongoing  experiment  to  determine  the 
ability  of  the  Forth-83  standard  dialect 
to  support  substantial  applications. 
Each  participating  vendor  supplies  an 
appropriate  ‘front  end"  so  that  the 
application  can  run  under  its  Forth. 
Needless  to  say,  no  machine-code 
words  are  used  in  the  library.  The  em¬ 
phasis  is  on  readability  and  utility 
rather  than  on  speed.  Of  course,  you 
can  always  tune  it  more  closely  to 
your  Forth  or  to  your  needs,  hence 
the  name  model. 

This  library  currently  contains  five 
volumes:  A  Forth  List  Handler,  Vol¬ 
ume  1,  by  Martin  J.  Tracy;  A  Forth 
Spreadsheet,  Volume  2,  by  Craig  A. 
Lindley;  Automatic  Structure  Charts, 
Volume  3,  by  Kim  R.  Harris;  A  Simple 
Inference  Engine,  Volume  4,  by  Mar¬ 
tin  J.  Tracy;  and  The  Math  Bog,  Vol¬ 
ume  6,  by  Nathaniel  Grossman.  Vol¬ 
ume  5,  A  Complete  PROLOG,  by  Lou 
Odette,  has  been  delayed  but  should 
appear  shortly. 


The  Forth  Model  Library  is  avail¬ 
able  from  FIG  at  $40  per  volume.  It  is 
available  on  IBM  MS-DOS  5'/4-inch 
disks  and  runs  under  several  IBM  PC 
Forths,  including  Laxen/Perry  (pub¬ 
lic  domain)  F83;  PC/FORTH,  Version 
3.0  or  later;  MicroMotion  Master- 
Forth,  Version  1.0  or  later;  and 
FORTH  Inc.’s  polyFORTH  II  MS-DOS 
ISD-4. 

Fixed-Point  Math 

The  Math  Bog,  Volume  6  of  the  li¬ 
brary,  includes  the  source  code  for 
extended  double-precision  arithme¬ 
tic;  a  complete  fixed-point  math 
package;  auto-ranging  text  graphics; 
and  utilities  for  rapid  polynomial 
evaluation,  continued  fractions,  and 
Monte  Carlo  factorization. 

Forth  programmers  generally  pre¬ 
fer  fixed-point  to  floating-point 
math.  For  any  given  number  size, 
fixed-point  math  is  faster  and  more 
accurate  than  floating  point  but  it 
lacks  the  convenience  and  larger  dy¬ 
namic  range  of  floating  point.  Real¬ 
time  applications  that  read  transduc¬ 
ers  and  write  to  D/A  converters  or 
stepper  motor  controllers  usually 


have  well-understood  algorithms  of 
limited  dynamic  range.  PID  control¬ 
lers,  digital  filters,  and  computer 
graphics  are  especially  amenable  to 
fixed-point  solutions. 

A  common  Forth  approach  to  im¬ 
plementing  fixed-point  numbers  is  to 
combine  signed  16-bit  integers  and 
signed  14-bit  fractions.  A  14-bit  frac¬ 
tion  is  a  signed  16-bit  number  with 
the  binary  point  two  positions  from 
the  left: 

S tt.tttt  tttttttt  tttttttt  tttttttt 

where  S  is  the  sign  bit  and  each  #  is  a 
binary  digit.  Fourteen-bit  fractions 
have  several  charming  properties: 

•They  can  be  added  to  each  other 
with  no  adjustment. 

•  They  can  be  multiplied  by  an  inte¬ 
ger  with  no  adjustment. 

•They  can  be  multiplied  together 
and  adjusted  with  two  left  shifts. 

•  They  can  exactly  represent  + 1  and 
—  1,  which  is  especially  useful  when 
working  with  sines  and  cosines. 

•  Because  the  number  of  bits  is  even 
and  fixed,  the  square  root  is  exact  and 


does  not  require  an  extra 
multiplication. 

You  can  read  more  about  14-bit 
fractions  in  Leo  Brodie’s  Starting 
Forth,  2d  ed  (Englewood  Cliffs,  N.J.: 
Prentice-Hall,  1987).  Don’t  bother  try¬ 
ing  to  find  the  information  in  the  first 
edition — it  isn’t  there.  By  the  way, 
Starting  Forth  has  sold  more  than 
110,000  copies.  The  second  edition 
has  been  substantially  revised  and 
enlarged.  It  now  identifies  the  differ¬ 
ences  between  Forth  dialects  and  in¬ 
cludes  a  long-awaited  index. 

Unfortunately,  neither  16-bit  inte¬ 
gers  nor  14-bit  fractions  can  represent 
handy  numbers  such  as  3.1415  ....  To 
combine  efficiency  with  conve¬ 
nience,  Dr.  Grossman  has  designed  a 
fixed-point  package  based  on  signed, 
32-bit,  fixed-point  numbers  with  the 
binary  point  in  the  middle: 

tttttttt  tttttttt  tttttttt  tttttttt  tttttttt  tttttttt 

tttt'tttt 

These  numbers  can  range  from  more 
than  9,999  to  less  than  0.0001  with 
roughly  4  xh  digits  on  each  side  of  the 


decimal  point. 

The  1987  Rochester 
Forth  Convention 

This  year’s  Rochester  Forth  Conven¬ 
tion  was  an  impressive  gathering  of 
more  than  100  Forth  professionals 
from  industry  and  academia.  Al¬ 
though  it  was  difficult  for  me  to  sit 
through  50  technical  presentations  (I 
admit  I  missed  some),  the  four-day 
conference  sped  along,  accompanied 
by  excellent  food  and  nightly  parties. 
This  year’s  conference,  sponsored  by 
the  Institute  of  Applied  Forth  Re¬ 
search  and  coordinated  by  Larry 
Forsley,  was  a  bit  less  organized  than 
last  year’s,  sadly  because  of  Thea 
Martin’s  disinvolvement.  Thea  is 
leaving  the  institute  to  pursue  a  ca¬ 
reer  in  education. 

This  year’s  official  theme  was 
Forth  hardware  architecture,  and 
the  unofficial  theme  was  artificial  in¬ 
telligence.  Dr.  Dorband  (NASA  God¬ 
dard  Space  Flight  Center,  Maryland) 
talked  about  the  Massively  Parallel 
Processor  (MPP).  This  128  X  128  array 
of  16,384  serial  processors  is  pro¬ 
grammed  in  (can  you  guess?)  Forth. 


Dr.  Ting  (Lockheed)  described  an  NCR 
GAPP  network  with  a  Novix  4016  as 
the  microcode  sequencer. 

On  a  slightly  smaller  scale,  Phil 
Koopman,  Jr.,  demonstrated  his  Writ¬ 
able  Instruction  Set/Stack  Oriented 
Computer  (WISC),  a  32-bit  commercial 
Forth  engine.  Wright  State  University 
(Ohio)  presented  a  stack-frame  com¬ 
puter  designed  for  Forth.  Final  silicon 
is  expected  in  spring  1988.  The  univer¬ 
sity's  stack-frame  architecture  has  a 
shallow  stack  with  all  items  directly 
addressable.  ROT  would  be  faster  on 
this  stack-frame  computer  than  on, 
let's  say,  a  Novix  4016.  Speaking  of  No¬ 
vix  chips,  George  Nicol  had  an  IBM  AT 
stuffed  with  six  Silicon  Composer 
PC4000  boards  running  in  parallel. 
Let's  see,  by  my  count  that's  20  MIPS 
per  board,  for  120  MIPS.  Is  that  Million 
Instructions  Per  Second  or  Misleading 
Information  Provided  by  Sales?  Also 
speaking  of  Novix  chips,  watch  for 
good  things  from  Harris  Semiconduc¬ 
tors,  which  has  added  the  Novix  to  its 
macro  cell  library. 

On  the  AI  front,  the  University  of 
Utah  has  been  using  an  emulator 
written  in  Forth  to  test  its  Common 


810 


THE  FORTH  COLUMN 

(continued  from  page  135) 


LISP  compiler.  Ecole  Polytechnique 
de  Montreal  presented  FUZZY- 
FORTH,  a  real-time  fuzzy-rule  pro¬ 
duction  system.  Henry  Harris  (Jet 
Propulsion  Labs,  California)  gave  a 
too-short  talk  on  conceptual  depen¬ 
dency.  William  Dress  (Oakridge  Na¬ 
tional  Labs,  Tennessee)  brought  a 
bug!  Complete  with  a  few  hundred 
simulated  neurons  connecting  simu¬ 
lated  sensors  with  simulated  mus¬ 
cles,  this  artificial  neural  network 
“insect"  crawled  around  a  CRT, 
learning  to  avoid  the  sides  and  to  seek 
the  simulated  food.  There  were  talks 
on  computer-aided  medical  diagno¬ 
sis,  multiprocessor  expert  systems, 
and  several  other  aspects  of  AI. 

The  last  day  of  the  conference  was 
reserved  for  seminars  and  demon¬ 
strations.  The  peripatetic  Ray  Dun¬ 
can  demonstrated  LMI's  UR/FORTH 
for  the  Microsoft  OS/2  operating  sys¬ 
tem.  Again  and  again  (remember  Cre¬ 
ative  Solution’s  MacFORTH?)  Forth  is 
one  of  the  first  languages  to  run  in  a 
new  software  environment. 

If  you  missed  the  conference,  you 
can  still  read  the  proceedings.  They 
will  appear  shortly  as  a  special  issue 
of  the  Journal  of  Forth  Applications 
and  Research  UFAR).  A  one-year  sub¬ 
scription  (four  issues)  to  this  profes¬ 
sional,  refereed  publication  costs  $40. 
You  can  order  it  from  JFAR,  P.O.  Box 
27686,  Rochester,  New  York  14627. 

For  a  Good  Time  . . . 

If  you  own  a  modem,  there  are  two 
numbers  you  should  definitely  inves¬ 
tigate:  the  West  Coast  Forth  Board 
([213]  301-0761)  in  Los  Angeles  and  the 
East  Coast  Forth  Board  ([703]  442-8695) 
in  McLean,  Virginia.  These  public- 
spirited  nondenominational  Forth 
bulletin  boards  contain  megabytes  of 
Forth  utilities,  interviews,  contacts, 
and  news.  The  WCFB  sysops  are  Scott 
Squires  (Lucasfilm),  Michael  Ham  (of 
DDJ  fame),  and  the  ubiquitous  Ray 
Duncan.  The  ECFB  sysop  is  Jerry  Shi- 
frin  (MCI). 

Both  boards  are  PCBOARD  Forth- 
only  bulletin-board  systems.  The 
ECFB  is  the  more  active  of  the  two, 
averaging  about  two  dozen  messages 
a  day.  There  are  more  than  1,000 
Forth  files  available  on  this  board! 
You  can  download  the  complete  di- 


Dr.  Dobb's  Journal,  October  1987 


THE  FORTH  COLUMN 

(continued  from  page  137) 


rectory  listing  in  FILELIST.ARC. 
Many  of  these  files  are  also  available 
on  the  WCFB.  Jerry  Shifrin  has  put 
together  an  excellent  50-page  Guide 
to  the  ECFB,  which  is  available  by 
leaving  him  a  message  on  the  board. 

Product  of  the  Month: 
Ashton-Tute ’s  RapidFite 

The  following  Forth  success  story 
was  downloaded  from  the  ECFB.  Edi¬ 
torial  comments  are  by  Jerry  Shifrin: 

"Ashton-Tate  has  recently  released 
RapidFile,  a  data  management  pack¬ 
age  for  the  IBM  PC.  The  following  is 
excerpted  from  Ashton-Tate's  3/87  is¬ 
sue  of  Technotes: 

Technotes:  Why  is  RapidFile  writ¬ 
ten  in  FORTH? 

Mark  McDonough  (developer):  I’ll 
tell  you  the  story.  When  I  told  my 
professors  that  I  wanted  to  imple¬ 
ment  the  QBE  [IBM's  Query  By  Exam¬ 
ple]  concept  on  a  PC,  I  was  told  it  was 
not  possible.  Knowing  that  it  couldn't 
be  done,  I  went  out  to  find  a  genius 
who  would  know  how  to  push  the 
PC  to  its  limits.  I  found  that  genius  in 
Tom  Dowling  [of  Miller 
Microsystems]. 

Tom  claimed,  and  has  since  proven 
time  and  again,  that  FORTH  was  as 
easy  to  program  in  as  other  high  level 
languages  and  that  its  programs  run 
almost  as  fast  as  assembler  programs. 
FORTH  programs  also  compile  into  a 
very  small  space.  We  could  not  have 
fit  RapidFile  onto  one  disk  and  in 
256K  of  RAM  if  it  were  written  in  C. 

Technotes:  We  understand  that  the 
flexibility  of  FORTH  really  came  in 
handy  when  designing  RapidFile. 

Steve  Allin  (product  manager):  Yes, 
FORTH  is  very  tailorable.  It  allowed 
us  to  make  substantial  changes  to  the 
design.  It  is  very  close  to  being  a  pro¬ 
totyping  language  where  you  can 
make  major  changes  much  more  eas¬ 
ily  than  in  standard  programming 
languages,  which  would  have  re¬ 
quired  substantial  rewrites.  Rapid- 
File  went  through  several  versions 
before  its  release  and  it  couldn’t  have 
gone  through  all  that  if  it  hadn’t  been 
written  in  FORTH." 


The  ANS  Forth  Effort  Begins 

In  October  1986,  a  small  group  from 
the  Forth  community  met  at  FORTH 
Inc.,  Los  Angeles,  to  submit  a  propos¬ 
al  to  the  American  National  Stan¬ 
dards  Institute  (ANSI)  asking  it  to  un¬ 
dertake  a  Forth  standard.  This  group 
consisted  of  major  vendors  (Elizabeth 
Rather,  FORTH  Inc.;  and  Ray  Duncan, 
Lab  Microsystems  Inc.);  major  users 
(W.  B.  Dress,  Oakridge  National  Labo¬ 
ratory;  Burt  Felis,  IBM  Corp.;  and  Jer¬ 
ry  Shifrin,  MCI  Telecommunications 
Corp.),  and  interested  experts 
(Charles  Moore,  inventor  of  Forth; 
Greg  Bailey,  Athena  Programming; 
and  yours  truly).  Also  invited  but  un¬ 
able  to  attend  were  Don  Colburn  (Cre¬ 
ative  Solutions),  Dick  Miller  (Miller 
Microcomputer  Services),  and  Kim 
Harris  (Paradise  Systems). 

In  February  1987,  CBEMA  (the  ANSI 
organization  responsible  for  ANS 
FORTRAN  and  the  proposed  ANS  C), 
sent  the  ANS  Forth  proposal  X3J14  to 
its  general  membership  for  a  letter 
ballot.  In  April,  CBEMA  accepted  the 
ANS  Forth  proposal.  The  following  is 
from  a  letter  by  Ms.  Rather  to  mem¬ 
bers  of  the  proposing  group: 

"I  am  happy  to  report  that  the  final 
vote  for  the  establishment  of  X3J14, 
the  Technical  Committee  for  ANS 
Forth,  was  favorable:  36  —  1  —  1,  with 
2/3  approval  required.  An  organiza¬ 
tional  meeting  has  been  scheduled 
for  August  3  —  4,  1987,  at  CBEMA 
headquarters,  311  First  St.,  N.W., 
Washington,  D.C.  A  newsletter  to  all 
ANSI  members  and  a  press  release 
will  be  sent  out  May  1. 

"At  the  first  meeting,  the  staff  of 
the  X3  Secretariat  will  present  a  de¬ 
tailed  tutorial  on  X3  and  TC  policies 
and  procedures.  Our  first  assignment 
will  be  to  develop  a  detailed  work 
plan  and  schedule  for  development 
of  a  draft  standard.” 

Anyone  may  attend  this  meeting. 
According  to  ANSI  rules,  however,  a 
voting  member  of  a  technical  com¬ 
mittee  pays  a  yearly  fee  of  $175  and 
must  attend  at  least  two  out  of  three 
meetings  to  retain  voting  status.  Fur¬ 
thermore,  a  technical  committee  (TC) 
generally  meets  two  to  four  times  a 
year  in  fairly  distributed  geographic 
locations.  The  frequency  of  meetings 
is  part  of  the  work  plan  decided  at 
the  first  meeting.  You  can  apparently 
team  up  with  an  alternate  and  there¬ 
by  attend  every  other  meeting  with¬ 


out  losing  your  vote. 

While  the  direction  of  ANS  Forth  is 
yet  to  be  determined  by  the  yet-to-be- 
determined  TC,  the  formal  proposal 
calls  for  an  integration  rather  than  a 
revolution.  The  first  item  in  the  pro¬ 
posed  work  program  is  to  "identify 
and  evaluate  common  existing  prac¬ 
tices  in  the  area  of  the  Forth  pro¬ 
gramming  language.’’  Wil  Baden,  for¬ 
mer  ANS  FORTRAN  committee 
member  and  well-known  Forth  ora¬ 
tor,  calls  this  "the  principle  of  least 
surprise.”  Furthermore,  the  proposal 
continues,  "while  the  Forth-83  Stand¬ 
ard  has  stabilized  the  language  to  a 
great  extent,  it  has  proven  too  restric¬ 
tive  and  machine-dependent.  Assum¬ 
ing  the  ANS  Forth  standard  confines 
itself  to  such  changes  as  are  neces¬ 
sary  to  resolve  the  problems  in  Forth- 
83,  the  effect  on  current  practice  will 
be  modest.”  It  was  generally  agreed 
that  an  ANS  Forth  might  be  a  nice 
place  to  delete  all  references  to  a  16- 
bit  parameter  stack. 

Meanwhile,  Guy  Kelly,  chair  of  the 
Forth  Standards  Team  (FST),  has  sug¬ 
gested  that  FST  might  serve  as  a  clear¬ 
inghouse  for  proposed  extensions  to 
Forth,  such  as  string  operators  and 
floating-point  arithmetic.  Technical 
proposals  should  be  sent  to  the  Forth 
Standards  Team,  P.O.  Box  4545,  Moun¬ 
tain  View,  CA  94040.  There  is  a  form 
provided  at  the  back  of  every  pub¬ 
lished  Forth-83  Standard.  FST  devel¬ 
oped  the  Forth-79  and  Forth-83  stan¬ 
dards  but  has  no  further  meetings 
scheduled  at  this  time. 

At  any  rate,  by  the  time  you  read 
this,  the  first  meeting  will  already 
have  happened.  I  will  try  to  keep  you 
up  to  date  on  this  heroic  and  historic 
event. 

Coming  Next 

Next  time  around?  More  on  the  ANS 
Forth  effort,  a  new  bibliography  on 
Forth,  and  a  few  surprises. 

DDJ 


Vote  for  your  favorite  feature/article. 
Circle  Reader  Service  No.  7. 


138 

812 


Dr.  Dobb’s  Journal,  October  1987 


COLUMNS 


STRUCTURED  PROGRAMMING 


Data  Hiding  and  Its  Variations 


To  trench-level  programmers, 
the  concept  of  data  hiding  is 
likely  to  be  regarded  with  hostility — 
why  deny  full  access  to  data  struc¬ 
tures?  Unfortunately,  data  hiding  is 
part  of  the  baggage  of  modern  soft¬ 
ware  engineering,  playing  an  impor¬ 
tant  role  in  minimizing  delays  in  ap¬ 
plication  development.  Because 
teams  of  programmers  may  spend 
thousands  of  man-hours  writing 
code,  strict  control  procedures  can  be 
vital  to  the  successful  management  of 
complicated  programming  projects. 

For  example,  when  using  either 
Ada  or  Modula-2,  the  first  step  in  soft¬ 
ware  coding  begins  with  writing  the 
definition  modules,  or  packages. 
These  modules  list  the  exported  data 
types,  constants,  variables,  and  rou¬ 
tines.  They  will  form  the  foundation 
upon  which  each  team  will  develop 
the  actual  detailed  modules  and 
packages. 

Careful  planning  must  be  exer¬ 
cised  in  the  planning  phase  both  to 
define  and  confine  software  develop¬ 
ment  objectives.  Each  team  should  be 
able  to  work  as  independently  as  pos¬ 
sible,  thus  yielding  maximum  pro¬ 
ductivity  with  minimum  confusion. 
In  the  absence  of  data  hiding,  all  ex¬ 
ported  data  types  and  their  variables 
have  visible  structures;  this  gives  any 
team  the  ability  to  write  their  own 
routines  and  to  manipulate  the  ex¬ 
ported  data  structures. 

by  Namir  Clement 
Shammas 

But  what  happens  if  the  team  re¬ 
sponsible  for  implementing  a  specif¬ 
ic  module  and  its  data  structures  dis¬ 
covers  a  more  efficient  equivalent 
structure  or  decides  that  it  absolutely 


has  to  modify  the  current  one?  The 
result  is  a  domino  effect  of  recoding 
by  all  the  teams  that  used  those  modi¬ 
fied  data  types.  Keep  in  mind  that 
this,  too,  is  likely  to  create  an  entirely 
new  set  of  problems! 

To  prevent  this  disastrous  domino 
effect,  opaque  data  types  can  be  used. 
This  gives  the  module  or  package  im¬ 
plementor  the  freedom  to  alter  the 
data  structure  without  passing  the 
negative  effects  on  to  other  program¬ 
mers.  So  what's  the  price  for  this? 

Opaque  data  types  must  be  accom¬ 
panied  by  an  adequate  set  of  export¬ 
ed  basic  routines  to  manipulate  them 
because  client  modules  and  pro¬ 
grams  cannot  contain  their  own  rou¬ 
tines  to  access  the  components  of  an 
opaque  type.  This  means  that  in¬ 
formed  choices  must  be  made  in  spe¬ 
cifying  which  routines  accompany 
the  opaque  types. 

Data  hiding  is  used  with  data  struc¬ 
tures  that  have  alternate  representa¬ 
tions,  such  as: 

•  complex  numbers  represented  by 
rectangular  or  polar  components 

•  stacks  implemented  using  arrays,  a 
set  of  scalar  variables  (for  small 
stacks),  or  a  linked  list 

•  lists  implemented  using  pointers  to 
dynamic  data  or  to  arrays 

•  binary  trees  implemented  using 
pointers  to  dynamic  data  or  to  arrays 

•  two-dimensional  matrices  repre¬ 
sented  by  arrays  of  rows  or  arrays  of 


columns 

Although  data  hiding  is  formally 
implemented  in  languages  such  as 
Ada  and  Modula-2,  it  can  also  prove 
useful  to  emulate  this  feature  fully, 
or  even  partially,  in  other  languages. 
This  article  looks  at  example  applica¬ 
tions  in  BASIC  and  Pascal  and  com¬ 
pares  these  implementations  of  data 
hiding  to  those  in  Ada  and  Modula-2. 

The  extent  to  which  you  can  emu¬ 
late  data  hiding  in  a  particular  lan¬ 
guage/implementation  is  dependent 
on  two  main  ingredients:  making  the 
detailed  data  structure  opaque,  and 
denying  the  programmer  access  to 
the  routines  that  manipulate  the 
opaque  data  types.  The  first  require¬ 
ment  is  easily  met  by  disguising  the 
data  structure;  thus  the  trick  is  to 
hide  the  supporting  code. 

BASIC  Examples 

The  QuickBASIC  implementation 
produces  a  compiled  user  library 
that  meets  the  more  critical  code-hid¬ 
ing  requirement.  You  can  use  either 
arrays  or  large  strings  (QuickBASIC 
supports  strings  of  up  to  32K).  Arrays 
are  probably  more  suitable  for  tack¬ 
ling  purely  numeric  data  structures, 
whereas  strings  are  more  suitable  for 
managing  structures  with  numeric 
and  alphanumeric  data. 

QuickBASIC  supports  packing  of 
the  basic  numeric  data  types  into 
fixed-length  strings.  Strickly  speak¬ 
ing,  QuickBASIC  still  enables  you  to 
access  the  components  of  the  not-so- 
opaque  data  structure,  but  the 
schemes  used  would  make  such  ac¬ 
cess  a  wasted  effort.  I  call  this  type 
“logically  opaque." 

Listing  One,  page  110, ,  shows  the 
QuickBASIC  library  that  implements 


140 


Dr.  Dobb’s  Journal,  October  1987 

813 


STRUCTURED  PROGRAMMING 

(continued  from  page  140) 


a  version  of  the  logically  opaque  nu¬ 
meric  matrix.  The  matrix  is  made  up 
of  arrays  of  columns.  Three  proce¬ 
dures  are  used:  one  to  initialize  the 
matrix,  one  to  store  an  element,  and 
one  to  recall  an  element. 

The  scheme  works  with  OPTION 
BASE  0  to  enable  Mat#( 0)  to  store  the 
maximum  number  of  rows  and  col¬ 
umns  in  the  matrix  structure.  The 
opaque  matrix  should  be  declared  as 
a  one-dimensional  array  with 
Ma?t.Row%  ’ Max.CoI%  as  its  upper  ar¬ 
ray  bound.  The  routines  are  written 
such  that  the  row  and  column  indi¬ 
ces  of  the  opaque  matrix  start  at  1. 
The  routines  for  storing  and  recalling 
matrix  elements  contain  commented 
code  lines  for  implementing  the  ar¬ 
rays  of  rows.  In  this  case,  the  changes 
are  very  simple. 

The  QuickBASIC  code  shows  how 
you  are  still  able  to  access  the  basic 
components  of  the  data  structure 
(that  is,  the  individual  array  ele¬ 
ments).  With  the  compiled  library 
version,  however,  the  exact  mapping 


of  the  matrix  elements  is  invisible. 

A  second  method  that  can  be  im¬ 
plemented  in  both  QuickBASIC  and 
Turbo  BASIC  revolves  around  static 
local  strings  or  arrays.  Both  imple¬ 
mentations  support  static  declara¬ 
tions  that  can  be  used  with  local  vari- 


It's  possible 
to  emulate 
opaque  types 
in  Pascal 
by  using  pointers. 


ables,  which  permits  the  BASIC 
subroutines  to  retain  the  data  be¬ 
tween  subroutine  calls.  Example  1, 
below,  shows  the  general  scheme 
and  how  the  subroutine  in  question 
is  used  to  perform  multiple  tasks.  The 


latter  permits  the  BASIC  subroutine 
to  monopolize  the  opaque  data  struc¬ 
ture.  Other  BASIC  code  segments  can 
use  the  opaque  data  but  are  denied 
direct  access.  The  same  methods  can 
be  used  !  with  C  and  PL/I  because 
they  support  static  variables. 

True  BASIC  is  another  BASIC  imple¬ 
mentation  that  supports  a  limited  lev¬ 
el  of  data  hiding.  True  BASIC  modules 
support  an  extension  to  ANSI  BASIC 
that  implements  SHARE  variables, 
which  are  characterized  by  their  stat¬ 
ic  nature  and  by  the  fact  that  they  are 
accessible  by  all  the  module  routines. 
These  features  enable  programmers 
to  perform  partial  data  hiding. 

Listing  Two,  page  111,  shows  an 
implementation  of  a  binary  tree  in 
True  BASIC.  The  module  implements 
one  instance  of  a  binary  tree.  Three 
SHAREd  arrays  are  used  to  manage 
the  binary  tree:  Bin—Tree$( )  stores 
the  tree  node  data,  Left( )  is  the  array 
of  left  pointers,  and  Right ( )  is  the  ar¬ 
ray  of  right  pointers.  None  of  the 
three  arrays  are  accessible  by  the  ap¬ 
plication  program  using  the  binary 
tree,  which  makes  the  binary  tree 
structure  completely  opaque. 


SUB  Jeky 11 . and . Hyde ( <argument 

list>.  Menu .Choice)  STATIC 

STATIC  <list  of  scalar  and 
arrays  used  to  implement  opaque 
structure> 

SELECT  CASE  Menu. Choice 
CASE  1 

<sequence  of  statements> 
CASE  2 

<sequence  of  statements* 
CASE  3 

<sequence  of  statements:* 

ELSE 

Oequence  of  statements* 
END  SELECT 
END  SUB 


Example  1:  General  scheme  for 
using  static  local  variables  in 
QuickBASIC  and  Turbo  BASIC 


142 

814 


Dr.  Dobb's  Journal,  October  1987 


Pascal  Examples 

Although  Pascal  is  similar  to  Modula- 
2,  it  does  not  support  opaque  types. 
Nevertheless,  it  is  possible  to  emulate 
opaque  types  in  Pascal  by  using 
pointers.  Schneider1  suggests  the  fol¬ 
lowing  method: 

1.  Declare  an  opaque  record  type  and 
its  pointer.  This  record  declaration 
should  be  empty. 

2.  Declare  a  record  type  with  the  ac¬ 
tual  data  structure  used.  Also  declare 
a  pointer  type  accompanying  this  re¬ 
cord  type. 

3.  Declare  a  variant  record  of  the 
form: 

TYPE  Convert  =  RECORD 
CASE  boolean  OF 

TRUE  :  (Opaque  :  <pointer  to 
opaque  record>); 
FALSE  :  (Actual :  <pointer  to 
actual  structureX 

END; 

This  variant  record  enables  you  to 
pass  the  addresses  of  pointers  from 
one  record  pointer  type  to  the  other. 

4.  Provide  two  functions  for  two-way 
conversion  between  the  opaque  type 
and  the  actual  data  structure. 

5.  Write  a  set  of  routines  that  provide 
the  required  manipulation  of  the 
opaque  data  type. 

Listing  Three,  page  113,  shows  Tur¬ 
bo  Pascal  data  types  and  routines  to 
implement  opaque  complex  types. 
Five  routines  are  provided  for  dem¬ 
onstration.  The  first  two  create  com¬ 
plex  numbers  from  rectangular  or 
polar  coordinates,  and  the  following 
two  perform  the  conversion  in  re¬ 
verse.  Function  Add— Complex  is  a 
sample  routine  to  perform  a  mathe¬ 
matical  operation  on  a  pair  of  com¬ 
plex  numbers. 

Notice  how  the  input  parameters 
that  represent  opaque  complex  num¬ 
bers  are  first  converted  into  the  actu¬ 
al  structures.  True  addition  is  per¬ 
formed  using  rectangular  coordi¬ 
nates,  and  the  results  are  converted 
into  opaque  complex  numbers.  The 
Pascal  code  prohibits  the  program¬ 
mer  from  accessing  the  actual  record 
type.  This  is  enforced  even  further 
when  using  compiled  library  UNITS 
because  the  actual  record  structure  is 
confined  to  the  implementation  part 
of  the  library  UNIT. 

Compare  the  Pascal  code  with  the 
Modula-2  version  in  Listing  Four, 


page  116.  Notice  how  simple  and  ele¬ 
gant  the  Modula-2  version  looks  com¬ 
pared  to  the  Pascal  version.  In  both 
Pascal  and  Modula-2,  you  can  use  the 
polar  coordinates  as  the  actual  data 
structures  without  changing  the  pa¬ 
rameters  of  the  routines  involved. 

Availability 

All  the  source  code  for  articles  in  this 
issue  is  available  on  a  single  disk.  To 
order  send  $14.95  to  Dr.  Dobb’s  Jour¬ 
nal,  501  Galveston  Dr.,  Redwood  City, 
CA  94063,  or  call  (415)  366-3600,  ext. 
216.  Please  specify  the  issue  number 
and  format  (MS-DOS,  Macintosh, 
Kay  pro). 


\ote 

1.  M.  Schneider,  "Pascal  Report,” 
Journal  of  Pascal,  Ada  and  Modula-2, 
vol.  5,  no.  4  (July/ August  1986). 

DDJ 

(Listings  begin  on  page  110.) 

Vote  for  your  favorite  feature/article. 

Circle  Reader  Service  No.  8. 


Dr.  Dobb's  Journal,  October  1987 


143 

815 


LETTERS 

(continued  from  page  14) 


posed  of  integers,  unless  you’re  a 
throwback  to  the  Pythagoreans.  Be¬ 
cause  of  the  nature  of  the  problems 
people  need  to  solve  today,  modern 
computer  languages  use  sophisticat¬ 
ed  variables  or  object  definitions, 
noninteger  arithmetic  operations, 
and  advanced  programming  control 
techniques.  Because  computers  do 
not  support  these  functions,  they 
must  be  emulated  somehow  by  lan¬ 
guages  or  programmers  or  both,  and 
emulation  is  just  a  fancy  word  to  de¬ 
scribe  a  case  in  which  a  particular 
machine  is  made  to  do  something  it 
was  not  meant  to  do. 

The  exception  to  this  is  assembly 
languages,  which  are  fully  supported 
by  the  computers  they  are  written 
for,  but  as  anyone  knows  who’s  done 
enough  "real  programming,”  assem¬ 
bly  language  isn’t  the  easiest  lan¬ 
guage  to  program  in,  document,  or 
maintain. 

Therein  lies  the  problem  with 
computers  and  the  reason  why  there 
are  no  good  computer  languages  to¬ 
day  and  certainly  no  good  general- 
purpose  computer  languages.  Trying 
to  tailor  a  machine  that  does  one  par¬ 
ticular  job  well  to  do  another  job  al¬ 
ways  leads  to  inefficiency,  but  in¬ 
stead  of  changing  the  approach,  most 
people  try  and  improve  on  just  one 
aspect  of  the  problem — the  software. 

We  will  not  see  any  significant  ad¬ 
vance  in  the  way  we  program  com¬ 
puters  until  a  computer  architecture 
appears  that  supports  a  particular 
high-level  language  so  fully  that  it 
would  take  more  time  and  be  less 
readable  and  maintainable  to  pro¬ 
gram  in  assembly  language  than  to 
use  the  HLL.  This  architecture  should 
be  geared  toward  a  particular  type  of 
task,  such  as  object-oriented  pro¬ 
gramming,  real-time  systems,  or 
arithmetic  number  crunching,  and 
should  not  support  any  operations 
that  are  not  needed.  Thus,  we’ll  see 
two  or  three  different  types  of  com¬ 
puters,  all  with  architectures  sup¬ 
ported  by  the  right  language  to  make 
the  machines  easy  to  program  and  as 
swift  as  possible. 

People  invent  new  languages  to 
make  the  job  easier,  looking  at  the 
problem  strictly  from  a  software  as¬ 
pect.  Instead,  they  should  take  a  sys¬ 


tems  approach  and  think  about  the 
whole  system — hardware  and  soft¬ 
ware — and  produce  a  machine  that 
gives  the  optimal  solution  to  a  partic¬ 
ular  type  of  problem.  The  new  gen¬ 
eration  of  computers  should  clear 
away  the  vast  jungle  of  computer 
languages  by  introducing  machines 
that  are  so  tailored  to  one  particular 
language  that  to  use  another  one 
would  be  grossly  inefficient.  These 
new  machines  would  not  necessarily 
run  any  of  the  languages  that  are  cur¬ 
rently  available.  In  fact,  they  proba¬ 
bly  won’t.  This  doesn’t  mean  that  all 
computers  will  look  like  clones  of 
three  or  four  basic  types.  There  will 
still  be  a  lot  of  latitude  in  what  type  of 
I/O  devices  a  particular  system  will 
support  for  a  particular  job  or  indi¬ 
vidual.  But  the  computers  will  be  ef¬ 
ficient  machines  especially  tailored 
to  the  task  to  be  done,  instead  of  Fords 
or  Chevys  passing  off  as  Mercedes  or 
BMWs. 

There  are  microprocessors  geared 
to  run  a  particular  language,  but  not 
everybody  is  convinced  that  pro¬ 
gramming  in  Forth  or  Modula-2  is  the 
way  to  go,  and  in  any  case  the  prob¬ 
lem  should  be  approached  from  a 
system  standpoint  instead  of  just  opti¬ 
mizing  the  microprocessor. 

David  Nakamoto 

280  S.  Euclid  Ave.,  #315 

Pasadena,  CA  91101 

Polytron  Responds 

Dear  DDJ, 

In  the  C  Chest  column  in  the  May 
1987  issue,  Alan  Holub  briefly  dis¬ 
cussed  the  relative  merits  of  Poly¬ 
gon's  PolyMake  and  Lattice’s  LMK 
make  utilities.  He  mentioned  two 
problems  that  he  was  having  with 
PolyMake — one  relating  to  files  exist¬ 
ing  in  different  directories  and  the 
other  relating  to  memory  usage. 

I  was  unable  to  determine  the  ex¬ 
act  nature  of  the  directory  problem 
that  Mr.  Holub  mentioned.  I  can  say, 
however,  that  we  have  literally  thou¬ 
sands  of  users  many  of  whom  struc¬ 
ture  their  projects  hierarchically  and 
apparently  have  no  problem  doing 
so  with  PolyMake.  We  use  the  same 
strategy  internally  at  Polytron. 

As  regards  Mr.  Holub ’s  comment 
about  excessive  memory  usage,  Poly¬ 


Make  has  a  feature  that  allows  the 
user  to  determine  how  much  memo¬ 
ry  it  will  consume.  Use  of  this  feature 
may  have  solved  his  problem. 

Our  customer  support  staff  is  very 
responsive  to  questions  such  as  these 
that  surface  from  time  to  time  and 
would  have  been  more  than  happy 
to  help  Mr.  Holub  resolve  these 
"problems.”  Often  all  it  takes  is  a  sim¬ 
ple  phone  call  to  resolve  a  perceived 
problem.  When  a  genuine  bug  is  dis¬ 
covered,  we  respond  very  quickly 
with  a  solution. 

Incidentally,  the  current  version  of 
PolyMake  is  Version  2.1,  and  it  con¬ 
tains  many  new  and  unique  features 
in  keeping  with  our  market  leader¬ 
ship  position.  These  features  include 
conditional  constructs  in  both  depen¬ 
dencies  and  operations,  internal 
commands,  shell  control,  some  very 
powerful  new  macros,  initialization 
operations,  and  the  ability  to  deter¬ 
mine  time  stamps  of  components  of 
Polytron  Version  Control  System 
(PVCS)  archives  and  PolyLibrarian 
object  module  libraries.  Also,  Version 
2.1  is  about  ten  times  faster  and  uses 
about  60  percent  of  the  data  space 
compared  to  PolyMake  versions  pri¬ 
or  to  Version  2.0. 

Donald  K.  Kinzer 

Polytron  Corp. 

1815  NW  169th  PL,  Ste.  2110 

Beaverton,  OR  97006 

Allen  Holub  replies: 

I  did  call  Polytron  a  year  ago  when  I 
was  having  the  directory  problems 
discussed,  and  the  staff  couldn’t 
come  up  with  a  solution  then.  More¬ 
over,  when  I  recently  asked  about 
the  space-related  problems,  I  was 
told  that  the  new  version  (2.1)  takes 
up  even  more  memory  than  the  ver¬ 
sion  that  I  had.  Mr.  Kinzer’s  letter 
mentions  that  Version  2.1  uses  60  per¬ 
cent  less  data  space,  but  he  doesn’t 
mention  that  it  uses  considerably 
more  code  space.  Be  that  as  it  may, 
the  added  features  may  well  make 
up  for  the  amount  of  memory  used. 
Polytron  is  sending  me  a  copy  of  the 
new  version,  and  I’ll  discuss  it  in  a 
future  C  Chest. 

ODJ 


144 

816 


Dr.  Dobb's  Journal,  October  1987 


PROGRAMMER'S  SERVICES 

OF  INTEREST 


Languages 

Language  Processors  has  released  a 
line  of  compilers  for  386  machines 
running  MS-DOS,  Version  3.2.  The  lan¬ 
guages  currently  available  include 
LPI-COBOL,  LPI-FORTRAN,  LPI-RPG II, 
LPI-PL/I,  LPI-PASCAL,  and  LPI-BASIC. 
Each  product  supports  the  386  DOS- 
Extender  from  Phar  Lap  Software, 
which  lets  users  run  MS-DOS  applica¬ 
tions  while  facilitating  access  to  the 
processing  power  of  the  32-bit  386.  No¬ 
table  benefits  include  surmounting 
the  640K  barrier,  increasing  overall 
performance,  and  enabling  access  of 
up  to  4  Gbytes  of  memory. 

Users  who  have  developed  appli¬ 
cations  using  previous  releases  of  LPI 
compilers  will  be  able  to  transport 
those  programs  to  80386  machines 
running  MS-DOS  by  recompiling 
them  with  the  386  version.  Each  com¬ 
piler  is  priced  at  $695.  Reader  Service 
No.  16. 

Language  Processors  Inc. 

400-1  Totten  Pond  Rd. 

Waltham,  MA  02154 
(617)  890-1155 

JForth  from  Delta  Research  is  based 
on  the  Forth-83  Standard  and  runs  on 
the  Commodore  Amiga.  Of  particular 
interest  is  that  high-level  programs 
compile  directly  into  machine  code 
as  opposed  to  interpreting  tokens  at 
run  time.  To  cut  development  time, 
the  compiler  environment  is  interac¬ 
tive  and  allows  incremental  compil¬ 
ing.  Utilities  include  a  68000  assem¬ 
bler  and  disassembler,  search  and 
sort  routines,  local  variables,  and 
floating  point. 

Amiga  structures  and  constants  are 
predefined  in  .j  files,  which  corre¬ 
spond  to  the  .h  files  used  in  C.  Amiga 
library  routines  are  called  by  name, 


and  an  object-oriented  development 
environment  is  provided.  Example 
programs  demonstrate  graphics, 
HAM  mode,  speech  synthesis,  and 
pull-down  menus.  Source  code  is  fur¬ 
nished.  JForth  sells  for  $99.95.  Reader 
Service  No.  17. 

Delta  Research 
201  D  Street,  Ste.  15 
San  Rafael,  CA  94901 
(415)  485-6867 

Microware  Systems  Corp.  is  now 
shipping  its  OS-9/68020  C  compiler  for 
the  32-bit  Motorola  68020  micropro¬ 
cessor.  Based  on  the  Kernighan  &  Rit¬ 
chie  standard,  the  compiler  includes 
extensions  for  the  OS-9  operating  sys¬ 
tem  (for  compact  disc-interactive  new 
media  technology).  All  compiler/as¬ 
sembler/linker  options  are  controlled 
by  an  "intelligent  executive”  that  gov¬ 
erns  compiler  options  and  module¬ 
calling  sequences. 

Extensions  to  the  OS-9  operating 
system  environment  include  library 
functions  for  memory  management 
and  systems  events  as  well  as  several 
library  functions  that  provide  com¬ 
patibility  with  the  proposed  ANSI 
standard.  The  compiler  uses  the 
MC68881  math  coprocessor  for  high¬ 
speed  execution  of  complex  math 
functions  and  can  generate  in-line 
floating-point  instructions,  link  to 
MC68881  math  libraries,  or  trap  to  a 
shared  systemwide  MC68881  math 
package.  It  can  also  take  advantage  of 
the  MC68020's  32-bit  arithmetic  in¬ 
structions  and  special  addressing 
modes.  The  OS-9/68020  C  compiler 
includes  both  MC68000  and  MC68020 
code  generation  packages  and  sells 
for  $750.  Reader  Service  No.  18. 
Microware  Systems  Corp. 

1900  N.W.  114th  St. 

Des  Moines,  IA  50322 
(515)  224-1929 

Oxxi’s  Benchmark  Modula-2  for  the 
Amiga  implements  the  entire  Modula- 
2  language  as  defined  by  Niklaus 
Wirth.  Execution  speed  is  enhanced 
because  the  compiler  resides  in  memo¬ 
ry.  The  compiler  is  activated  directly 
from  the  editor  by  a  hot  key,  so  the 
time  it  would  normally  take  to  load  an 
overlay  from  disk  is  eliminated.  The 
editor  contains  more  than  125  com¬ 
mands  for  dealing  with  multiple  files, 


windows,  and  buffers.  Demo  pro¬ 
grams  include  a  freehand  paint  pro¬ 
gram,  a  desktop  calculator,  and  a  di¬ 
rectory  maintenance  program. 
Programs  written  in  Benchmark  Mo¬ 
dula-2  can  be  distributed  without  fur¬ 
ther  licensing  requirements  from 
Oxxi. 

Available  add-on  products  include 
a  C  Language  Standard  Library, 
which  allows  programs  written  in  C 
to  be  moved  easily  into  the  Modula-2 
programming  environment;  Simpli¬ 
fied  Amiga  Libraries,  which  includes 
functions  for  screen,  window,  sound, 
and  device  handling;  Interchange 
File  Format  (IFF)  Libraries;  and 
Graphic  Image  Resource  Manage¬ 
ment,  consisting  of  both  IFF  libraries 
and  the  full  documentation  of  the  IFF 
format.  The  compiler  sells  for  $199; 
the  add-on  products  are  available  for 
$99  each.  Reader  Service  No.  19. 

Oxxi  Inc. 

1835-A  Dawns  Way 
Fullerton,  CA  92631 
(714)  999-6710 

Tools 

Jasik  Designs  is  now  shipping  Mac- 
Nosy,  Version  2,  and  The  Debugger  for 
the  Mac  II.  For  those  of  you  not  famil¬ 
iar  with  the  product,  MacNosy  is  a 
global,  interactive  decompiler  that 
lets  you  recover  the  source  code  of 
any  Macintosh  application.  Standard 
MacNosy  features  include  on-line  ac¬ 
cess  to  the  Macintosh  system  struc¬ 
tures  and/or  current  values  and  to 
Macintosh  trap  names  (system  calls) 
and  their  parameter  lists  and  allow 
disassembly  of  all  680x0  instructions 
(including  the  69991  FPU  and  68851 
MMU). 

Added  features  of  this  Mac  II  edition 
include  new  system  structures  to  han¬ 
dle  additions  to  ROM,  including  Color 
QuickDraw;  the  ability  to  disassemble 
the  Mac  II  ROM;  the  identification  of 
more  than  600  internal  procedures  in 
ROM;  and  an  increase  in  maximum 
text  file  size  from  32K  to  64K. 

The  Debugger  monitors  the  execu¬ 
tion  of  programs,  allowing  them  to  be 
arbitrarily  stopped  to  trace  execution. 
At  this  point  you  can  view  variables 
and  structures  as  well  as  memory  lo¬ 
cations.  The  Mac  II  version  includes 
the  ability  to  run  in  single-  or  multi¬ 
screen  mode  and  the  addition  of  a  dis- 


146 


Dr.  Dobb's  Journal,  October  1987 

817 


OF  INTEREST 

(continued  from  page  146) 


play  of  floating-point  registers. 

The  Mac  II  (universal)  version  of 
MacNosy  and  The  Debugger  are  sold 
only  in  combination  for  $350.  Ver¬ 
sions  for  the  Mac  Plus  and  the  Mac  SE 
sell  for  $170.  Reader  Service  No.  20. 
Jasik  Designs 
343  Trenton  Way 
Menlo  Park,  CA  94024 
(415)  322-1386 

Solution  Systems  has  introduced 
BRIEF  2.0,  the  latest  version  of  its  gen¬ 
eral-purpose  programming  editor 
for  the  PC.  One  especially  nifty  fea¬ 
ture  is  an  improved  "undo”  com¬ 
mand,  which  allows  programmers  to 
reverse  the  effect  of  their  last  300 
commands.  The  update  includes 
new  documentation  with  tutorials 
on  basic  editing,  regular  expressions, 
and  the  internal  macro  language — a 
language  that  allows  users  to  custom- 
ize.their  editing  environment  to  meet 
individual  needs. 

BRIEF  2.0  adds  device  drivers  that 
support  the  Enhanced  Graphics 
Adapter,  the  Hercules  Graphics  Plus 
Card,  and  the  Wyse  700  display.  It  is 
now  compatible  with  displays  that 
have  up  to  127  lines  by  255  charac¬ 
ters.  BRIEF  2.0  sells  for  $195;  the  up¬ 
grade  cost  for  registered  users  is  $60. 
Reader  Service  No.  21. 

Solution  Systems 
335  Washington  St. 

Norwell,  MA  02061 
(617)  659-1571 

MicroSolutions  Computer  Prod¬ 
ucts  has  just  released  Version  2  of  Uni¬ 


form-PC,  a  utility  that  permits  the  use 
of  more  than  110  non-IBM  PC  disk  for¬ 
mats  on  a  standard  PC/XT  and  more 
than  160  formats  on  an  AT.  Users  of 
IBM  PCs  and  compatibles  can  directly 
read,  write,  and  initialize  disks  from 
most  popular  CP/M  and  MS-DOS  com¬ 
puters.  CP/M  disks  can  be  used  just  as 
though  they  were  PC-DOS  disks.  New 
features  include  support  for  Apple 
CP/M,  NorthStar  CP/M  hard-sector 
formats  when  used  in  conjunction 
with  the  new  MatchPoint-PC,  and  MS- 
DOS  formats  from  computers  that  do 
not  use  the  IBM  standard.  Support  is 
also  provided  for  48  TPI,  96  TPI,  8- 
inch,  and  3.5-inch  disk  drives.  Uni- 
form-PC,  Version  2,  sells  for  $69.95. 
Reader  Service  No.  22. 

MicroSolutions  Computer  Products 
132  West  Lincoln  Highway 
DeKalb,  IL  60115 
(815)  756-3411 

Solutions  International  has  just  re¬ 
leased  SuperGlue,  a  graphics  utility 
for  the  Macintosh  that  can  be  installed 
as  a  desk  accessory  and  allows  for  font 
substitutions  and  text  extraction  from 
captured  images.  Images  can  be  re¬ 
duced  or  enlarged  before  they  are 
printed  to  disk,  and  graphics  can  be 
automatically  saved  to  a  scrapbook 
file.  The  product  has  three  parts:  Su- 
perlmage  Saver,  a  printer  driver  that 
prints  text  and  graphics  to  disk  as  im¬ 
ages;  SuperViewer,  a  utility  that  al¬ 
lows  users  to  open  those  images  and 
print  or  copy  them  for  use  in  other 
applications;  and  SuperViewerDA,  a 
desk  accessory  version  of  Super- 


Viewer.  SuperGlue  sells  for  $89.95. 
Reader  Service  No.  23. 

Solutions  International 
29  Main  St. 

P.O.  Box  989 
Montpelier,  VT  05602 
(802)  229-9146 

Hardware 

ROMulator  from  Grammar  Engine 
is,  as  its  name  implies,  a  ROM  emula¬ 
tor.  This  hardware/software  combi¬ 
nation  includes  an  in-circuit  emula¬ 
tion  module  with  associated  cables 
and  adapters  along  with  the  requisite 
host  software.  It  can  emulate  stand¬ 
ard  (JEDEC)  24-  and  28-pin  ROMs, 
PROMs,  and  EPROMs;  8-,  16-,  and  32- 
bit-word  ROM  modes;  and  supports 
daisy-chained  modules  of  up  to  8 
ROMs.  ROMulator  software  allows 
ROM  software  in  Intel  hex  or  Motor¬ 
ola  S  record  format  to  be  downloaded 
from  any  host  via  an  RS-232  interface. 
When  loaded,  the  software  is  then 
immediately  available  for  access  by 
the  target  system.  Prices  for  the 
ROMulator  (Model  S256)  start  at  $400. 
Reader  Service  No.  24. 

Grammar  Engine,  Inc. 

1021  Tipton  Ct. 

Westerville,  OH  43081 
(614)  882-6366 

Applied  Physics’  BusMate  PC/XT 
and  BusMate  AT  cards  offer  hard¬ 
ware  hackers  or  technicians  effec¬ 
tive  tools  for  performing  diagnosis  on 
the  PC  bus  system.  Each  card  has  a 
labeled  gold  pin  for  each  signal  line 
present  on  the  bus  and  connections 
can  be  made  to  oscilloscopes,  logic 
analyzers,  or  other  test  equipment 
that  use  standard  test  probes  or  mi¬ 
croclips.  Four  light-emitting  diodes 
monitor  the  system's  power  supply 
voltages.  Each  card  also  features  a 
bus  reset  button  that  lets  users  reboot 
the  system  without  powering  down 
the  machine.  BusMate  PC/XT  sells  for 
$79,  and  BusMate  AT  sells  for  $89. 
Reader  Service  No.  25. 

Applied  Physics  Inc. 

P.O.  Box  2368 

West  Lafayette,  IN  47906 

(317)  497-1718 

DDJ 


148 

818 


Dr.  Dobb's  Journal,  October  1987 


FORUM 


SWAINE'S  FLAMES 


Doesn’t  your  heart  just  go  out  to 
Lotus  Development  Corpora¬ 
tion?  First  everybody  and  his 
brother  ripped  off  the  Lotus  1-2-3 
user  interface.  Then  somebody  hit 
on  the  idea  of  compiling  1-2-3  spread¬ 
sheets  into  objects  that  can  be  ma¬ 
nipulated  without  1-2-3,  so  that  one 
copy  of  1-2-3  may  be  all  a  company 
needs.  And  now  the  latest  wrinkle 
in  the  spreadsheet  counterpane: 
Microsoft  has  slipped  in  with  a  PC 
spreadsheet  product  that  reads 
1-2-3  files  and  converts  1-2-3  macros. 

It  reminds  me  of  a  story  I  wrote 
once. 

Back  when  I  worked  for  InfoWorld 
and  VisiCalc  and  SuperCalc  were  in 
flower,  I  wrote  a  regular  back-page 
column  that  posed  mathematical 
puzzles  in  the  format  of  mystery 
stories  featuring  a  character  named 
Usasi  and  often  touching  obliquely 
on  computer  industry  issues.  In 
1983,  I  wrote  something  like  this: 

I  was  sitting  at  my  favorite  table 
in  Mister  Bob’s  on  Polk  Street,  ex¬ 
plaining  a  fine  forensic  point  to  my 
associate  Casey  Standard  and  smart- 
mouthed  lawyer  Bette  Noire. . . . 
Casey  asked  if  we  had  heard  of  a 
programming  case  Mr.  Usasi  had  in 
which  a  PROLOG  expert  system  had 
been  ported  to  a  machine  on  which 
the  values  true  and  false  had  the 
opposite  representation  from  what 
the  programmer  expected.  Every 
true  became  a  false,  every  full 
became  an  empty.  Truth  values  were 
reversed  universally.  It  was  a  real 
mess.  Mr.  Usasi  was  called  in  as  an 
expert  on  expertise  to  advise  them. 

“Simple,”  I  said.  “Just  reverse  the 
interpretation  of  the  output.  You 
know,  there  was  a  clever  database 
program  on  micros  back  in  the  70s 
that  looked  a  lot  like  expert  systems 
of  today,  but  was  much  simpler.  You 
could  tell  it,  ‘Bartholomew’s  maiden 
aunt’s  Abigail,’  and  ‘Bartholomew’s 


birthday  isn’t  February  29,’  and  ask 
questions  like  ’What’s  Bartholomew’s 
favorite  color?’  It  used  Soundex 
coding.  What  was  it  called? 

“Anyway,  this  all  reminds  me  of 
that  time  in  Guadalajara  when  I 
broke  up  a  software  smuggling  ring 
that  could  have  torpedoed  the  elec¬ 
tronic  spreadsheet  market.” 

“That  doesn't  make  any  sense,” 
Bette  objected.  “You  don’t  have  to 
smuggle  software;  you  can  just — ” 

“I’d  insinuated  myself  into  the 
smuggling  ring,"  I  went  on,  “a  group 
of  Orange  County  numerologists 
who  had  got  it  into  their  heads  that 
balance  sheets  belonged  to  the 
masses  and  who  were  smuggling  in 
spreadsheets  to  undermine  the 
market  leaders,  which  were  Sorcim 
and  VisiCorp  back  then. 

“For  deep  and  subtle  numerologi- 
cal  reasons,  these  Orange  County 
spreadsheet  smugglers  had  to  divide 
the  gold  dust  in  which  they  had 
been  paid  into  equal  shares,  with 
each  smuggler  getting  exactly  as 
many  shares  as  there  were  smug¬ 
glers  who  got  shares,  and  with  no 
undistributed  shares. 

“Now  I  had  infiltrated  the  band 
and  had  studied  a  little  numerology 
to  beat  them  at  their  own  game.  I 
siezed  on  the  fact  that  the  smug¬ 
glers  secretly  didn’t  want  an  even 
distribution  of  the  spoils  and  sug¬ 
gested  a  scheme  that  would  result 
in  the  same  number  of  shares  being 
distributed — which  mattered  to 
their  numerological  principles — and 
that  would  still  allow  an  uneven 
distribution  of  shares — which  ap¬ 


pealed  to  the  cupidity  of  the  more 
powerful  members. 

“Give  me  exactly  one  share  for 
each  smuggler,  I  told  them,  count¬ 
ing  me  as  one  of  the  smugglers  of 
course,  and  give  each  smuggler  a 
different  even  number  of  shares,  as¬ 
signing  them  anywhichway.  I  gave 
them  solid  numerological  reasons 
for  the  plan,  and  they  bought  it.  I 
also  stipulated,  since  I  had  demon¬ 
strated  my  value  by  coming  up  with 
the  plan,  that  nobody  get  as  much 
as  twice  my  allotment,  and  they 
bought  that,  too.” 

Casey  frowned  at  me.  “I'm  afraid 
this  is  a  little  far-fetched,  Mickey.” 

"Sure  it  was,”  I  explained  patiently. 
“I  was  just  playing  along  with  them. 
I’m  not  into  that  numerological  stuff 
Anyway,  after  they  solemnly  agreed 
to  the  plan  and  swore  their  numero¬ 
logical  oaths,  they  started  divvying 
up  the  gold  dust  and  discovered 
what  I  had  pulled  on  them — but  by 
then  it  was  too  late.  Athough  they 
didn't  much  like  it,  they  had  the 
integrity  to  stand  by  their  numero¬ 
logical  principles.”  I  signaled  for  the 
check. 

“That’s  the  end?”  Casey  asked. 
“But  what  happened?  What  had  you 
pulled  on  them?” 

“Do  the  math,”  I  said.  “It’s  all  in 
the  math.” 


You  can  surely  solve  Mickey’s  little 
puzzle,  but  what  about  the  upside- 
down  expert  system  Casey  men¬ 
tioned?  Because  of  its  closed-world 
implementation,  PROLOG  does  not 
treat  true  and  false  complementa- 
rily.  What  would  happen  if  the  base 
truth  values  in  a  PROLOG  program 
were  reversed?  And  what  was  the 
name  of  that  database  program? 


Michael  Swaine 
editor-in-chief 


152 


Dr.  Dobb’s  Journal,  October  1987 

819 


m 


#133  NOVEMBER  1987 


Dr.  Dobb’s  journal  ol 


2.95  (3.95  CANADA)  I 


Software  Tools 

FOR  THE  PROFESSIONAL  PROGRAMMER 

4 

Special  Graphics  Issue 


Tools  for: 

3-D  Mapping 

Screen 

Management 

Turbo  C  Graphics 


Languages: 

C,  Pascal,  and  Forth 


NOVEMBER  1987 


CONTENTS 


VOLUME  12,  ISSUE  1 1 


Mac  graphics  ► 


Turbo C  ► 
Turbo  Pascal  ► 


Make  your  own 

TSRs  ► 

Text  processing  ► 


Unix  time  ► 


OOPstories  ► 


ARTICLES 


3-D  Images  from  Contour  Maps  1 8 

by  William  D.  May 

Bill  shows  how  to  warp  scanned  images  of  contour  maps — such 
as  topographical  maps — into  representations  that  can  be 
displayed  and  rotated  in  three  dimensions. 

A  Graphics  Toolbox  for  Turbo  C  (Part  1)  30 

by  Kent  Porter 

Borland's  new  C  compiler  doesn't  come  with  a  lot  of  graphics 
support.  Kent  decided  to  do  something  about  that. 

A  Graphics  Toolkit  for  Turbo  Pascal  38 

by  Hubert  D.  Callihan 

Hugh  uses  several  nonstandard  Turbo  Pascal  routines  to  create  a 
set  of  tools  for  handling  screen  regions. 

Using  EGA  Graphics  Screens  in  Your  Programs  46 

by  J.  Brooks  Breeden 

Using  both  Forth  and  pseudocode  (for  portability),  Brooks  gives 
you  the  tools  for  loading  EGAPaint  files  into  video  memory. 

Automated  Interrupt  Handling  in  C  54 

by  Ron  Miller 

Ron  offers  a  clever  hack  you  can  use  to  develop  your  own  TSR 
utilities. 

An  Alternative  to  Soundex  62 

by  Jim  Howell 

Weight  or  wait — a  faster  word  matching  algorithm 
Async  AppleTalk  104 

Richard  E.  Brown  and  Steve  Ligett’s  listings  continued  from  last 
month 


COLUMNS 


C CHEST  116 

by  Allen  Holub 

Allen  shows  how  to  use  the  ANSI  time  functions. 

STRUCTURED  PROGRAMMING  124 

by  Namir  Clement  Shammas 

Namir  presents  several  heuristic  search  techniques. 

ARTIFICIAL  INTELLIGENCE  130 

by  Ernest  R.  Tello 

Ernie  talks  with  users  of  object-oriented  programming  tools  and 
tells  why  some  developers  are  abandoning  or  avoiding  the 
object-oriented  approach— -and  then  he  tells  why  he  still  thinks 
it's  one  of  the  best  paradigms  yet  devised. 


FORUM 

PROGRAMMER'S 

SERVICES 

EDITORIAL 

6 

OF INTEREST:  144 

by  Michael  Swaine 

Products  for  programmers 

RUNNING  LIGHT 

8 

ADVERTISER  INDEX:  145 

by  Tyler  Sperry 

Where  to  go  for  more 

ARCHIVES 

8 

information  on  products 

LETTERS 

12 

by  you 

SWAINE'S  FLAMES 

152 

by  Michael  Swaine 

About  the  Cover 

As  you  may  have  guessed,  we 
didn't  produce  the  contour  map 
on  this  month’s  cover  on  the  art 
department's  512K  Mac.  Thanks, 
you  folks  at  Dynamic  Graphics 
Inc.  in  Berkeley,  California,  for 
the  use  of  the  software,  Vax, 
Tektronix  terminal,  and  data  to 
make  this  month’s  cover 
happen. 

This  Issue 

Programming  is  getting  more 
like  cinematography  every  day. 
Welcome  to  our  graphics  pro¬ 
gramming  issue,  with  coverage 
ranging  from  how  to  handle 
contour  maps  on  the  Macintosh 
to  how  to  shuffle  screen  regions 
on  the  PC.  There’s  even  a  library 
for  handling  graphics  in  Turbo 
C. 

Next  Issue 

December  is  our  annual  operat¬ 
ing  systems  issue.  The  lead  arti¬ 
cle  comes  from  Dave  Cortesi  (the 
Resident  Intern,  for  those  of  you 
who  remember  his  column 
from  the  old  days)  and  concerns 
one  of  the  less  explored  aspects 
of  OS/2:  dynamic  linking.  Allen 
Holub’s  column  will  detail  a  new 
multitasking  kernel,  and  we  ll 
also  present  some  keen-edged 
blades  for  system  hacking. 


Dr.  Dobb's  Journal,  November  1987 

822 


3 


FORUM 


EDITORIAL 

Mike’s  Survey 

MI  T  e've  just  seen  the  results  of  our 
WW  annual  reader  survey,  so  we 

Editors 

EMS  programming 

now  know  more  about  you  and  your 

Encryption 

interests.  We  know  that  you  are  edu- 

Fourth  generation  languages 

cated:  most  of  you  have  done  gradu- 

Functional  programming 

ate  work.  You're  forward-looking: 

Human  interface  design 

you’re  more  likely  to  use  LISP  or  PRO- 

Industry  news 

LOG  than  COBOL  or  even  Modula-2, 

Libraries 

and  if  you  or  your  company  don’t  al- 

Logic  programming 

ready  have  a  386  machine,  you  prob- 

Machine  learning 

ably  will  buy  one  or  more  in  the  next 

Managing  development  teams 

year.  You  have  professional  access  to 

Mathematics 

a  wide  choice  of  machines,  operating 

Modula-2 

systems,  and  environments  for  soft- 

Multitasking 

ware  development. 

Music 

Not  entirely  coincidentally,  we’re 

Nubus  and  the  Mac  II 

just  wrapping  up  the  editorial  calen- 

Numerical  methods 

dar  for  1988.  You  can  look  forward  to 

PS/2  programming 

coverage  of  a  wide  range  of  program- 

Parallel  processing 

ming  environments:  Unix,  OS/2,  the 

PostScript 

Macintosh  environment,  Windows, 

Product  news 

and  DOS.  We’ll  publish  code  in  PRO- 

Product  reviews 

LOG,  LISP,  PostScript,  C,  Pascal,  assem- 

Scientific  applications 

bly  languages,  Forth,  and  BASIC.  We’ll 

Software  engineering 

examine  the  shift  in  programming 

Starting  a  software  business 

style  to  object-oriented,  event-driven 

Telecommunications 

design,  we’ll  .... 

We’ll  gladly,  as  always,  alter  the 

Unix 

plan  to  accommodate  new  develop- 

Please  indicate  your  choices  on  this 

ments  and  the  treasures  of  the  tran¬ 
som.  Which  is  where  you  come  in. 

page  or  a  copy  of  it  and  send  it  to 

What  follows  is  a  list  of  topics  that 

Michael  Swaine 

interest  us,  beyond  the  languages  and 

Mike's  Survey 

operating  systems  mentioned  above. 

M&T  Publishing 

Survey  results  indicate  that  they  also 

501  Galveston  Drive 

interest  you.  You  can  influence  us  in 
two  ways.  The  first  is  to  write  us  an 
article  on  one  of  the  listed  topics  (or 

Redwood  City  CA  94063. 

another  of  your  choosing).  The  sec¬ 
ond  is  to  let  us  know  where  you’d 
lii.j  to  see  us  invest  our  efforts  in  the 

coming  year.  Treat  this  page  as  a  bal- 

Michael  Swaine 

lot  and  check  your  favorite  topics. 
Send  me  your  choices.  I  promise  to 
read  them. 

Ada 

Algorithms 

Common  data  formats 

Consulting 

Databases 

Device  control 

editor-in-chief 

Software  Tools 

FOR  THE  PROFESSIONAL  PROGRAMMER 


Editorial 

Editor-in-Chief  Michael  Swaine 
Editor  Tyler  Sperry 
Managing  Editor  Vince  Leone 
Associate  Editor  Ron  Copeland 
Assistant  Editors  Sara  Noah  Ruddy 
Technical  Editors  Allen  Holub 

Richard  Relph 

Contributing  Editors  Kent  Porter 

Namir  Shammas 
Ernest  R.  Tello 

Copy  Editor  Rhoda  Simmons 

Production 

Director  Art/Production  Larry  L.  Clay 

Art  Director  Michael  Hollister 
Assoc.  Art  Director  Joe  Sikoryak 
Technical  Illustrator  Frank  Pollifrone 
Typesetter  Mary  Lopez 
Cover  Photographer  Michael  Carr 
Circulation 

Circulation  Director  Maureen  Kaminski 
Fulfillment  Coordinator  Francesca  Martin 
Book  Marketing  Mgr.  Jane  Sharninghouse 
Subscription  Supervisor  Kathleen  Shay 
Newsstand  Sales 

Coordinator  Larry  Hupman 
Administration 
Finance  Director  Kate  Wheat 
Business  Manager  Betty  Trickett 
Accounts  Payable  Supv.  Mayda  Lopez-Quintana 
Accts.  Receivable  Supv.  Laura  Di  Lazzaro 
Advertising  Director 
Ferris  Ferdon  (415)  366-3600 
Marketing  Mgr  Michael  Wiener 
Trafficking  Coordinator  Patricia  Albert 
Account  Managers 
see  page  145 
Associate  Publisher 
Michael  Swaine 
Assistant  Sara  Noah  Ruddy 

Dr.  Dohh's  Journal  of  Software  Tools  (USPS  307690) 
is  published  monthly  by  M&.T  Publishing  Inc.,  SOI 
Galveston  Dr.,  Redwood  City,  CA  94063;  (415)  366-3600. 
Second-class  postage  paid  at  Redwood  City  and  at  ad¬ 
ditional  entry  points.  DDJ  is  published  under  license 
from  People’s  Computer  Company,  2682  Bishop  Dr., 
Suite  107,  San  Ramon,  CA  94583,  a  nonprofit 
corporation. 

Article  Submissions:  Send  manuscripts  and  disk 
(with  article  and  listings)  to  the  Editor. 

DDJ  on  CompuServe:  Type  GO  DDJ 
Address  Correction  Requested:  Postmaster:  Send 
Form  3579  to  Dr.  Dobb  s  Journal ,  P.O.  Box  27809,  San 
Diego,  CA  92128.  ISSN  0888-3076 

Customer  Service:  For  subscription  problems  call: 
outside  CA  (800)  321-3333;  in  CA  (619)  485-9623  or  566- 
6947.  For  book/software  order  problems  call  (415)  366- 
3600. 

Subscriptions:  $29.97  per  1  year;  $56.97  for  2  years. 
Canada  and  Mexico  add  $27  per  year  airmail  or  $10 
per  year  surface.  All  other  countries  add  $27  per  year 
airmail.  Foreign  subscriptions  must  be  prepaid  in  U.S. 
funds  drawn  on  a  U.S.  bank.  For  foreign  subscrip¬ 
tions,  TELEX:  752-351. 

Foreign  Newsstand  Distributor:  Worldwide  Media 
Service  Inc.,  386  Park  Ave.  South,  New  York,  NY  10016; 
(212)  686-1520  TELEX:  620430  (WUI). 

Entire  contents  copyright  ©  1987  by  M&T 
Publishing  Inc.  unless  otherwise  noted 
on  specific  articles.  All  rights  reserved. 

M&.T  Publishing  Inc. 

Chairman  of  the  Board  Otmar  Weber 
Director  C.  F.  von  Quadt 
President  and  Publisher  Laird  Foshay 


6 


Dr.  Dobb's  Journal,  November  1987 

823 


FORUM 


RUNNING  LIGHT 


As  Mike  Swaine 
mentioned  on 
page  6,  we've  just  re¬ 
ceived  the  results  of 
our  latest  reader  sur¬ 
vey.  After  spending 
some  time  pouring 
through  the  data,  I’ve 
still  got  a  few  things 
I’m  curious  about — 
items  that  are  impor¬ 
tant  to  producing  a 
quality  magazine  but  don’t  often 
show  up  in  the  statistical  abstracts. 

For  example,  we  devote  a  lot  of 
space  to  source  code  listings.  The 
question  is,  are  you  folks  really  using 
those  listings?  Given  there's  always  a 
limit  to  the  number  of  pages  in  an 
issue,  should  we  keep  printing  the 
listings  as  they  are  or  print  more  arti¬ 
cles  and  arrange  for  source  code  to 
be  distributed  some  other  way? 

There’s  ample  historical  justifica¬ 
tion  for  printing  source  listings,  of 
course.  The  reason  DDJ  was  founded, 
after  all,  was  to  put  tools  such  as  the 
source  code  for  Tiny  BASIC  into  the 
hands  of  as  many  people  as  possible. 
If  you  wanted  your  computer  to  do 
something  back  then,  for  the  most 
part  you  had  to  write  the  program 
yourself.  (Sometimes  you’d  have  to 
write  an  interpreter  or  compiler  first 
and  then  get  back  to  the  original 
problem.)  A  dozen  years  ago,  the 
hacker  ethic  demanded  we  put 
source  code  into  the  hands  of  our 
readers,  and  we  did  it  gladly. 

But  that  was  a  dozen  years  ago, 
and  things  have  changed  a  bit.  These 
days  Bill  Gates  isn’t  paranoid  about 
hobbyists  copying  his  cassette  BASIC; 
instead  he's  justifiably  worried  about 
Philippe  Kahn  stealing  his  languages 
market.  The  economics  of  the  eight 
million  PCs  out  there  (a  conservative 
estimate,  by  the  way)  have  caused 
many  sophisticated  compilers  to  be 
priced  so  that  it  doesn't  make  sense 
to  pirate  the  products,  let  alone  write 
them  yourself. 

Fine,  I  hear  you  say,  but  what  has 


all  this  to  do  with  Dr. 
Dobbs? 

Simply  put,  the  field 
is  changing  and  DDJ 
must  grow  and  adapt 
to  keep  up.  There’s  no 
question  that  Tiny  BA¬ 
SIC  ported  to  the  AMD 
29000  would  be  an  in¬ 
teresting  exercise  in 
assembly  language, 
but  surely  there  are 
more  useful  projects  to  use  as  an  ex¬ 
ample  of  29000  programming.  (For 
those  of  you  coding  for  the  29000, 
yes,  that  was  a  hint  for  article 
suggestions.) 

The  question  is  not  whether  the 
Doctor's  pride  and  joy  will  change, 
but  how.  To  get  back  to  the  topic  of 
source  listings:  Most  of  you  have  a 
modem  or  easy  access  to  one.  In  a 
time  when  2,400  bps  modems  are 
readily  available,  does  it  make  sense 
to  print  listings  so  that  you  can  enjoy 
hours  of  typing  practice?  Do  you 
want  more  downloading  options  or 
is  CompuServe  access  sufficient?  Fi¬ 
nally,  do  you  use  the  code  in  binary 
form  or  just  scan  the  printed  listings 
to  understand  the  algorithms? 

As  always,  I’m  willing  to  discuss 
this  topic  with  you  by  phone,  if  you 
can  catch  me  at  my  desk.  I’ll  certain¬ 
ly  be  watching  the  CompuServe  DDJ 
Forum  to  catch  the  debate  and  as  the 
pile  of  papers  hiding  my  desk  will  at¬ 
test,  we  do  read  all  your  letters.  But  if 
you  really  want  to  make  sure  I  get 
your  vote/suggestion/threatening 
letter,  the  best  bet  is  to  use  the  mail 
systems  on  CompuServe  (#<TK>)  or 
BIX  ('tyler'). 


Tyler  Sperry 
editor 


ARCHIVES 


Linguistic  Overbyte 

"Programming  languages  appear  to  be 
in  trouble.  Each  successive  language  incor¬ 
porates,  with  a  little  cleaning  up,  all  the 
features  of  its  predecessors  plus  a  few 
more  ....  The  Department  of  Defense  has 
current  plans  for  a  committee-designed 
language  standard  that  could  require  a 
manual  as  long  as  1000  pages.  Each  new 
language  claims  new  and  fashionable  fea¬ 
tures,  such  as  strong  typing  or  structured 
control  statements,  but  the  plain  fact  is  that 
few  languages  make  programming  suffi¬ 
ciently  cheaper  or  more  reliable  to  justify 
the  cost  of  producing  and  learning  to  use 
them .... 

For  twenty  years  programming  lan¬ 
guages  have  been  steadily  progressing  to¬ 
ward  their  present  condition  of  obesity;  as 
a  result,  the  study  and  invention  of  pro¬ 
gramming  languages  have  lost  much  of 
their  excitement.  Instead,  it  is  now  the 
province  of  those  who  prefer  to  work  with 
thick  compendia  of  details  rather  than 
wrestle  with  new  ideas.” — John  Backus, 
1977  Turing  Award  Lecture. 

Ten  Years  Ago  in  DDJ 

"Microsoft’s  BASIC  for  the  8080  and  Z-80  . .  . 
is  now  generally  available  on  both  a  single¬ 
copy  and  OEM  basis.  The  BASIC  became  the 
subject  of  extended  legal  dispute  which  re¬ 
sulted  in  the  termination  of  an  exclusive 
license  to  mits,  Inc. 

"The  BASIC,  best  known  in  the  field  as 
Altair  BASIC,  has  been  in  use  for  Vk  years 
and  has  a  user  base  of  over  5000.” — news 
release,  DDJ,  November /December  1977. 

"In  his  letter  in  the  DDJ  #18,  Phil  Karn 
mentions  problems  regarding  semi-trans¬ 
parent  paper  tape.  I  wonder  if  he  has  tried 
using  polarized  light.  Two  inexpensive 
pieces  of  polarizing  material,  one  above 
the  tape  and  one  below,  might  solve  the 
problem.  Light  would  still  filter  through 
the  tape,  but  would  be  de-polarized  in  the 
process  and  would  be  rejected  by  the  sec¬ 
ond  polarizer.  Light  passing  through  the 
holes  would  still  be  polarized  and  be 
picked  up  by  the  sensors.  I  haven't  tried  it, 
but  it  should  work” — Jim  Day,  letter  to  the 
editor,  DDJ,  November/December  1977. 


Dr.  Dobb’s  T< 

v  r  -wv  Jr' i 


OURNALof 


COMPUTER 

(Calisthenics  ^5  Orthodontia 


Running  Light  Without  Ouerbyte 


8 

824 


Dr.  Dobb's  Journal,  November  1987 


FORUM 

LETTERS 


8088  Optimization 
Techniques 

Dear  DDJ, 

I  enjoyed  Tom  Disque’s  article  on 
8088  optimization  techniques  (July 
1987)  but  must  point  out  a  simple 
improvement  to  his  Example  1. 

He  uses  a  jcyz  jump  to  guard 
against  counter  c,x  entering  a  rep 
movsw  with  a  value  of  0.  Although 
this  would  be  a  necessary  prelude 
to  an  ordinary  loop,  where  cy  is 
tested  at  the  bottom  after  decremen¬ 
tation,  it  is  wasted  before  a  rep 
prefix.  As  explained  in  The  8088 
Book  by  Russell  Rector  and  George 
Alexy  (pages  4-46),  a  check  for  cy  =  0 
is  the  very  first  step  in  a  rep 
instruction,  and  when  it  is 
true  initially,  the  string  primi¬ 
tive  is  not  executed. 

Particularly  for  anyone 
whose  use  of  the  computer 
reflects  the  name  the  French 
give  to  it — L’ordinateur,  a  ma¬ 
chine  for  putting  things  in 
order — the  string  primitive 
instructions  are  among  the 
8086  family’s  most  distinct  ad¬ 
vantages  and  such  situations 
arise  frequently.  Bearing  in 
mind  this  behavior  of  the  rep 
prefix  and  the  cy  register  can 
save  both  time  and  memory 
as  well  as  giving  program¬ 
mers  one  less  detail  to  worry 
about. 

Paul  R.  Emmons 

438  W.  Chestnut  St. 

West  Chester,  PA  19380 


Dear  DDJ, 

I  read  Tom  Disque’s  article 
in  the  July  issue,  on  8088  op¬ 


timizing  techniques,  with  great  inter¬ 
est. 

As  a  Motorola-biased  programmer, 
I  found  it  interesting  to  note  that 
Mr.  Disque’s  68000  code  suffered 
from  an  Intel-type  restriction.  The 
code  itself  was  very  good,  but  it 
limited  the  amount  of  memory  that 
could  be  moved  to  a  maximum  of 
between  64K  and  256K,  depending 
on  the  alignment  of  the  memory. 
Although  Mr.  Disque  used  long- size 
instructions  when  manipulating  the 
length  registers,  the  dbf  instruction 
could  only  loop  a  maximum  of  64K 
times. 

The  dbf  instruction  can  easily  be 
extended  to  cope  with  32-bit  loops 
by  following  it  with  the  instructions: 

sub.l  #$10000, <register> 
bcc.s  <loop _ label> 

In  Mr.  Disque’s  routine,  the  solution 
is  to  replace  the  branch  to  exit  after 
longloop  with  the  lines: 

sub.l  #$10000, dO 
bcs.s  exit 
bra.s  longloop 

and  to  insert  the  lines: 


sub.l  #$10000, dO 
bcc.s  lngloop2 

after  the  second  dbf.  The  last  dbf 
does  not  need  modification  as  it 
only  ever  moves  14  bytes  at  most. 

With  these  changes  Mr.  Disque’s 
memory-move  routine  can  then 
cope  with  the  complete  addressing 
range  of  all  the  680x0  processors. 
Andrew  Pennell 
The  Old  School,  Greenfield 
Bedord,  MK45  5DE 
England 

Legacy  of  the  Teletype 

Dear  DDJ, 

In  the  December  1986  Letters 
column  there  was  a  letter  from  a  Mr. 
Hawkins  in  which  he  was  complain¬ 
ing  among  other  things  about  the 
terseness  of  the  C  language.  He  at¬ 
tributed  this  “propensity  for  the 
least  possible  typing”  to  the  authors 
of  the  language  being  two  fingered 
typists.  More  likely  it  was  what  they 
were  typing  on  that  inspired  them 
to  opt  for  the  least  possible  typing. 
C  is  an  older  language  than  many 
realize  since  although  it  was  born 
in  the  early  1970s  along  with  Unix,  it 
did  not  escape  from  its  birthplace, 
the  Bell  Laboratories,  until 
around  1980.  The  standard 
terminal  in  use  back  in  the 
early  seventies  was  the  tele¬ 
type  machine.  In  many  operat¬ 
ing  systems  from  CP/M  to 
Unix  the  abbreviation  TTY, 
used  for  things  related  to  the 
terminals,  makes  sense  only 
if  it  is  realized  that  the  termi¬ 
nals  were  originally  teletype 
machines. 

The  teletype  machine  was 
an  electromechanical  device, 
not  an  electronic  device,  and 
was  therefore  very  slow  by 
today’s  standards.  Typically 
it  operated  at  110  baud  or 
ten  characters  per  second. 
Furthermore,  after  a  key  was 
struck  you  had  to  wait  a  full 
tenth  of  a  second  before  strik¬ 
ing  the  next  key  or  it  would 
be  ignored.  Since  most 
people,  especially  when  think¬ 
ing  at  the  keyboard,  tend  to 
(continued  on  page  138) 


Jared’s  wardrobe — like  his  programming — 
isn’t  EGA  compatible. 


12 


Dr.  Dobb's  Journal,  November  1987 

825 


ARTICLES 


3-D  Images  from 
Contour  Maps 


Contour  maps  have  in¬ 
trigued  me  ever  since 
my  introduction  to 
them  in  high-school  geology. 

By  studying  a  contour  map,  I 
could  learn  about  inaccessi¬ 
ble  parts  of  the  world  and 
discover  unexpected  possibili¬ 
ties  in  my  backyard.  Yet  for 
all  its  wealth  of  information  and  detail,  a  contour  map  is 
less  than  intuitive.  You  must  say  to  yourself,  “these 
contours  are  very  close  together,  so  this  area  must  be 
very  steep,”  or  "these  contours  are  widely  spaced,  so 
this  area  is  fairly  level.” 

It  would  be  wonderful  to  be  able  to  feed  a  contour 
map  into  a  machine  that  could  project  images  of  the 
terrain  in  three  dimensions.  Then  you  could  rotate  the 
view,  zoom  in  on  interesting  spots,  change  to  different 
viewpoints,  and  generally  get  an  extremely  tangible 
feeling  for  the  characteristics  of  the  landscape. 

This  article  is  about  a  set  of  algorithms  that  takes  a 
step  toward  achieving  these  objectives.  The  algorithms 
analyze  a  binary  image  of  a  contour  map  (such  as  that 
produced  by  an  image  scanner)  and  produce  a  represen¬ 
tation  of  the  image  that  can  be  used  by  a  hidden-line 
algorithm.  The  hidden-line  algorithm  can  then  produce 
3-D  views  from  any  mathematically  valid  viewpoint. 
Figures  1  and  2,  page  20,  show  the  before  and  after  of 
this  transformation. 

In  many  respects  the  means  of  transforming  the 
contour  map  into  three  dimensions  is  as  interesting  as 
the  transformation  itself.  Seeing  the  contours  on  a 
contour  map  is  trivial  for  the  human  eye  and  brain  but 


William  D.  May,  20A-M23,  Arthur  D.  Little  Inc.,  Acorn  Pk., 
Cambridge,  MA  02140.  Bill  is  a  consultant  specializing  in 
system  development. 


complex  for  a  computer.  A 
large  part  of  the  problem 
stems  from  bandwidth:  the 
eye/brain  seems  to  see  an 
entire  image,  its  components, 
and  the  relationships  among 
its  components  simultane¬ 
ously,  but  the  computer 
“sees"  an  image  one  pixel  at 
a  time.  The  low  bandwidth  of  the  computer’s  vision 
makes  it  extremely  difficult  for  the  computer  (that  is,  its 
programmer)  to  identify  objects  and  to  perceive  their 
relationships. 

The  approach  I  have  taken  uses  some  geometry  to 
overcome  this  lack  of  cognitive  ability,  first  to  find  and 
identify  the  contours  in  the  image  and  then  to  analyze 
the  contour  relationships  (nesting).  The  drawback  of 
using  a  geometric  solution  (as  opposed  to  an  "intelli¬ 
gent”  or  “learning"  solution)  is  that  this  approach  is 
useful  only  for  this  particular  application.  But  I  think 
this  drawback  is  more  than  offset  by  the  ability  to  do 
something  that  the  eye/brain  cannot:  use  the  analysis  to 
build  a  3-D  image  from  a  2-D  map. 

Of  course,  contour  representation  is  used  in  areas 
other  than  geography,  such  as  medicine,  chemistry,  and 
astronomy.  The  contour-analysis  algorithm  I  have  devel¬ 
oped  can  be  used  in  these  situations  as  well,  although 
the  interpretation  of  the  resulting  "image"  will  be  quite 
different. 

I  should  caution  you  that  the  current  state  of  this 
algorithm  falls  somewhat  short  of  my  original  objective, 
the  major  problem  being  with  the  input  of  the  map.  The 
algorithm  needs  an  image  that  consists  of  topologically 
correct  contours.  This  means  that  continuous,  closed 
contours  on  the  map  must  in  fact  be  continuous  and 
closed  in  the  image  and  that  contours  don’t  cross  or 
merge. 


This  contour- analysis  algorithm 
vividly  demonstrates 
the  gulf  between  human  cognition 
and  an  algorithmic  procedure. 


18 

826 


Dr.  Dobb’s  Journal,  November  19S7 


In  real  life,  such  as  using  a  U.S.  Geological  Survey  map, 
this  requires  that  extraneous  information  be  filtered  out 
of  the  image  and  that  the  contours  themselves  get 
scanned  accurately.  Both  requirements  are  difficult  to 
achieve.  U.S.  Geological  Survey  maps  contain  a  wealth  of 
symbols,  lines,  markings,  grids,  shadings,  and  so  on  that 
interfere  with  the  filtering  process.  The  requirement  for 
continuous  lines  is  also  difficult  to  achieve.  First,  the 
maps  often  place  elevation  information  directly  on  the 
contour  line,  which  is  easy  for  humans  to  read  but 
difficult  for  computers  to  interpret.  Second,  scanners 
have  difficulty  accurately  placing  the  very  fine  lines  used 
for  contours,  so  the  result  sometimes  looks  more  like  a 
fog  of  dots  where  the  contour  is  supposed  to  be  than  a 
continuous  line. 

From  the  Map  to  Three  Dimensions 

A  program  that  transforms  contour  maps  into  3-D  views 
requires  several  steps  to  achieve  its  objective.  I  have 
summarized  these  steps  below.  The  process  draws  on 
diverse  subjects,  such  as  image  processing  and  3-D 
graphics.  The  focus  of  this  article  is  mainly  on  the 
algorithms  for  analyzing  contour  map  information  and 
building  a  3-D  representation  from  the  analysis.  These 
tasks  correspond  to  steps  3  through  5  below. 

1.  Getting  the  map  image  into  the  computer.  I  use 
images  stored  in  Macintosh  MacPaint  format,  which  can 
be  created  using  MacPaint  or  via  scanners  such  as 
Thunderscan.  The  MacPaint  format  is  simple,  consisting 
of  512  bytes  of  header  information  (which  is  usually 
ignored)  and  a  bit  map  in  which  each  bit  represents  a 
single  black  or  white  pixel  in  the  image.  Apple’s  Macin¬ 
tosh  Tech  Note  #86  explains  the  MacPaint  format  in 
detail  and  provides  sample  Pascal  code  for  reading  and 
writing  MacPaint  documents. 

2.  Cleaning  up  the  image.  Extraneous  image  data  — that 


is,  data  that  is  not  part  of  a  contour — must  be  removed 
before  trying  to  analyze  the  map.  In  many  cases  some 
human  work  is  needed  to  overcome  imperfections  in 
the  scanning  process. 

3.  Analyzing  the  contour  image.  This  refers  to  finding  all 
the  contours  in  the  bit  map,  ‘'remembering”  which  are 
nested  in  which,  and  storing  them  in  a  way  that  can  be 
used  to  build  a  3-D  image.  The  contour-analysis  algo¬ 
rithm  builds  two  data  structures  during  the  analysis. 
One  is  an  array  with  one  entry  for  each  contour  discov¬ 
ered  in  the  map.  Each  array  element  contains  the 
coordinates  of  one  point  on  the  contour;  the  index  of 
the  enclosing  contour;  and,  eventually,  the  contour’s 
elevation.  The  algorithm  also  stores  every  point  that  is 
on  a  contour  as  a  node  in  a  tree.  The  key  for  the  node  is 
the  point’s  coordinates,  and  the  datum  consists  of  the 
array  index  of  the  contour  the  point  is  on.  These  two 
data  structures  make  it  easy  to  access  the  contours 
directly  or  to  access  contour  information  when  given 
any  point  on  the  contour.  This  in  turn  makes  the 
translation  into  three  dimensions  easy. 

4.  Assigning  elevations  to  the  contours.  A  restriction  of 
the  contour-analysis  algorithm  is  that  it  cannot  read 
elevations  directly  from  a  digitized  image.  In  some 
applications  this  does  not  matter;  in  others,  such  as  U.S. 
Geological  Survey  maps,  the  ability  would  be  quite 
useful.  That  would  be  a  different  and  very  large  project, 
though,  and  leaving  it  out,  as  I  have  done  here,  does  not 
affect  the  basic  logic  of  the  algorithm.  In  lieu  of  reading 
the  elevations  from  the  map,  I  have  implemented  the 
simple  case  in  which  all  nested  regions  are  assumed  to 
be  ascending.  Given  the  base  elevation  of  the  map,  the 
program  simply  increments  the  elevation  for  each  level 
of  nesting  in  the  image.  The  more  general  solution 
would  allow  users  to  override  elevation  assignments 
manually. 

5.  Creating  an  internal  3-D  representation  of  the  terrain. 


Dr.  Dobb’s  Journal,  November  1987 


19 

827 


3-D  IMAGES 

(continued  from  page  19) 

After  the  contour  analysis,  the  program  builds  a  rectan¬ 
gular  grid  as  the  transformed  representation  of  the 
contour  map.  The  grid  is  simply  an  array  in  which  each 
cell  contains  the  elevation  at  a  point  on  the  map.  The 
larger  the  grid,  the  more  finely  it  covers  the  map  and  the 
greater  the  detail  in  the  resulting  image. 

6.  Displaying  a  projection  of  the  grid.  The  final  step  is  to 
use  a  hidden-line  algorithm  to  display  the  grid  as  a  3-D 
projection  on  the  screen.  Hidden-line  algorithms  allow 
the  viewer  to  observe  the  scene  from  any  angle  in  a  fairly 
realistic  format. 

How  the  Contour  Analysis  Works 

Two  problems  must  be  solved  in  order  to  analyze  a 
contour  map.  The  algorithm  must  find  all  the  contours, 
and  it  must  determine  their  nesting  relationships. 

The  contour  search  begins  by  using  a  2-D  search  of 


the  image — that  is,  starting  at  the  top  left  and  scanning 
across  each  row  until  it  finally  reaches  the  bottom  right. 
Each  time  it  finds  a  contour  (a  black  pixel),  it  determines 
if  it  is  a  contour  that  was  previously  "seen."  If  not — that 
is,  if  the  program  has  found  a  new  contour — it  adds  the 
contour  to  its  internal  data  structures.  In  either  case  the 
search  continues. 

Unfortunately,  this  search  pattern  is  too  simple  to 
fully  serve  your  needs.  The  problem  is  that  it  provides 
no  easy  way  to  track  how  contours  are  nested.  In  order 
to  track  the  nesting  relationships  you  must  specify  a 
contour  and  have  the  contour-analysis  algorithm  find  a 
single  level  of  contours  within  it,  as  shown  in  Figure  3, 
below.  The  interiors  of  the  contours  within  the  region 
should  not  be  searched  (yet).  This  procedure  makes  the 
tracking  of  nesting  easy:  the  enclosing  contour  is  the 
“parent”  of  the  contours  discovered  within  it.  Assuming 
you  have  this  ability,  you  then  use  the  top-left  to 
bottom-right  search  to  find  the  lowest  level  of  contours 
in  the  image  and  use  your  new  search  algorithm  to 


figure  3:  The  search  within  contour  1  will  find  contours 
2  and  3  but  not  4.  Contour  4  will  be  found  by  a  search  of 
contour  3. 


20 

828 


Dr.  Dobb's  Journal,  November  1987 


3-D  IMAGES 

(continued  from  page  20) 

search  the  interior  of  each  of  the  newly  discovered 
contours  (and  then  use  it  again  to  search  the  interior  of 
any  contours  within  them,  and  so  on  . . .). 

This  new  search  hinges  on  being  able  to  differentiate 
the  interior  or  exterior  of  an  enclosed  region.  This 
requires  a  simple  observation:  if  the  program  is  following 
a  contour  in  a  counterclockwise  direction,  and  it  is 
currently  following  a  downward  arc,  then  the  interior 
lies  to  the  right  of  the  current  position,  as  seen  in  Figure 
4,  below.  Alternatively,  if  the  program  is  following  a 
contour  in  a  clockwise  direction,  and  it  is  currently  on  a 
downward  arc,  then  the  exterior  lies  to  the  right  of  the 
current  point. 

Using  this  observation,  the  sequence  of  events  to 
search  the  interior  of  a  region  is  this:  Trace  around  the 
bounding  contour  in  a  counterclockwise  direction,  and 
at  each  point  on  a  downward  arc,  scan  to  the  right  (that 
is,  the  interior)  until  you  find  a  new  contour  or  find  the 
other  side  of  the  bounding  contour.  On  finding  a  new 
curve,  traverse  this  curve  in  a  clockwise  direction,  still 
scanning  to  the  right  on  a  downward  arc,  as  indicated 
in  Figure  5,  below.  Because  the  traversal  is  now  reversed, 
this  scan  will  now  be  the  exterior  of  the  new  contour — 
that  is,  the  interior  of  the  original  contour  but  on  the 
opposite  side  of  the  nested  contour.  Applied  to  the 
bounding  curve  and  all  the  interior  curves,  this  proce¬ 
dure  scans  the  interior  of  the  bounding  contour  but  not 
the  interior  of  any  contours  nested  within  the  bounding 
contour.  For  example,  in  Figure  3  a  search  of  contour  1 
done  in  this  way  will  encounter  contours  2  and  3  but 
not  contour  4.  Contour  4  will  be  discovered  only  when  a 
search  is  made  of  the  interior  of  contour  3.  In  fact,  the 
entire  search  is  reduced  to  this  process  by  treating  the 
border  of  the  image  itself  as  a  large  contour. 

Following  the  contour  analysis,  the  elevation  assign¬ 
ments  are  made  by  stepping  through  an  array  of  the 
contours  found  and  assigning  an  elevation  equal  to  the 
enclosing  contour’s  elevation  plus  the  elevation  incre¬ 
ment.  This  is  where  I  use  the  assumption  that  all  nested 
regions  are  rising. 

Finally,  an  array  of  elevations  is  developed  to  be  used 
as  input  to  the  hidden-line  algorithm.  The  array  ele- 


Figure  4:  The  interior  of  the  region  is  to  be  right  when 
traversing  a  descending  arc. 


ments  are  calculated  by  traversing  the  image  along  the 
grid  lines.  When  a  traversal  begins,  it  sets  its  current 
elevation  to  the  base  elevation  of  the  contour  map,  and 
each  time  it  reaches  a  grid  intersection  it  assigns  the 
current  elevation  to  the  corresponding  element  of  the 
array.  The  traversal  also  looks  for  intersections  with 
contours  on  the  map.  These  intersections  signal  a  change 
in  the  current  elevation.  When  it  encounters  a  contour, 
the  program  first  retrieves  the  elevation  of  the  contour 
from  the  contour  array  described  earlier. 

At  first  glance  it  may  seem  that  this  new  elevation 
should  become  the  current  elevation.  It’s  not  that  simple. 
If  the  traversal  is  going  into  a  nested  region,  then  the 
new  elevation  is  the  elevation  of  the  region  it  is  now 
entering  and  thus  should  become  the  current  elevation. 
If  the  traversal  is  leaving  a  nested  region,  however,  the 
new  elevation  is  the  elevation  of  the  region  just  left. 

For  example,  imagine  traversing  the  following  simple 
map:  the  base  of  the  map  has  a  zero  elevation,  and  there 
is  a  plateau  in  the  center  with  an  elevation  of  100  feet. 
When  the  traversal  begins,  it  uses  the  base  elevation  (0) 
as  the  current  elevation.  When  it  encounters  the  con¬ 
tour  around  the  plateau,  because  it  is  entering  a  nested 
region,  it  assigns  100  to  the  current  elevation.  When  it 
reaches  the  other  side  of  the  plateau,  it  once  again 
meets  the  contour  with  an  elevation  of  100.  Because  you 
are  now  leaving  a  nested  region,  however,  this  signals 
that  the  elevation  is  no  longer  100  feet,  and,  in  this  case, 
it  should  now  be  the  base  elevation  of  0. 

There  is  a  data  structure  that  naturally  resolves  this 
problem:  the  stack.  Each  time  the  program  encounters  a 
contour,  it  compares  the  contour’s  elevation  to  the  top 
of  the  stack.  If  the  elevation  is  different,  it  pushes  the 
new  elevation.  If  the  new  elevation  is  the  same  as  the  top 
of  the  stack,  it  pops  the  stack.  In  either  case  the  current 
elevation  is  the  number  on  the  top  of  the  stack. 

Using  the  Program 

Using  the  program  is  quite  simple.  The  main  program 
sets  up  a  contour  array  (it  must  be  larger  than  the 
expected  number  of  contours)  and  a  pointer  to  the  tree 
of  points  as  global  variables.  It  then  calls  find — all — con- 
tours( ),  passing  a  pointer  to  the  bit  map  to  be  analyzed. 

On  return  it  calls  make _ hidlinpitf )  to  assign  elevations, 

which  in  turn  calls  make — grid( )  to  calculate  the  grid 


Figure  5:  The  trace  for  the  external  curve  is  counterclock¬ 
wise;  for  internal  curves  it  is  clockwise. 


22 


Dr.  Dobbs  Journal,  November  1987 

829 


3-D  IMAGES 

(continued  from  page  22) 

elevations.  The  current  implementation  converts  the 
grid  to  a  disk  file  that  is  then  used  as  input  to  a 
hidden-line  algorithm. 

The  previous  quick  overview  of  the  contour-analysis 
and  grid-construction  algorithms  glosses  over  most  of 
the  gory  details,  such  as  coordinating  progress  and 
tracing  curves.  I’ll  now  describe  these  in  more  detail. 

Find — all _ contours!  ) 

Find — all — contours ( ),  as  its  name  implies,  is  responsi¬ 
ble  for  coordinating  the  entire  contour  search.  The 
progress  of  the  contour-analysis  algorithm  is  based  on  a 
queue  (a  first-in/first-out  data  structure).  The  queue  is 
find — all — contoursO' s  means  of  communicating  with 
the  lower-level  function  find _ contours ( ).  Find _ all _ con¬ 

toursO  calls  a  function  to  enqueue  the  points  on  a 
contour  of  interest  and  then  calls  find— contoursf ). 
Find — contours ( i  then  dequeues  all  points,  and  those 
on  a  descending  arc  are  used  as  the  starting  points  for 
rightward  scans. 

The  queue  is  built  by  tracing  contours  in  a  counter¬ 
clockwise  direction  and  includes  the  location  of  each 
point  on  the  contour  as  well  as  a  chain  code  for  the 
point.  The  chain  code  is  an  integer  from  0  to  7  that 
indicates  the  direction  moved  from  the  previous  con¬ 
tour  point  to  the  current  contour  point,  as  shown  in 
Figure  6,  below.  Thus  a  chain  code  of  0  indicates  a 


Trace  around  the  external  contour 


Figure  7:  Find — contours!  )  searches  the  interor  of  a 
given  contour  for  nested  contours. 


left,  and  6  is  down.  Find _ all _ contours O  initiates  the 

entire  contour  analysis  by  enqueuing  the  borders  of  the 
image — that  is,  by  treating  the  border  of  the  image  as  a 
contour  itself.  Because  only  the  left-hand  side  of  the 
image  will  have  chain  codes  in  the  range  5  to  7,  only  the 
left  side  really  needs  to  be  enqueued.  Find _ all _ con¬ 
tours  ( )  then  calls  find _ contoursO,  which  in  turn  uses 

the  queue  to  do  a  single-level  contour  search.  Any 

contours  found  by  find _ contoursf )  are  placed  in  the 

contours  array. 

On  return  from  find _ contoursf ),  there  will  usually  be 

some  contours  stored  in  the  contour  array.  Find _ all _ 

contours O  then  steps  through  each  unsearched  con¬ 
tour  in  the  array.  At  each  iteration,  find _ all— contoursf ) 

calls  a  function  to  enqueue  (by  tracing)  the  points  on  the 
contour  and  once  again  calls  find— contoursf )  to  search 
the  interior  of  the  new  region.  When  the  interiors  of  all 
contours  in  the  array  have  been  scanned,  the  search  is 
complete. 

Find _ contoursf  ) 

Find — contoursf )  is  responsible  for  finding  a  single  level 
of  contours  nested  within  a  given  region,  which  is  that 

region  enqueued  by  find _ all _ contoursf )  (see  Figure  7, 

below).  Because  the  queue  was  created  by  tracing  the 
contour  in  a  counterclockwise  direction,  the  contour 
interior  will  be  found  by  scanning  to  the  right  of  points 
whose  chain  codes  are  in  the  range  5  to  7.  For  each 
chain  code  in  this  range,  the  function  search  yf )  is 
used  to  perform  the  actual  scan  of  the  bit  map. 

The  scan  can  return  either  of  two  results:  it  can 
encounter  a  known  contour  (which  can  be  the  other 
side  of  the  enclosing  contour,  the  edge  of  the  bit  map, 
or  an  internal  but  previously  discovered  contour),  or  it 
can  find  a  new  contour.  If  it  finds  a  new  contour,  then 
find — contoursf )  calls  tracef )  to  traverse  the  new  con¬ 
tour  in  a  clockwise  direction.  Once  again  the  trace 
enqueues  the  points  on  the  contour.  When 
find — contoursf )  reaches  these  points  in  the  queue,  it 
will  result  in  the  scan  starting  on  the  right  side  of  the 
new  contour  and  still  moving  right.  The  result  is  that  the 
interior  of  the  enclosing  contour  is  scanned,  except  for 
the  interiors  of  any  nested  contours. 

There  are  two  problems  with  this  general  procedure: 
tabletops  and  peaks  (see  Figure  8,  below,  for  examples). 
Tabletops  are  horizontal  runs  to  the  right.  The  problem 
is  that  the  point  on  the  downward  arc  will  start  the 
scan.  The  scan  will  immediately  encounter  a  point  on 
the  contour  (that  is,  part  of  the  tabletop)  and  stop, 
thinking  (incorrectly)  that  it  has  reached  the  other  side. 
The  solution  to  this  problem  is  to  look  at  the  next  chain 
code  as  well  as  the  current  one.  If  the  current  chain 


Figure  8:  Examples  of  a  tabletop  and  a  peak 


24 

830 


Dr.  Dobb’s  Journal,  November  1987 


3-D  IMAGES 

(continued  from  page  24) 

code  was  0  and  the  next  chain  code  is  5-7,  then  the 
program  has  just  reached  the  end  of  a  tabletop  and  it 
should  start  to  do  a  scan. 

A  peak  is  a  single  pixel  at  the  top  of  a  contour.  Peaks 
present  another  problem:  they  are  never  on  a  downward 
arc  and  will  never  initiate  a  rightward  scan.  A  peak  can 
result  in  at  most  a  single  scan  line  being  skipped.  Thus, 
in  the  worst  case,  the  only  contour  that  can  be  missed 
is  a  single  line  of  pixels.  Because  this  is  a  degenerate 
case,  I  did  not  feel  it  needed  to  be  resolved. 

Search _ x(  ) 

Search _ x( )  scans  to  the  right  of  a  given  point  until  it 

finds  a  black  pixel  or  reaches  the  edge  of  the  bit  map,  as 
shown  in  Figure  9,  below.  It  then  returns  a  value  of  1  if 
the  pixel  is  on  a  new  contour  or  2  if  it  is  either  on  a 
known  contour  or  at  the  edge  of  the  bit  map.  A  pixel  has 
been  seen  before  if  it  can  be  found  in  a  tree  containing 
the  pixels  lying  on  contours. 

The  interrogation  of  individual  pixels  on  an  off-screen, 
bit-map  image  can  be  done  in  several  ways.  The  Macin¬ 
tosh  Toolbox  contains  the  functions  GetPixelf )  and 
GetCPiyelf )  (in  the  Macintosh  II’s  Color  QuickDraw). 
However,  I  have  used  an  assembly-language  function, 
called  getpixelf),  that  accomplishes  the  same  task  by 
accessing  the  bit  map  directly.  This  getpixelf )  is  quite 
fast,  but  it  lacks  the  flexibility  and  generality  of  its 
Toolbox  counterparts.  Any  machine  that  supports  bit- 


figure  9:  Search _ x(  )  starts  at  a  given  pixel  and  searches 

to  the  right  until  it  hits  a  contour  or  the  edge  of  the 
image. 


Figure  lO:  The  search  direction  and  sequence  of  pixels 
searched  during  a  contour  trace  ensures  that  the  trace 
always  follows  the  outermost  border  of  the  contour. 


mapped  graphics  will  have  counterparts  to  getpixell )  or 
GetPiyeK )  (the  Microsoft  C,  Version  5.0,  library  now  has 
a _ getpixell )  function). 

Tracef ) 

Trace(  1  is  the  real  workhorse  of  the  contour-analysis 
algorithm.  It  traces  a  given  curve  in  a  clockwise  or 
counterclockwise  direction,  and  it  simultaneously  up¬ 
dates  both  the  queue  and  a  tree  of  points  lying  on  the 
contour. 

In  order  to  follow  a  contour,  traced  uses  the  chain 
codes  described  earlier.  An  example  can  help  explain 
the  process.  Assume  that  the  program  will  traverse  a 
contour  counterclockwise.  The  calling  function  passes  a 
point  on  the  exterior  of  the  contour  to  tracef ).  First,  the 
trace  sets  the  initial  search  direction  to  6  (down).  It  then 
looks  at  pixels  with  chain  codes  5,  6,  and  7  from  the 
starting  point — that  is,  “search  direction-1,”  “search 
direction,”  “search  direction  + 1,"  all  modulo  8  (see  Figure 
10,  below).  Note  that  the  search  starts  the  furthest  to  the 
exterior  as  is  possible  (a  pixel  at  chain  code  4  would 
mean  that  the  starting  pixel  is  not  on  the  exterior). 

The  first  hit  then  becomes  the  new  search  direction. 
If  there  is  no  hit  at  all,  trace  adds  2  to  the  search 
direction  (modulo  8)  and  tries  again.  This  process  con¬ 
tinues  for  three  iterations  at  most,  to  prevent  an  infinite 
loop  if  the  contour  consists  of  a  single  point. 

Note  that  if  there  is  a  break  in  the  contour,  this 
algorithm  will  circle  around  the  end  point  and  return 
back  to  the  starting  point.  Also  note  that  tracef )  always 
proceeds  around  the  exterior  of  a  contour.  If  the  con¬ 
tour  is  more  than  one  pixel  thick,  tracef )  will  ignore  the 
interior  pixels.  These  points  will  be  found  later  while 
scanning  the  interior  of  the  region. 

Building  the  3-D  Image 

As  indicated  in  the  overview,  the  contour  analysis  itself 
is  only  the  first  step  in  producing  a  projection  of  the 
contour  map.  Once  the  analysis  is  complete,  it  must  be 
converted  into  a  form  that  a  hidden-line  algorithm  can 
use  as  input.  For  purposes  of  demonstrating  the  con¬ 
tour  analysis,  I  have  used  an  algorithm  published  by  L. 
Ammeraal  in  Programming  Principles  in  Computer  Graph¬ 
ics.  This  requires  two  further  steps  after  the  analysis 
itself. 

First,  the  data  from  the  analysis  must  be  used  to 
construct  a  2-D  array,  with  the  value  of  each  element  of 
the  array  being  the  elevation  at  that  point.  This  is  done 
by  the  function  make _ gridf ),  as  described  in  the  over¬ 

view. 

Next,  the  array  itself  must  be  converted  into  a  file  that 
can  be  used  as  input  to  the  hidden-line  algorithm.  This 
is  done  by  the  function  make — hidlinpixf).  A  small 
section  of  this  output  file  is  shown  in  Table  1,  page  28.  It 
starts  with  the  coordinates  of  the  midpoint  of  the  object 
to  be  drawn.  Then  follows  a  section  that  lists  all  the 
points  in  the  array  and  their  spatial  coordinates  (x,  y,  z). 
Thus  point  1  is  at  spatial  coordinates  (0.0,  0.0,  0.0),  33  is 
at  (0.0, 1.0,  0.0),  and  so  on. 

The  key  to  the  hidden-line  algorithm  is  the  following 
section,  beginning  with  the  word  Faces:.  This  is  a 
decomposition  of  the  grid  into  triangular  faces.  Trian- 


28 


Dr.  Dobbs  Journal,  November  1987 

831 


3-D  IMAGES 

(continued  from  page  26J 

gles  are  used  because  all  points  on  a  triangle  in  3-D 
space  are  located  on  the  same  plane  (see  Ammeraal’s 
book  for  further  explanation).  Each  face  is  defined  by  its 
three  corner  points  listed  in  counterclockwise  order:  for 
example,  the  file  shown  here  indicates  that  points  1,  34, 
and  2  form  a  triangular  face.  The  points  1,  2,  33,  and  34 
form  the  corners  of  a  rectangle.  The  program  has  broken 
this  rectangle  into  two  triangular  faces:  1,  34,  and  2;  and 
1,  33,  and  34.  The  minus  sign  for  point  34  indicates  that 
the  line  segment  from  point  1  to  point  34  should  not 
actually  be  drawn,  in  this  case  because  it  forms  a 
diagonal  across  a  rectangle. 

In  fact,  the  grid  faces  are  actually  defined  twice  in  the 
file:  once  as  the  top  side  (the  normal  viewpoint)  and 
once  as  the  under  side;  that’s  the  reason  for  the  appar¬ 
ent  redundancy  in  the  description  of  the  faces.  This 
technique  allows  you  to  view  the  grid  both  from  above 
and  below  (underground  if  this  were  a  topographic 
map!).  This  file  is  then  read  and  processed  by  the 
program  hidlinpix.  The  result  is  shown  in  Figure  2. 

Portability  and  Optimization 

The  contour-analysis  algorithm  was  written  and  tested 
on  a  512K  Macintosh  and  on  a  Macintosh  II  using 
Lightspeed  C.  In  order  to  simplify  the  development  and 
testing  process,  I  modified  a  copy  of  the  Lightspeed 
stdio  library  so  that  the  console  window  opens  to  cover 
the  lower  quarter  of  the  screen  (instead  of  the  entire 
desktop,  as  it  normally  does).  The  program  then  opens 
two  windows:  a  window  that  shows  the  source  image 
and  a  window  that  displays  the  progress  of  the  contour 
analysis  (a  sort  of  debugging  animation).  The  algorithms 
themselves  are  intended  to  be  portable,  but  I  have  not 
actually  tested  the  code  in  other  environments,  so  there 
could  be  surprises  in  store. 


I  have  used  the  Macintosh's  graphics  abilities  to 
animate  the  code,  so  if  you  find  the  written  explanation 


15.500000 

12.500000 

0.000000 

1 

0.000000 

0.000000 

0.000000 

33 

0.000000 

1.000000 

0.000000 

65 

0.000000 

2.000000 

0.000000 

97 

0.000000 

3.000000 

0.000000 

129 

0.000000 

4.000000 

0.000000 

161 

0.000000 

5.000000 

0.000000 

193 

0.000000 

6.000000 

0.000000 

Faces: 

1 

-34  2# 

2 

34  -1# 

1 

-34  33# 

33 

34  -1# 

33 

-66  34# 

34 

66  -33# 

Table  1:A  sample  of  the  output  file  from  the  contour 
analysis 


of  the  program  complex,  and  if  you  are  able  to  view  the 
animation,  you  may  find  that  it  makes  visual  sense. 
There  is  some  redundancy  because  of  the  need  to  trace 
each  contour  twice  (once  when  first  discovered,  once 
again  to  search  its  interior),  but  the  vast  majority  of 
points  in  the  image  are  visited  only  once. 

The  code  shown  here  uses  one  important  optimiza¬ 
tion:  the  avoidance  of  the  memory-management  library 
supplied  with  Lightspeed  C  (that  is,  it  avoids  the  use  of 
mallocO,  freed,  and  so  on).  Instead,  I  have  used  a 
nonstandard  function  called  getmemO.  The  rationale 
for  this  action  is  explained  in  detail  in  the  source  code 
to  getmemf )  itself.  I  was  reluctant  to  include  getmemf ) 
with  this  article  (it  lacks  its  counterparts,  such  as 
freemem( ) ),  but  the  speed  improvement  is  quite  dra¬ 
matic;  without  getmemf )  the  program  is  slow,  even  on 
the  Macintosh  II. 

Miscellaneous  Functions 

The  contour  analysis  depends  on  several  different  data 
structures.  I  have  avoided  reinventing  the  wheel  by 
using  previously  published  code,  particularly  from  some 
of  Allen  Holub’s  columns  in  DDJ.  These  borrowings 
include  the  queue  and  AVL  tree  code  published  in  June 
1985  and  August  1986,  respectively. 

Conclusion 

For  the  eye  and  brain,  identifying  objects  and  relation¬ 
ships  in  the  visual  world  seems  effortless.  The  contour 
map  analysis  algorithm  is  an  example  of  the  complexity 
of  this  process  when  performed  by  machine,  even  when 
restricted  to  a  simple  class  of  images.  The  algorithm 
vividly  demonstrates  the  wide  gulf  between  human 
cognition  and  an  algorithmic  procedure.  In  spite  of  the 
constraints  placed  on  it,  though,  the  machine  is  able  to 
provide  services  that  are  impossible  for  the  eye/brain; 
specifically,  restoring  a  3-D  image  from  a  2-D  representa¬ 
tion. 

Availability 

All  the  source  code  for  articles  in  this  issue  is  available 
on  a  single  disk.  To  order,  send  $14.95  to  Dr.  Dobbs 
Journal,  501  Galveston  Dr.,  Redwood  City,  CA  94063,  or 
call  (415)  366-3600,  ext.  216.  Please  specify  the  issue 
number  and  format  (MS-DOS,  Macintosh,  Kaypro). 

Bibliography 

Ammeraal,  L.  Programming  Principles  in  Computer  Graph¬ 
ics.  Chichester,  England:  Wiley,  1986. 

Apple  Computer.  Macintosh  Technical  Support.  Macin¬ 
tosh  Tech  Notes.  Cupertino,  Calif.:  Apple  Computer, 
1983-1987. 

Holub,  Allen.  Various  C  Chest  columns.  DDJ.  1986- 
1987. 

Pavlidis,  Theo.  Algorithms  for  Graphics  and  Image  Proc¬ 
essing.  Rockville,  Md.:  Computer  Science  Press,  1982. 

DDJ 

(Listings  begin  on  page  66.) 

Vote  tor  your  favorite  feature/article. 

Circle  Reader  Service  No.  1. 


28 

832 


Dr.  Dobbs  Journal,  November  1987 


ARTICLES 


A  Graphics  Toolbox 
for  Turbo  G 

by  Kent  Porter 


That  Borland’s  Turbo  C  comes 
equipped  with  350  functions 
and  macros  is  pretty  impres¬ 
sive,  until  you  realize  that  not  one 
of  them  has  anything  to  do  with 
graphics.  To  fill  this  void  I’ve  con¬ 
structed  my  own  version  of  a  graph¬ 
ics  toolkit  for  Turbo  C. 

To  make  the  information  more 
easily  digestible,  I’ve  divided  the  pro¬ 
ject  into  two  bite-size  articles.  This 
piece  introduces  a  basic  library  of 
graphics  calls  and  shows  how  to 
put  them  to  work  in  pixel-oriented 
graphics;  Part  2  (to  appear  in  Decem¬ 
ber's  DDJ )  will  combine  this  graph¬ 
ics  library  with  data  structures  and 
C  functions  to  control  pop-down 
menus,  dialog  boxes,  and  other  ap¬ 
pearance  features  that  users  have 
come  to  expect  in  contemporary  soft¬ 
ware. 

The  MS-DOS  Underpinning 

IBM  PCs,  or  equivalents,  rely  on 
ROM  BIOS  interrupt  lOh  (16  deci¬ 
mal)  for  their  graphics  capabilities. 
This  interrupt  furnishes  "general¬ 
ized”  functions  for  various  aspects 
of  text  and  APA  (all-points  address¬ 
able,  or  pixel-oriented)  graphics.  I 
say  “generalized"  because  the  ROM 
BIOS  accommodates  numerous 
modes,  some  of  which  may  not  be 
available  on  certain  hardware  con¬ 
figurations.  For  example,  the  ROM 
BIOS  handles  video  modes  for  the 
EGA  monitor  and  the  PCjr  that  don’t 
exist  when  a  CGA  or  monochrome 


Kent  Porter,  1909-4  Montecito  Rd., 
Mountain  View,  CA  94043.  Kent  has 
written  17  books  about  programming 
and  hundreds  of  magazine  articles 
on  computer  hardware  and  software. 
He  is  a  technical  editor  for  DDJ. 


The  first  article  in  a  two- 
part  series  on 
designing  a  library  for 
PC  graphics 


adaptor  is  present  in  the  system. 

There  are  16  core  functions  in  the 
ROM  BIOS  video  services.  These  func¬ 
tions  govern  display  aspects  such 
as  the  position  or  shape  of  the 
cursor,  the  video  mode  (text  and 
APA  options),  the  active  display 
page,  character  output,  pixel  graph¬ 
ics,  and  others. 

The  ROM  BIOS  functions  are  not 
known  for  their  speed.  In  general, 
they’re  adequate  in  text  operations 
but  less  than  adequate  in  APA  graph¬ 
ics.  You  could  devise  much  faster 
pixel  manipulations  by  writing  to 
the  display  memory  directly,  and 
indeed,  many  commercial  packages 
do  just  that.  The  speed  advantages 
of  this  approach,  however,  are  ob¬ 
tained  with  a  high  risk  of  incompati¬ 
bility:  environments  such  as  Win¬ 
dows  and  OS/2  are  much  more  intol¬ 
erant  of  direct  display  memory 
access. 

In  selecting  the  ROM  BIOS  trade¬ 
off  rather  than  directly  writing  to 
screen  memory,  I’ve  deliberately 
opted  for  the  more  conservative  ap¬ 
proach  on  two  grounds:  first,  the 
ROM  BIOS  calls  are  the  approved 
method  for  doing  graphics  and 
future  machines  will  probably  sup¬ 
port  them  within  the  definition  of 
"well-behaved”;  and  second,  faster 
processors  will  cancel  out  their 
slower  speed,  resulting  in  a  zero 
loss/gain  from  the  user’s  perspective. 


While  this  is  an  arguable  position, 
and  no  doubt  some  readers  will  dis¬ 
pute  it,  at  least  you  know  my  under¬ 
lying  assumptions.  Now  let's  see 
how  to  access  those  ROM  BIOS  func¬ 
tions. 

ROM  BIOS  Culls  from  Turbo  C 

Like  DOS  itself,  the  ROM  BIOS  rou¬ 
tines  under  interrupt  lOh  expect  a 
function  code  in  register  AH,  with 
other  parameters  passed  as  required 
by  specific  functions  in  the  rest  of 
the  registers.  Any  number  of  Micro¬ 
soft,  IBM,  and  commercial  publica¬ 
tions  document  the  ROM  BIOS  calls; 
for  this  project  I  referred  to  Ray 
Duncan’s  indispensable  Advanced 
MS-DOS  (pages  399-420). 

ROM  BIOS  calls  are  made  using 
the  Turbo  C  int86( )  function.  Regis¬ 
ters  are  set  up  in  a  structured  vari¬ 
able  bound  to  the  REGS  union  as 
defined  in  Turbo  C’s  DOS.H  header 
file. 

The  REGS  union  includes  all  the 
general  registers  of  the  80x86  line  of 
processors  and  furnishes  a  notation 
convention  for  byte  (8-bit)  and  word 
(16-bit  pair)  registers.  For  example,  if 
you  declare  the  structured  variable 
as  union  REGS  r;  you  can  load  the 
value  OCh  into  byte  register  AH  with 
r.h.ah  =  OjcOC;  and  the  same  value 
into  word  register  BX  with 
r.x.by  =  0)i0C;.  The  structure  notation 
.h.  indicates  a  byte  register,  and  no¬ 
tation  .y.  indicates  a  word  register. 
The  REGS  union  encompasses  all 
byte  registers  [AH-DL )  and  all  word 
registers  (AX-DX )  as  well  as  SI,  Dl, 
and  the  flags.  It  does  not  include 
the  segment  registers  (CS,  DS,  ES, 
and  SS ),  which  are  irrelevant  to 
ROM  BIOS  calls. 

To  call  a  ROM  BIOS  video  service 


30 


Dr.  Dobb’s  Journal,  November  1987 

833 


routine,  you  place  the  function  code 
in  r.h.ah  and  the  required  parame¬ 
ters  in  other  registers  and  execute 
the  Turbo  C  int86( )  function  int86 
(0^10,  &r,  &r);. 

Function  int86( )  performs  the  fol¬ 
lowing: 

•  saves  the  caller’s  registers 

•  loads  the  CPU  registers  from  the 
union  whose  address  is  passed  in 
the  second  argument 

•  executes  the  interrupt  given  in  the 
first  argument 

•  on  return  from  the  ROM  BIOS, 
places  register  contents  in  the  union 
passed  as  the  second  address  argu¬ 
ment  (the  same  as  the  first  address 
argument  in  this  example) 

•  restores  the  caller’s  registers 

•  resumes  execution  in  the  calling 
program 

On  resumption  you  can  pluck  any 
BIOS-returned  value  from  the  vari¬ 
able  bound  to  the  REGS  union — r  in 
the  examples  shown  here — by  using 
the  appropriate  register-type  nota¬ 
tion. 

For  example,  suppose  you  want 
to  determine  the  current  cursor  po¬ 
sition  in  video  page  0.  The  setup  is: 

r.h.ah  =  0x03;  /*  read  position  V 
r.h.bh  =  0;  /*  in  page  0  7 
int86  (0x10,  &r,  &r);  /*  call  BIOS  7 

Afterward  you  can  fetch  the  row 
with  cursRow  —  r.h.dh;  and  the 
column  with  cursCol  =  r.h.dl;.  This 
is  the  same  as,  but  easier  than,  per¬ 
forming  an  equivalent  call  in  assem¬ 
bly  language.  An  added  benefit  is 
that  the  structured  variable  bound 
to  the  REGS  union  retains  the  re¬ 
turned  register  values  for  as  long  as 
it  exists  or  until  the  next  call  to  the 
ROM  BIOS  routines.  Thus,  you  can 
fetch  the  returned  register  values  at 
your  convenience.  You  can  translate 
these  ROM  BIOS  calls  into  C  func¬ 
tions. 

Syntactic  Sugar 

Many  of  Turbo  C’s  350  functions 
and  macros  are  simply  DOS  and 
ROM  BIOS  calls  sugar-coated  to  look 
like  C.  In  the  same  spirit,  you  can 
pack  a  little  sugar  around  the  ROM 
BIOS  video  service  routines  and  call 
them  "a  basic  library  of  Turbo  C 
graphics  functions.” 


Listing  One,  page  82,  shows  an 
# include  file,  VIDEO.I,  which  con¬ 
tains  the  fundamental  16  ROM  BIOS 
calls  plus  a  couple  of  extended  func¬ 
tions  that  synthesize  several  calls. 
Any  program  that  has  to  perform 
graphics  operations  can  obtain  the 
full  set  of  functions  with  the  simple 
statement  #include  <video.i>  call¬ 
ing  whichever  specific  ones  it  needs. 
Because  the  source  code  is  included 
in  the  compilation,  the  functions 
will  automatically  adapt  to  the 
memory  model  currently  in  use. 
Also  note  that  every  function  is  de¬ 
fined  before  being  referenced  by 
other  functions,  thus  eliminating 
any  need  for  a  header  file  contain¬ 
ing  prototypes  (if  you  program  using 
ANSI  C  conventions). 

The  down  side  of  using  source- 
level  # include  files  such  as  VIDEO.I 
is  that  they  waste  memory.  Turbo  C 
doesn’t  optimize  itself  by  pulling  out 
unreferenced  code.  As  a  result,  all 
the  code  for  all  the  functions  ap¬ 
pears  in  every  .EXE  program  that 
# includes  VIDEO.I,  even  if  the  pro¬ 
gram  calls  only  a  single  function. 
My  solution  to  this  is  to  make  the 
functions  into  a  linkable  library  (.LIB 
file). 

Creating  the  Libraries 

Building  and  linking  with  user-cre¬ 
ated  libraries  in  Turbo  C  can  be  a  bit 
of  a  chore.  In  the  small  model,  the 
inclusion  of  VIDEO.I  adds  1,376 
bytes  to  the  .EXE  file;  in  the  large 
model,  1,520  bytes.  The  overhead  con¬ 
sists  of  total  bytes  minus  the  sizes 
of  the  functions  you  actually  call.  If 
that  amount  of  code  space  is  impor¬ 
tant  to  you,  it  might  be  worth  turn¬ 
ing  VIDEO!  into  one  or  more  librar¬ 
ies  so  that  only  those  routines  that 
are  actually  used  get  linked  into  the 
.EXE  file. 

The  first  step  is  to  write  a  file 
VIDEO.H  that  defines  all  the  graph¬ 
ics  function  prototypes.  Listing  Two, 
page  84,  shows  this  header  file, 
which  you  should  # include  in  any 
programs  that  link  with  the  video 
library.  You’ll  also  use  VIDEO.H  in 
creating  the  library’s  individual  mem¬ 
bers.  Each  function  file  should  # in¬ 
clude  VIDEO.H  at  the  top.  Purists 
might  also  want  to  add  comments 
explaining  what  the  function  does 
and  to  what  library  it  belongs. 

When  all  the  functions  have  been 


broken  out  into  separate  files,  com¬ 
pile  them  one  by  one  with  Turbo  C 
to  create  a  matching  set  of  .OBJ  files. 
Now  you’re  ready  to  make  the  li¬ 
brary. 

The  .OBJ  files  produced  by  Turbo 
C  are  in  Microsoft-compatible 
format,  so  you  can  use  the  DOS  LIB 
utility  to  combine  them  into  a  link¬ 
able  library.  To  produce  a  library 
called  VIDEOS.LIB  containing  the 
functions  video  model )  and  active- 
page( ),  for  example,  type: 

LIB  VIDEOS  +  VIDEOMODE  +  ACTIVE- 

PAGE; 

Later  you  can  add  setmodef )  and 
setcursorf )  with  the  similar: 

LIBVIDEOS  +  SETMODE  + 

SETCURSOR; 

You  can,  of  course,  add  more  than 
two  modules  at  a  time  to  the  library 
simply  by  specifying  the  library 
name  first,  followed  by  an  entire 
command  line  of  module  names 
separated  by  plus  signs. 

The  S  on  the  end  of  the  library 
name  is  a  convention  borrowed 
from  Turbo  C  to  indicate  the 
memory  model;  here,  it  is  assumed 
that  you  compiled  the  separate 
units  in  the  small  model. 

To  link  with  this  library,  first  make 
sure  it’s  in  the  directory  where 
Turbo  C  looks  for  source  files  (not 
in  the  \TURBOC\LIB  directory  with 
the  vendor’s  libraries).  Put  the  entry 
VIDEOS.LIB  in  your  project  file  (.PRJ). 
Turbo  C  then  knows  that  you  want 
to  link  to  a  user-created  library  and 
where  to  find  it. 

Repeat  the  compilation  and  LIB 
procedures  for  each  memory  model 
you  use,  varying  the  library  file  name 
accordingly  (VIDEOT.LIB, 

VIDEOC.LIB,  and  so  on).  You’ll  have 
to  modify  the  library  name  in  your 
.PRJ  file  if  you  decide  to  change 
memory  models  during  the  develop¬ 
ment  of  a  program. 

Because  it  takes  an  hour  or  two 
to  get  through  this  whole  business 
of  making  libraries,  weigh  the  price 
in  time  investment  vs.  saving  a  thou¬ 
sand  bytes  or  so  in  the  .EXE  file.  It’s 
your  choice. 

Adapting  to  the  Adaptor 

The  box  accompanying  this  article 


Dr.  Dobbs  Journal,  November  1987 

834 


31 


GRAPHICS  TOOLBOX 

(continued  from  page  31) 

(page  34)  describes  how  to  deter¬ 
mine  which  video  adaptor  is  pre¬ 
sent  in  the  machine.  Refer  to  it  for 
the  “theory”;  this  article  concen¬ 
trates  on  the  practical  aspects  of 
applying  the  adaptor  identification 
to  APA  graphics,  specifically  on  the 
CGA  and  EGA.  The  principles  dis¬ 
cussed  here  are  applicable  to  other 
adaptors,  such  as  the  Hercules,  as 
well. 

In  text  modes  there’s  little  adapt¬ 
ing  to  do  for  a  specific  display  type. 
The  screen  is  always  25  rows  high 


and  either  40  or  80  columns  wide. 
The  40-column  mode  only  applies 
when  writing  text  on  a  320  X  200- 
pixel  graphics  screen;  otherwise 
you’re  always  in  80-column  mode. 
The  monochrome  display  adaptor 
(MDA)  can’t  produce  colors,  but  it 
can  do  normal,  intense,  underlined, 
and  blinking  (attributes  remarkably 
similar  to  colors).  Thus,  although 
the  demo  program  later  in  this  arti¬ 
cle  does  a  few  text  tricks  using  the 
video  library,  I'm  not  going  to  dis¬ 
cuss  them  here.  Because  text  graph¬ 
ics  has  its  own  set  of  problems  and 
solutions,  I’ll  save  that  discussion 
for  Part  2. 


When  your  software  knows  the 
video  adaptor  it’s  working  with,  it 
can  alter  its  behavior  to  suit.  After 
identifying  the  video  board,  the  soft¬ 
ware  can  then  adjust  its  coordinate 
system  or  color  selections  to  fit  the 
APA  display.  Here,  in  the  interest  of 
limited  space,  I'll  show  you  how  to 
make  a  program  dynamically  modify 
its  coordinate  system. 

Suppose,  for  example,  you  want 
to  draw  a  border  around  the  screen, 
starting  15  scan  lines  (y  points) 
below  the  top  and  ending  15  above 
the  bottom.  The  width  in  either  case 
is  640  pixels  and  the  top  of  the 
rectangle  is  at  y=15.  However,  the 
CGA  has  a  mere  200  vertical  scan 
lines,  whereas  the  EGA  has  350. 
Therefore,  you  can  set  up  an  assign¬ 
ment  condition  such  as: 

if  (adaptor  =  =  cga) 

bottom  =  199-15; 
else 

bottom  =  349—15; 

The  trick  here  is  to  locate  the 
bottom  of  the  box,  which  is  deter¬ 
mined  by  the  adaptor  type.  The  ver¬ 
tical  line-drawing  routine  can  then 
repeatedly  call  the  plot( )  function 
from  the  video  library,  with  the 
number  of  calls — pixels  plotted — 
being  determined  at  run  time  by  the 
type  of  adaptor  available.  Similarly, 
the  horizontal  line-drawing  routine 
can  draw  the  bottom  at  the  appro¬ 
priate  elevation  (y  =  184  or  y  =  334, 
for  CGA  and  EGA,  respectively). 

This  is  a  somewhat  simplistic  sce¬ 
nario  implemented  in  the  demo  pro¬ 
gram;  a  more  complex  application 
might  consider  CGA  the  default  and 
supply  a  multiplier  of  1.0  in  calculat¬ 
ing  y  coordinates.  If  an  EGA  is  pre¬ 
sent,  however,  the  program  can  sub¬ 
stitute  1.75  for  the  y  multiplier  (be¬ 
cause  350/200  =  1.75).  Furthermore,  if 
four-color  graphics  are  required,  it 
can  use  1.0  as  the  multiplier  for  the 
default  (mode  04h  =  CGA  320  X  200 
four  color)  and  substitute  2.0  if  an 
EGA  is  present  (mode  lOh  =  EGA 
640X350  four  color).  Thus  the  pro¬ 
gram  dynamically  redefines  its  coor¬ 
dinate  workspace  in  both  dimen¬ 
sions  based  on  the  available  video 
adaptor. 

A  program  that  does  extensive 
APA  graphics  will  rely  on  the  video 
library's  plot( )  function,  which  plots 


32 


Dr.  Dobbs  Journal ,  November  1987 

835 


GRAPHICS  TOOLBOX 

(continued  from  page  32) 

individual  pixels,  but  it  could  clearly 
benefit  by  some  higher-level  draw¬ 
ing  routines. 

Adding  APA  Graphics 

Ultimately,  all  objects  appearing  on 
the  display  are  composed  of  three 
kinds  of  lines:  vertical,  horizontal, 
and  diagonal.  Even  a  solid  such  as  a 
block  cursor  or  a  complex  filled  poly¬ 
gon  consists  of  some  number  of  con- 


For  efficiency 
I  developed 
orthogonal 
line  routines 
and  a  separate, 
less  efficient 
routine 
for  diagonals. 


tiguous  lines  arranged  so  as  to  ap¬ 
proximate  a  form  that  the  eye  recog¬ 
nizes.  This  suggests  enhancing  the 
video  library  with  line-drawing  rou¬ 
tines  that  you  can  employ  to  create 
shapes. 

There’s  less  overhead  in  drawing 
orthogonal  lines  (vertical  or  horizon¬ 
tal)  than  in  drawing  diagonals.  Why? 
Because  diagonals  proceed  along  a 
slope,  or  ratio  between  vertical  and 
horizontal  motion.  Slopes  are, 
almost  without  exception,  fractional 
numbers  that  entail  floating-point 
operations,  which  are  inherently 
slower  in  computers.  Slope  is  not 
an  issue  when  drawing  a  line  that  is 
perfectly  vertical  or  horizontal,  and 
thus  it  can  be  done  with  more  effi¬ 
cient  integer  arithmetic. 

For  efficiency,  I  developed  orthogo¬ 
nal  line  routines  and  a  separate,  less 
efficient  routine  for  diagonals.  The 
latter  is  called  if  I  know  or  suspect 
that  the  line  is  not  orthogonal. 

In  the  orthogonal  routines,  you 
pass  the  start  and  end  points,  the 
axis  coordinate  along  which  the  line 
is  to  proceed,  and  the  color  of  the 
pixel  you  want  plotted.  The  routine 
then  uses  integer  arithmetic  to  con¬ 


trol  a  loop  that  plots  the  individual 
points.  Listing  Three,  page  84,  sup¬ 
plies  two  such  routines,  called 
hdraw) )  and  vdrawO,  for  horizontal 
and  vertical  lines,  respectively. 

Because  callers  will  probably  use 
these  routines  frequently,  passing 
variables  as  arguments,  you  should 
not  place  on  them  the  burden  of 
ordering  the  start  and  end  points  to 
suit  the  expectations  of  the  routines. 
Consequently  these  routines  sort 
the  start  and  end  into  the  low-to- 
high  order  they  expect.  The  calls 
hdraw  (35,  231,  20,  1);  and  hdraw 
(231,  35,  20,  1);  both  produce  exactly 
the  same  result:  namely,  a  horizon¬ 
tal  line  between  x  coordinates  35 
and  231  inclusive  along  y  coordinate 
20  in  color  index  1. 

In  considering  the  diagonal  rou¬ 
tine  drawl ),  it’s  important  to  recog¬ 
nize  several  factors: 

•  The  requested  line  might  be  or¬ 
thogonal,  resulting  in  a  slope  that  is 
either  0  (horizontal)  or  infinity  (verti¬ 
cal).  To  avoid  division-by-zero  fail¬ 
ures,  the  routine  must  anticipate 
these  problems  in  advance  and  take 
corrective  action  by  preassigning  the 
correct  stepping  value. 

•  The  diagonal  must  move  along  the 
axis  that  results  in  the  densest  line, 
or  in  other  words,  along  the  axis 
that  has  the  greatest  distance  to 
travel.  If  the  motion  is  greater  along 
the  x  axis  than  along  the  y,  the  more 
dense  line  results  from  plotting  each 
y  intersection  per  x  increment.  Thus 
the  routine  must  determine  the  axis 
with  the  greatest  movement  and 
make  that  the  controlling  axis. 

•  The  stepping  value  along  the 
shorter-motion  axis  is  a  floating¬ 
point  number  that  is  the  ratio  of 
short-motion  to  long-motion.  For  ex¬ 
ample,  if  the  x  axis  moves  + 120 
points  and  the  y  axis  -30,  the  x  axis 
is  controlling  and  the  y  moves  -0.25 
points  per  x  (-30/120  =  -0.25),  which 
is  its  slope  relative  to  the  controlling 
axis. 

•  The  stepping  value  is  additive. 
That  is,  for  each  controlling  incre¬ 
ment,  the  stepping  value  is  added 
to  the  current  controlled  value.  This 
motion  accumulates  and  is  trun¬ 
cated  or  rounded  for  each  control¬ 
ling  step,  according  to  the  way  the 
routine  is  written  (Listing  Three  trun¬ 
cates  it). 


Dr.  Dobb’s  Journal,  November  1987 

836 


Identifying  the  Video  Adaptors  of  IBM  PCs 


ROM  BIOS  interrupt  lOh  on  IBM  PCs 
and  compatibles  furnishes  video  serv¬ 
ices  to  accommodate  a  variety  of 
hardware  configurations.  Although 
many  of  the  calls — notably  those  deal¬ 
ing  with  text — are  device  independ¬ 
ent,  others  are  device  specific.  If 
your  graphics  software  is  to  func¬ 
tion  effectively  in  this  kind  of  vari¬ 
able  environment,  it  must  adapt  its 
behavior  to  suit  the  hardware.  And 
to  do  that,  it  must  first  find  out 
what  the  video  hardware  is. 

This  isn’t  as  easy  as  it  might 
sound.  Programs  that  want  to  deter¬ 
mine  the  video  capabilities  at  their 
disposal  usually  have  to  look  in  sev¬ 
eral  places,  piecing  together  many 
clues. 


Value 

Adaptor 

0 

undefined1 

1 

40  by  25  B&W  text,  color 
graphics2 

2 

80  by  25  text,  CGA3 

3 

80  by  25  text,  monochrome 

1.  used  by  Compaq,  EGA,  and  CGA 

2.  also  used  for  non-IBM  video 
devices  such  as  Hercules 

3.  unreliable 

Table  1  BIOS  video  equipmen  t  flag 


The  Video  Equipment  Flag 

First  I’ll  discuss  sources  of  informa¬ 
tion  about  the  video  environment, 
then  I'll  show  how  to  use  them  to 
identify  the  video  hardware.  At 
memory  address  40h:07h,  the  ROM 
BIOS  keeps  a  byte  called  the  equip¬ 
ment  flag,  which  describes  the  ma¬ 
chine’s  hardware  configuration.  Bits 
4  and  5  give  some  information  about 
the  attached  video  device.  By  mask¬ 
ing  out  the  other  bits  and  shifting 
right  four  places,  you  can  isolate  a 
device  number  from  0  to  3.  In  C,  for 
example,  you  might  write  this  opera¬ 
tion  as: 

adaptor  =  (peek  (0x40,  0x07)  &  0x18) 

»  4; 

A  similar  effect  can  be  obtained  in 
BASIC  with  the  statements: 

DEF  SEG  &H40 

ADAPTOR  =  (PEEK  (7)  \\  8)  AND 

&H03 

Table  1,  left,  shows  the  full  range  of 
resulting  values.  As  this  table  indi¬ 
cates,  the  BIOS  video  equipment  flag 
byte  raises  more  questions  than  it 
answers.  The  only  thing  that's  sure 
is  that  when  its  value  is  3,  you 
have  a  monochrome  (that  is,  text- 
only)  adaptor.  Even  so,  you  might 
want  to  check  further;  if  the  video 
mode  discussed  next  is  7,  it’s  defi¬ 
nite  that  the  video  device  is  inca¬ 
pable  of  APA  graphics  and  color 
text. 

The  Video  Mode 

The  ROM  BIOS  furnishes  two  meth¬ 
ods  for  obtaining  the  video  mode 
— that  is,  the  current  mode  of  opera¬ 
tion  for  the  adaptor.  One  is  to  call 
interrupt  lOh  with  function  code 
OFh  in  the  AX  register.  On  return, 
AL  contains  the  mode  value.  An 
alternative  is  to  bypass  the  ROM 
BIOS  and  do  yourself  what  func¬ 
tion  OFh  does  indirectly:  read  the 
video  mode  byte  from  location 
40h:49h. 

Whichever,  the  resulting  is  a  byte 
whose  possible  settings  are  shown 


in  Table  2,  left.  These  values  can 
also  be  used  to  set  the  video  mode, 
assuming  the  attached  device  is  ca¬ 
pable  of  supporting  them.  These 
values  suggest  the  capabilities  of  the 
monitor  by  inference  only  and  can 
be  misleading.  For  example,  although 
mode  07h  indicates  an  active  mono¬ 
chrome  display  adaptor  (MDA),  it  is 
also  valid  for  the  EGA  in  plain  text 
mode. 

Note  that  modes  OBh  and  OCh  are 
undefined;  presumably  these  are  re¬ 
served  for  future  developments  or 
for  graphics  adaptors  that  were 
never  introduced. 

The  Enhanced 
Graphics  Adaptor 

If  the  current  mode  is  anything  less 
than  08h,  you  have  to  investigate 
further  to  find  out  if  an  EGA  card  is 
present  in  the  system.  The  ROM 
BIOS  furnishes  no  call  for  this,  so 
you  have  to  read  memory  location 
40h:87h  to  find  out.  This  address 
contains  the  EGA  information  byte. 

Although  each  bit  or  bit  pair  in 
the  EGA  information  byte  has  some 
unique  significance,  in  general  it’s 
sufficient  to  know  that  the  byte  is  0 
when  an  EGA  is  not  present  and 
nonzero  when  one  is.  Therefore,  in 
C,  you  can  define  a  Boolean  func¬ 
tion  isEGAf )  as  follows: 

char  isEGA  (void) 

{ 

return  (peekb  (0x40,  0x87)); 

} 

which  returns  FALSE  when  there  is 
no  EGA  attached  and  TRUE  (some 
nonzero)  when  one  is.  Similarly,  you 
can  test  for  an  EGA  monitor  in 
BASIC  with  statements  such  as: 

DEF  SEG  &.H40 

IF  PEEK  (&.H87)  <>  0  THEN  (EGA 
present)  ELSE  (not) 

A  brief  discussion  of  the  EGA  is 
perhaps  a  worthwhile  digression 
here.  The  EGA  furnishes  several  ex¬ 
tended  all-points  addressable  (APA) 
graphics  capabilities.  That  is,  you 


Mode  Meaning 

OOh  40  by  25  B&W  text,  color 
graphics  (obsolete) 

01  h  40  by  25  color  text 
02h  80  by  25  B&W  text 

03h  80  by  25  color  text 

04h  320  x  200  4-color  graphics 

05h  320  x  200  4-color  graphics 

(color  burst  off) 

06h  640  x  200  2-color  graphics 

07h  monochrome  adaptor  or 

EGA  text  mode 

08h  1 60  x  200  1 6-color  graphics  (PCjr) 

09h  320  x  200  1 6-color  graphics  (PCjr) 

OAh  640  x  200  4-color  graphics  (PCjr) 

OBh  not  used 

OCh  not  used 

ODh  320x  200  16-color  graphics  (EGA) 
OEh  640  x  200  1 6-color  graphics  (EGA) 
OFh  640  x  350  2-color  graphics  (EGA) 

1 0h  640  x  350  4-color  or  1 6-color 

graphics  (EGA)  (depends  on 
available  display  memory) 


Table  Z  IBM  PC  line  video  modes 


34 


Dr.  Dobbs  Journal,  November  1987 

837 


can  have  more  pixels  of  higher  den¬ 
sity  with  more  colors  than  in  the 
earlier  graphics  adaptors.  How  the 
EGA  works  is  beyond  the  scope  of 
this  article.  What’s  important  is  that 
the  EGA  is  downward  compatible; 
that  is,  it  supports  all  the  modes 
available  in  the  earlier  MDA  and 
CGA,  plus  some  of  its  own.  The  EGA 
offers  no  text  capabilities  beyond 
the  CGA,  except  that  the  characters 
are  more  dense  and  thus  more  read¬ 
able;  this  is  a  given,  beyond  pro¬ 
grammer  control.  Otherwise  the 
same  combinations  of  16  foreground 
by  16  background  colors  are  avail¬ 
able. 

The  advantages  of  the  EGA  over 
other  adaptors  thus  involve  only 
APA  graphics.  In  text  modes,  the 
only  discriminator  is  whether  the 
display  is  pure  monochrome  (MDA) 
or  not.  If  it’s  pure  MDA,  you  can’t 
do  colors;  otheiwise  you  can. 

Other  Common  Adaptors 

The  popular  Hercules  card  offers 
APA  graphics  enhancements  but 
nothing  extraordinary  by  way  of  text. 
The  Hercules  provides  for  gray  shad¬ 
ing  of  text  as  a  substitute  for  text 
colors  and  APA  graphics  on  a 
720  X  348-pixel  monochrome  display. 

The  Compaq  adaptor  is  second 
only  to  the  EGA  in  versatility.  In  text 
mode,  it  acts  like  the  EGA’s  text 
mode,  producing  high-resolution 
text  in  any  combination  of  16  fore¬ 
ground  and  16  background  colors. 
You  can  also  switch  it  into  any  of 
the  CGA  graphics  modes  (04h-06h ) 
and  it  behaves  just  like  the  CGA 
adaptor. 

Identifying  a  "Here”  or  a  Compaq, 
like  the  others,  is  a  matter  of  recog¬ 
nizing  a  pattern  of  indicators. 

The  Software 
Sherlock  Holmes 

Now  let's  see  how  these  clues  fit 


together.  Table  3,  below,  shows  the 
“signatures”  of  several  popular  video 
adaptors  in  their  default  (power-up) 
conditions.  Notice  that  each  adaptor 
has  a  unique  pattern  of  values.  Given 
this,  you  can  quickly  sift  through 
three  items  of  information  and  iden¬ 
tify  the  adaptor.  Expressed  in  pseu¬ 
docode,  the  algorithm  is: 

if  EGA  byte  <>  0  then 
adaptor  =  EGA; 
else 

case  BIOS  equipment  flag  of 

0:  if  video  mode  =  2  then 
adaptor  =  Compaq; 
else 

adaptor  =  CGA; 
end  if; 
break; 

1:  adaptor  =  Hercules;  (see 
note) 

break; 

3;  adaptor  =  MDA; 
end  case; 
end  if; 

Note:  Some  other  monitors  use 
the  same  signature  as  the  Hercules 
does.  An  example  is  the  MDS  Genius 
full-page  display  popular  with  desk¬ 
top  publishing  systems.  In  this  case, 
you  have  to  look  at  the  display 
buffer  size,  which  you  can  fetch  as 
an  integer  from  40h:4Ch.  The  Hercu¬ 
les  display  buffer  is  16,384  bytes; 
that  for  the  Genius  is  8,192  bytes. 

Obviously,  then,  Table  3  and  the 
algorithm  are  not  comprehensive. 
The  great  majority  of  video  adaptors 
for  IBM  PCs  and  compatibles  emu¬ 
lates  one  of  these  de  facto  stan¬ 
dards,  however,  and  thus  the 
method  presented  here  is  reliable  in 
almost  all  cases. 


BIOS  equip. 

EGA 

0 

CGA 

0 

MDA 

3 

Compaq 

0 

Hercules 

1 

Video  mode 

3 

3 

7 

2 

7 

EGA  byte 

nonzero 

0 

0 

0 

0 

Table  3  Signatures  of  popular  video  adaptors 


Dr.  Dobbs  Journal,  November  1987 

838 


GRAPHICS  TOOLBOX 

(continued  from  page  33i 

These  considerations  enable  the  rou¬ 
tine  to  plot  a  diagonal  or  orthogonal 
line  in  any  direction.  The  cost  of 
this  flexibility  is  slower  performance. 

As  you  use  the  video  library,  you’ll 
undoubtedly  add  your  own  graph¬ 
ics  routines  for  such  things  as 
polylines,  filled  shapes,  circles,  arcs, 
and  so  on.  The  DRAW.I  file  given 
here  is  merely  a  suggestion  of  the 
kinds  of  things  you  can  do  to  make 
graphics  easier  in  Turbo  C. 

You  can  also  make  life  a  little 
easier,  and  your  source  code  more 
readable,  by  including  the  file 
COLORS.H  (see  Listing  Four,  page 
85)  in  programs  that  require  colors. 
It’s  always  easier  to  understand  iden¬ 
tifiers  than  “magic  numbers”  requir¬ 
ing  you  to  memorize,  for  example, 
that  07h  is  light  gray. 

A  Graphics  Sampler 

Now  let's  put  this  discussion  to 
work  in  a  demonstration  program. 
Listing  Five,  page  85,  contains  the 
demonstration  program  VID.C, 
which  calls  a  number  of  the  rou¬ 
tines  in  the  video  library.  It  pro¬ 
duces  some  simple  but  illustrative 
effects  in  both  text  and  graphics 
modes. 

The  program  first  identifies  the 
adaptor  on  the  host  machine  so 
that  it  can  subsequently  tailor  its 
behavior  to  suit  (or  bypass  opera¬ 
tions  that  the  adaptor  can’t  handle, 
such  as  APA  graphics  on  an  MDA). 
This  program  is  written  to  run  on 
the  MDA,  CGA,  EGA,  and  Compaq;  it 
doesn’t  recognize  the  Hercules  card. 

Next  it  produces  a  text  graphics 
display  that  showcases  cursor  posi¬ 
tioning  with  gotoyyf )  and  (if  the 
monitor  is  capable  of  color)  colored 
text.  Except  for  the  string-writing 
functions  that  produce  a  label  at  the 
top  and  a  prompt  at  the  bottom  of 
each  demonstration  display,  this  is 
the  only  use  of  text  graphics  in  the 
program. 

In  the  third  display,  the  program 
draws  a  border  around  the  screen 
and  a  large  corner-to-corner  X  inside 
it.  It  calls  the  functions  in  DRAW.I. 
This  routine  shows  how  to  make  a 
run-time  adjustment  to  the  coordi¬ 
nate  system  according  to  whether 
the  adaptor  is  a  CGA  (or  Compaq) 


or  an  EGA.  If  you’re  running  with  an 
MDA,  the  program  bypasses  this  dis¬ 
play  and  the  next  because  the  hard¬ 
ware  can't  handle  it. 

The  fourth  panel  is  in  CGA 
320  X  200-pixel  four-color  graphics.  It 
draws  a  three-color  hourglass  illus¬ 
trating  two  things:  first,  that  solids 
on  a  display  are  indeed  composed 
of  numerous  horizontal  lines;  and 
second,  that  the  EGA  (if  that’s  what 
you’re  using)  can  emulate  a  CGA 
monitor. 

The  final  display  merely  an¬ 
nounces  that  the  demo  is  finished. 

If  you  haven't  converted  the  video 
functions  into  a  linkable  library  but 
instead  are  using  Listing  One  di¬ 
rectly,  change  the  indicated  direc¬ 
tive  to  # include  <video.i>  near  the 
top  of  the  file  before  compiling.  On 
the  other  hand,  if  you’ve  gone  to  the 
trouble  of  creating  VIDEO.LIB,  set 
up  a  project  file  VID.PRJ  containing 
the  following  entries: 

vid 

video.lib 

Make  sure  VIDEO.LIB  is  in  the  same 
directory  as  your  source  program 
VID.C,  then  start  the  compile,  link, 
and  go  session  with  Alt-R. 

Availability 

All  the  source  code  for  articles  in 
this  issue  is  available  on  a  single 
disk.  To  order,  send  $14.95  to  Dr. 
Dobb's  Journal,  501  Galveston  Dr., 
Redwood  City,  CA  94063,  or  call  (415) 
366-3600,  ext.  216.  Please  specify  the 
issue  number  and  format  (MS-DOS, 
Macintosh,  Kaypro). 

DDJ 

(Listings  begin  on  page  82.) 

Vote  for  your  favorite  feature/article. 

Circle  Reader  Service  No.  2. 


Dr.  Dobb's  Journal,  November  1987 

839 


ARTICLES 


A  Graphics  Toolkit 
for  Turbo  Pascal 


Although  I've  been  using 
Turbo  Pascal  for  custom 
,  graphics  work  for  some  time, 
I  haven’t  come  across  much  infor¬ 
mation  on  how  to  uncover  the  some¬ 
times  subtle  graphics  capability  in 
Version  3.0.  This  article  shows  how 
I  used  several  nonstandard  Turbo 
Pascal  procedures,  including  GET- 
PIC,  PUTP1C,  GETMEM,  FREEMEM, 
BLOCKWRITE,  and  BLOCKREAD,  to 
create  a  set  of  tools  for  handling 
rectangular  regions.  You  can  use 
these  tools,  for  which  I  have  includ¬ 
ed  all  the  source  code,  to  develop: 

•  graphics  pop-up  overlay  screens 
in  either  medium-resolution  (320  X 
200  pixels)  or  high-resolution  (640  X 
200  pixels)  graphics  modes 
•  multiple  concurrent  graphics  re¬ 
gions  that  require  rapid  display 
•  rapid  block-move  animation 

This  set  of  tools  consists  of  three 
procedures — SaveRegion,  Restore- 
Region,  and  FreeBuffer — along  with 
one  function — CRTmode — which  col¬ 
lectively  let  you  store  to  buffer  and 
restore  complete  graphics  screen  re¬ 
gions  or  any  upright  rectangular 
subregions  of  the  screen.  I  have  also 
included  two  supporting  proce¬ 
dures — SaveBlockToDisk  and  Get- 
BlockFromDisk — which  are  useful 
for  moving  buffered  screen  regions 
to  and  from  disk  efficiently.  (You 
need  a  CGA  to  make  use  of  these 
routines.) 


Dr.  Callihan  is  chair  of  the  Computer 
Science  Department  at  the  University 
of  Pittsburgh  at  Johnstown  where  he 
has  taught  for  the  past  19  years.  For 
the  last  7  years,  he  has  been  actively 
involved  in  developing  graphics  tools. 


by  Hubert  D.  Callihan 


Saving  and  restoring 
graphics  screen 
regions 


Despite  what  you  might  think 
about  the  limitations  of  Turbo 
Pascal,  the  number  of  graphics  re¬ 
gions  that  you  can  save  to  buffer 
simultaneously  is  limited  only  by 
heap  space  (that  is,  the  dynamically 
accessible  memory  not  already  con¬ 
sumed  by  DOS,  resident  programs, 
the  Turbo  Pascal  environment,  the 
static  program  code,  and  static  data). 
By  operating  in  dynamic  memory 
space,  these  tools  exploit  all  the 
memory  available  on  the  PC  that  is 
managed  by  DOS.  Turbo  Pascal  pro¬ 
grammers  who  think  they  must  stay 
within  the  64K  static  limit  will  ap¬ 
preciate  this  feature;  frankly,  most 
of  the  criticism  that  has  been  lev¬ 
eled  at  Turbo  Pascal  regarding  the 
64K  static  limit  is  unfounded  be¬ 
cause,  for  a  little  inconvenience  in 
notation  and  the  need  to  allocate 
and  deallocate  memory,  you  can 
always  get  around  the  static  limita¬ 
tion  by  using  pointers. 

I  have  successfully  used  these 
tools  to  develop  menu  overlays, 
images  of  fractals,  Mandelbrot  and 
Julia  sets,  splines,  and  custom  ani¬ 
mation  code.  Animation  using  re¬ 
peated  overlays  of  several  images  is 
a  particularly  good  application  for 
the  tools  because  it  depends  in  no 
way  upon  the  complexity  of  the 
image  but  strictly  on  the  amount  of 
time  required  to  restore  the  region 
to  the  screen  once  it  is  buffered. 
Even  as  complex  an  image  as  a  Man¬ 


delbrot  set,  which  may  take  hours 
to  generate,  can  be  buffered  and 
restored  at  animation  speed  if  no 
logical  changes  need  to  occur  in  the 
image.  Drawing  the  individual 
frames,  buffering  each  in  memory, 
and  displaying  them  in  rapid  se¬ 
quence  will  produce  the  flip-card 
motion  effect  common  in  some 
types  of  animation,  and  because  you 
don't  need  to  erase  one  image 
before  drawing  another,  the  result  is 
nearly  smooth  and  flicker-free. 

Saving  Screen  Regions 

Listing  One,  page  92,  contains  the 
Turbo  Pascal  code  for  SaveRegion, 
the  procedure  that  saves  a  screen 
region  into  a  buffer.  You  call  Save¬ 
Region  with  arguments  defining  the 
rectangular  screen  region  to  be  buff¬ 
ered,  expressing  the  coordinates  in 
absolute  screen  values;  that  is, 
medium-resolution  x  range  0-319 
and  y  range  0-199  or  hi-res  x  range 
0-639  and  y  range  0-199.  In  addi¬ 
tion,  you  must  declare  a  pointer 
referring  to  a  screen  buffer  contain¬ 
ing  the  saved  screen  as  a  variable 
parameter  argument. 

SaveRegion  itself  calls  GETMEM  to 
allocate  sufficient  memory  to  hold 
the  screen  buffer,  and  then  it  calls 
GETPIC  to  get  the  picture  from  the 
screen  and  copy  it  into  this  buffer. 
The  details  for  the  SaveRegion  input 
and  output  header  are: 

TYPE  buffermemory  :  ARRAY  [1  .  .3] 
OF  INTEGER; 

bufferaddress  :  “buffermemory; 
PROCEDURE  SaveRegion  (  VAR  buff  : 

bufferaddress; 
xl,  yl,  {upper  left  coords} 
x2,  y2  {lower  right  coords} 

:  INTEGER  ); 


38 

840 


Dr.  Dobb’s  Journal,  November  1987 


Notice  that  array  of  three  inte¬ 
gers — buff' 111,  buffl2l,  and  buffTl3]. 
Although  they  are  somewhat  enig¬ 
matically  described  in  the  Turbo 
Pascal  reference  manual,  these  three 
integers  describing  the  parameters 
of  the  buffered  screen  must  contain 
the  following  values: 

•  buff'll 1 — an  integer  code  for  the 
current  screen  mode,  namely  2  for 
GRAPHMODE  or  GRAPHCOLOR¬ 
MODE  or  1  for  HIRES  mode 

•  buff [21 — the  width  in  pixels  of  the 
buffered  region  according  to  the 
mode  in  buffll] 

•  buffi 31 — the  height  in  pixels  of 
the  buffered  region 

GETPIC  stores  the  remaining  screen 
image  data  in  successive  bytes  after 
buff  [31. 

The  program  needs  to  know  how 
much  space  the  buffer  for  this 
screen  region  will  need  so  it  can 
dynamically  allocate  the  memory  via 
GETMEM  and  inform  GETPIC  about 
it.  The  documentation  for  GETPIC  in 
the  Turbo  Pascal  reference  manual 
requires  that  this  size  be  computed 
for  medium  resolution  as: 

size  :=  ((width +  3)  DIV  4)  *  height  * 

2  +  6 

and  for  high  resolution  as: 

size  :=  ((width +  7)  DIV  8)  *  height 

+  6 

These  expressions  account  for  the 
three  integers  (6  bytes)  and  the  total 
number  of  bits  needed  to  represent 
all  pixels,  where  each  pixel  is  either 
2  bits  for  medium  resolution  (four 
colors)  or  1  bit  for  high  resolution 
(two  colors).  In  either  case,  pixel 
width  and  pixel  height  are  com¬ 
puted  as  follows: 

width  :=  ABS  (  xl-x2  )  +  1; 
height  :=  ABS  (  yl-y2  )  +  1 

Calls  to  GETMEM  and  GETPIC 
with  these  parameters  will  allocate 
contiguous  memory  dynamically  at 
the  location  determined  by  the 
pointer  buff: 

GETMEM  (  buff,  size  ); 

GETPIC  (  buff,  xl,  yl,  x2,  y2  ); 

I  use  GETMEM  rather  than  the  stan¬ 


dard  NEW  allocation  procedure  in 
Pascal  because  the  amount  of 
memory  varies  at  run  time  for  differ¬ 
ent  screen  buffer  sizes  and  must  be 
computed.  NEW  allocates  space  at 
run  time  but  only  according  to  the 
static  TYPE  declared  at  compile 
time;  GETMEM  permits  it  to  be  com¬ 
puted  on  the  fly. 

GETPIC  actually  permits  any  stan¬ 
dard  TYPE  variable  to  be  used  to 
declare  the  buffer.  An  integer  would 
suffice  in  most  cases,  but  I  use  an 


The  number  of 
graphics  regions 
you  can 
save  to  buffer 
simultaneously 
is  limited  only 
by  heap  space. 


array  of  three  integers  so  that  the 
resolution,  width,  and  height  are  con¬ 
veniently  accessible  using  buffll], 
buffl2],  and  buff[3],  respectively. 
You  will  need  these  later  when  you 
want  to  display  a  screen  region 
using  RestoreRegion  and/or  free  the 
memory  consumed  by  its  buffer. 

Note  that  in  Listing  One  I  use  the 
may  and  min  functions  locally 
within  SaveRegion  to  filter  the 
passed  coordinates  and  thereby  guar¬ 
antee  that  they  lie  within  the  speci¬ 
fied  range  for  the  current  resolution 
mode.  Note  also  the  function  CRT¬ 
mode,  which  returns  the  current 
resolution  mode  of  the  graphics  dis¬ 
play,  thus  determining  how  size  is 
computed. 

My  original  versions  of  SaveRegion 
and  RestoreRegion  didn’t  have  a  CRT¬ 
mode  function;  it  required  a  global 
variable  to  carry  the  currently  active 
resolution.  Such  globals  are  a  nui¬ 
sance  when  designing  self-contained 
tools  that  abide  by  loose  coupling 
principles  commonly  used  in  well- 
structured  systems.  In  this  case,  I 
found  that  I  could  virtually  elimi¬ 
nate  side  effects  by  creating  a  func¬ 
tion  to  determine  the  current  video 
mode.  You  just  load  the  AX  register 
with  $0F00  and  exercise  ROM  BIOS 
interrupt  $10,  and  you  get  back  the 


CRT  mode-of-operation  parameters 
in  the  AX  and  BX  registers  (see 
AT&T’s  6300  documentation). 
Namely,  the  low  byte  of  the  16-bit 
AX  register  contains  an  integer  code 
corresponding  to  the  current  mode 
of  operation  (see  Listing  Two,  page 
92,  for  these  codes).  Although  not 
used  here,  the  high  byte  of  the  AX 
register  contains  the  number  of  char¬ 
acter  columns  in  the  text  display  if 
text  mode  is  active  and  should  be 
ignored  in  any  graphics  mode.  The 
high  byte  of  the  BX  register  contains 
the  current  display-memory  page. 

Testing  the  CRT  mode  is  a  simple 
matter  of  declaring  the  usual  regis¬ 
ter  variables  within  a  RECORD  struc¬ 
ture,  setting  the  conditions  for  the 
interrupt  to  occur,  calling  the  Turbo 
Pascal  INTR  procedure,  and  inter¬ 
preting  the  returned  results,  as  fol¬ 
lows: 

FUNCTION  CRTmode 

(  VAR  char _ columns, 

display _ page  :  BYTE  )  :  BYTE; 

TYPE 

regpack  =  RECORD 

ax,bx,cx,dx,bp,si,di,de,es,flags :  IN¬ 
TEGER 

END;  {regpack} 

VAR  dosreg  :  regpack; 

BEGIN 

WITH  dosreg  DO  BEGIN 
ax  :=  S0F00; 

INTR  (  $10,  dosreg  ); 

CRTmode  :=  LO  (  ax  ); 

char _ columns  :=  HI  (  ax  ); 

display _ page  :=  HI  (  bx  ) 

END  {WITH} 

END;  {CRTmode} 

Restoring  Regions 
to  the  Screen 

Listing  Three,  page  93,  contains  the 
complete  code  for  the  RestoreRegion 
procedure.  The'  heading  for  the 
RestoreRegion  procedure  is: 

PROCEDURE  RestoreRegion  (  VAR 
buff  :  bufferaddress;  x,  y  :  INTE¬ 
GER;  freeup  :  BOOLEAN  ); 

You  call  it  with  arguments  describ¬ 
ing  the  lower-left  screen  coordinates 
(x,y)  where  the  buffered  region  is  to 
be  placed  on  the  current  screen, 
and  it  uses  the  Turbo  Pascal-sup¬ 
plied  procedure  PUTPIC  to  place  the 
image  at  the  point  (x,y).  Restore¬ 
Region  also  expects  the  buffer 


Dr.  Dobb’s  Journal,  November  1987 


39 

841 


TURBO  PASCAL  GRAPHICS 

(continued  from  page  39) 

pointer  used  by  a  previous  Save- 
Region  call  to  be  transferred  in  buff. 
Finally,  there’s  that  Boolean  argu¬ 
ment.  I  chose  to  use  a  BOOLEAN 
( called  freeup)  to  indicate  either  that 
the  restored  buffer  should  be  deallo¬ 
cated  ( freeup  =  TRUE  )  or  that  it 
should  not  be  deallocated  (freeup  = 
FALSE ),  implying  that  the  region 
may  be  restored  to  the  screen  again 
and  again,  as  with  a  menu. 

One  caution:  do  not  use  the  same 
buffer  name  on  two  successive  calls 
to  SaveRegion  because  this  will 
render  the  first  buffer  inaccessible 
and  you  won't  be  able  to  deallocate 
it.  If  you  must  use  the  same  buffer 
name  in  this  manner,  then  copy  the 
existing  buffer  into  a  bufferaddress 
variable  name  prior  to  the  second 
call.  Otherwise,  a  region  correspond¬ 
ing  to  a  given  named  buffer  should 
be  restored  with  the  freeup  argu¬ 
ment  set  to  TRUE  before  the  same 
buffer  is  used  again  in  a  SaveRegion 
call. 

RestoreRegion  uses  the  resolution, 
width,  and  height  parameters  previ¬ 
ously  saved  in  the  buffer  to  compute 
the  amount  of  memory  in  size  to  be 
deallocated  by  FREEMEM.  Note  that 
it  won’t  restore  the  image  if  any  part 
of  the  region  extends  beyond  the 

screen  edges  (y _ ok  andy _ ok);  two 

bells  will  sound  in  this  case.  Note, 
too,  that  the  logic  in  this  procedure 
permits  you  to  remove  this  bound- 
ing-edge  test,  permitting  partial 
image  restoration. 

Figure  1,  page  44 ,  illustrates  a  pop¬ 
up  graphics  menu  that  is  used  to 
overlay  a  complex  fractal  image.  The 
fractal  image  is  previously  buffered 
using  SaveRegion,  and  when  restored 
using  RestoreRegion,  it  overwrites 
the  image  containing  the  menu.  The 
effect  in  such  a  case  is  to  erase  the 
menu  and  restore  the  background 
to  its  previous  state. 

Managing  Memory 

I  found  it  wise  to  include  in  my  bag 
of  tools  a  procedure  called  Free- 
Buffer  to  allow  me  to  deallocate  an 
image  buffer  without  restoring  the 
image  visually  to  the  screen.  Like 
RestoreRegion,  FreeBuffer  computes 
the  size  of  the  memory  block  to  be 
freed  and  calls  FREEMEM  to  com- 


Dr.  Dobb’s  Journal,  November  1987 


2  BASINS  OF  INFINITE 

P  -  Palette  Toggle 
B  -  BaokgpnJ  Toggle 
S  -  Save  Screen 
T  -  Make  Title  Slide 
G  -  Get  a  Screen 
D  -  Draw  Julia  Set 
R  -  Restore  Screen 
< space >  -  This  Menu 
< Al t~PrtSc>  -  CaMera 
<Esc>  -  Exit 
Choice  ? 


Figure  1:  The  pop-up  menu,  previously  drawn  and  saved,  overlaying  the 
fractal  image 


Figure  2:  A  saved  graphic  region  used  in  Listing  Sip.  as  buffer 


Figure  3:  The  full  screen  saved  as  buffer2  in  Listing  Siy. 


TURBO  PASCAL  GRAPHICS 

(continued  on  page  40) 

plete  the  job.  Listing  Four,  page  94, 
contains  the  complete  FreeBuffer  pro¬ 
cedure.  The  procedure  heading 
takes  this  form: 

PROCEDURE  FreeBuffer 
(  VAR  buff  :  bufferaddress  ); 

Storing  Regions  on  Disk 

Because  an  image  is  stored  within  a 
contiguous  section  of  memory,  it  is 
possible  to  save  it  rapidly  to  an  “un¬ 
typed"  disk  file  and  later  load  it 
from  disk  into  a  buffer.  Turbo  Pascal 
provides  for  such  untyped  files,  and 
its  BLOCKWRITE  and  BLOCKREAD, 
as  described  on  page  114  of  the 
reference  manual,  will  suffice  for  writ¬ 
ing  and  reading  the  images. 

You  call  BLOCKWRITE  like  this: 


numrecs  :=  1  +  (size-1)  DIV  128; 
BLOCKWRITE  (  FileVariable,  buff, 

numrecs  ) 

because  it  saves  a  block  of  memory 
to  a  file  named  FileVariable  consist¬ 
ing  of  numrecs  128-byte  block  re¬ 
cords  starting  at  buff.  The  last  record 
may  conceivably  contain  less  than 
128  bytes  of  true  image  data,  wast¬ 
ing  a  few  bytes,  but  the  sacrifice  of  a 
few  bytes  of  memory  and  disk  space 
is  well  worth  what  you  gain  in  speed 
by  using  block  disk  transfers. 

To  load  an  image  back  from  disk 
into  a  buffer,  you  determine  the  size 
of  the  image  file  (in  128-byte  re¬ 
cords)  using  Turbo  Pascal’s  FILE- 
SIZE  function,  allocate  sufficient 
memory  to  hold  the  image  using 
GETMEM,  and  finally,  BLOCKREAD 
the  records  from  the  original  "block 
written”  file  into  the  buffer: 

SizeOfFile  :=  FILESIZE  (  FileVari¬ 
able  ); 

GETMEM  (  buff,  SizeOfFile  *  128  ); 
BLOCKREAD  (  FileVariable,  buff, 
SizeOfFile  ); 

Listing  Five,  page  94,  contains  the 
simple  procedures  SaveBlockToDisk 
and  GetBlockFromDisk  for  saving 
and  loading  these  buffered  regions 
to  and  from  disk. 

Examples 

Having  hopefully  whetted  your  ap- 


44 


Dr.  Dobb's  Journal,  November  1987 

843 


petite  to  see  a  demo,  I  submit  the 
demo  program  in  Listing  Six,  page 
95.  It  creates  the  graphic  region 
shown  in  Figure  2,  page  44,  saves  it 
in  a  buffer  (buffer),  and  restores  it  to 
various  locations  on  the  screen.  A 
second  buffer,  bufferZ,  contains  a 
full-screen  image  (see  Figure  3,  page 
44)  that  serves  as  background  for 
subsequent  pop-up  overlays.  The  pro¬ 
gram  restores  this  image  alternately 
with  the  smellier  overlay,  creating  a 
motion  effect. 

Note  that  I  have  included 
GRAPH  .P  Extended  Graphics  (sup¬ 
plied  with  Turbo  Pascal,  Version  3.0) 
as  an  option  if  you  want  to  use 
windows  via  GRAPHWINDOW  and 
the  window-clearing  operation  FILL- 
SCREEN.  If  you  use  Turbo  Pascal 
windows  in  this  manner,  then  the 
coordinates  in  SaveRegion  and 
RestnreRegion  refer  to  the  current 
window,  where  (0,0)  is  its  upper-left 
corner. 

Availability 

All  the  source  code  for  articles  in 
this  issue  is  available  on  a  single 
disk.  To  order,  send  $14.95  to  Dr. 
Dobb’s  Journal,  501  Galveston  Dr., 
Redwood  City,  CA  94063,  or  call  (415) 
366-3600,  ext.  216.  Please  specify  the 
issue  number  and  format  (MS-DOS, 
Macintosh,  Kaypro). 

Bibliography 

System  Programmer’s  Guide.  AT&T 
6300  Persona]  Computer  Documen¬ 
tation,  1986. 

Turbo  Pascal  V3.0  Reference  Manual. 
Scotts  Valley,  Calif.:  Borland  Interna¬ 
tional,  1987. 

DDJ 

(Listings  begin  on  page  92.) 

Vote  for  your  favorite  feature/article. 

Circle  Reader  Service  No.  3. 


Dr.  Dobb's  Journal,  November  1987 

844 


ARTICLES 


Using  EGA  Graphics 
Screens  in  Your  Programs 


Creating  detailed  images  for 
the  Enhanced  Graphics 
Adapter  (EGA)  usually  means 
graph  paper,  binary-to-hexadecimal 
conversion,  and  counting  lots  of 
dots.  Commercial  "paint”  programs 
such  as  EGAPaint  and  PC  Paint¬ 
brush,  on  the  other  hand,  offer  real¬ 
time,  see-what-you-draw  image  crea¬ 
tion  but  typically  only  allow  full 
screens  of  images  to  be  chained  in  a 
sort  of  “slide  show.” 

Once  you  display  a  screen  (cre¬ 
ated  by  such  a  paint  program)  with 
your  programming  language,  you 
can  then  use  your  language's  graph¬ 
ics  commands  to  modify  the  screen. 
You  can  capture  bit  blocks,  animate 
portions  of  the  screen,  and  save 
images  to  disk — all  this  and  more, 
with  no  graph  paper  or  binary-to- 
hexadecimal  conversion  and  only  a 
little  pixel  counting. 

This  article  presents  Forth  rou¬ 
tines  that  can  be  used  to  load  an 
EGAPaint  file  into  video  memory  for 
producing  animation  (see  Listing 
One,  page  88).  Although  Forth  is  the 
language  used  in  the  listing,  the  pseu¬ 
docode  in  Listing  Two,  page  88, 
should  enable  you  to  adapt  the 
method  to  any  language  supporting 
EGA  graphics.  The  process  is  rela¬ 
tively  straightforward.  Listing  Three, 
page  89,  contains  a  demonstration 
of  the  routines’  use  in  a  simple 
game.  But  first,  let’s  take  a  brief  look 


J.  Brooks  Breeden,  367  Brynhild  Rd., 
Columbus,  OH  43202.  Brooks  is  a 
professor  at  Ohio  State  University. 
He  worked  with  CA1  on  mainframes 
from  1976  until  1981,  when  he 
switched  to  using  the  IBM  PC.  He  has 
worked  with  Forth  exclusively  since 
1984. 


by  J.  Brooks  Breeden 

Getting  a  full-color 
screen  from  a  paint 
program  into  EGA 
display  memory  from  a 
high-level  language 

at  the  Enhanced  Graphics  Adapter. 

An  EGA  Review 

Since  its  introduction  in  1984,  the 
IBM  Enhanced  Graphics  Adapter 
(EGA)  has  received  much  less  press 
than  did  its  older  sibling,  the  Color 
Graphics  Adapter  (CGA),  in  its  first 
three  years.  The  CGA’s  introduction 
was  followed  by  a  plethora  of  $20 
books  on  how  to  program  graphics 
(mostly  in  BASIC),  but  despite  the 
introduction  of  many  low-price  EGA 
clone  boards,  information  on  taking 
advantage  of  the  EGA’s  capabilities 
is  still  scarce.  The  EGA  is  more  com¬ 
plex  than  the  CGA  and  program¬ 
ming  the  EGA  does  require  some 
understanding  of  the  fundamental 
differences  between  them. 

In  its  high-resolution  mode 
(640  x  200  pixels,  black  and  white), 
the  CGA  uses  1  byte  to  store  data  for 
eight  pixels,  1  bit  per  pixel,  either 
on  or  off  (see  Figure  1,  page  47).  The 
CGA  memory  required  for  a  single 
page  of  hi-res  display  is  therefore 
16,000  bytes  (640  X  200/8  =  16,000),  al¬ 
though  16K  (16  X  1,024  =  16,384  bytes) 
is  allocated.  CGA  memory  begins  at 
B800h  for  even  scan  lines;  odd  scan 
lines  are  offset  by  8K  (8,192  bytes). 
The  extra  192  bytes  are  not  used  in 
either  area. 

In  contrast,  EGA  hi-res  graphics 
display  memory  begins  at  AOOOh  (see 
Figure  2,  page  47).  Unlike  the  CGA, 


display  memory  is  continuous,  not 
separated  into  even-line  and  odd¬ 
line  locations.  In  its  high-resolution 
16-color  mode,  the  EGA  uses  1  byte 
to  represent  16-color  data  for  eight 
pixels,  1  bit  per  pixel.  No,  you  didn’t 
read  that  wrong. 

Forgetting  color  for  the  moment, 
the  EGA  display  is  a  matrix  of 
224,000  dots  (640X350  =  224,000).  If 
the  display  is  either  black  or  white, 
each  dot  or  pixel  can  be  represented 
by  1  bit  (either  on  or  off),  just  like 
the  CGA  in  its  hi-res  mode.  One  byte 
represents  eight  contiguous  pixels 
on  a  scan  line,  so  28,000  bytes 
(224,000/8  =  28,000)  contain  pixel  data 
for  one  full  screen. 

But  what  about  color?  Let’s  first 
examine  how  a  color  printer 
"paints”  an  image.  Color  printers 
usually  make  four  passes  per  line. 
Simply  stated,  color  printing  involves 
mapping  areas  of  a  page  that  receive 
a  color,  then  applying  the  ink  to 
those  areas  of  the  page.  Successive 
mapping  and  application  of  cyan, 
magenta,  yellow,  and  black  inks  (pig¬ 
ments  that  reflect  light)  yield  a  full- 
color  image. 

The  EGA  similarly  uses  a  "map 
mask"  to  determine  which  phos¬ 
phors  (that  emit  light)  should  be 
excited  in  red,  green,  blue,  and  in¬ 
tensity  "planes.”  One  screen  full  of 
pixels  (28,000  bytes)  maps  all  areas 
of  the  screen  that  contain  red  in  the 
displayed  color,  a  second  28,000- 
byte  screen  maps  all  areas  contain¬ 
ing  blue,  a  third  maps  green,  and  a 
fourth  maps  the  associated  inten¬ 
sity.  The  pixels  mapped  from  all 
four  planes  blend  visually  to  make 
the  full-range  color  display.  The  EGA 
display,  then,  can  be  visualized  as 
four  overlaid  planes  of  color,  num- 


46 


Dr.  Dobb’s  Journal,  November  1987 

845 


bered  0-3,  each  having  the  same 
address — AOOOh.  Yes,  the  same  ad¬ 
dress.  If  the  CGA  is  a  single  family 
residence,  the  EGA  is  a  four-unit 
apartment  building. 

In  addition  to  display  memory, 
the  EGA  card  contains  control  regis¬ 
ters  and  temporary  latches.  The  con¬ 
trol  registers  are  accessed  by  writing 
data  to  a  port.  To  write  to  the  dis¬ 
play,  you  first  set  the  control  regis¬ 
ters  and  read  the  byte  at  the  desired 
address  and  then  write  the  new  byte 
to  the  same  address.  Reading  the 
byte  (eight  pixels  at  a  time)  causes 
the  EGA  to  read  4  bytes,  one  from 
each  plane,  into  the  temporary 
latches  (eight  pixels’  worth  of  data 
in  four  colors).  When  a  data  byte  is 
then  written  back  to  the  same  ad¬ 
dress,  the  values  in  the  control  regis¬ 
ters  at  that  time  determine  how  the 
data  gets  written  to  each  of  the  four 
planes.  Map  masks  determine  which 
plane  is  written  to,  and  bit  masks 
determine  which  bits  are  on  in  the 
byte  that  is  written  to  the  plane.  The 
bibliography  lists  references  that 
cover  this  sort  of  low-level  program¬ 
ming  in  detail. 

High-level  languages  supporting 
the  EGA  typically  hide  the  complex¬ 
ity  of  such  bit-level  operations  in 
commands  such  as  LINE,  ARC,  and 
so  on,  but  most  do  not  implement 
all  of  the  EGA  functions.  To  do  some 
things,  you  still  have  to  resort  to 
low-level  programming.  Although 
map-mask/bit-mask  programming 
may  appear  excessively  tedious,  it  is 
easy  to  do  some  things  with  the 
EGA,  and  moving  data  from  storage 
to  the  display  is  one  of  the  easiest. 
Displaying  a  screen  image  on  the 
EGA  means  simply  moving  data  rep¬ 
resenting  the  blue,  red,  green,  and 
intensity  planes  from  the  source  to 
the  display  address  (the  same  ad¬ 
dress,  the  video  segment:ojfset) .  You 
set  the  control  registers  (by  writing 
values  to  the  ports)  such  that  the 
latches  send  the  data  being  read  to 
the  appropriate  plane.  (Sending  the 
blue  data  to  the  red  plane  results  in 
bizarre  color  schemes.  Go  ahead,  try 
it!)  The  only  problem  is  finding  out 
where  the  data  for  each  plane  is 
stored  in  your  paint  program’s  file. 

Paint  Your  Wagon 

EGAPaint,  a  popular  paint  program 
from  RIX  Softworks  is  used  in  the 


listings.  Like  other  paint  programs, 
it  allows  screens  of  images  created 
with  a  program  to  be  printed  or 
saved  to  disks  as  files  (either  com¬ 
pressed  or  uncompressed).  A  printer 
utility,  EGAPrint,  is  included  to  allow 
capture  of  anything  being  displayed 
to  an  uncompressed  file  in  EGAPaint 
format. 

The  standard  640  X  350-pixel,  16- 
color  file  format  for  EGAPaint  2001 
is  112,016  bytes:  16  bytes  Of  color 
palette  information,  followed  by 
28,000  bytes  each  of  blue,  red,  green, 
and  intensity,  in  that  order.  (I  called 
RIX  and  asked!) 

The  newer  EGAPaint  2005  file  for¬ 
mats  vary  depending  upon  the  type 
of  screen — for  example,  640X350, 
640  X  480,  and  so  on.  RIX  says  it  will 
share  uncompressed  file-format  in¬ 
formation  with  any  registered  owner 
of  the  program  who  writes  and  re¬ 
quests  it  (I’m  still  waiting),  but  it  will 
not  share  the  compression  scheme 
it  uses.  ARC  seems  to  work  fine  for 


long-term  storage,  though.  Other 
paint  programs  may  save  color 
plane  data  in  a  different  order  and 
might  include  palette  color  data  in 
a  different  location  (or  not  at  all).  A 
letter  or  call  to  the  program  vendor 
is  worthwhile  if  some  experimenta¬ 
tion  doesn’t  discover  the  proper  se¬ 
quence  quickly. 

Moving  Right  Along 

You  can  program  the  EGA  using  any 
language  that  lets  you  do  the  follow¬ 
ing  things: 

1.  read  and  write  to  an  absolute 
address 

2.  read  or  write  to  a  specific  I/O  port 

3.  call  an  external  subroutine  or  a 
DOS  interrupt 

For  example,  to  set  write  mode  0 
in  Turbo  Pascal  you  use 
port[$03CE]:=5  to  select  the  regis¬ 
ter,  then  port[$03CFl:  =  0  to  set  the 
register  value.  In  BASIC  you  use  out: 


B800:0000h 


B800:000h+8k 


Figure  1:  Conceptual  representation  of  CGA  display  memory.  The  plane 
is  16,000  bytes;  1  byte  controls  eight  pixels. 


A000:0000h 


Figure  2:  Conceptual  representation  of  EGA  display  memory.  Each  plane 
is  28,000  bytes;  1  byte  controls  eight  pixels  for  all  planes.  You  read  and 
write  to  the  same  address.  The  latches  deliver  to  the  previously  selected 
plane(s). 


Dr.  Dobb's  Journal,  November  1987 

846 


47 


EGA  GRAPHICS 

(continued  from  page  47) 

100  out  &h3ce,5 
110  out  &.h3cf,&h0 

In  UR/FORTH,  the  word  PC!  sends  a 
byte  to  a  port.  The  sequence  be¬ 
comes  a  word: 

:  WriteModeO  (  —  ) 

HEX  5  3CE  PC!  0  3CF  PC!  DECI¬ 
MAL  ; 

Listing  One  contains  the  required 
routines  written  in  UR/FORTH  1.01 
(MS-DOS  version),  a  Forth-83  im¬ 
plementation  from  Laboratory  Mi¬ 
crosystems;  Listing  Two  contains  a 
pseudocode  version.  The  listings 
don't  include  saving  a  screen  to  disk 
because  the  paint  program  creates 
the  image  and  does  the  save-to-disk. 
As  they  say  in  academe,  reversing 
the  process  to  save  a  screen  to  disk 
is  left  as  an  exercise  .... 

A  Mindless  Game 
of  Motor  Skill 

Figure  3,  page  49,  is  a  dump  from 
my  sample  program.  EGAPaint  was 


used  to  create  a  rear  view  of  a 
Fokker  Dr.l  triplane.  After  8  X  Zoom 
"detailing,”  the  Fokker  was  saved  as 
an  EGAPaint  file  of  112,016  bytes.  A 
UR/FORTH  application  then  loaded 
the  EGAPaint  image  using  the  rou¬ 
tines  shown  in  Listing  One.  Once 
the  EGAPaint  screen  was  displayed, 
UR/FORTH’s  bit-block  save  routine, 
@BLOCK,  was  used  to  save  a  rectan¬ 
gle  of  screen  image  surrounding  the 
Fokker  to  memory.  (That’s  where  the 
pixel-counting  comes  in.)  The  block 
image  was  then  written  to  disk  by 
UR/FORTH  as  a  DOS  file  of  approxi¬ 
mately  3K,  and  the  112,016  byte 
EGAPaint  file  was  deleted. 

This  Fokker  image  was  used  as 
the  principle  image  for  a  World  War 
I  shoot-'em-up  dogfight.  In  the 
game,  the  Fokker  image  is  loaded 
from  disk  to  an  allotted  memory 
area  and  repeatedly  written  to  the 
display  using  UR/FORTH’s  '.BLOCK. 
Obviously,  cursor  keys  control  (move 
the  Fokker  relative  to)  the  gunsight 
of  your  aircraft,  and  the  Fokker  pilot 
has  a  will  (mind?)  of  his  own  as  his 
range  of  random  motion  changes  as 
hits  are  scored. 

Animation  of  detailed  images  is  a 


natural  outgrowth  of  this  technique 
because  complicated  rotations  and 
so  on  can  be  captured  as  a  series  of 
frames  that  can  be  loaded  into 
memory,  rapidly  swapped,  then  dis¬ 
carded.  Currently,  I’m  using  this 
method  in  developing  computer- 
assisted  instruction  (CAD  modules 
for  my  courses.  As  virtual  memory 
becomes  commonplace  with  OS/2  et 
al.,  we  may  see  this  technique  used 
increasingly.  Until  then,  experiment, 
and  if  you  discover  something,  for 
goodness  sake,  publish  it! 

The  program  begins  with  initial 
credits  and  a  menu.  When  play 
begins,  a  Fokker  Dr.l  triplane,  a  gun- 
sight,  your  remaining  ammunition, 
and  the  number  of  hits  scored  are 
displayed.  The  Fokker  is  moving 
slowly  and  is  unaware  of  your  pres¬ 
ence.  Your  mission  is  to  maneuver 
your  plane  (the  gunsight)  into  posi¬ 
tion  behind  the  Fokker  and  score  20 
hits  in  the  fuselage  area.  The  space 
bar  fires  the  guns. 

Remember,  you  are  flying  the  pur¬ 
suit  aircraft  (the  gunsight)  not  the 
Fokker,  so  the  controls  take  a  little 
getting  used  to.  The  number  pad  is 
your  “stick.”  Pressing  the  up  arrow 


key  (8)  pushes  the  stick  forward,  and 
because  the  gunsight  is  “fixed”  in 
the  center  of  the  screen,  diving 
makes  the  Fokker  move  (relatively) 
up!  Similarly,  pulling  back  on  the 
stick  with  the  down  arrow  key,  pulls 
your  plane’s  nose  up,  making  the 
Fokker  move  down.  Left  and  right 
arrow  keys  move  you  left  (the  Fokker 
moves  right)  and  right  (the  Fokker 
moves  left).  Got  it?  The  diagonal  keys 
function,  also.  This  is  useful  because 
you  will  often  want  to  pull  up  and 
left,  or  push  down  and  right,  simul¬ 
taneously. 

When  no  cursor  key  is  pressed, 
the  Fokker  simply  flies  off-screen 
upward  to  the  left,  which  is  analo¬ 
gous  to  your  plane  diving  to  the 
right.  This  is  undoubtedly  a  func¬ 
tion  of  the  random-number  genera¬ 
tor’s  less-than-perfect  randomness. 
What  is  ironic,  though,  is  that  it  is 
exactly  what  would  tend  to  happen 
were  you  to  let  go  of  the  controls  in 
a  Sopwith  Camel.  “The  Camel  spun 
very  quickly,  had  a  very  sensitive 
elevator  control,  and  was  very  quick 
on  right-hand  turns  due  to  the  gyro¬ 
scopic  effects  of  the  heavy  rotary- 
engine  and  the  short  fuselage.” 


Range  of 
random  motion 
(limited  to 
insure  overlap 
of  bit  planes). 


Figure  3:  Display  of  Fokker  Dr.l  triplane  (measurements  in  pixels) 


Dr.  Dobb’s  Journal,  November  1987 


49 

847 


EGA  GRAPHICS 

(continued  from  page  49) 

(Campbell,  1984). 

To  just  type  and  run  the  program 
without  modification,  you  need  UR/ 
FORTH  (a  segmented  Forth-83 
model)  from  Laboratory  Microsys¬ 
tems.  The  video  driver,  EGA- 
GRAPH.EXE,  must  be  installed  before 
UR/FORTH.  The  file  DERFOKKR.IMG, 
which  contains  the  image  created 
with  EGAPaint,  must  also  be  in  the 
active  directory.  Using  the  technique 
I  have  already  described,  you  may 
wish  to  build  your  own  aircraft  with 
a  paint  program,  load  it  into  video 
RAM,  and  use  the  bit-block  operator 
@BLOCK  to  capture  the  image  to 
memory.  You  then  only  have  to 
write  the  image  to  a  file  from  which 
it  can  be  loaded  whenever  needed. 

The  fundamentals  of  the  game  are 
neither  specific  to  Forth  nor  to  the 
EGA.  Alternately,  you  can  build  an 
aircraft  using  simple  lines  and  boxes 
that  will  work  adequately  with  the 
CGA.  Figure  3  shows  the  basis  for 
the  bit  map  and  how  that  image  is 
overwritten.  Because  the  Fokker  bit 
map  is  located  by  the  coordinates 
of  the  upper-left  corner,  the  fuselage 
“hit  zone”  is  simply  offset  by  a  simi¬ 
lar  range  of  coordinates  surround¬ 
ing  the  corner. 

The  area  of  sky  within  the  bit  map 
beyond  the  actual  image  is  used  to 
blot  parts  of  the  previous  image 
when  a  new  one  is  written.  The 
extent  of  the  “extra”  sky  is  thus 
directly  related  to  the  amount  of 
random  motion  allowed  with  the 
image.  Experimentation  resulted  in 
limiting  the  maximum  range  of 
random  movement  from  the  current 
location  to  plus  or  minus  four 
pixels.  Combined  with  a  possible 
two  additional  pixels’  movement 
from  a  cursor  keypress,  the  maxi¬ 
mum  Fokker  movement  is  six  pixels 
per  loop  cycle.  The  area  of  sky  sur¬ 
rounding  the  Fokker  would  allow 
for  more  random  motion  without 
leaving  garbage  all  over  the  screen, 
but  1  don’t  think  you’ll  really  want  it. 

The  program  was  developed  using 
both  a  Compaq  Deskpro  (8086)  and 
a  6-MHz  IBM  PC/AT,  with  the  QUICK- 
EYS.COM  keyboard  speedup  pro¬ 
gram.  QUICKEYS.COM  was  listed  in 
a  back  issue  of  PC  Magazine  and  can 
be  downloaded  from  its  IRS  bulletin 


Dr.  Dobb’s  Journal,  November  1987 

848 


EGA  GRAPHICS 
(continued  from  page  51) 

board.  (Ray  Duncan  reports  that 
Cruise  Control  also  works.)  FOKKER 
runs  fine,  albeit  a  little  slowly,  on  a 
vanilla  PC  without  QUICKEYS,  but 
with  anything  faster  than  a  4.77- 
MHz  8088,  it  requires  some  sort  of 
keyboard  speedup  program.  The 
normal  keyboard  routines  just  don’t 
read  the  stick  routines  fast  enough, 
and  the  Fokker  flies  off  the  screen 
despite  all  efforts  to  catch  him. 

UR/FORTH  users  may  want  to  use 
Tom  Almy’s  Native  Code  Compiler 
to  make  the  game  faster  on  a  stan¬ 


dard  PC,  but  it  really  doesn’t  need 
the  extra  speed  on  faster  machines. 
On  both  Compaq  386  and  Zenith 
386  models,  it  is  really  too  fast  and 
probably  ought  to  have  a  slow-down 
loop  in  the  gun  firing  routines 
added.  The  rate  of  fire  varies  with 
the  processor,  also.  An  80386  will 
fire  at  about  480  rounds  per  minute 
(per  gun),  a  possible  but  high  rate. 
(World  War  I  aircraft  synchronizers 
varied  the  rate  of  fire  with  engine 
speed.) 

The  Game  Listing 

The  source  is  heavily  commented. 
Probably  the  easiest  way  to  under¬ 


stand  the  game  is  to  examine  the 
main  loop  definition  on  Screen  15. 
After  setting  initial  values,  display 
mode,  and  so  on,  DOGFIGHT  enters 
a  BEGIN ...  AGAIN  loop.  DR.l  puts 
the  Fokker  bit-map  address  on  the 
stack.  Next,  the  current  screen  coor¬ 
dinates  of  the  upper-left  CORNER  of 
the  bit  map  are  fetched  to  the  stack. 
MOVEFOKKER  leaves  two  random 
values  within  the  current  RANGE  of 
random  motion  on  the  stack.  These 
are  vector-added  to  CORNER'S  x  and 
y.  ?STICK  then  determines  if  a 
cursor  keypress  is  in  the  buffer  and, 
if  it  is,  leaves  values  appropriate  to 
the  key  pressed  on  the  stack;  other¬ 
wise,  it  leaves  a  pair  of  Os.  The 
?STICK  values  are  then  vector-added 
to  the  previously  “moved”  corner 
values,  and  the  Fokker  is  displayed 
at  the  new  location  by  '.BLOCK.  What 
this  does  is  randomly  shift  the 
Fokker's  position  and  allow  your 
keypress  to  counteract  (somewhat) 
the  random  shift. 

Next,  SHOOTING?  (defined  in 
Screen  10)  polls  the  keyboard  and,  if 
a  key  has  been  pressed,  checks  to 
see  if  it  was  the  space  bar.  If  it  was, 
then  if  there  is  any  AMMO  left,  it 
fires  the  guns,  decrementing  the 
ammo  by  two  rounds,  and  calls 
HIT?  (defined  in  Screen  9)  to  see  if 
CORNER  is  within  the  x,y  range  that 
describes  a  hit  in  the  Fokker’s  cock¬ 
pit  zone.  If  there  is  a  hit,  it’s  actually 
two  hits  (two  guns,  right?),  and  so 
#HITS  is  incremented  by  2.  With 
each  four  hits  EXCITEMENT  incre¬ 
ments  the  RANGE  of  random  motion 
and  the  Fokker  pilot  shouts  a  curse, 
in  German,  to  distract  you.  To  avoid 
offending  anyone,  I  have  made  the 
curses  in  Screen  8  merely  illustra¬ 
tive,  but  you  can  probably  come  up 
with  somewhat  more  irritating  and 
appropriate  phrases. 

Back  in  the  DOGFIGHT  loop,  you 
now  check  #HITS  again  to  see  if  the 
Fokker  has  been  hit  20  times.  If  it 
has,  it  EXPLODES  with  three 
BURSTS,  and  you  WIN  and  EXIT.  If 
it  hasn't  been  hit  20  times,  you 
check  the  AMMO  remaining.  If  it’s 
zero,  you  LOSE  and  EXIT.  Finally, 
you  redisplay  the  GUNSIGHT. 

The  code  for  ?stick  in  Screen  7 
leaves  -2s,  0s,  or  +  2s  depending  on 
the  key  pressed.  Initially,  the  RANGE 
of  Fokker  motion  is  set  at  -1  to  +1. 
This  means  that  you  can  outfly  the 


52 


Dr.  Dobb's  Journal,  November  1987 

849 


Fokker  easily,  at  first.  With  each  four 
hits  in  the  fuselage  of  the  triplane, 
however,  RANGE  expands  by  -1  and 
+  1:  after  the  first  4  hits,  RANGE  is 
-2  to  +2;  after  16  hits,  it  is  -4  to 
+  4 — twice  the  range  of  control  you 
have  with  the  numeric  keypad.  Be¬ 
cause  you  are  making  a  conscious 
effort  to  move  in  one  direction,  and 
the  Fokker  is  moving  randomly,  you 
can  still  get  him  in  your  sights  after 
16  hits  (statistically).  But  he  doesn’t 
stay  there  very  long! 

Screen  1  loads  UR/FORTH’s  DOS 
level-2  interface  and  builds  the  file¬ 
handling  words.  Screen  2  assumes 
you  have  a  file  containing  the  Fokker 
image,  called  DERFOKKR.IMG,  in  the 
same  directory  and  loads  it  into 
memory.  Note  that  DR.l  allots  3,000 
bytes  whereas  the  file  is  only 
2,866  +  16  (palette)  bytes  long. 

There  is  something  funny  going 
on  in  LMI’s  sizing  of  arrays  for 
@BLOCK  and  IBLOCK,  and  the  “cor¬ 
rect”  values  seem  to  pick  up  garbage 
somehow;  the  “oversized"  3,000 
bytes  is  empirically  adequate. 

Availability 

All  the  source  code  for  articles  in 
this  issue  is  available  on  a  single 
disk.  To  order,  send  $14.95  to  Dr. 
Dobb’s  Journal,  501  Galveston  Dr., 
Redwood  City,  CA  94063,  or  call  (415) 
366-3600,  ext.  216.  Please  specify  the 
issue  number  and  format  (MS-DOS, 
Macintosh,  Kaypro). 

For  non-Forth  programmers  with 
an  EGA  display  who  want  to  pit 
their  skill  against  the  red  Fokker,  a 
complete  unprotected  run-time  ver¬ 
sion  (load-and-go  .EXE  file)  is  avail¬ 
able  from  me  for  $14 — cash,  check, 
or  M.O.  only;  no  purchase  orders, 
please.  Have  fun! 

Bibliography 

Campbell,  Christopher.  Aces  and  Air¬ 
craft  of  World  War  I.  New  York: 
Greenwich  House,  distributed  by 
Crown  Publishers  Inc.,  1984:139. 
Cockerham,  John  T.  “The  EGA  Stan¬ 
dard.”  PC  Tech  Journal,  vol.  4,  no.  10 
(October  1986):  49-79. 

Hoffmann,  Thomas  V.  “Graphic  En¬ 
hancement.”  PC  Tech  Journal,  vol.  3, 
no.  4  (April  1985):  58-77. 

Mansfield,  Victor.  “Scientific  Graph¬ 
ics  with  the  EGA.”  PC  Tech  Journal. 
vol.  3,  no.  9  (September  1985):  163- 
170. 


IBM  Personal  Computer  Seminar  Pro¬ 
ceedings.  vol.  2,  no.  11  (November 
1984). 

Petzold,  Charles.  “Exploring  the 
EGA,  Part  1.”  PC  Magazine,  vol.  5,  no. 

14  (August  1986):  367-384. 

Petzold,  Charles.  “Exploring  the 
EGA,  Part  2.”  PC  Magazine,  vol.  5,  no. 

15  (September  16,  1986):  367-384. 
Ross,  Hugh  N.  “Programming  the  En¬ 
hanced  Graphics  Adapter.”  Ex¬ 
change.  (September/October  1986): 
14—17.  International  Business  Ma¬ 
chines  Corp. 

Wilton,  Richard.  “Programming  the 
Enhanced  Graphics  Adapter.”  Byte 
vol.  10,  no.  11  (Fall  1985). 


DDJ 

(Listings  begin  on  page  88.) 

Vote  for  your  favorite  feature/article. 
Cicle  Reader  Service  No.  4. 


Dr.  Dobb’s  Journal,  November  1987 

850 


53 


ARTICLES 


Automated  Interrupt 
Handling  in  C 


by  Ron  Miller 


A  good  case  could  be  made 
that  anything  that  makes  the 
writing  of  resident  programs 
easier  is  socially  counterproductive. 
There  are  too  many  of  those  darned 
things  already.  But  at  times  it  can 
be  useful  to  hack  out  a  quick  resi¬ 
dent  utility  to  serve  a  transient  pur¬ 
pose  such  as  reconfiguring  hard¬ 
ware  on  the  fly  or  offering  an  extra 
help  screen,  in  much  the  same  way 
as  it  is  sometimes  useful  to  hack  out 
a  file  filter  or  a  printer  configuration 
utility.  If  you  need  to  do  such  things, 
doing  them  in  a  high-level  language 
adds  immeasurably  to  the  ease  and 
accuracy  of  the  results. 

Any  MS-DOS  programmer  who 
writes  memory-resident  utilities  or 
works  with  serial  communications 
knows  that  coding  interrupt  han¬ 
dlers  is  a  task  that  quickly  becomes 
tedious  without  ever  becoming  rou¬ 
tine.  Debugging  one  more  run  of 
assembly  language — especially  some 
assembly  language  triggered  by  an¬ 
other  program — inevitably  burdens 
any  project  with  considerable  over¬ 
head.  If  only  there  were  some  way 
to  get  the  requisite  PUSHs,  POPs, 
CLIs,  STIs,  and  MOV s  straight  once 
and  for  all. 

If  a  set  of  assembly-language  rou¬ 
tines  for  interrupt  handling  could 
be  regularized  and  debugged  and 
compiled,  the  modules  could  be 
stored  away  in  a  library  for  linking 
with  high-level  code  used  for  the 
bulk  of  the  interrupt  handlers  them¬ 
selves. 

Unfortunately,  interrupt  handlers 
are  exceedingly  specialized  objects. 


Ron  Miller,  1157  Ellison  Dr.,  Pen¬ 
sacola,  FL  32503.  Ron  is  a  regular 
contributor  to  Micro  Cornucopia. 


Make  your  own 
TSR  utilities 


Different  registers  must  be  set  and 
preserved;  different  interrupts  called; 
and  in  some  cases,  flags  must  be 
returned  intact.  The  programmer 
who  writes  everything  from  scratch 
in  assembly  language  at  least  retains 
absolute  control  over  all  registers 
and  flags,  but  once  you  move  into  a 
high-level  language  such  as  C,  the 
registers  and  flags  cease  to  be  easy 
to  supervise.  If  every  line  of  C  must 
be  scrutinized  to  determine  what 
registers  will  be  altered  and  what 
flags  will  be  set,  why  not  just  stick 
with  low-level  coding? 

I  have,  I  believe,  devised  some¬ 
thing  close  to  a  minimal  system  for 
automating  the  process  so  that  in¬ 
terrupt  handling,  no  matter  how  com¬ 
plex,  can  be  taken  out  of  the  realm 
of  assembly  language  and  be  made 
routine.  Inevitably,  there  are  patches 
of  assembly  language  in  the  system, 
and  some  of  it  is  compiler  and/or 
memory  model  dependent.  These 
routines,  however,  can  be  done 
once,  then  put  away  for  linking  on 
demand. 

A  Sketch  of  the  Process 

The  key  to  successfully  implement¬ 
ing  this  system  is  a  two-step  proce¬ 
dure  in  which  a  standardized  low- 
level  handler  is  invoked  by  the  inter¬ 
rupt  itself.  This  invariant  first  stage 
transforms  the  registers  (and  the 
stack,  if  necessary)  so  that  the  rou¬ 


tine  can  act  as  a  simple  C  function 
that  calls  another  C  function  to  do 
the  actual  work  of  the  handler.  This 
reestablishment  of  the  C  environ¬ 
ment  allows  the  high-level  handler 
to  use  the  full  range  of  C  syntax,  to 
access  the  run-time  package,  and  to 
employ  automatic  and  previously  in¬ 
itialized  static  variables.  Variations 
from  handler  to  handler  can  thus 
be  confined  to  the  far  more  easily 
maintained  context  of  a  high-level 
language. 

Four  linkable  assembly-language 
routines  are  involved  in  every  appli¬ 
cation: 

1.  A  trivial  initializing  routine  that 
stores  the  data  segment  of  the  C 
code  in  the  code  segment  of  the 
low-level  handler. 

2.  A  generalized  low-level  handler 
routine  that: 

•  PUSHe s  the  registers 

•  resets  the  ds  register  (and  per¬ 
haps  other  registers)  to  the  values 
needed  for  operation  by  the  actual 
resident  C  code 

•  calls  the  high-level  handler  itself 

•  POPs  the  registers  on  return 
from  the  high-level  handler 

•  returns  from  the  interrupt  itself 

3.  An  interrupt  function  that  can 
address  the  operating  system  from 
within  the  high-level  handler.  As  you 
will  see,  this  function  must  use  the 
register  stack  generated  by  the  low- 
level  handler  routine  as  its  data-in 
and  data-out  structure. 

4.  A  routine  that  swaps  the  32-bit 
address  of  the  low-level  interrupt 
handler  for  the  interrupt  vector 
being  captured  while  moving  the 
original  vector  to  an  unoccupied  lo¬ 
cation  in  the  interrupt  table  for  call¬ 
ing  or  chaining. 


54 


Dr.  Dobb’s  Journal,  November  1987 

851 


In  the  code  fragments  to  follow,  I 
neglect  to  set  up  the  necessary 
PROCs,  ASSUMES,  and  SEGMENTS 
for  linking  because  those  housekeep¬ 
ing  details  will  vary  from  compiler 
to  compiler.  The  [bp+wcl  address¬ 
ing  may  be  displaced  by  a  word  or 
two  from  what  you  need,  but  a 
glance  at  a  bit  of  assembly-language 
output  from  your  compiler  should 
reveal  the  necessary  adjustments. 

Storing  the  Data  Segment 
Address 

At  the  very  least,  for  a  handler’s 
resident  C  code  to  work  properly, 
the  ds  register  must  be  reset  to  its 
original  value.  This  is  made  possible 
by  inserting  a  saveds( )  call  some¬ 
where  in  the  initializing  code  before 
any  vector  swapping  is  carried  out. 


Coding 

interrupt  handlers 
is  a  task 
that  quickly 
becomes  tedious 
without 
ever  becoming 
routine. 


The  object  library  of  the  program¬ 
mer  should  therefore  contain  a  com¬ 
piled  version  of  the  following  code: 

PUBLIC  saveds _ ,c _ ds 

c — ds  dw  0  ;In-code  storage  slot  for 

DS. 

saveds — :  ;Or  howeveryour  compiler/ 
assembler  alters 
public  names  for  assembly 
-language  reference. 

mov  cs:c _ ds, 

ret 

This  assumes  that  your  handler  is 
not  so  complex  that  it  requires  a 
separate  stack;  if  it  does,  ss  and  sp 
for  the  internal  stack  could  also  be 
squirreled  away  for  recall.  My  advice, 
however,  is  to  keep  auto  variables  to 
the  minimum  needed  for  communi¬ 
cation  between  C  functions.  Play  stor¬ 
age  games  with  static  variables  if 
considerable  storage  is  needed. 


Keep  it  simple;  use  the  other  fellow's 
stack. 

The  Low-Level 
interrupt  Handler 

The  outline  given  earlier  provides 
the  rationale  for  the  code  in  Listing 
One,  page  100.  This  routine,  labeled 
LhandlO,  provides  the  segment/ 
offset  address  actually  inserted  into 
the  interrupt  table.  In  all  the  code 
to  follow,  I  use  the  prefix  L  to  tag 
low-level  interrupt  handlers  and  H 
to  tag  high-level  ones. 

Because  some  interrupts  return 
information  in  the  flags,  you  cannot 
use  iret  to  end  the  routine.  FAR  ret 


2  strips  the  old  flag  off  the  stack  to 
preserve  the  return.  If  this  routine  is 
not  assembled  as  a  FAR  procedure, 
the  ret  2  must  be  replaced  by  db 
0cah,2,0  so  that  a  FAfi  return  to  the 
caller  is  made.  I  urge  you,  if  at  all 
possible,  to  write  your  high-level  han¬ 
dlers  in  a  “large”  C  so  that  all  calls 
and  returns  are  FAR  and  so  that  the 
entire  memory  of  the  computer  is 
available  to  operations  using  point¬ 
ers.  The  small  decrease  in  code  size 
using  a  small-model  C  is  more  than 
paid  for  by  the  need  to  allocate  peek 
and  poke  room  in  the  data  segment 
when  FAR  manipulations  must  be 
made. 


Dr.  Dobb’s  Journal,  November  1987 

852 


55 


INTERRUPT  HANDLING 

(continued  from  page  55) 


You’ll  note  that  the  routine  calls 

an  EXTERN  Hhandl _ Naturally, 

you  must  name  your  high-level  han¬ 
dler  Hhandl  ( )  so  that  the  linker  can 
find  it.  The  2  in  the  name  allows  you 
to  place  several  versions — Lhandli), 
Lhand2( ),  Lhand3( ),  and  so  on — into 
your  library  for  use  in  complex  pro¬ 
grams  involving  several  interrupt  han¬ 
dlers.  As  long  as  each  low-level  han¬ 
dler  calls  its  proper  high-level  part¬ 
ner,  there  isn’t  any  confusion. 

As  you  will  see  in  the  next  sec¬ 
tion,  the  power  of  this  strategy  de¬ 
pends  upon  having  the  call  to  the 
high-level  handler  immediately  pre¬ 
ceded  by  the  push-the-registers  state¬ 
ments.  If  you  insist  upon  setting  up 
your  own  stack,  swap  ss  and  sp 
before  the  PUSHf  and  after  the  POPf 
so  the  register  stack  also  serves  as 
the  stack  for  the  C  routine. 

The  High-Level  Handler 

The  design  of  the  high-level  handler 
depends  on  an  interesting  feature 
of  the  C  language:  setting  up  the 
stack  before,  and  cleaning  up  the 
stack  after  a  function  call,  are  the 
responsibilities  of  the  calling  rou¬ 
tine,  not  of  the  function  itself.  C 
does  this,  I  gather,  to  allow  for  the 
passing  of  a  variable  number  of  argu¬ 
ments.  One  secondary  consequence 
of  this  design  is  that  your  higher- 
level  function  can  be  fooled  into 
treating  the  stack  it  inherits  as  if  it 
contained  function  arguments. 
Thus,  if  you  give  your  C  code  han¬ 
dler  a  fake  argument  such  as: 

Hhandl  (fake) 

int  fake; 

{...} 

the  compiler  will  treat  the  2-byte 
region  containing  the  pushed  value 
of  the  ay  register  as  though  it  were 
an  integer  that  had  been  passed  to 
the  function.  In  effect,  the  entire 
stack  of  pushed  register  values  be¬ 
comes  available  to  the  high-level  han¬ 
dler  as  an  automatic  variable. 

If  within  the  handler  you  declare: 

typedef  struct  { 


int  ax,bx,cx,dx,di,si,bp,ds,es,flags: 

int; 

}  REGS; 

REGS  *regs; 

and  then  initialize  regs  to  point  to 
the  base  of  the  stack: 

regs  =  Make; 

the  stack  is  mapped  to  that  struc¬ 
ture.  (In  practice,  of  course,  this  struc¬ 
tural  definition  would  be  handled 
globally  with  an  #include  statement.) 
Want  to  set  cfr  on  the  stack  to  0? 


Write  regs->cfr  =  0;  without  leaving 
C.  After  the  higher-level  routine  is 
exited  and  the  low-level  routine 
POPs  the  stack  it  had  PUSHed,  the 
original  calling  routine  will  see  a 
return  of  00  in  cfc.  It  doesn’t  matter 
what  the  C  routine  has  done  to  the 
actual  value  of  djc  in  the  meantime, 
because  upon  POPping  the  registers, 
the  low-level  routine  will  restore  the 
old  values — -unless  some  purposeful 
changes  have  been  made  to  the  vir¬ 
tual  structure  Vegs. 

The  value  of  this  ploy  becomes 
clearer  with  a  specific  example.  Sup¬ 
pose  you  wish  to  write  a  resident 
program  that  captures  interrupt  16h 


Dr.  Dobb’s  Journal ,  November  1987 


57 

853 


INTERRUPT  HANDLING 

(continued  from  page  57) 


and  checks  for  five  different  hot  keys 
that  trigger  five  alternative  routines. 
You  would  put  a  long  pointer  to 
Lhandlf )  in  the  place  of  int  16h  in 
the  interrupt  table  and  move  the  old 
int  16h  vector  out  at  NEW16.  The 
higher-level  routine  could  be  writ¬ 
ten  as  in  Listing  Two,  page  100. 

This,  I  submit,  is  considerably 
easier  to  write  and  maintain  than 
an  equivalent  assembly-language  rou¬ 
tine  full  of  CMPs,  labels,  and  leap- 
froggings. 

The  Interrupt  Function 

With  luck,  your  particular  version  of 
C  offers  an  "interrupt”  function  that 
permits  the  stack  manipulations  de¬ 
scribed  in  the  previous  section.  The 
requirements  are: 

•  the  use  of  a  pointer-to-a-register- 
structure  as  an  argument  for  the 
calling  function 

•  the  inclusion  of  all  data-bearing 
registers  (including  a  separate  "flag” 
integer)  in  that  structure 

•  the  potential  for  using  a  single 
memory  region  as  both  *inregs  and 
*outregs,  to  use  the  old  Lattice 
int86( )  terminology.  In  this  case  all 
you  need  to  do  is  be  sure  to  PUSH 
and  POP  the  registers  in  the  low- 
level  handler  in  exactly  the  order 
ordained  by  the  function  definition 
you  have  inherited. 

If,  as  seems  likely,  your  compiler 
has  made  other  choices,  you  can 
find  a  suitable  version  for  assem¬ 
bling  in  Listing  Three,  page  100. 
There  is  probably  no  need  to  ana¬ 
lyze  the  code  at  length,  except  to 
observe  that  the  version  offered  as¬ 
sumes  32-bit  pointers  and  a  need  to 
preserve  ds  and  bp  across  the  func¬ 
tion  call.  Anyone  who  will  profit 
from  automating  interrupt  handling 
will  be  able  to  check  his  or  her  own 
compiler’s  assembler  output  and 
modify  the  base  pointer  addressing 
to  fit  the  brand  and  the  memory 
model  at  hand.  The  basic  strategy  is 
to  load  the  registers  from  the  struc¬ 
ture  pointed  at,  make  the  call  in 
question,  reload  the  structure  with 


the  returning  values  in  the  registers, 
and  exit. 

It  is  worth  noting  that  the  inter¬ 
rupt  function  in  Listing  Three  has 
the  added  virtue  of  being  reentrant 
— which  can  be  significant  if  you're 
capturing  more  than  one  interrupt 
vector  in  a  resident  program.  Earlier 

drafts  of  this  system  broke  down 
because  I  tried  movsbing  parts  of 
the  stack  back  and  forth  to  the 
global  array  used  by  my  compiler’s 
interrupt  function.  Why  not  adapt 
the  situation,  I  reasoned,  to  the 
ready-made  function?  Things  went 
swimmingly  in  simple  hot-key  rou¬ 
tines.  When  I  used  the  clock  inter¬ 


rupt  to  poll  the  DOS-is-interruptable 
flag  given  by  interrupt  Zlh  service 
34h,  however,  the  register  stacks  for 
various  interrupts  began  to  corrupt 
one  another  18.2  times  a  second. 
Moreover,  the  compiler’s  routine  did 
not  return  the  raw  flags  but  pro¬ 
vided  separate  carry  and  zero  flag 

Booleans.  I  found  myself  having  to 
AND  and  OR  the  stack  to  set  up  the 
return.  It  seemed  simpler  to  write 
my  own  function. 

The  Interrupt  Vector 
Swapping  Routine 

The  code  for  the  interrupt  vector 


Dr.  Dobb’s  Journal,  November  1987 

854 


59 


INTERRUPT  HANDLING 

(continued  on  page  59) 


swapping  routine  is  in  Listing  Four, 
page  101.  It  uses  int  21h,  functions 
35h  and  25h,  to  get  and  put  inter¬ 
rupt  vectors.  By  setting  up  a  stan¬ 
dard  function  that  uses  the  old  in¬ 
terrupt  number,  the  new  (or 
chained)  interrupt  number,  and  a 
pointer  to  the  low-level  handler,  you 
can  carry  out  the  entire  process  of 
vector  swapping  without  explicit  re¬ 
course  to  assembly  language.  Notice 
that  the  swap  routine  returns 
nonzero  in  a*  (modify  it  if  a?c  is  not 
your  integer-return  register)  if  the 
chaining  interrupt  is  in  use.  Such  a 
feature  could  be  useful  in  a  find-an- 
unused-interrupt  routine. 

Once  again  I  assume  32-bit  point¬ 
ers  and  preserve  bp  and  ds.  A  small- 
model  C  routine  would  require 
more  explicit  loading  of  segment  reg¬ 
isters  into  ds  before  using  service 
25h.  In  any  case,  the  name  of  the 
low-level  interrupt  handler  (in  this 
case,  Lhandl )  is  used  as  the  final 
function  argument  because  C  com¬ 
pilers  treat  a  function  name  by  itself 
as  a  function  pointer — 32-  or  16-bit 
— depending  on  the  memory  model. 


Setting  Things  Up 

If  the  four  functions  I’ve  discussed 
have  been  compiled  and  stored  in 
an  object  file  library,  setting  up  a 
resident  program  is  scarcely  more 
complicated  than  the  writing  of  the 


handler  code  itself.  Listing  Five,  page 
102,  provides  a  template  for  initializ¬ 
ing  a  generic  resident  program.  For 
clarity's  sake  I  capture  a  single  inter¬ 
rupt;  however,  there  is  no  limit  to 
the  number  of  vectors  that  can  be 
inserted  into  the  table  with  multiple 
chgintl  Is.  Certainly,  in  production 
software  the  error-handling  would 
be  more  complex. 

To  estimate  the  program  size,  I 
ordinarily  use  interrupt  21b,  service 
51h,  to  obtain  the  PSP  address  of  the 
C  program,  then  subtract  that 
number  from  the  paragraph  address 


of  the  top  of  the  static  heap  — plus 
a  couple  more  paragraphs  just  to  be 
on  the  safe  side. 

DDJ 

(Listings  begin  on  page  lOO.) 

Vote  for  your  favorite  feature/article 
Circle  Reader  Service  No.  5. 


Dr.  Dobb's  Journal,  November  1987 


61 

855 


ARTICLES 


An  Alternative 
to  Soundex 

by  Jim  Howell 


...  if  things  had  to  be  identical  before 
you  could  recognize  them,  you  would 
never  recognize  anything  at 
all. .  . .  The  fact  is  that  our  ability  to 
work  and  to  act  in  the  real  world  de¬ 
pends  on  our  accepting  a  " tolerance " 
in  our  recognition  and  in  our  language. 

— Jacob  Bronowski,  A  Sense  of  the 
Future 

Much  of  the  art  of  communi¬ 
cation  relies  on  the  ability 
to  differentiate  what  is 
meant  from  what  is  said.  This  goes 
far  toward  explaining  why  I  have  so 
much  trouble  at  the  keyboard;  com¬ 
puters  have  none  of  the  tolerance 
that  a  bad  (and  slightly  dyslexic)  typ¬ 
ist  such  as  myself  needs.  My  comput¬ 
er  doesn't  understand  that  “dri” 
means  "dir,”  so  it  yells  "Bad  com¬ 
mand  or  file  name,”  then  sits  there, 
waiting  for  something  intelligible. 

To  you  and  me,  it’s  obvious  what 
command  I  actually  meant,  but  to  the 
computer,  lacking  the  recognition 
tolerance  shaped  by  millions  of  years 
of  evolution,  the  input  is  gibberish. 
As  simple  as  this  example  seems,  it 
has  proved  a  difficult  problem  to 
solve  through  some  mechanism  that 
a  computer  can  use  efficiently.  In 
spite  of  the  time  and  effort  that  have 
gone  into  attempts  to  solve  this  prob¬ 
lem  (the  venerable  Soundex  algo¬ 
rithm  dates  from  World  War  I),  no 
wholly  satisfactory  solution  has  yet 
come  to  light. 

But  that  isn’t  to  say  that  there 
haven’t  been  gains  on  the  problem. 
In  fact,  another  promising  solution 
has  recently  arisen  that  is  certainly 


Jim  Howell,  45  Lodgepole  Dr.,  Ever¬ 
green,  CO  80439.  Jim  is  a  former  geolo¬ 
gist  who  now  works  as  a  programmer 
for  Reference  Technology  in 
Evergreen. 


Upping  the 
word-  matching 
hit  rate 


worth  passing  on,  warts  and  all.  The 
algorithm  comes  from  Michael 
Bickel  and  was  published  in  the 
March  issue  of  the  CACMJ 

The  method  is  simple — each  letter 
has  a  weight  associated  with  it.  The 
"likeness  score”  between  two  words 
is  calculated  by  summing  the  weights 
of  all  the  letters  in  common  between 
the  two  words,  each  letter  being 
counted  only  once.  Letters  are  con¬ 
verted  to  uppercase,  and  numbers, 
punctuation,  and  white  space  are  ig¬ 
nored.  The  method  can  be  easily 
made  sensitive  to  case  or  to  consider 
numbers. 

The  weights  are  based  on  the  fre¬ 
quencies  of  the  letters,  being  the  neg¬ 
ative  of  the  logarithm  (base  2)  of  the 
frequency  of  the  letter  (the  frequen- 


Weight 

Letters 

3 

aeinost 

4 

dhlru 

5 

cfgmpw 

6 

b  v 

7 

kq 

8 

i.x.y 

9 

Z 

0 

all  others 

Table  1:  Letter  weights  used  in 
Bickel's  algorithm 


Bit 

7  6  5  4  3  2 

10 

ByteO 

-  a  e  I  n  o 

St 

Byte  1 

- d  h  1 

ru 

Byte  2 

—  c  f  g  m 

pw 

Byte  4 

b  v  k  q  j  x 

yz 

Table  2:  Positions  in  the  letter  set 


cies  sum  to  1.0).  This  gives  greater 
weight  to  the  less  common  letters. 
Bickel’s  weights  are  given  in  Table  1, 
below. 

As  an  example,  say  you  want  to 
match  the  word  ecdysiast.  You  first 
create  a  set  (I  call  this  a  letter  set; 
Bickel  uses  the  name  MASK ) — ac- 
deisty — from  the  letters  in  the  word. 
Now  consider  the  word  ecstasy, 
which  has  the  letter  set  acesty.  To 
compare  these,  take  the  intersection 
of  the  two  sets — namely,  acesty.  The 
first  letter  in  the  new  set  is  a.  The 
weight  of  this  letter  is  3,  so  the  like¬ 
ness  score  is  set  to  this.  The  next  letter 
is  c,  so  you  add  5  to  the  score.  Then 
add  3  for  e,  3  for  s,  3  more  for  t,  and  8 
for  y  to  give  a  likeness  of  25. 

To  compare  ecdysiast  to  ectoplasm, 
you  form  the  letter  set  acelmopst. 
The  intersection  with  the  set  of  ec¬ 
dysiast  is  acest,  and  so  the  likeness 
score  of  these  two  words  is  only  17.  If 
you  were  to  transpose  two  letters 
and  compare  edcysiast,  say,  the  like¬ 
ness  scores  would  still  be  25  and  17. 
The  score  of  comparing  ecdysiast 
and  edcysiast  would  be  32,  however, 
so  ecdysiast  becomes  the  word  to 
look  at  more  closely. 

Implementation 

Bickel  illustrated  the  algorithm  using 
NOMAD2  (a  fourth-generation  lan¬ 
guage),  but  it  can  just  as  easily  be  im¬ 
plemented  in  C  using  a  bit  set  with  26 
members.  For  each  word,  set  the  ap¬ 
propriate  bit  for  each  letter,  leaving 
the  rest  unset.  Then  combine  the  two 
sets  with  a  logical  AND.  Only  the  com¬ 
mon  letters  are  left  in  the  combined 
set,  which  are  then  added  using  the 
appropriate  weights. 

You  can  do  this  in  several  ways.  I 
chose  to  use  a  4-byte  array  (of  type 
unsigned  char)  to  hold  the  letter  set.  A 
static  data  array,  which  is  indexed  by 


62 

856 


Dr.  Dobb's  Journal,  November  1987 


lowercase  letters,  is  initialized  with 
the  index  to  the  appropriate  byte  in 
the  letter  set  array  along  with  the 
mask  necessary  to  set  the  correct  bit 
(see  Listing  One,  page  110).  The  letter 
set  array  is  designed  so  that  the  first 
three  bytes  have  common  weights, 
with  the  rest  being  stuffed  into  the 
fourth  byte  (see  Table  2,  page  54). 

To  construct  the  letter  set  for  a 
word,  go  through  the  word  charac¬ 
ter  by  character.  If  the  character  is  a 
letter,  convert  it  to  lowercase  and  use 
this  as  an  index  into  the  data  array  to 
set  the  appropriate  bit  in  the  letter  set 
array.  Duplicate  letters  result  in 
needless  repetition;  once  a  bit  is  set, 
setting  it  again  doesn’t  do  any  good, 
but  any  scheme  to  avoid  setting  an 
already  set  bit  would  have  too  much 
overhead,  so  the  problem,  such  as  it 
is,  is  ignored. 

Calculating  the  likeness  score  from 
the  two  letter  sets  is  straightforward. 
The  first  three  bytes  use  a  common 
algorithm.  Combine  the  appropriate 
bytes  from  the  two  letter  sets  into  a 
mask  with  a  logical  AND;  then  exam¬ 
ine  the  rightmost  bit  of  the  resulting 
byte,  and  if  it  is  set,  increase  the  like¬ 
ness  score  by  the  appropriate  weight; 
and  finally,  shift  the  mask  to  the  right 
one  bit  and  examine  the  rightmost 
byte,  and  so  on  through  the  bits,  us¬ 
ing  a  for  loop. 

The  fourth  byte  is  more  complicat¬ 
ed  because  its  letters  have  different 
weights.  The  easiest  solution  is  to  op¬ 
erate  on  each  bit  individually,  in  ef¬ 
fect  hard-wiring  the  weights.  Speed 
is  most  important  here,  space  and  el¬ 
egance  less  so. 

As  an  example,  let’s  search  the  set 
of  keywords  in  Table  3,  right,  using 
several  common  (for  me)  misspell¬ 
ings  of  geochemistry.  Using  geochem- 
sitry  (s  and  i  transposed),  the  highest 
likeness  scores  are: 


geochemistry  (46) 

biochemistry  (41) 

geochronometry  (40) 

geothermics  (38) 

geophysical  (34) 


A  search  with  another  misspelling 
geochemistru  (u  mistyped  for  y) — 
gives: 

geochemistry  (38) 

geothermics  (38) 

biochemistry  (33) 


geochronometry  (32) 

geochemical  (28) 

And  using  geochemistr  (y  dropped), 
you  have: 


geochemistry  (38) 

geothermics  (38) 

biochemistry  (33) 

geochronometry  (32) 

geochemical  (28) 


If  the  search  word  is  simply  chem- 
sitry  (with  the  transposition),  the  best 
matches  are: 


geochemistry  (38) 

biochemistry  (38) 

geochronometry  (32) 

geothermics  (30) 

geophysical  (26) 


The  last  three  examples  point  out 
the  main  weakness  of  this  method: 
letter  order  is  not  significant.  To  us, 
geochemistr  is  nothing  like  geother¬ 
mics,  but  the  method  is  too  simple  to 
detect  this.  Other  algorithms  can 
check  for  transpositions,  or  altered 
letters,  or  dropped  letters,2  but  be¬ 
cause  of  combinatorial  complexities, 
these  can  be  slow.  The  strength  of 
Bickel's  algorithm  lies  in  its  simplicity, 
which  in  practice  means  speed.  It  can 
be  used  as  a  first  pass  to  select  likely 
candidates  for  the  slower  routines. 

Limitations 

Bickel’s  algorithm  has  some  trouble 
with  short  words,  especially  if  the 
letters  are  high-frequency  ones.  Such 
a  word  could  match  a  great  many 
other  words — a  major  problem  if  a 


geochemical 

geochemistry 

geochronology 

geochronometry 

geodesy 

geographic 

geologic 

geological 

geomac 

geometry 

geomorphology 

geophysical 

geophysics 

geopressure 

geothermal 

geothermics 

biochemistry 

Table  3:  Keywords  taken  from  a 

reference  list. 


given  word  list  is  long.  In  fact,  any 
word  composed  of  frequent  letters 
behaves  in  this  way,  no  matter  what 
the  length.  The  longer  the  list  to  be 
searched,  the  greater  the  problem. 

As  an  example,  a  search  of  the 
keyword  list  using  meter  gives: 


geochemistry  (15) 

geothermics  (15) 

geochronometry  (15) 

biochemistry  (15) 

geothermal  (15) 

geometry  (15) 


In  a  long  word  list,  the  search  should 
be  restricted — for  example,  by  as¬ 
suming  that  the  first  letter  is  correct. 
Even  this  may  not  be  enough,  as  with 
George: 


geomorphology  (15) 

geochemistry  (15) 

geochronology  (15) 

geochronometry  (15) 

geopressure  (15) 

geographic  (15) 

geothermal  (15) 

geothermics  (15) 

geometry  (15) 


Here  additional  criteria,  such  as  hav¬ 
ing  the  lengths  of  the  words  be 
"close,”  could  be  profitably  used  to 
narrow  down  the  search. 

Thoughts  on  Optimisation 

In  a  real  application  of  this  algorithm, 
you  would  probably  want  to  build  it 
for  speed,  which  will  mean  a  bout 
with  the  assembler.  But  preplanning 
can  help.  If  there  are  only  a  handful 
of  words  in  the  list  to  be  searched, 
any  savings  in  time  will  likely  be  too 
small  to  be  noticed  by  the  user.  Also, 
you  could  save  some  time  by  con¬ 
structing  the  letter  sets  in  advance  so 
that  speed  wouldn't  be  as  important. 
Fast  code  is  most  essential  when 
there  is  a  large  list  to  search  and  in 
interacting  with  an  impatient  user. 

For  a  fast  assembler  routine,  I 
would  unfold  all  the  loops,  making 
the  code  bulky  but  avoiding  all  the 
overhead  of  incrementing  and  com¬ 
paring  the  counter.  The  masks  and 
weights  would  be  built  in  as  immedi¬ 
ate  data,  again  sacrificing  space  for 
speed.  Order  is  not  important  here, 
so  by  putting  the  least  common  let¬ 
ters  at  the  end,  it  would  be  possible  to 
check  for  an  empty  set  and  return 


Dr.  Dobb's  Journal,  November  1987 


63 

857 


SOUNDEX  ALTERNATIVE 

(continued  from  page  63) 


early  from  the  function  if  there  was 
nothing  more  to  check. 

It  is  possible  to  save  some  space  and 
a  little  speed  by  eliminating  the  two 
most  common  letters,  e  and  t.  The  let¬ 
ter  set  could  then  be  stored  in  three 
bytes,  and  two  nearly  inevitable  com¬ 
parisons  could  be  eliminated  from 
the  likeness  calculation. 

Comparison  with  the 
Soundex  Algorithm 

The  Soundex  algorithm  is  a  creaking 
ancient  by  computer  standards,  but 
because  it  is  still  widely  used,  it  is 
worth  comparing  it  with  Bickel's 
new  algorithm.  To  review  (or  intro¬ 
duce)  it,  the  Soundex  algorithm  con¬ 
structs  a  code  by  preserving  the  first 
letter  of  a  name.  From  the  rest,  elimi¬ 
nate  all  vowels  and  the  letters  h  and 
w  along  with  all  doubled  letters 
(keeping  one  of  them).  The  remain¬ 
ing  letters  (except  the  first)  are  re¬ 
placed  with  numbers.  The  key  is  the 
initial  letter  and  the  first  three  num¬ 
bers  of  the  rest  of  the  name,  padded 
with  Os  if  necessary.3 

The  main  advantage  the  Soundex 
algorithm  has  over  Bickel's  is  that  the 
computation  is  done  only  once  per 
word.  The  key  is  more  like  a  hash 
function  in  that  it  is  usually  arbitrary 
and  there  is  no  good  way  to  compare 
two  values.  Bickel’s  algorithm  re¬ 
quires  a  set  of  calculations  at  each 
comparison  but  does  a  good  job  of  in¬ 
dicating  the  amount  of  similarity. 

Notes 

1  Michael  Allen  Bickel,  “Automatic 
Correction  to  Misspelled  Names:  A 
Fourth-Generation  Language  Ap¬ 
proach,”  CACM  30  (March  1987):  224- 
228. 

2.  Dave  Taylor,  “Wordz  That  Almost 
Match,”  Computer  Language  3  (No¬ 
vember  1986):  47-59. 

3.  Donald  E.  Knuth,  The  Art  of  Com¬ 
puter  Programming,  Volume  3: 
Searching  and  Sorting  (Reading, 
Mass.:  Addison-Wesley,  1973):  391- 
392. 

DDJ 

(Listing  begins  on  page  98.) 

Vote  for  your  favorite  feature/article. 

Circle  Reader  Service  No.  6. 


3-D  IMAGES 


Listing  One  (Text.  begins  on  page  18.) 


main. c 

Program  skeleton  for  testing  the  contour  map  algorithm 

This  program  reads  in  a  MacPaint  document,  unpacks 
it,  and  calls  the  contour  analysis  algorithms. 

Once  the  analysis  is  complete  the  program  writes  out  a 
file  called  grid.dat  that  can  be  read  by  hidlinpix  and 
displayed  in  3D  perspective  with  hidden  lines  removed. 

hidlinpix  is  a  program  published  in  "Programming 
Principles  in  Computer  Graphics"  by  Leendert 
Amraeraal  (John  Wiley  &  Sons,  1986) 

Other  notes: 

Data  structure  libraries  are  derived  from  Alan  Holub's 
column  in  Doctor  Dobb's  Journal.  These  include  the  AVL 
tree  and  the  queue. 

I  modified  the  Lightspeed  C  stdio  library  so  that  the  stdio 
console  window  occupies  only  the  lower  1/3  of  the  screen. 

I  then  use  the  upper  2/3  for  a  window  to  display  the  source 
contour  map  and  a  second  window  to  display  the  progress  of  the 
algorithm. 

William  May 

303A  Ridgefield  Circle 
Clinton,  MA  01510 


Jan  25,  1986 


--*/ 


I include  <stdio.h> 
t include  <WindowMgr . h> 
♦include  "contour. h" 


•root  -  OL; 


/*  array  of  curve  data  */ 


TREE 

WindowPtr  left_wind; 

WindowPtr  right_wind; 

BitMap  fctnap; 

curve_data  curves [100]; 

main() 

{ 

char  c; 

Rect  r; 
int  width; 

/ *  force  the  LSC  stdio  window  to  open  by  doing  a  print f  */ 
print f ("Program  Start ing\n\n") ; 

/*  suppress  the  LSC  dialog  window  when  the  program  ends  */ 
Click_On (false) ; 

/*  open  some  windows...  «/ 
init_windows ( ) ; 

init_raem ( ) ; 

/*  read  in  the  MacPaint  document,  named  "test"  */ 
if  (read_MP ("\ptest",  tbmap))  ( 
r.top  -  0; 
r. bottom  -  VBITMAX; 
r.left  -  0; 
r. right  -  HBITMAX; 
show_bmap(left_wind,  4bmap,  *r); 

f ind_all_contours ()  ; 

/*  make  the  input  file  for  hidlinpix  */ 


) 


raake_hidlin() ; 

/• 

A  "real"  program  should  return  allocated  memory  to  the 
system  here,  etc.  I  will  just  return  and  let  the  heap 
^  reinitialization  take  care  of  everything. 

printf ("Program  complete.  <cr>  to  continue:  ") ; 
c  -  getchar(); 


init_windows  opens  the  two  windows  used  for  testing  graphics 
algorithms.  The  console  window  is  opened  automatically 
by  stdio  when  a  printf  is  called.  Ho  window  template  resources 
are  used.  The  two  window  records  are  kept  in  application  global 
space. 


init  windows () 


{ 


static  WindowRecord  left  wrec,  right  wrec; 
int  height,  width; 

Rect  r; 

height  -  180; 
width  -  144; 

r.top  -  22; 
r.left  -  20; 
r. bottom  -  r.top  +  height; 
r. right  -  r.left  +  width; 

left_wind  -  HewWindow(4left_wrec,  ir,  "\p",  true,  altDBoxProc, 


66 

858 


Dr.  Dobb’s  Journal,  November  1987 


FrontWindow ( ) ,  false,  OL) ; 

r . left  -  184; 

r. right  -  r.left  +  width; 

right_wind  -  NewWindow (4right_wrec,  *r,  "\p",  true,  altDBoxProc, 

FrontWindow( ) ,  false,  OL) ; 

1  End  Listing  One 

Listing  Two 


global. h 


contains  the  external  references 


*/ 

extern 

TREE 

•root; 

extern 

WindowPtr 

left_wind; 

extern 

WindowPtr 

right  wind; 

extern 

BitMap 

bnap; 

extern 

curve_data 

curves [100] ; 

global  variables 

/*  root  of  AVL  tree  */ 

/*  bitmap  being 

U3ed  */ 

/*  array  of  curve  data  */ 


Listing  Three 


End  Listing  Two 


contour. h 

defines  the  leaf  structure  and  does  the  usual  includes 


I include  <WindowMgr.h> 


typedef  struct  { 
union  { 

long  Key; 
Point  p; 

)  header; 
int  curve ; 

)  LEAF; 


/*  Key  -  h,v  concatenated  */ 
/*  curve  that  this  point  is  on  */ 


♦define  HBITMAX  576  /*  max  pixels  in  horiz  direction  */ 
♦define  VBITMAX  720  /*  max  pixels  in  vert  direction  */ 


♦include  "tree.h" 


typedef  struct  curvo_data  { 

Point  p; 

int  parent; 

int  elevation; 

int  searched; 

)  curve_data; 

Listing  Four 


/*  starting  point  of  curve  */ 
/*  parent  (enclosing)  curve  */ 
/*  elevation  of  the  curve  */ 

/*  has  interior  been 

searched?  */ 

End  Listing  Three 


tracer .c 

There  is  only  ona  externally  visible  function;  f ind_all_contours 

Inplanants  contour  search  and  tracing  algorithms 

Soma  sections  arc  based  on  Pavlidis,  "Algorithms  for  Graphics  and 

Image  Processing" 

Quick  description:  the  algorithm  examines  the  interiors  of  known 
contours  to  find  additional  contours. 

It  is  started  by  treating  the  border  of  the  image  itself  as  a  contour. 

The  final  outcome  of  this  algorithm  is  a  tree  containing  all  points 
on  contours  in  the  image.  The  key  for  a  leaf  in  the  tree  is  the  point 
coordinates.  Each  leaf  also  contains  an  index  for  the  contour  containing 
the  point,  and  the  elevation  for  the  point. 

William  May 

created  Jan  31,  19*7 

modified  Apr  10,  1987  improve  the  trace  logic  to  handle 

contours  that  are  multiple  pixels  in 
thickness. 


•define  DEBUG 

•ifdef  DEBUG 
•define  D(x)  x 

•else 

•define  D(x) 

•endif 

•include  <QuickDraw.h> 

♦include  <MsmoryMgr .h> 

♦include  <stdio.h> 

•include  "getpixel.h" 

♦include  "contour. h" 

•include  "global. h" 

/*  two  defines  for  our  trace  directions  */ 

•define  CLOCKWISE  -1 
•define  COUNTERCLOCKWISE  1 

/•  the  index  of  the  image  borders  is  BASE  (the  exterior  contour)  */ 
•define  BASE  -1 

/*  define  the  queue  elements  */ 
typedef  struct  ( 

int  h,  v,  chain; 

)  q_item; 


(continued  on  ne^t  page) 


Dr.  Dobb’s  Journal,  November  1987 


67 

859 


3-D  IMAGES 


Listing  Four  (Listing  continued,  te^ct  begins  on  page  18.) 


static  char  *qp;  /•  pointer  to  the  queue  •/ 

static  int  qnux  -  0;  /•  max  items  in  the  queue  */ 


static  int  current_elevation  -  0;  /•  current  elevation  of  region  •/ 

static  int  next_curve  -  0;  /»  number  of  next  index  in  curve  array  •/ 


static  long  point_count; 


/*  count  the  points  we  have  found  */ 


find  all_contours  finds  all  the  contours  in  a  bitmap 
first  the  left  side  of  the  bitmap  is  entered  into  the  queue  to 
start  off  the  search,  then  oach  time  find  contours  returns 
(sManing  no  more  contours  at  that  level)  tKe  elevation  is  incremented 
for  the  next  round  of  searches. 


- ./ 

int  find  all_contours() 

I 

char  ch; 

Point  pt; 

CrafPtr  old  port; 
int  draw_poInt ( ) ; 
int  item; 
register  int  i; 
char  *makequeue () ; 


qp  -  make queue (4000,  sireof (q_item) ) ; 


start_queue ( ) ; 
find_contours (BASE) ; 


while  (next_point (titem,  tpt))  ( 

trace (pt,  COUNTERCLOCKWISE,  true); 
f ind_contoura (item) ; 

) 


/•  set  up  search  for  exterior  of  bitmap  •/ 

/•  find  the  first  level  of  contours  */ 

/•  parent  -  -1  •/ 

/•  while  there  is  an  unsearched  curve  */ 

/•  set  up  queue  for  next  search  */ 

/*  go  get  'em.  item  -  parent  •/ 


free(qp);  /«  all  done,  get  rid  of  the  queue  •/ 

aet_el«vat ions ( ) ;  /•  set  elevations  in  the  array  •/ 

printf ("max  used  in  queue  is  %d\n",  qnux); 
print f ( "number  of  curves  found  is  %d\n*,  next_curve); 
printf (“number  of  points  found  is  %ld\n",  point_count) / 
printf ("Curve  search  complete.  <cr>  to  continue:  ") ; 
ch  -  getchar () ; 


Cet?ort(&old_port) ; 

SetPort (right  wind); 

EraseRect (A (rrght_wlnd->portRect) ) ; 

D (printf ("Now  traversing  the  tree.\n")); 
D (traverse (root,  draw _point)); 

SetPort (old_port) ; 

) 


start_queue  puts  the  left  edge  of  the  bitmap  into  the 
queue  to  act  as  start  for  the  bitmap  search 


only  the  left  edge  needs  to  be  put  in  because  the  other  edges 
would  be  ignored  anyway 

static  start_queue () 

< 

register  int  i; 
q_item  q; 


) 


q.h  -  0; 

q. chain  -  6;  /*  all  starting  queue  items  are  going  south  */ 


for  (i 


) 


0;  i  <  VBITMAX;  i+4)  ( 

?.v  -  i; 

f  (1  enqueue (6q,qp) ) 

syserr(0,  "Queue  overflow  in  startup  function\n") ; 


set  the  next  curve  starting  point  if  there  is  one, 
otherwise  return  0; 


static  int  next  point (item, pt) 
int  ‘item; 

Point  *pt; 

< 

static  int  last_searched  -  -1;  /•  last  array  index  used  •/ 


) 


if  (♦+last_searched  <  next_curve)  ( 

•item  -  last_searched; 
pt->h  -  curves [ last_searched) .p.h; 
pt->v  -  curves  I last“search#dj .p.v; 
return  1; 

) 

else 

return  0; 


/ 


set  the  elevations  in  the  array 
assume  all  contours  are  ascending 
and  increment  -  100 

- ./ 

set  elevations () 

< 

register  curve_data  *p  -  curves; 
register  int  parent,  I; 


) 


for  (i 


) 


0;  i  <  next_curve;  ii+,  pf+)  ( 
if  ((parent  -  curves ( 1 ] .parent) 
curves (i] .elevation  - 

else 


ourves(i) .elevation  - 


—  BASE) 

100; 

curves (parent] .elevation  +  100; 


find  contours  searches  for  all  contours  within  a  closed  contour, 
basei  on  the  queue  elements 

if  any  are  found  a  non-zero  value  is  returned 


static  int  f lnd_contoura (parent) 

int  parent;  ~  /•  index  cf  enclosing  curve 


( 


*/ 


Dr.  Dobb’s  Journal,  November  1987 


q_its»  it as,  *show_next ( ) ; 
register  int  c;  - 
Point  p; 

int  B,  n«xt_chain; 

Point  firstj  last; 

while  (dequeue (4itera,  op))  { 
c  -  it an. chain; 

if  (c  —  I)  { 

dequeue (4item,  qp) ; 
c  -  i to*. chain; 

) 

if  ((c  >-  5  44  c  <-  7)  || 

(c  —  0  44  ( (next  chain  -  ( ‘ahowjnext (qp) ) .chain)  >-  S)  44 
(next  chain  <-  7)D  ( 

p.h  -  item.h  +  1; 
p.v  -  item.v; 

D ( first  -  p); 


do  { 

aearch_x(4B,  4p); 
if  (B  —  1)  ( 


) 

return  1; 


D  (last  -  p); 

D (show_line ( fir at,  last) ) ; 

/• 

if  trace  found  a  new  contour  then  break  off 
the  acan.  If  the  point  la  a  new  point  on  a 
known  contour,  then  continue. 

*/ 

if  (trace (p,  clockwise,  false))  ( 

/*  and  add  curve  to  array  •/ 

curves (next_curve] .p  -  p; 

curves [ next — curve! .parent  -  parent; 

curves [next^curve] .elevation  -  0; 

curves tnext__curvej  .searched  -  false; 

next  curve*  7; 

brealc; 

)  else 

p.h++;  /*  bump  up  to  next  point  •/ 

) 

else  if  (B  >-  2)  ( 

D  (last  -  p); 

D(ahow_line( first,  last)); 
break;- 

> 

)  while  (1); 


/• - 

scan  pixels  in  the  x  direction  at  point  p 
if  the  bitmap  shows  a  pixel  return  1 

if  the  bitmap  shows  a  pixel  and  the  pixel  is  in  the  tree 
(i.e.  the  point  has  bean  scanned  before)  return  2 
update  the  starting  point 


static  ssarch_x (b,  p) 
int  *b; 

Point  *p; 

( 

long  dcapO; 

/* 

the  union  makes  it  easy  to  use  the  point  coordinates 
as  the  key  in  the  tree 

*/ 

union  ( 

Point  pt; 
long  key; 

»  Q. 


> 


q.pt  -  *p; 

if  (ab  -  getpixel(q.pt.h,  q.pt.v,  4bmap) )  ( 

if  (find(root,  q.key,  demp) )  (*b)**; 

return;  /•  don't  increment  h  on  a  hit  */ 

l 


if  (+*p->h  >-  HBITMAX)  < 
•b  -  2; 


return; 

) 


/•  when  we  hit  the  right  edge, indicate  already 

found  */ 


trace  traces  a  contour  beginning  at  the  specified  point 
at  each  found  point  the  x,y  coordinates  and  chain  code  are 
stored  both  in  the  queue  and  the  tree. 


two  result  codes  are  returned; 

0  indicates  that  the  starting  point  represented  a  new  point  on  an 
existing  (known)  contour,  trace  was  aborted. 

1  indicates  a  new  contour  was  successfully  traced 


static  int  trace (p,  dir,  q  only) 

Point  p;  /*  p  is  the  starting  point  •/ 

int  dir;  /•  dir  la  the  trace  direction  •/ 

Boolean  q_only;  /•  trace  and  put  in  queue  only, 

^  do  not  check  for  dupes  in  tree  •/ 

Point  c,  t;  /*  current  point,  and  transformed  point  for  testing  */ 

int  s;  /•  search  direction  •/ 

Boolean  found,  test_pixel () ; 
int  count,  used,  chain; 
q  item  item; 

LEAT  «lp,  *lp2; 
long  icapO,  dcmp(); 

GrafPtr  old  port; 
union  temp  \ 


)  temp; 


long  key; 
Point  p; 


if  (dir 

else 


—  COUNTERCLOCKWISE) 

a  -  6; 

s  -  2; 


(continued  on  nejet  page) 


Dr.  Dobb's  Journal,  November  1987 


69 

861 


3-D  IMAGES 


Listing  Four  (Listing  continued,  text  begins  on  page  18.) 

if  (q_ooly)  ( 

/•  hz*  this  curva  intarior  baan  aaarchad  bafora?  •/ 
temp.p.h  -  c.h; 
taap.p.v  -  c.v/ 

lp  -  find (root,  taap.kay,  dcap) ; 

if  (curvaatlp->curva) . aaarchad  —  trua)  ( 

printf (“Curva  %d  alraady  aaarchadXn",  lp->curva) ; 
r a turn  0; 

) 

alia 

curva»llp->curva] . aaarchad  -  trua; 

itam. h  “  c.h;  /*  add  data  to  quaua  •/ 

itaa.v  -  c.v; 

itan.  chain  -  •;  / *  nark  baginning  of  now  contour  */ 

if  ( lanquaua (titan, qp) ) 

ayaarr (0,  “Quaua  ovarflow  in  traca\n 

GatPort (told  port) ; 

SatPort (right_vind) ; 

PanPat (whita) 7 
SatPort (old_port) ; 

) 

alaa  ( 

if  (! (lp  -  talloc (aizaof (LEAP) ) ) ) 

ayaarr (MaaErr,  "Inaufficiant  Banory  for  AVL  traa"); 

alaa  ( 

lp->haadar.p.h  -c.h; 
lp->haadar .p.v  -  c.v; 
lp->curva  -  naxt_curva; 

if  (lp2  -  inaart (troot,  lp,  icnp))  ( 

/•  tfraa(lp);  not  iaplaaantad  in  gatnaai]  */ 

raturn  0;  /•  at ar ting  point  alraady  in  tha  traa  )  I  •/ 

alaa  (  /•  if  not  in  tha  traa  than  an quaua  tha  naw  point  •/ 

point_count++ j 

itan.h  -  c.h;  /*  add  data  to  quaua  */ 

itaa.v  -  c.v; 

itaa. chain  -  t;  / *  mark  baginning  of  naw  contour  •/ 

if  ( lanquaua  (t itaa, qp) ) 

ayaarr (0,  “Quaua  ovarflow  in  traca\n“); 

GatPort (told  port) ; 

SatPort (right  wind); 

PanPat (black) ; 

SatPort (old _port) ; 


do  ( 


found  -  falaa; 
count  -  0; 

whila  ( ! found  tt  (count  <  3))  ( 
count**; 

if  (taat_pixal(c,  add_chain(a,  -1  •  dir)))  ( 

aat_pixal (tc,  (chain  -  add  chain(a,  -1  •  dir))); 
a  -  add_chain(a,  - 2  *  dir); 
found  -”tru a; 


) 

alaa  ( 


if  (taat_pixal (c,  a))  ( 
aat_pixal (tc, 
found  -  trua; 


(chain  -  a)); 


) 


) 


0  ( «how_point  (c) ) ; 


if  (taat pixal(c,  add  chain (a,  1  •  dir)))  ( 

aat_pixal(tc,  (chain  -  add_chain(a,  1  *  dir))); 
found  -  trua; 

) 

alaa 

a  -  add_chain(a,  2  *  dir); 


if  (q_only)  ( 

itaai.h  -  c.h;  /*  add  data  to  quaua  */ 

itaai.v  -  c.v; 
itaa. chain  -  chain; 
if  ( lanquaua (titaa,qp) ) 

ayaarr (0,  "GKiaua  ovarflow  in  traca\n“); 

» 

alaa  ( 

if  (1 (lp  -  talloc (aizaof (LEAF)))) 

ayaarr (MaaErr,  “Inaufficiant  naaory  for  AVL  traa“); 

alaa  ( 

lp->haadar .p.h  -  c.h; 
lp->haadar .p.v  -  c.v; 
lp->curva  -  naxt_curva; 


) 

if  ( (uaad 


if  (lp2 


) 

alaa  ( 


) 


inaart  (troot,  lp,  ica^)))  ( 

/•  tfraa(lp);  not  inplaBiantad  in  getmaait  */ 
if  ((c.h  —  p.h)  tt  (c.v  —  p.v)) 

raturn  1;  /•  wa  hava  ratumad  to  tha  atart 

alaa 


raturn  0; 


•/ 


/*  if  not  in  tha  traa  than  anquaua  point  */ 

point_count*+; 

itaa.h  -  c.h;  /*  add  data  to  quaua  */ 

itaa.v  -  c.v; 
itaa. chain  -  chain; 
if  ( lanquaua (t itaa, qp) ) 

ayaarr (0,  “Quaua  ovarflow  in  traca\n“); 


ap_uaad  (qp) )  >  cpaax)  -  uaad; 


)  whila  <(e.h  t-  p.h)  II  (o.v  I-  p.v)); 


GatPort (told  port) ; 

SatPort (right_wind) ; 

PanPat (black) 7 
SatPort  (old__port) ; 

raturn  1; 

) 

(continued  on  page  72) 


70 

862 


Dr.  Dobb's  Journal,  November  1987 


3-D  IMAGES 


Listing  Four 

(Listing  continued,  text  begins  on  page  18.) 


teat  the  s  neighbor  of  point  p 


static  Boolean  tost_pixel (p,  s) 
Point  p; 

int  a;  /•  direction  to  taat  */ 


aat_pixel (4p,  a); 

return  (getpixel (p.h,  p.v,  (boap) ) ; 


aet  point  p  to  point  ♦  chain  code  a 


static  setjpixeMp,  a) 

Point  *p; 
int  a; 
l 

switch  (s)  ( 

case  Oi 


(*p).h++; 

break; 

<*p) .h*+; 
Cp)  .v—/ 
break; 

(•p) . v — ; 
break; 

(•p)  .h — ; 
<*p)  .v—; 
break; 

(*p)  .h— ; 
break; 

(•p)  .h— ; 
(*p).v++; 
break; 

(*p) .v**; 
break; 

(•p) • h++; 
<*P>  • 
break; 


add  c  to  chain  coda  a 


static  int  add_chain(s,  c) 
register  int  s,  c; 


register  int  i; 


if  ( (1  -  a  ♦  c)  >-  •) 

return  (i  -  •); 

else  ( 

if  (i  <  0) 

return  (i  ♦  •); 

else 

return  i; 

) 


♦ifdef  DEBUG 

/•-. 


The  code  below  is  used  to  display  the  progress  of  the  contour 
analysis  in  a  Macintosh  window.  It  is  obviously  totally 
nonportable.  Fixed  point  arithmetic  is  used  for  scaling  the  MacPaint 
size  document  to  a  much  smaller  window.  Fixed  point  is  much  faster 
floating  point,  and  is  often  used  for  two  and  three  dimensional 
graphics  on  the  Mac.  This  code  is  for  debugging  purposes.  ^ 

/•  ratio  is  used  to  scale  down  picture  size  by  1/4  */ 
static  Fixed  ratio  -  0x00004000; 


convert  a  14  bit  integer  to  fixed  point 


static  Fixed  int  to  fix(i) 
int  i; 

{ 

asm  ( 

move. w 
ext  .1 

swap 


i,  dO  ;  get  the  number 

dO  ;  clear  top  of  dO 

dO  ;  make  it  a  fixed  point  number, 

all  done 


cgnvert  a  Fixed  point  number  to  16  bit  integer 


static  int  fix_to  int(i) 
Fixed  i; 

{ 

asm  ( 

move.l 

add.l 

swap 

) 


1,  dO  ;  get  the  number 

♦OxAOOO,  dO ;  add  .5  to  masher,  for  rounding 
dO  ;  make  the  int,  ignore  upper  word  of  dO 


/* - 


static  draw_point (p) 
LEAF  *p; 


draw  the  point  from  a  LEAF,  scaled  down  to  the  actual  vindav^itm 


GrafPtr  oldPort; 
Point  q; 

Rect  r; 

q  -  p->header.p; 


GetPort (ioldPort) ; 

SetPort (right_wind) ; 

r .top  -  fix_to  int (FixMul ( int_to_f ix (q . v) ,  ratio)); 
r .left  -  tix_to”int (FixMul (int_to_fix(q.h) ,  ratio)); 
r. right  -  r.left  ♦  1; 
r. bottom  -  r.top  +1; 

FraneRectUr) ; 

SetPort (oldfort) ; 


draw  the  point,  scaled  down  to  the  actual  window  size 


static  show_point (p) 
Point  p; 


GrafPtr  oldPort; 

Rect  r; 

GetPort (ioldPort) ; 

SetPort (right_wind) ; 

r.top  -  fix_to_int (FixMul (int_to_f lx (p.v) ,  ratio)); 
r.left  -  fix_to_int (FixMul (int_to_fix (p.h) ,  ratio)); 
r. right  -  r.left  ♦  1; 
r. bottom  -  r.top  +1; 

FraneRectUr) ; 

SetPort (oldPort) ; 


draw  a  line,  scaled  down  to  the  actual  window  size 


static  show_line (first,  last) 
Point  first, last; 


GrafPtr  oldPort; 


GetPort (4oldPort) ; 
SetPort (right_wind) ; 


fix  to  int (FixMul (int  to_fix(first.v),  ratio)); 
fix__to~int  (FixMul  (int~to~fix (first. h) ,  ratio)); 


first. v 
first. h 

MoveTo ( first . h,  first. v); 

last .v  -  fix  to  int (FixMul(int_to_fix(last.v) ,  ratio)); 
last.h  -  fix_to”int (rixMul(int”to_fix(last.h) ,  ratio)); 

LineTo(last.h,  last.v); 

SetPort (oldPort) ; 


End  Listing  Four 


Listing  Five 


William  May 
2/20/87 


typedef  int  ‘STACK; 

/*  function  prototypes 


extern 

int 

empty 

extern 

int 

pop 

extern 

int 

push 

extern 

STACK  * 

init_stack 

extern 

extern 

void 

int 

del_stack 

(  STACK  *  ); 

/*  is  anything  on  the  stack?  */ 

(  STACK  *  ); 

(  int,  STACK  *); 

(  int  ) ; 

(  STACK  “  );  /*  remove  stack  •/ 

top_of_stack (  STACK  *  ); 

/*  returns  top  of  stack  */ 


End  Listing  Five 


Listing  Six 


stack. c 

implements  a  simple  stack  of  integers 
William  May 

303A  Ridgefield  Circle 
Clinton,  MA  01510 


♦ifdef  DEBUG 
♦include  <stdio.h> 
♦endif 

♦include  <storage.h> 


72 


Dr.  Dobb’s  Journal,  November  1987 

863 


typedef  struct  STACK  { 

int  max;  /*  max  items  in  stack  »/ 

int  top;  /*  current  items  in  stack  */ 

int  items[ ];  /*  the  data  */ 

)  STACK; 

/*  some  error  codes  */ 

♦define  FULL  -2 
♦define  EMPTY  -1 
♦define  NOERR  0 

♦ifdef  DEBUG 

STACK  ‘mystack; 

main() 

{ 

STACK  *init_stack  ( ) ; 
int  nura,  c; 

printf ("Stack  program  begun\n"); 

if  (>(mystack  -  init_stack (5) ) )  { 

fprintf (stderr,  "Insufficient  memory  to  create  stack\n") ; 

ExitToShell ( ) ; 

) 

while (  1  )  { 

num  -  c  -  -1; 

printf ("top  -  %d\n",  mystack->top) ; 

printf ("max  -  %d\n",  mystack->max) ; 

printf("top  item  -  %d\n",  top_of_stack (raystack) ) ; 

printf ("\n<p(op) /s (push) /q (uit)>  ->  "); 
while(  c  !-  'p*  44  c  1-  's'  tt  c  1-  'q'  ) 
c  -  getchar ( ) ; 

if  (  c  —  '  s'  ) 

{ 

printf ("\nenter  decimal  number  ->  ") ; 

sc.anf("%d",  tnura  ); 

printf ("\npush(%d)  returned  %d\n", 

num,  push (num,  mystack) ) ; 

} 

else  lf(  c  'p*  ) 

( 

printf (  "\npop  returned:  %d\n",  pop (mystack) ) ; 

) 

else 

break; 

) 

del_stack(4ray stack) ; 

printf  ("\nStack  program  cornplete\n")  ; 

) 

♦endif 

/. - 

create  the  stack 

NULL  returned  if  insufficient  memory 

STACK  *init_stack  (  items  ) 
int  items; 

{ 

STACK  *  p; 

if  (p  -  (STACK  *) calloc  ( (sizeof (int)  *  items  +  sizeof (STACK) ), 1) )  { 
p->max  -  items; 
p—>t op  -  EMPTY; 

) 

return  (p) ; 

) 

/* - 

delete  the  stack 

- ./ 

void  del_stack  (  stackptr  ) 

STACK  ** stackptr ; 

{ 

free ( ‘stackptr) ; 

•stackptr  -  OL; 

} 

/* - 

is  the  stack  empty? 

int  empty  (  stack  ) 

STACK  ‘stack; 

( 

return  ((stack->top  —  EMPTY)  ?  1  :  0); 

) 

/* - 

return  the  top  element  of  the  stack  and  decrement 
the  stack  pointer 

- */ 

int  pop  (  stack  ) 

STACK  ‘stack; 

( 

if  (empty (stack) ) 

return  EMPTY; 

else  { 

return  (stack->iteras  [stack- >top — ] )  ; 

) 

(continued  on  ne?ct  page.) 


Dr.  Dobb’s  Journal,  November  1987 

864 


3-D  IMAGES 


Listing  Six  (Listing  continued,  te?ct  begins  on  page  18.) 


/ 


push  an  integer  onto  the 


stack  and  increment  the  stack  pointer 
- '/ 


int  push  (n,  stack) 
int  n; 

STACK  ‘stack; 

if  (++stack->top  <  stack->max)  { 

stack->items [stack->top]  -  n; 
return  NOERR; 


} 

else 


— stack->top; 
return  FULL; 


) 


) 


look  at  the  top  of  the  stack  without  changing  the 
stack  pointer 


int  top_of_stack(  stack  ) 

STACK  ‘stack; 

return  ( stack->items  [ stack->top] ) ; 

) 


End  Listing  Six 


Listing  Seven 


make_hidlin.h 

header  file  for  make_hidlin 


William  May 

*/ 

created: 

3/21/87 

♦define 

MAPWIDTH 

576 

♦define 

MAPHEIGHT 

720 

♦define 

INTERVAL 

24 

♦define 

VMAX  31 

♦define 

HMAX  25 

Listing  Eight 

/* - 

make  hidlin.c 


/*  interval,  in  pixels,  between  grid  lines  */ 

/*  bitmap  height/interval  +  1  •/ 

/*  bitmap  width/interval  +1  */ 


Creates  a  grid  of  elevations  and  then 

input  file  for  the  3D  hidden  line  algorithm  (hidlinpix) 

The  input  file  creation  process  is  fully  explained  in  Araraeraal's 
book  "Programming  Principle  in  Computer  Graphics" 


William  May 

303A  Ridgefield  Cirlce 

Clinton,  MA  01510 


created:  3/20/87 


*/ 


End  Listing  Seven 


♦include  <stdio.h> 
♦include  "make_hidlin.h" 
♦include  "contour. h" 

♦define  SCALE  100.0 

int  grid[HMAX+l] [VMAX+1] ; 
FILE  *fp; 


Coordinate  creating  an  input  file  for  hidlinpix 


make_hidlin<) 

print f ("Beginning  make_hidlin\n") ; 
init_grid ( ) ; 
raake_grid() ; 

fp  -  f open ("grid.dat",  "w") ; 

/*  convert  grid  to  hidlin  input  format  */ 
conv_hidlin() ; 

fclose(fp) ; 

} 

0  the  grid 

this  may  be  unnecessary  for  a  global,  but  doesn't  hurt 
It  is  however,  necessary  if  the  contour  analysis  is 
done  repeatedly  without  quitting  the  program 


init_grid () 


(continued  on  page  76 


74 


Dr.  Dobb's  Journal,  November  1987 

865 


3-D  IMAGES 


Listing  Eight  (Listing  continued ,  te?ct  begins  on  page  18.) 

{ 

register  int  h,v; 


for  (h  -  0;  h  <-  HMAX;  h++) 

for  (v-0;  v  <-  VMAX;  v+  +  ) 
grid(h] [v]  -  0; 


Convert  the  grid  to  a  hidlinpix  input  file 

This  function  is  based  on  code  from  Ammeraal's  book  for 

doing  3D  projections  of  mathematical  functions 


conv_hidlin () 

{ 

int  i ,  j,  k,  1; 
double  f ( ) : 


fprintf ( fp,  "%lf  %lf  %lf\n“,  (double) HMAX  /  2.0,  (double) VMAX  /  2.0,  0.0); 


printf (“Printing  the  point  coordinates\n") ; 


for  (i 


) 


0;  i  <- 
for  (j 


) 


HMAX;  i++)  ( 

-  0;  j  <-  VMAX;  j++)  { 

fprintf (fp,  “%d  %lf  %lf  %lf\n", 
(double) i,  (double) j. 


j*(HMAX+l)+i+l, 

f(i,j)); 


) 


fprintf  (fp,  “Faces :  \n“) ; 


print f (“Printing  the  faces\n“); 


/*  next  two  lines  switched  */ 
for  (i  -  0;  i  <  HMAX;  i++)  { 

for  (j  -  0;  j  <  VMAX;  j++)  ( 
k  -  j» (HMAX+1) +i+l; 

1  -  k+HMAX+1 ; 

fprintf (fp,  “%d  %d  %d#\n“, 
fprintf (fp,  “%d  %d  %d*\n“, 
fprintf (fp,  “%d  %d  %d#\n“, 
fprintf (fp,  “%d  %d  %d#\n“. 


k,  -(1+1),  k+1) ; 
k+1,  1+1,  -k); 

k,  -(1+1),  1); 

l,  1+1,  -k); 


/* - 

note  reversing  the  h  coordinates 

somewhere  the  grid  is  being  reversed,  so  I  unreverse  it  here. 


double  f(h, v) 
int  h,v; 

( 

return  ( (double) (grid [HMAX  -  h](vj)  /  SCALE); 

} 


/ 


End  Listing  Eight 


Listing  Mine 

/* - 

raake_grid 

converts  graph  data  into  a  grid  that  can  be  interpreted 
by  hidlinpix 

William  May 

303A  Ridgefield  Circle 

Clinton,  MA  01510 


Feb  16,  1986 


♦define  DEBUG 


created 


♦ifdef  DEBUG 
♦define  P(x)  x 
♦else 

♦  define  P(x) 
♦endif 


♦include  <stdio.h> 

♦include  “stack. h“ 

♦include  “raake_hidlin. h“ 

♦include  “ contour. h" 

♦include  “global. h“ 

extern  int  stack_error; 
extern  int  grid [HMAX] (VMAX) ; 

static  STACK  *  stack; 

/* - 

move  vertically 

assigning  elevations  to  each  grid  point 

make_grid () 

( 

register  int  h; 

stack  -  init_stack (2000) ;  /*  should  be  plenty!  */ 

for  (h  -  1;  h  <  (HMAX-1) ;  h++)  { 
traverse  vert(h); 

) 

del  stackUstack)  ; 

) 


/ 


76 

866 


Dr.  Dobb’s  Journal ,  November  1987 


Traverse  a  vertical  grid  line,  determining  elevations  along  the  way 
This  i 3  mostly  simple:  i.e.  keep  track  of  the  elevation  of  the 
contour  line  crossed  last. 

The  main  complications  are  tangents  (in  which  case  we  want  to  ignore 
the  contour  line)  and  inflection  points  (in  which  case  we  don't 
ignore  the  contour  line) . 


traverse_vert (h) 
register  int  h; 

{ 

register  int  ver,  hor; 

register  int  vraax  -  (VMAX  -  1)  *  INTERVAL; 
hor  -  h  *  INTERVAL; 


} 


push(0,  stack);  /*  push  0  (elev  of  border)  onto  the  stack  */ 

P.(show_line  (hor,  1,  hor,  vraax) )  ; 


for  (ver  -  1;  ver  <  vraax;  ver++)  { 

/*  traverse  a  vertical  grid  line  */ 

/*  on  a  grid  intersection  set  array  to  value  on  the  stack  */ 
if  ( ! (ver  \  INTERVAL))  { 

P ( showjpoint (hor-3 ,  ver ) ) ; 

P (show_point (hor+3, ver) ) ; 
set_elevation (h,  (ver  /  INTERVAL)); 

} 


) 


/* 

if 

> 


are  we  crossing  a  contour?  are  we  tangent  to  it?  */ 
(getpixel (hor,  ver,  ibmap) )  ( 

if  (is_crossing (hor,  ver)  | I  start_inf lection (hor, 
ver  +-  check_hit (hor,  ver) ; 


ver) ) 


/* - 

set  grid[h] (v)  to  the  value  of  top  of  stack 


static  set_elevation (h,  v) 

register  int  h,  v;  /*  note  these  are  grid  coordinates  ! 1  */ 

< 

grid(h) ( v ]  -  top_of_stack (stack) ; 

} 


/ 


a  pixel  was  hit,  get  the  curve  index  from  tree 
get  curve  elevation  from  curve  array 
if  curve  elevation  -  elevation  on  stack 
pop  stack 

else 


push  new  elevation 


end 

return  the  number  of  pixels  to  skip-1  (i.e. 
return  0  if  wskip  1,  etc.) 


static  short  check_hit (h,  v) 

register  int  h,  v;  /*  note  these  are  pixel  coordinates  1!  */ 

{ 

short  cnt  -  1;  /*  number  of  pixels  *f 

short  ver; 


*/ 


union  temp  { 

long  key; 

Point  p; 

)  temp; 

LEAF  *lp; 
long  dcmp( ) ; 
int  elev; 

/*  traverse  all  pixels,  if  more  than  one  */ 
for  (ver  -  v+1;  getpixel (h,  ver++,  ibmap) ;  cnt++) 


) 


terap.p.h  -  h; 
temp.p.v  -  v; 

lp  -  find (root,  temp. key,  dcrap) ; 
elev  -  curves [lp->curve] .elevation; 


if  (elev  --  top_of_stack (stack) ) 
pop (stack) ; 


else 


push (elev,  stack); 


return  (cnt-1); 


is_crossing:  tests  a  point  on  a  contour  to  figure  out  if 

we  are  crossing  the  contour  or  tangent  to  the  contour.  If  we  are 

tangent  to  it  we  don't  want  to  bump  the  elevation  yet. 

The  test  is  performed  by  examining  the  six  pixels  to  the  side: 

1  4 

2  h,  v  5 

3  6 
Crossings  are: 

two  or  more  hits  in  range  1-6 
1+  in  1-3  and  1+  in  4-6 

return  false  for  a  tangent,  true  for  a  crossing. 


static  int  is_crossing (h,  v) 
int  h,  v; 

{ 

register  int  h_test,  v_test,  i; 
h_test  -  h  -  1;  /*  test  the  1-3  */ 

(continued  on  nejct  page) 


Dr.  Dobb's  Journal ,  November  1987 


77 

867 


3-D  IMAGES 


Listing  Nine  (Listing  continued,  text  begins  on  page  18.) 

v_test  -  v  -  1; 

for  (i  -  0;  Igetpixel (h_test,  v_test,  4fcmap)  44  i  <  3;  i++,  v_teat++) 
if  (i  —  3) 

return  false;  /*  a  tangent  was  hit  */ 

h_test  -  h  +  1;  /*  test  the  4-6  */ 

v_test  -  v  -  1; 

for  (i  -  0;  igetpixel (h_test,  v_test,  4braap)  (t  1  <  3;  i++,  v_test++) 


if  (i  —  3) 

return  false; 

return  true; 


/*  a  tangent  was  hit  */ 
/*  looks  good!  */ 


st art_inf lection:  tests  to  see  if  this  point  is  the  start  of  an 
inflection  in  the  contour.  An  inflection  will  not  look  like  a  crossing, 
but  should  be  counted  as  one. 

An  inflection  looks  like 

1 

h,  v 
2 

3 

4 

5 

for  example.  Only  the  first  point  on  the  inflection  is  counted. 

- ./ 

static  int  start_inflection(h,  v) 


int 

( 


h,  v; 

register  int  h_test,  v_test,  dir  -  0,  i; 

if  (getpixel(h,  v-1,  tfcrnap) ) 
return  false; 


/*  only  count  the  start  of  an  inflection  */ 


h_test  -  h  -  1;  /*  test  the  1-3  */ 

v_test  -v-1; 

for  (i  -  0;  igetpixel (h_test,  v_test,  ifcroap)  44  i  <  3;  i++,  v_test++) 


if  (i  <  3) 


dir  -  1; 


/*  test  the  4-6  */ 


h_test  -  h  +  1; 
v_test  -  v  -  1; 

for  (i  -  0;  Igetpixel  (h_test,  v_test,  4fcinap)  44  i  <  3;  i++,  v_test++) 


if  (i  <  3) 


dir  -  -1; 


if  (dir  —  0) 

return  falsa; 


/•  no  inflection  here  */ 


final  test:  track  pixels  downward 

if  the  turn  at  the  and  is  in  the  opposite  direction 
as  the  original  turn  (indicated  by  dir) 
then  it  is  an  inflection 

v  test  -  v; 

wKile  (getplxel  (h,  v  tast+1,  abmap) ) 
v_test++;  ~ 

if  (v_test  !-  v)  ( 

~  v  -  v_test; 

/•  let's  test  the  pixels  again!  •/ 
h_test  -  h  -  1;  /•  test  the  1-3  •/ 

v_teat  -  v  -  1; 

for  (i  -  0;  igetpixel  (h_test,  v__test,  tbraap)  44  i  <  3;  i++,  v_test++) 


if  (i  <  3) 


if  (dir  !-  1) 

return  true; 

else 

return  false; 

/*  test  the  4-6  */ 


/•  no  change  in  direction,  inflection  •/ 
/*  reversed  direction,  was  a  tangent  •/ 


h_test  -  h  +  1; 
v~test  -v-1; 

for  (i  -  0;  igetpixel  (h_test,  v_test,  ifcruap)  44  i  <  3;  i*+,  v_teat*+) 


if  (i  <  3) 


) 

else 


if  (dir  1-  -1) 

return  true; 

else 

return  false; 


/•  no  change  in  direction,  inflection  •/ 


return  false; 


lifdef  DEBUG 
/• 


H«r<«  is  sons  code  to  display  the  progress  of  the  algorithm 
Fixed  point  math  is  used  to  speed  up  calculations. 

Fixed  point  math  is  tricky  in  a  typed  language  like  C  or 
Pascal,  and  a  cinch  in  assexfely. 


♦include  < QuickDraw. h> 

static  Fixed  int  to  fix(i) 
int  i; 


78 

868 


Dr.  Dobb's  Journal,  November  1987 


I 


asm  ( 

move. w 

i,  dO  ; 

get 

the  number 

ext.l 

dO 

;  clear  top  of  dO 

) 

swap 

dO 

;  make  it  a  fixed  point  nbmber,  all  done 

) 

static  int  fix_to_ 

int (i) 

Fixed  i; 

asm  ( 

move . 1 

i,  dO 

;  get  the  number 

add.l 

♦OxAOOO, dO ; 

add 

.5  to  number,  for  rounding 

swap 

dO 

;  make  the  int,  ignore  upper  word  of  dO 

static  Fixed  ratio  -  0x00004000; 

static  show_point (h, v) 
register  int  h,  v; 

< 

/• 

draw  each  point  in  riaht  window 
scaled  like  the  MacPaint  draw  function 

*/ 

GrafFtr  oldPort; 

Rect  r ; 

GetPort (ioldPort) ; 

SetPort (right_wind) ; 

r.top  -  fix_to_int (FixMul (int_to_fix(v),  ratio)); 

r.lsft  -  fix_to_int (FixMul (int_to~f lx (h),  ratio)); 
r. right  -  r.left  ♦  1; 
r. bottom  -  r.top  +  1; 

FramaRact  Ur) ; 

SetPort (oldPort) ; 

) 

static  show  Iine(hl,vl,h2,v2) 
register  in£  hi,  vl,  h2,  v2; 

( 

/• 

draw  each  line  in  right  window 
scaled  like  the  MacPaint  draw  function 

*/ 

GrafPtr  oldPort; 

GetPort  UoldPort) ; 

SetPort (right_wind) ; 

MoveTo (fix_to_int (FixMul (int  to_fix(hl),  ratio)), 

fix_to_int  (FixMul  (1nt_to_fix  ( vl) ,  ratio) ) )  ; 
LineTo ( f ix_to_int  (FixMul  (int  to_fix(h2),  ratio)), 

fix_to_int  (FixMul  (1nt_to_fix  (v2) ,  ratio) ) )  ; 

SetPort (oldPort) ; 

> 

#endif 


End  Listing  Nine 


Listing  Ten 

/• 

getpixel .h 

contains  function  prototype  for  getpixel. c 


♦include  <QuickDraw.h> 

extern  int  getpixel(int  h,  int  v,  BitMap  *braap) ; 

End  Listing  Ten 


Listing  Eleven 

getpixel .c 

Returns  the  value  of  pixel  h,  v  in  the  bitmap  "fcinap". 

Written  in  68000  inline  assembly  both  because  it  is  faster 
and  because  it  is  quite  easy  to  do.  A  version  of  this  code 
was  used  by  Mike  Morton  in  his  StarFlight  program. 

This  version  has  the  following  limitations: 

1.  It  assumes  that  the  cursor  will  not  be  in  the  way, 

1. e.  it  has  either  been  hidden  or  we  are  looking  at  an  off-screen 
bitmap.  In  the  contour  map  algorithm  we  are  looking  at  an 
off-screen  bitmap. 

2.  This  version  also  assumes  that  the  coordinates  at  the  upper  left 
are  (0,0).  Any  offset  tc  the  coordinate  system  (by  a  call  to  SetOrigin) 
is  ignored. 

3.  This  code  will  probably  not  work  on  the  screen  buffer  for  large  screen 
Macs,  color  Macs,  gray  scale  Macs,  etc.  It  works  fine  on  off-screen  bitmaps 
on  my  Mac  II,  as  long  as  the  depth  is  1  bit  per  pixel. 

William  May 

303A  Ridgefield  Circle 
Clinton,  MA  01510 

Created:  1/20/87 

Modified:  3/21/87  Found  a  bug:  getpixel  wouldn't  handle 
bitmaps  >  32K. 

(continued  on  next  page) 


Dr.  Dobbs  Journal,  November  1987 


79 

869 


3-D  IMAGES 


Listing  Eleven  (Listing  continued,  text  begins  on  page  18.) 


♦include  <QuickDraw.h> 
♦include  <asm.h> 

♦include  "getpixel .h" 

int  getpixel  (h,  v,  bmap) 
int  h,  v; 

BitMap  *famap; 


CO 

» 


» 

• 

> 

o 

a 

v,  dO 

;  load  v  and  h  into  regiaters 

•xt.l 

dO 

move. v 

h,dl 

•xt.l 

dl 

novt.v 

dl,d2 

;  copy  h  coord  for  bit  of fast 
;  only  need  low  order  byte 

mov«. 1 

b&ap, aO 

;  point  to  bitmap 

mulu 

OFFSET (BitMap, rovByt as) (a0),d0 

;  v  •  rowbytea  ia  offset  to  r< 

lar.l 

♦3,dl 

;  extract  byte  offset 

add.  1 

dl,  dO 

not.b 

d2 

;  make  bit  number  68000  style 

nova . 1 

OFFSET  {BitMap,  baaeAddr) (aO) ,a0 

i  get  base  address  of  bitmap 

add.l 

dO,  aO 

;  get  to  the  correct  byte 

btst 

d2, (aO) 

;  test  the  bit 

beq 

10 

moveq 

return 

♦  l,d0 

;  set  value  to  1 

movaq 

♦  0,d0 

;  set  valuo  to  0 

End  Listing  Eleven 


Listing  Twelve 


mem.h 

definition  info  for  memory  management 
William  May 
created:  3/25/87 


♦define  SYSERR  -1 


roundew,  truncew  -  round  up  or  truncate  address  to  the  next 
even  word. 


♦define  roundew (x)  (int  *)((3  +  (long)(x))  t  (~3)) 
♦define  truncew (x)  (int  •)  ( ( (long) (x) )  &  (~3 ) ) 


♦define  DEFSIZE  (150000) 


/*  default  size  for  a  new  pool  */ 


*  node  structure  for  each  node  in  the  free  memory  list 
*/ 

struct  mblock  ( 

struct  mblock  *mnext; 
unsigned  long  mien; 

): 


*  structure  for  pools 
*/ 

struct  pool  ( 

struct  pool  *pnext; 
struct  mblock  firstblock; 


End  Listing  Twelve 


Listing  Thirteen 


getmem.c 

Implements  low  overhead  memory  management  for  nonrelocatable 
blocks  on  the  Macintosh.  This  code  is  based  on  "Operating  System 
Design:  The  XINU  Approach"  by  Douglas  Comer  (Prentice  Hall,  1984. 

Great  book') 

Motivation: 

The  Macintosh  Memory  Manager  is  designed  to  use  relocatable 
blocks  (i.e.  pointers  to  pointers)  for  heap  data  structures.  In 
order  that  relocatable  blocks  be  free  to  move  the  Memory  Manager 
tries  to  keep  nonrelocatable  blocks  (i.e.  blocks  referred  to  by 
pointers)  out  of  the  way,  i.e.  in  low  mem.  To  do  this,  when  a  program 
asks  for  a  nonrelocatable  block  the  Memory  Manager  starts  a  linear 
search  for  available  space  at  the  bottom  of  the  heap.  In  an 
application  such  as  a  tree,  where  many  thousands  of  small  blocks  are 
needed,  the  linear  search  technique  quickly  dies. 

The  code  below  bypasses  this  problem  by  requesting  one  large  block  at 
the  start  of  execution  (i.e.  when  init_mera  is  called).  The  function 
getmera  then  maintains  a  pointer  to  the  next  available  block.  Thus 
any  request  for  memory  is  satisfied  immediately.  No  search  is  needed. 
The  performance  difference  in  the  contour  analysis  i3  quite  dramatic. 


80 

870 


Dr.  Dobb’s  Journal,  November  1987 


Listing  Thirteen  (Listing  continued,  te?ct  begins  on  page  18.) 

The  code  below  is  quite  simple  but  not  flexible.  One  notable 
lack  is  for  a  routine  to  free  memory.  Such  a  function  is  shown  in 
Comer's  book.  Another  limitation  that  only  one  block  is  used.* 

The  functions  can  be  extended  to  be  able  to  add  new  blocks  when  it 
needs  additional  space. 


William  May 

303A  Ridgefield  Circle 
Clinton,  MA  01510 

created:  3/25/87  very  primitive  version,  but  works: 

one  pool  only  created 
user  has  to  guess  the  required  space 
speed  increase  is  dramatic,  especially 
as  the  contour  algorithm  progresses  (l.e. 
as  the  AVL  tree  gets  quite  large) . 
- */ 


♦include  <MemoryMgr .h> 

♦include  "mera.h" 

/*  head  of  free  memory  list  */ 
/*  head  of  pool  list  */ 


struct  mblock  memlist; 
struct  pool  ‘plist; 


init_mera  creates  a  large  memory  pool,  and  initializes  the  necessary 
data  structures. 


- - - */ 

int  init_mem() 

{ 

plist  -  (struct  pool  *) NewPtr (sizeof (struct  pool)  +  DEFSIZE); 


if  (MemErr  —  noErr)  { 

plist->pnext  -  (struct  pool  *)0; 


memlist .mnext  -  t(plist->firstblock); 
(memlist .mnext) ->mnext  -  (struct  mblock  *)0; 
(memlist .mnext) ->mlen  -  (long) (DEFSIZE) ; 


) 


return  MemErr; 


getraera  allocates  memory  in  the  pool.  On  successful  completion 
a  pointer  to  the  allocated  space  is  returned  to  the  caller.  Otherwise 
a  null  pointer  (0L)  is  returned. 

- ./ 


int  ‘getmera (nbytes) 
unsigned  long  nbytes; 

{ 

register  struct  mblock  *p,  *q,  ‘leftover; 


if  (nbytes  —  0)  ( 

return  (int  *)0; 

} 


nbytes  -  (unsigned  long) roundew (nbytes) ; 

for  (q-4raemlist,  p-memlist .mnext;  p  1-  0L;  q-p,  p-p->mnext) 
if  (p->mlen  —  nbytes)  ( 

q->ranext  -  p->mnext; 
return  ((int  *)p); 

}  else  if  (p->mlen  >  nbytes)  { 

leftover  -  (struct  mblock  *)( (unsigned  long)p  +  nbytes); 

q->mnext  -  leftover; 

leftover->ranext  -  p->mnext; 

leftover->mlen  -  (long) (p->mlen  -  nbytes); 

return  {(int  *)p); 

) 


return  ((int  *)0); 


) 

End  Listing  Thirteen 


Listing  Fourteen 

- - 

error .c 

very  primitive  error  handler 
William  May 

303A  Ridgefield  Circle 
Clinton,  MA  01510 

- */ 


♦include  <stdio.h> 


if  an  error  number  specified  then  print  it, 
otherwise  only  print  the  message 


syserr (errno,  s) 
int  errno; 
char  *s; 

if  (errno)  printf ("Error  (%d)  :  %s\n",  errno,  s); 
else  printf ("Error :  %s\n",  s)  ; 

ExitToShell ( ) ; 

)  End  Listings 


Dr.  Dobb’s  Journal,  November  1987 


81 

871 


TURBO  C  TOOLBOX 

Listing  One  (Text  begins 

on  page  30.) 

/*  VIDEO. I:  Contains  ROM  BIOS  video  calls  to  be  used  in  Turbo  C  */ 

♦define  ROM  0x10 

union  REGS  inreg,  outreg; 

int  videoraode  (int  *ncols) 

/*  get  current  display  mode  */ 

1  4  /* 

inreg.h.ah  -  OxOF; 

int86  (ROM,  ainreg,  aoutreg); 

return  number  of  cols  via  *ncols  */ 

*ncols  -  outreg.h.ah; 

/*  number  of  cols  */ 

return  (outreg.h.al); 

)  /» - ./ 

/*  return  mode  */ 

int  activepage  (void) 

1 

inreg.h.ah  -  OxOF; 

int86  (ROM,  ainreg,  aoutreg); 

return  ( outreg . h . bh) ; 

1  /. -  - v 

!*  return  active  display  page  */ 

void  setmode  (int  mode) 
l 

inreg.h.al  -  mode; 
inreg.h.ah  -  0x00; 
int86  (ROM,  ainreg,  aoutreg); 

)  /. - */ 

/*  set  video  mode  */ 

void  setcursor  (int  start,  int  end)  /*  set  cursor  shape  */ 

inreg.h.ch  -  start; 
inreg.h.cl  -  end; 
inreg.h.ah  -  0x01; 
int86  (ROM,  ainreg,  aoutreg); 

)  /* - 

int  curstaxrt  (void) 

( 

/*  get  cursor  starting  line  */ 

inreg.h.bh  -  0;  /* 

inreg.h.ah  -  0x03; 
int 8 6  (ROM,  ainreg,  aoutreg); 
return  ( outreg . h . ch) ; 

i  /. - ,/ 

(cursor  shape  same  in  all  pages  */ 

int  cursend  (void) 

( 

inreg.h.bh  -  0; 
inreg.h.ah  -  0x03; 
int86  (ROM,  ainreg,  aoutreg); 
return  (outreg .h .cl) ; 

)  /. - */ 

/*  get  cursor  ending  line  */ 

void  cursoff  (void) 

/*  turn  cursor  off  */ 

inreg.h.ch  -  cur start  ()  |  0x10; 
inreg.h.cl  -  cursend  (); 
inreg.h.ah  -  0x01; 
int 8 6  (ROM,  ainreg,  aoutreg); 

j  /. - ./ 

/*  turn  on  bit  4  */ 

void  curson  (void) 

( 

/*  turn  cursor  on  */ 

inreg.h.ch  -  curstart  ()  a  0x07; 
inreg.h.cl  -  cursend  (); 
inreg.h.ah  -  0x01; 

int86  (ROM,  ainreg,  aoutreg); 

)  /* -  Mf 

/*  turn  off  high  order  bits  */ 

void  gotoxy  (int  col,  int  row,  int 

page)  /*  set  cursor  pos  */ 

inreg.h.bh  -  page; 
inreg.h.dh  -  row; 
inreg.h.dl  -  col; 
inreg.h.ah  -  0x02; 
int86  (ROM,  ainreg,  aoutreg); 

i  /. - ./ 

/*  must  specify  page  */ 

int  wherex  (int  page) 

inreg.h.bh  -  page; 
inreg.h.ah  -  0x03; 
int86  (ROM,  ainreg,  aoutreg); 
return  (outreg. h. dl) ; 

i  /. - ./ 

/*  return  cursor  column  in  page  */ 

int  wherey  (int  page) 

inreg.h.bh  -  page; 
inreg.h.ah  -  0x03; 
int86  (ROM,  ainreg,  aoutreg); 
return  ( outreg . h . dh) ; 

)  /. - m/ 

/*  return  cursor  row  in  page  */ 

void  setpage  (int  page) 

inreg.h.al  -  page; 
inreg.h.ah  -  0x05; 
int86  (ROM,  ainreg,  aoutreg); 

i  /. - ./ 

/•  set  active  display  page  */ 

void  els  (void) 

/*  clear  active  screen  */ 

inreg.h.al  -  25; 

/*  entire  screen  */ 

inreg.h.bh  -  0x07; 

inreg.h.ah  -  0x06; 

inreg.h.cl  -  0;  inreg.h.ch  -  0; 

inreg.h.dl  -  79;  inreg.h.dh  -  24; 

int86  (ROM,  ainreg,  aoutreg); 

gotoxy  (0,  0,  activepage  ()); 

j  /. - ./ 

/*  set  to  gray  on  black  */ 

void  window  (int  xl,  int  yl. 

/*  window  upper  left  corner  •/ 

int  x2,  int  y2. 

/*  lower  right  corner  */ 

char  attrib) 

/*  text  attribute  inside  */ 

inreg.h.al  -  y2  -  yl  +  1; 
inreg.h.bh  -  attrib; 
inreg.h.cl  -  xl;  inreg.h.ch  -  yl; 
inreg.h.dl  -  x2;  inreg.h.dh  -  y2; 
inreg.h.ah  -  0x06; 

/*  clear  entire  window  */ 

int86  (ROM/  *inreg,  ioutreg) ; 

void  winScroll  (int  xl,  int  yl,  int  x2,  int  y2, 
int  attr) 


/*  scroll  window  */ 
/*  one  line  upward  */ 


inreg.h.al  -  1; 

inreg.h.cl  -  xl;  inreg.h.ch  -  yl; 
inreg.h.dl  -  x2;  inreg.h.dh  -  y2; 
inreg.h.bh  -  attr; 
inreg.h.ah  -  0x06; 
int86  (ROM,  tinreg,  fioutreg) ; 

/* - */ 


char  chattr  (int  foregrnd,  int  backgrnd) 
{ 

return  ( (backgrnd  «  4)  +  foregrnd); 

char  rdchara  (int  page, 

char  *attrib) 


/*  character  attrib*/ 


( 

inreg.h.bh  -  page; 
inreg.h.ah  -  0x08; 
int86  (ROM,  tinreg,  aoutreg) ; 
‘attrib  -  outreg.h.ah; 
return  (outreg.h.al) ; 

)  /. - */ 

void  wrtcha  (char  ch,  char  attrib, 
int  page) 


/*  read  char  at  curs  pos  */ 
/*  return  attribute  indirectly  */ 


inreg.h.al  -  ch; 
inreg.h.bh  -  page; 
inreg.h.bl  -  attrib; 
inreg.x.cx  -  1; 
inreg.h.ah  -  0x09; 
int 8 6  (ROM,  ainreg,  aoutreg); 
/* -  */ 


void  wrtstra 


( 

int 


(char  *str, 
char  attrib, 
int  page) 

,  p  -  0; 


/*  write  char  +  attrib  */ 
/*  at  cursor  pos  on  page  */ 
/*  NOTE:  does  not  advance  cursor  */ 


/*  write  string  */ 
/*  with  attribute  */ 
/*  to  page  */ 
starting  at  cursor  position  */ 


videomode  (an) ; 
r  -  wherey  (page) ; 
while  (strip))  ( 

wrtcha  (str[p++),  attrib,  page); 
if  ( (c  -  wherex  (page))  <  (n  -  1) 
gotoxy  (c  +  1,  r,  page) ; 
else 

gotoxy  (0,  ++r,  page) ; 

) 

}  /* - */ 

void  wrtch  (char  ch,  int  color, 

int  page)  / 


inreg.h.al  -  ch; 
inreg.h.bl  -  color; 
inreg.h.bh  -  page; 
inreg.x.cx  -  1; 
inreg.h.ah  -  0x0 A; 
int86  (ROM,  ainreg,  aoutreg); 
/• -  »/ 


void  wrtstr 


( 

int 


(char  *str, 
char  color, 
int  page) 

/  P  “  0; 


videomode  (an) ; 
r  -  wherey  (page); 
while  (strip))  ( 

wrtcha  (strtp++J,  color,  page); 
if  ( (c  -  wherex  (page))  <  (n  - 
gotoxy  (c  +  1,  r,  page) ; 
else 

gotoxy  (0,  ++r,  page); 


D) 


/*  get  width  of  screen  */ 
/*  get  current  row  */ 

/*  write  next  char  */ 

/*  advance  cursor  */ 

/*  else  wrap  */ 

/*  write  char  in  color  */ 
at  cursor  position  on  page  */ 


/*  write  string  */ 

/*  in  color  */ 

/*  to  page  •/ 

starting  at  cursor  position  */ 

/*  get  width  of  screen  */ 

/*  get  current  row  */ 

/*  write  next  char  */ 

/•  advance  cursor  */ 

/*  else  wrap  */ 


*/ 


void  palette  (int  palno) 

l 

inreg.h.bh  -  1; 
inreg.h.bl  -  palno; 
inreg.h.ah  -  OxOB; 
int86  (ROM,  ainreg,  aoutreg); 

)  /. - */ 

void  graphbackground  (int  color) 

( 

inreg.h.bh  -  0; 
inreg.h.bl  -  color; 
inreg.h.ah  -  OxOB; 
int86  (ROM,  ainreg,  aoutreg); 

)  /. - */ 

void  plot  (int  x,  int  y,  int  pixel) 
( 

inreg.h.al  -  pixel; 

inreg.h.bh  -  0; 

inreg.x.cx  -  x; 

inreg.x.dx  -  y; 

inreg.h.ah  -  OxOC; 

int86  (ROM,  ainreg,  aoutreg); 

)  /. - */ 

int  pixel  (int  x,  int  y) 

( 

inreg.x.cx  -  x; 

inreg.x.dx  -  y; 

inreg.h.ah  -  OxOD; 

int86  (ROM,  ainreg,  aoutreg); 

return  (outreg.h.al); 

}  /. - */ 


/*  set  color  palette  */ 
/*  valid  only  in  mode  4  on  CGA  */ 


/*  set  graphics  b/g  color  */ 


/*  plot  pixel  at  x,  y  */ 
/*  pixel  (color)  value  */ 


/*  return  pixel  value  at  x,  y  */ 


End  Listing  One 

(Listing  Two  begins  on  next  page) 


Dr.  Dobb’s  Journal,  November  1987 

872 


83 


TURBO  C  TOOLBOX 


Listing  Two  (Te^ct  begins  on  page  30.) 

/*  VIDEO. H:  Prototypes  for  contents  of  user-written  VIDEO. LIB  */ 
/*  Describes  calls  to  ROM  BIOS  video  functions  */ 

int  videomode  (int  *ncols); 

int  activepage  (void) ; 

void  setmode  (int  mode); 

void  setcursor  (int  start,  int  end); 

int  curstart  (void) ; 

int  cursend  (void) ; 

void  cursoff  (void); 

void  curson  (void) ; 

void  gotoxy  (int  col,  int  row,  int  page); 

int  wherex  (int  page); 

int  wherey  (int  page); 

void  setpage  (int  page); 

void  els  (void) ; 

void  window  (int  xl,  int  yl,  int  x2,  int  y2,  char  attrib) ; 

char  chattr  (int  fgrnd,  int  bgrnd) ; 

char  rdchara  (int  page,  char  *attr); 

void  wrtcha  (char  ch,  char  attr,  int  page); 

void  wrtstra  (char  *str,  char  attr,  int  page); 

void  wrtch  (char  ch,  int  color,  int  page) ; 

void  wrtstr  (char  *str,  char  color,  int  page); 

void  palette  (int  palno) ; 

void  graphbackground  (int  color) ; 

void  plot  (int  x,  int  y,  int  pixel); 

int  pixel  (int  x,  int  y>;  End  Listing  Two 


Listing  Three 

/*  DRAW. I;  Draws  lines  in  graphics  mode  */ 

/*  hdraw  draws  horizontal  line  along  y  between  xl  and  x2  */ 
void  hdraw  (int  xl,  int  x2,  int  y,  int  color) 

{ 

int  x; 

if  (xl  >  x2)  (  /*  sort  x's  into  le£t-to-right  order 

x  -  xl;  xl  -  x2;  x2  -  x; 

} 

for  (x  -  xl;  x  <-  x2;  x++) 
plot  (x,  y,  color) ; 

,  /. - */ 

/*  vdraw  draws  vertical  line  along  x  between  yl  and  y2  */ 
void  vdraw  (int  yl,  int  y2,  int  x,  int  color) 


{ 

int  y; 

if  (yl  >  y2)  (  /*  sort  y's  into  top-to-bottom  order  */ 

y  -  yi;  yi  -  y 2;  y2  -  y; 

) 

for  (y  -  yl;  y  <-  y2;  y++) 
plot  (x,  y,  color) ; 

}  /* - ./ 

/*  draw()  does  lines  on  the  diagonal  */ 

void  draw  (int  xl,  int  yl,  int  x2,  int  y2,  int  color) 

{ 


double  xstep,  ystep,  xcura  -  0.0, 
int  dx,  dy; 

register  x,  y; 

dx  -  x2  -  xl; 
dy  -  y2  -  yl; 

if  (abs  (dx)  >-  abs  (dy))  { 
ystep  -  (double)  dy  /  dx; 
if  (dy  <  0)  { 
if  (ystep  >  0) 

ystep  *-  -1;  /* 

)  else 

if  (ystep  <  0) 
ystep  *—  -1; 
dx  /-  abs  (dx) ; 
for  (x  -  xl,  y  -  yl;  x  !-  x2; 
plot  (x,  y,  color); 
ycum  +-  ystep; 
y  -  yl  +  ycura; 

} 

)  else  { 

xstep  -  (double)  dx  /  dy; 
if  (dx  <  0)  { 
if  (xstep  >  0) 
xstep  *-  -1; 

}  else 

if  (xstep  <  0) 
xstep  *-  -1; 
dy  /-  abs  (dy)  ; 
for  (y  -  yl,  x  -  xl;  y  !-  y2; 
plot  (x,  y,  color); 
xcum  +-  xstep; 
x  -  xl  +  xcum; 

) 

} 

,  /. - * 


ycum  -  0.0; 

/*  deltas  */ 


/*  plot  along  x  axis  */ 
/*  movement  along  y  axis  per  x  */ 
/*  y  travels  to  the  left  */ 

adjust  for  wrong  sign  from  -y/-x  */ 
/*  y  travels  to  the  right  */ 

/*  adjust  as  above  */ 
/*  x  increment  •/ 

x  +-  dx)  { 

/*  cum  motion  along  y  axis  */ 
/*  next  y  */ 

/*  plot  along  y  axis  */ 
/*  movement  along  x  axis  per  y  */ 


/*  y  increment  */ 

y  +-  dy)  { 

/*  cum  motion  along  x  axis  */ 
/*  next  x  */ 


/ 


End  Listing  Three 


Listing  Four 


I*  COLORS. H:  Maps  color  names  */ 


♦define  BLACK  0 
♦define  BLUE  1 
♦define  GREEN  2 
♦define  CYAN  3 
♦define  RED  4 
♦define  MAGENTA  5 
♦define  BROWN  6 
♦define  LIGHTGRAY  7 
♦define  DARKGRAY  8 
♦define  LIGHTBLUE  9 
♦define  LIGHTGREEN  10 
♦define  LIGHTCYAN  11 
♦define  LIGHTRED  12 
♦define  LIGHTMAGENTA  13 
♦define  YELLOW  14 

♦define  white  15  End  Listing  Four 


Listing  Five 

/*  VID.C:  Demos  video  functions  */ 

/*  TURBO  C  INCLUDES  *1 
♦include  <stdio.h> 

♦include  <bios.h> 

♦include  <dos.h> 

/*  USER-WRITTEN  INCLUDES  */ 

♦include  <colors.h> 

♦include  <video.i> 

♦include  <draw.i> 

/*  LOCAL  FUNCTION  PROTOTYPES  */ 

int  vidldent  (int  *vidmode); 

void  wait  (void) ; 

void  stairsteps  (void) ; 

int  isEGA  (void) ; 

void  label  (void) ; 

void  bigX  (int  adap,  int  vmode) ; 

void  hourglass  (int  adap,  int  vmode) ; 

/*  GLOBALS  */ 

enura  vidTypes  (mda,  ega,  ega,  Compaq,  other); 


main  () 

{ 

int  adaptor,  mode,  cols; 

els  ( ) ; 

/*  clear  screen 

*/ 

adaptor  -  vidldent  ((mode); 

/*  identify  video  adaptor 

*/ 

if  (adaptor  !-  other)  { 

stairsteps  (); 

/*  cursor  positioning  demo 

*/ 

bigX  (adaptor,  mode); 

/*  graphics  demo  ^1 

*/ 

hourglass  (adaptor,  mode) ; 

/*  graphics  demo  kl 

*/ 

label  (); 

gotoxy  (36,  12,  0) ; 
puts  (MA11  done!w); 

) 

,  /. - ./ 

int  vidldent  (int  *vidmode) 

( 

int  flag,  adap,  width; 
label  (); 

puts  (M\n\nDI SPLAY  INFORMATION:"); 

•vidmode  -  videomode  ((width); 
flag  -  (biosequip  ()  4  0x18)  »  4; 
if  (isEGA  {))  ( 
adap  -  ega; 
puts  ("\n\n  Enhanced  Graphics  Adaptor") ; 

}  else 

switch  (flag)  { 

case  0:  if  (“vidmode  —  2)  { 
adap  -  Compaq; 

puts  (M\n\n  Compaq  adaptor"); 

}  else  { 

adap  -  ega; 

puts  ("\n\n  Color  Graphics  Adaptor"); 

) 

break; 

case  3:  adap  -  mda; 

puts  ("\n\n  Monochrome  Display  Adaptor"); 
break; 

default:  adap  -  other; 

puts  ("\n\n  Adaptor  not  usable  in  this  demo.  Sorry."); 

)  /*  end  of  switch  */ 

print f  ("\n  Text  screen  size  is  %d  columns  x  25  rows\n",  width); 
if  ( (*vidraode  <  4)  | |  (*vidmode  —  7)) 
puts  ("\n  Text  mode  currently  active"); 
else 

puts  ("\n  Graphics  mode  currently  active"); 
wait  (); 

return  (adap) ;  .  . 

(continued  on  ne?ct  page) 


/*  identify  video  adaptor  */ 

i*  label  display  */ 

f+  get  video  mode  */ 
/*  get  video  eqpt  flag  */ 


Dr.  Dobb's  Journal,  November  1987 


85 

873 


TURBO  C  TOOLBOX 

Listing  Five 

(Listing  continued,  te?ct  begins  on  page  30.) 


!  /. - . 

int  isEGA  (void) 

( 

return  (peekb  (0x40,  0x87)); 

)  /. -  * 


/*  determine  if  EGA  is  attached  */ 
/*  return  TRUE  if  so,  FALSE  if  not  */ 
/*  check  EGA  info  byte  *V 


void  wait  (void) 

( 

int  tab,  width; 


/*  prompt  to  continue,  wait  for  keypress  */ 


videomode  (twidth);  /*  get  width  in  columns  */ 

tab  -  (width  -  33)  /  2;  /*  starting  column  for  text  */ 

gotoxy  (tab,  24,  0); 

wrtstr  ("Press  any  key  to  continue  demo...",  WHITE,  0); 
getch  ( ) ; 
els  (); 

)  /. - */ 

void  label  (void)  /*  label  the  screen  at  top  center  */ 

{ 

gotoxy  (30,  0,  0); 

wrtstra  ("Video  demonstration",  chattr  (YELLOW,  BLUE),  0); 

)  /. - */ 

void  stairsteps  (void) 

{ 

int  color  -  1; 


label  ( ) ; 
gotoxy  (31,  2, 

0) 

wrtstr 

("Cursor  positioning". 

gotoxy 

(10,  4, 

0) 

wrtstr 

("Stair", 

color++. 

0); 

gotoxy 

(20,  10, 

0) 

wrtstr 

("steps". 

color++. 

0); 

gotoxy 

(30,  16, 

0) 

wrtstr 

("going". 

color++. 

0); 

gotoxy 

(40,  22, 

0) 

wrtstr 

("down". 

color++. 

0); 

gotoxy 

(50,  16, 

0) 

wrtstr 

("and". 

color++. 

0) ; 

gotoxy 

(60,  10, 

0) 

wrtstr 

("back". 

color++. 

0); 

gotoxy 

(70,  4, 

0) 

wrtstr 

("up". 

color,  0) 

wait  () 

)  /. - */ 

void  bigX  (int  vidAdap,  int  vmode) 
(  /*  draws  ful 


int  xl  -  0,  x2  -  639; 

int  yl  -  15,  y2; 


nt  vmode)  /*  APA  graphics  demo  #1  */ 

draws  full-screen  border  and  X,  adjusting  */ 
/*  for  EGA  or  CGA  as  indicated  */ 


if  ((vidAdap  —  mda)  ||  (vidAdap  —  other))  /*  can't  do  it  */ 

return;  /*  so  return  with  no  action  */ 


if  (vidAdap  —  ega)  { 
y2  -  349  -  15; 
setroode  (0x0F) ; 

}  else  ( 
y2  -  199  -  15; 
setmode  (0x06) ; 


/*  EGA  demo:  640  x  350  (mode  OFh) 
/*  set  bottom  of  graphics  screen 
/*  go  to  EGA  mono  graphics  mode 
/*  CGAICorapaq  demo:  640  x  200  (mode  06h) 


label  ();  /*  label  the  screen 
hdraw  (xl,  x2,  yl,  1);  /*  draw  line  across  top 
hdraw  (xl,  x2,  y2,  1);  /•  then  bottom 
vdraw  (yl,  y2,  xl,  1);  /»  down  left  side 
vdraw  (yl,  y2,  x2,  1);  /*  down  right  side 
draw  (xl,  yl,  x2,  y2,  1);  /*  main  diagonal 
draw  (xl,  y2,  x2,  yl,  1);  /*  cross  diagonal 
wait  (); 

setmode  (vmode) ; 

)  /. - */ 

void  hourglass  (int  vidAdap,  int  vmode)  /*  graphics  demo  #2 
(  /*  operates  in  320  x  200  four-color  (CGA)  mode 
int  y,  xl  -  60,  x2  -  260,  pixval  -  1; 


/*  graphics  demo  #2 


if  ((vidAdap  ■—  mda)  ||  (vidAdap  —  other)) 
return; 


setmode  (4); 
gotoxy  (8,  0,  0); 

puts  ("320  x  200  color  graphics"); 
palette  (0); 

for  (y  -  50;  y  <  151;  y++  )  ( 
hdraw  (xl,  x2,  y,  pixval); 
xl  +-  2;  x2  —  2; 
if  (y  —  84)  pixval  -  2; 

if  (y  —  117)  pixval  -  3; 

> 

wait  (); 

setmode  (vmode) ; 

)  /. - */ 


/*  can't  do  it 
/*  so  go  back 


/*  go  to  CGA  graphics  mode  */ 

/*  show  mode  */ 

I*  draw  figure  */ 

/*  change  x's  */ 
/*  change  colors  */ 


End  Listings 


86 

874 


Dr.  Dobb's  Journal,  November  1987 


Listing  One  (Text  begins  on  page  46.) 


Screen  #  0 

\ 


jbb  15:47  08/04/87 


Copyright  1987,  All  Rights  Reserved 
by 

J.  Brooks  Breeden 
Columbus,  Ohio 
August  4,  1987 

Loads  an  EGAPAINT  file  into  video  memory 
from  within  UR/FORTH  for  subsequent  animation,  etc. 


Screen  #  1 

\  Misc.  jbb  13:01  08/04/87 

DOSINT  \  fetch  level  2  DOS  interface 

HCB  IMAGEHCB  \  create  the  handle  block 

CREATE  PALTAB  16  ALLOT  \  to  hold  egapaint's  palette 

:  2MEMREQ  (  xl  yl  x2  y2  -  bytes)  \  calc,  memory  req  to  0BU1CK 

ROT  -  1+  -ROT  SWAP  -1+2/1+  *  4  +  ; 


IESTORE-EGA 

(  - 

-  ) 

\ 

restore 

EGA's  default  mode: 

2 

3C4 

PC! 

OF 

3C5 

PC! 

\ 

default 

map  mask 

3 

3CE 

PC! 

0 

3CF 

PC! 

\ 

default 

data  rotate  reg  value 

5 

3CE 

PC! 

0 

3CF 

PC! 

\ 

default 

write  mode  0 

8 

3CE 

PC! 

FF 

3CF 

PC!  ; 

\ 

default 

bit  mask 

DECIMAL 
— > 


Screen  #  2 

\  EGA  map  mask  control 
HEX 


jbb  15:37  08/04/87 


:  READMODEO  (  -  )  5  3CE  PC!  8  3CF  PC!  2  3CE  PC!  0  3CF  PC!  ; 

\  set  map-masks  to  select  active  EGA  color  bit-planes 


BLUEPLANE  i 

(  -  I 

2 

3C4 

PC! 

1 

3C5 

PC! 

;  \ 

plane 

0 

G  REENPLANE  l 

[  -  ] 

I  2 

3C4 

PC! 

2 

3C5 

PC! 

X  \ 

plane 

1 

REDPLANE  1 

[  -  1 

1  2 

3C4 

PC! 

4 

3C5 

PC! 

;  \ 

plane 

2 

INTENSEPLANE  l 

[  "  ] 

1  2 

3C4 

PC! 

8 

3C5 

PC! 

;  \ 

plane 

3 

Screen  *  3 
\  File  control 

:  OPENIMAGEFILE  (  -'filename  -  ) 
IMAGEHCB  NAME>HCB 
IMAGEHCB  0_RD  FOPEN 
IF  ABORT"  Can't  open'  file."  THEN 


jbb  12:59  08/04/87 


\  force  filename 
\  open  file  for  reading 


CDDSEIMAGEFILE  (  -  )  IMAGEHCB  FCLOSE  DROP 


:  MAKE IMAGED  I LE  (  "filename  -  ) 
IMAGEHCB  NAME>HCB 
IMAGEHCB  0  FMAKE 

IF  ABORT"  Can't  make  file."  THEN  ; 

— > 


\  force  filename 
\  make  "normal"  file 


Screen  *  4 

\  Read  EGAPAINT  file  from  disk  to  video 


jbb  15:45  08/04/87 


\  To  load  EGAPAINT  disk  image  into  EGA  video  segment... 

\  set  map  mask,  read  bytes  from  disk,  repeat  for  each  plane 

:  READBYTES  (  -  )  \  read  28,000  bytes  to  video  segment 

IMAGEHCB  ?VSEG  0  \  source-handle ;  dest.-  videoseg -.offset 

28000  FREADL  \  read  28,000  bytes 

DROP  ;  \  drop  flag. 

:  READPAINTFILE  (  -  )  \  read  all  four  planes  in  sequence... 
BLUEPLANE  READBYTES  REDPLANE  READBYTES 
GREEN PLANE  READBYTES  INTENSEPLANE  READBYTES  ; 

— > 


Screen  #  5 

\ 

:  LOADPAINTFILE  (  "filename  -  ) 
640X350  VMODE  CLS 


jbb  15:44  08/04/87 
\  hi-res  16-color 


OPENIMAGEFILE  \ 

IMAGEHCB  PALTAB  16  FREAD  \ 

IF  PALTAB  ! PALETTE  \ 

THEN  \ 

READMODEO  \ 

READPAINTFILE  \ 

CLOSE IMAGE FILE  \ 

RESTORE-EGA  ;  \ 


Open  the  file,  and 
read  palette  data. 

If  read  was  successful, 
set  palette  to  new  colors. 
Set  read  mode  0  and 
read  the  bit-plane  data. 
Close  the  file,  and 
restore  the  EGA  defaults. 


Screen  I  6 

\  Load  the  EGAPAINT  full-screen  image 


jbb  15:17  08/04/87 


CREATE  BARON  , C"  THEFOKKR . IMG "  \  orig  112K  EGAPAINT  disk  file 

CREATE  FOKKER  , C"  DERFOKKR. IMG"  \  new  smaller  0BLOCK  disk  file 
CREATE  DR. 1  3072  ALLOT  \  memory  req  as  multiple  of  8 

:  SAVEDR.l  1  1  106  53  DR.l  0 BLOCK  ;  \  get  block  to  memory 

:  GETIMAGE  BARON  LOADPAINTFILE  SAVEDR.l  ;  \  get  orig  iraage>raera 

:  SAVE FOKKER  FOKKER  MAKEIMAGEFILE  \  build  .  img  file  on  disk 


IMAGEHCB  PALTAB  16  FWRITE  DROP 
IMAGEHCB  DR.l  2904  FWRITE  DROP 
CLOSE  IMAGEFILE 


Screen  t  7 
\  CHECKOUT  JUNK 


CK1 

HR  DR.l 

267 

148 

! BLOCK 

277 

156 

361 

192 

FRAME 

LTBLUE  FG 

267 

148 

372 

200 

FRAME 

262 

143 

272 

153 

FRAME 

314 

165 

324 

175 

FRAME  , 

GUNSIGHT 

(  - 

) 

ORANGE  FG 

319 

133 

319 

274 

LINE  319  ! 

CK 

CK1 

GUNSIGHT  ; 

\  include  pallete  info 
\  write  DR.l  mem  to  disk 
\  close  it  out . . . 


jbb  16:07  08/04/87 


\  Fokker  image  limits 

\  image  area  saved 
\  hit  zone  around  CORNER 
\  hit  zone  on  Fokker 


269  170  369  170  LINE 


End  Listing  One 


Listing  Two 


Listing  2.  Pseudocode  to  move  a  full  112,000  byte  image 
from  the  disk  to  the  screen:  NOTE:  ALL  NUMBERS  HEXIDECIMAL 

Set  the  video  mode  to  graphics  mode  xx  however  your  language 
does  it. 

Open  the  file  using  a  DOS  handle  however  your  language  does 
it. 

If  palette  data  is  the  first  data  in  the  file. 

Then  read  the  palette  data  and  set  the  new  palette.  (This 
should  be  covered  in  your  language's  high  level  routines.) 

Set  read  mode  0 
Write  5  to  port  3CE,  then 
write  8  to  port  3CF,  then 
write  2  to  port  3CE,  then 
write  0  to  port  3CF . 

Set  map  mask  to  plane  0  (blue  plane) 

Write  2  to  port  3C4,  then 
write  1  to  port  3C5. 

Read  28,000  bytes  from  the  file  to  A000:0000  however  your 
language  does  it. 

Set  map  mask  to  plane  2  (red  plane) 

Write  2  to  port  3C4,  then 

write  4  to  port  3C5. 

Read  28,000  bytes  from  the  file  to  A000:0000  however  your 
language  does  it. 

Set  map  mask  to  plane  1  (green  plane) 

Write  2  to  port  3C4,  then 

write  4  to  port  3C5. 

Read  28,000  bytes  from  the  file  to  A000:0000  however  your 
language  does  it. 

Set  map  mask  to  plane  3  (intensity  plane) 

Write  2  to  port  3C4,  then 
write  8  to  port  3C5 . 

Read  28,000  bytes  from  the  file  to  A000:0000  however  your 
language  does  it. 

Close  the  file,  however  your  language  does  it. 

Restore  EGA  default  values. 

(set  map  mask  default:  all  planes  active) 

Write  2  to  port  3C4,  then 
write  OF  to  port  3C5 . 


88 


Dr.  Dobb’s  Journal,  November  1987 

875 


(set  data  rotate  register  default) 
Write  3  to  port  3CE,  then 
write  0  to  port  3CF . 

(set  default  write  mode  0) 

Write  5  to  port  3CE,  then 
write  0  to  port  3CF . 


End  Listing  Two 


(set  default  bit-mask) 
Write  8  to  port  3CE,  then 
write  FF  to  port  3CF . 


Listing  Three 

Screen  #  0 

\  jbb  17:28  07/04/87 


FOKKER .SCR 

A  Mindless  Game  of  Motor  Skill 
(somewhere  over  the  trenches...) 

NOTE:  YOU  ABSOLUTELY  MUST  HAVE  CRUISE  CONTROL,  OR  SOME  OTHER 
KIND  OF  KEYBOARD  SPEEDUP  SOFTWARE  FOR  THIS  TO  WORK  ! ! ! 

Standard  Version 

Copyright  (C)  1987,  All  Rights  Reserved 


by 

J.  Brooks  Breeden 
EGAGRAPH.EXE  must  be  loaded  first 


Screen  t  1 


\  File  control...  jbb  13:08  08/04/87 

\  this  screen  is  required  to  load  image  initially. 

\  For  turnkey,  the  image  is  in  memory  4  this  scr  isn't  required 


DOS I NT 
HCB  I MAG EH CB 

:  OPENIMAGEFILE  (  '•filename  - 
IMAGEHCB  NAME>HCB 
IMAGE HCB  0_RD  FQPEN 
IF  ABORT**  Can't  open  file." 

:  CLOSE  I  MAGE  FILE  (  —  ) 
IMAGEHCB  FCLOSE  DROP  ; 


\  level  2  DOS  interface 
\  create  the  handle  block 

\  force  filename 
\  open  file  for  reading 
THEN  ; 

\  close,  ignore  status 


Screen  #  2 

\  File  control. . . 

CREATE  DR. 1  3000  ALLOT 

CREATE  PAL TAB  16  ALLOT 
CREATE  FOKKERIMG  ,  C**  DERFOKKR. 


jbb  09:49  08/01/87 

\  to  hold  the  bit  map  image 
\  new  palette  of  colors 
IMG"  \  name  of  the  image  file 


:  GETFQKKER  \  load  binary 


DR.l  3000  ERASE  \ 
FOKKERIMG  OPENIMAGEFILE  \ 
IMAGEHCB  PAL TAB  16  FREAD  DROP  \ 
IMAGEHCB  DR.l  2866  FREAD  DROP  \ 
CLOSE IMAGEF I LE  ;  \ 


image  from  disk  to  memory 
clean  out  memory  area 
open  *er  up 
read  in  palette  info 
read  in  the  image 
close  the  file 


:  S-LINE  (  row  -  )  0  TAB  80  0  DO  196  EMIT  LOOP  ;  \  a  line 

:  D-LINE  (  row  -  )  0  TAB  80  0  DO  205  EMIT  LOOP  ;  \  dbl.  line 

— > 

Screen  #  5 

\  Random  number  generator  jbb  09:56  08/01/87 

VARIABLE  SEED 

0TIME  COMBINE  SEED  !  \  plant  seed  w/system  clock 

:  random  (  -  n  )  \  0  <-  n  <-  32767 

SEED  8  259  *  3  +  32767  AND  DUP  SEED  !  ; 

:  RANDOM  (  nl  -  n2  )  \  0  <-  n2  <  nl 

random  M*  32768  UM/MOD  NIP  ; 

:  BETWEEN  (  lo*  hi*  -  inbetween#)  OVER  -  RANDOM  +  ; 

:  WITHIN?  (  n  hi#  lo*  -  flag)  >R  1-  OVER  <  SWAP  R>  <  AND  ; 


Screen  #  6 


x  t  muvBraeni. 

2 VAR I ABLE  CORNER 
-1  1  2EQU  RANGE 


\  xy-coords  of  up-left  corner  of  fokker  box 

\  range  of  random  fokker  movement 

\  add  to  Baron’s  excitement  by  bumping  allowed  RANGE 
:  EXCITEMENT  (  -  )  RANGE  -1  1  V+  2EQU  RANGE  ; 

\  leaves  random  x  and  y  within  RANGE  by  which  to  move  Fokker 
:  MOVEFOKKER  (  -  n  n)  RANGE  BETWEEN  RANGE  BETWEEN  ; 


:  GUNSIGHT  (  -  )  ORANGE  FG  269  170  369  170  LINE 
319  133  319  274  LINE  319  170  50  circle  ; 

— > 


Screen  #  7 

\  Stick  control  jbb  10:05  08/01/87 

\  set  gunsight  movement  based  on  #key  pressed 
:  ?stick  (  adr  -  adr  n  n)  MYKEY 

CASE  199  OF  2  -2  ENDOF  200  OF  0  -2  ENDOF 

201  OF  -2  -2  ENDOF  205  OF  -2  0  ENDOF 

209  OF  -2  2  ENDOF  208  OF  0  2  ENDOF 

207  OF  2  2  ENDOF  203  OF  2  0  ENDOF 

(  otherwise)  00  \  leave  0's  for  V+  to  add... 

ROT  \  move  index  to  top  of  stack 

ENDCASE  ;  \  for  ENDCASE  to  drop... 

:  ?STICK  (  adr  -  adr  n  n)  7TERMINAL  \  key  pressed? 

IF  ?3tick  \  check  stick  control 

ELSE  0  0  THEN  ;  \  leave  zeros  for  "V+" 

— > 


Screen  #  8 

\  Exclamations!  jbb  10:12  08/01/87 

\  These  aren't  real  serious  curses;  you  make  up  your  own! 

CREATE  CURSES  , "  Dankeschoen!  Welkommen!  Wie  Geh 

ts?  Was  ist  das?  Weiner  Schnitzel!  Bratwurst! 

Sauerbrauten!  Knockwurst!  Sauerkraut! 

Weisswurst  •' 


GETFOKKER  \  load  the  Fokker  image  during  the  compile. . . 

— >  \  Note:  Obviously,  you'd  better  have  the  image 

\  file  in  your  directory  at  this  point! 


:  EXCLAMATION  \  shout  a  curse  at  the  user,  for  distraction 

3  30  TAB  CLREOL  3  33  TAB  ORANGE  FG 
10  RANDOM  20  *  CURSES  +  20  -TRAILING  TYPE  ; 


Screen  #  3 

\  Pallette  colors 

jbb  17:10  08/02/87 

:  FG  FOREGROUND  ; 

\  shorthand  for  LMI's  words 

:  BG  BACKGROUND  ; 

:  HR  640X350  VMODE  ; 

\  high  res  EGA  16-color  mode 

\  Named  colors  for 

the  EGAPAINT  default  pallette 

0  CONSTANT  BLACK 

8 

CONSTANT 

GREEN 

1  CONSTANT  DKGRAY 

9 

CONSTANT 

LTGREEN 

2  CONSTANT  GRAY 

10 

CONSTANT 

CYAN 

3  CONSTANT  DKRED 

11 

CONSTANT 

LTBLUE 

4  CONSTANT  RED 

12 

CONSTANT 

BLUE 

5  CONSTANT  ORANGE 

13 

CONSTANT 

DKBLUE 

6  CONSTANT  YELLOW 

14 

CONSTANT 

PURPLE 

7  CONSTANT  DKGREEN 

15 

CONSTANT 

WHITE 

— > 

Screen  #  4 

Screen  #  9 

\  Damage  report 

VARIABLE  #HITS 
VARIABLE  AMMO 

:  .HITS 
:  .AMMO 

:  HIT?  CORNER  20  143  153  WITHIN? 
SWAP  262  272  WITHIN?  AND 


IF  2  #HITS  +!  .HITS 
♦HITS  0  4  MOD  0- 
IF  EXCITEMENT  EXCLAMATION 
THEN 
THEN  ; 

■> 


jbb  10:14  08/01/87 

\  #  of  hits  in  cockpit  zone 
\  rounds  of  ammunition  left 

\  print  out  number  of 
\  hits  4  ammo  left. 

\  y  in  hit  zone? 

\  x  in  hit  zone?  Both? 

\  Increment  hits  4  show  #. 

\  For  every  4  hits,  increase 
\  movement  4  display  curse! 


GRAY  FG  22  44  TAB  #HITS  ?  ; 
GRAY  FG  23  44  TAB  AMMO  ?  ; 


\  Utilities 


jbb  09:53  08/01/87 


Screen  #  10 


:  TAB  (  row  col  -  )  SWAP  GOTOXY  ;  \  clearer  for  text 

:  ?  0  •  ; 

\  returns  unique  IBM  keycode  for  each  keypress 
:  MYKEY  (  -  n)  KEY  DUP  0-  IF  DROP  KEY  128  +  THEN  ; 


:  WAIT  (  -  )  MYKEY  DROP 
\  vector  math. . . 

:  V+  (abed  -  a+c  b+d) 

:  V*  (abed  -  a*c  b*d) 

:  V/  (abed  -  a/c  b/d) 


\  wait  for  a  keypress 

>R  ROT  +  SWAP  R>  +  ; 

>R  ROT  *  SWAP  R>  *  ;  \  not  used; 
>R  ROT  /  SWA?  R>  /  ;  \  for  info . . 


\  Shooting...  jbb  10:19  08/01/87 

:  2 SHOTS  219  349  319  170  LINE  419  349  320  170  LINE  ; 

:  FIREGUNS  \  fire  two  shots,  decrement  ammo,  show  ammo  left 

WHITE  FG  2SHOTS 

5000  0  DO  LOOP  \  kill  some  time 
DKBLUE  FG  2 SHOTS  -2  AMMO  +!  .AMMO  ; 


(continued  on  next  page) 


Dr.  Dobb’s  Journal,  November  1987 

876 


89 


USING  EGA  SCREENS 


Listing  Three  (Listing  continued,  text  begins  on  page  46.) 


:  SHOOTING?  ? TERMINAL 

IF  MYKEY  32  - 
IF  AMMO  0 

IF  FIREGUNS  HIT?  THEN 
THEN 
THEN  ; 


\  If  there  i.3  keyboard  input... 
\  was  it  the  spacebar? 

\  Ye3,  if  we  have  any  ammo, 

\  shoot  fc  check  if  we  hit  the 
\  Fokker. 


13  31  TAB 
15  31  TAB  ." 
20  0  TAB  ; 


F2 . 
F10. 


GRAY  FG  ."  Play" 
GRAY  FG  ."  Exit" 


LTBLUE  FG 


Screen  #  11 

\  Explosion  jbb  15:01  08/04/87 

:  BURST  320  0  \  random  dots  expanding  in  all  four  quadrants 

DO  319  I  RANDOM  +  174  I  RANDOM  -  !PEL 

319  I  RANDOM  +  174  I  RANDOM  +  !PEL 

319  I  RANDOM  -  174  I  RANDOM  +  !PEL 

319  I  RANDOM  -  174  I  RANDOM  -  !PEL  LOOP  ; 

:  EXPLODE  3  0  DO  BURST  LOOP  ;  \  it  pulses  the  bursts... 

HEX  \  clear  bios  buffer  of  waiting  keypresses 

:  CLR-KEY-BUF  0C01  regAX  1  21  INT86  ; 

DECIMAL 

:  WAIT -FOR -ESC  CLR-KEY-BUF  BEGIN  MYKEY  27  -  UNTIL  ; 


Screen  t  17 

\  Credits  jbb  13:25  07/31/87 

:  .CREDITS  D-LINES  6  36  TAB  ."  FOKKER"  LTBLUE  FG  8  23  TAB 
."  (A  Mindless  Game  of  Motor  Skill)"  11  20  TAB  RED  FG 
."  Somewhere  over  the  trenches  in  France..." 

GRAY  FG  14  22  TAB 

."  Copyright  1987,  All  Rights  Reserved"  15  38  TAB  ."  by  " 

16  32  TAB  ."  J.  Brooks  Breeden"  17  32  TAB  ."  Columbus,  Ohio" 
ORANGE  FG  20  26  TAB  ."  Press  Any  Key  to  Continue..." 

23  11  TAB  LTBLUE  FG 

. "  Written  in  UR/FORTH  from  Laboratory  Microsystems,  Inc." 

24  9  TAB 

."  Fokker  Dr.l  created  with  EGAPAINT  from  RIX  Softworks,  Inc." 


Screen  *  12 

\  Win  message  jbb  17:35  07/04/87 

:  WIN  \  what  happens  if  you  shoot  him  down! 

WHITE  FG  EXPLODE  DKBLUE  FG  REVERSE  ORANGE  FG  6  14  TAB 
. "  What  a  pilot!  You  downed  the  baron  with  " 

100  AMMO  0  -  .  ."  rounds.  " 

7  14  TAB 

."  Now  it's  time  to  head  for  home  and  celebrate.  " 

20  25  TAB 

."  Press  ESC  to  go  home  in  GLORY.  "  GRAY  FG  REVERSE 
WAIT-FOR-ESC  ; 


Screen  #  13 

\  Lose  message  jbb  10:20  08/01/87 

:  LOSE  \  what  happens  if  you  DON'T  shoot  him  down... 

DKBLUE  FG  REVERSE  ORANGE  FG  6  14  TAB 

. "  You  are  a  rotten  pilot!  You  waste  ammunition,  and  " 

7  14  TAB 

."  let  yourself  get  out flown  by  the  bloody  Red  Baron.  " 

8  14  TAB 

. "  Tuck  your  tail  beween  your  legs  and  head  for  home.  " 

20  23  TAB 

Now  press  ESC  to  go  home  in  SHAME!  " 

DKBLUE  FG  REVERSE  GRAY  FG  WAIT-FOR-ESC  ; 


Screen  ♦  14 
\  Setup 
F:  FOKKER 

:  SETUP  HR  PALTAB  ! PALETTE  CLS 
DKBLUE  BG  GUNSIGHT 
-1  1  2EQU  RANGE 

175  375  BETWEEN  104  254  BETWEEN 
CORNER  2!  0  #HITS  !  100  AMMO  ! 

GRAY  FG 

22  34  TAB  ."  *  OF  HITS:  "  .HITS 

23  34  TAB  ."  AMMO  LEFT:  "  .AMMO  ; 


Screen  #  15 
\  Major  loop 


:  DOGFIGHT  SETUP  \ 

BEGIN  DR.l  \ 

CORNER  20  MOVEFOKKER  V+  \ 

?STICK  V+  (  -  adr  x  y)  \ 

2DUP  CORNER  2!  ! BLOCK  \ 

SHOOTING?  IHITS  0  19  >  \ 

IF  WIN  EXIT  THEN  \ 

AMMO  00-  \ 

IF  LOSE  EXIT  THEN  \ 

GUNSIGHT  \ 

AGAIN  ; 


jbb  10:22  08/01/87 

\  forward  reference 

\  use  EGAPaint  palette 
\  paint  sky  and  gunsight 
\  fokker  motion  range 
\  random  starting  points 
\  setup  the  variables 

\  show  hits 
\  show  ammo 


jbb  10:23  08/01/87 
get  ready. . . 

address  of  fokker  bitmap 
+randora  movement 
+stick  control 
update  corner;  show  plane 
check  hits  in  pilot  zone 
blow  up  plane;  you  win. 
out  of  ammo? 
you  lose . . . 
redisplay  gunsight 


Screen  *  16 

\  Options  jbb  10:23  08/01/87 

:  D-LINES 

B/W  CLS  PURPLE  FG  7  D-LINE  19  D-LINE  RED  FG  ;  \  double  lines 
:  .OPTIONS  D-LINES 

6  0  TAB  ."  Fokker"  6  36  TAB  ."  OPTIONS"  LTBLUE  FG 

11  31  TAB  ."  FI.  "  GRAY  FG  ."  Instructions"  LTBLUE  FG 


Screen  #  18 

\  Help...  jbb  13:24  07/31/87 

:  HELP1  B/W  CLS  RED  FG  1  20  TAB  ."  Your  Mission:" 

PURPLE  FG  2  S-LINE  LTBLUE  FG  3  20  TAB 
. "  Your  mission  is  to  down  the  Red  Baron  with  20  hits" 

4  20  TAB 

."  in  the  cockpit  area  of  his  Fokker  Dr.l  triplane." 

5  20  TAB 

."  When  you  shoot,  the  Baron  gets  excited,  and  takes" 

6  20  TAB 

."  evasive  action.  You  have  100  rounds  of  ammo  left," 

7  20  TAB 

."  and  your  guns  have  a  tendency  to  jam..."  ; 


Screen  ♦  19 

\  Help...  jbb  13:21  07/31/87 

:  HELP 2  RED  FG  9  20  TAB 

."  To  play..."  PURPLE  FG  10  S-LINE  LTBLUE  FG  11  20  TAB 
. "  The  cursor  keys  control  the  gunsight  like  a  joystick." 

12  20  TAB 

."  The  up  arrow  key  pushes  the  stick  forward.  The  down" 

13  20  TAB 

."  arrow  key  pulls  back  on  the  stick.  Left,  right,  and" 

14  20  TAB 

."  diagonal  keys  move  the  stick  as  you  would  expect." 

16  20  TAB 

."  Fire  by  holding  down  the  space  bar." 

21  20  TAB 

."  Press  Ctrl-Break  to  quit  at  any  time."  ; 


Screen  #  20 


\  Warning. . . 


jbb  10:24  08/01/87 


:  WARNING 

9  20  TAB 

10  20  TAB 

11  20  TAB 


D-LINES  6  34  TAB  ."  WARNING!!!"  LTBLUE  FG 
."  YOU  ABSOLUTELY  MUST  HAVE  CRUISE  CONTROL," 
."  QUICKEYS,  OR  OTHER  TSR  KEYBOARD  SPEED-UP" 
."  RESIDENT  IN  ORDER  TO  RUN  THIS  PROGRAM  !!!" 


13  20  TAB 

14  20  TAB 
16  20  TAB 
ORANGE  FG 
WAIT  ; 


."  If  you  do  not  have  a  keyboard  speed-up" 

."  program  resident,  press  Ctrl-Break  to  exit." 

."  If  a  keyboard  speed-up  program  is  resident..." 
20  26  TAB  ."  Press  Any  Key  to  Continue..." 


:  .HELP  HELP1  HELP2  ORANGE  FG  23  26  TAB 

."  Press  Any  Key  to  Continue..."  WAIT  FOKKER  ; 


Screen  #  21 

\  Main  resolved  jbb  10:24  08/01/87 

:  INTRO  CLS  B/W  .CREDITS  WAIT  20  0  TAB  CLREOL  ; 

R:  FOKKER  .OPTIONS  MYKEY  \  resolve  FOKKER  forward  reference 
CASE  187  OF  .HELP  FOKKER  ENDOF  \  3how  instructions 

188  OF  DOGFIGHT  FOKKER  ENDOF  \  do  the  dogfight 
196  OF  BYE  ENDOF  \  exit  to  DOS 

DROP  FOKKER  \  recurse  to  options 

END CASE  ; 


:  MAIN  HR  PALTAB  ! PALETTE  WARNING  INTRO  FOKKER  ; 


End  Listings 


90 


Dr.  Dobb's  Journal,  November  1987 

877 


TURBO  PASCAL  GRAPHICS 


Listing  One 

(Text  begins  on  page  38.) 

0Listing  10.  The  0SaveRegion@  Procedure  copies  any 
upright  rectangular  graphics  screen  region  into 
a  buffer,  0buff0,  where  its  upper-left  (xl,yl) 
and  lower-right  (x2,y2)  screen  coordinates  are 
specified.  GETMEM  is  used  to  allocate  memory, 
rather  than  the  standard  NEW,  because  the  buffer 
size  needs  to  be  computed  at  run  time. 


FUNCTION  CRTmode (VAR  char  columns, 

dispray_page  :  BYTE)  :  BYTE; 

(Returns  CRT  mode  of  operation.  It  uses  registers  and 
software  interrupt  lOh  to  BIOS  video  services.  Also 
returns  the  number  of  character  columns  in  the  current 
video  mode  (80  or  40)  and  the  video  display  page  as 
VARiable  parameters. 


(The  following  code  is  the  property  of  H.D. 

Callihan,  University  of  Pittsburgh  at  Johnstown, 
Johnstown,  Pa.  Personal  use  is  encouraged.  Feel 
free  to  make  copies  for  distribution  to  other 
personal  users.  Commercial  use  is  prohibited 
without  written  permission  and  an 
appropriate  license. 

Version;  4  (requires  the  CRTmode  function  in 
file:  CRTMODE .INC)  Purpose: 
save  and  restore  a  current  screen 
region  in  low-  or  hi-res  mode. 

Date:  6/22/87 

Author:  H.D.  Callihan,  Ph.D.  (C)  Copyright  1987 
Applic:  Turbo  V3.0  for  IBM  PC  and  true  compatibles. 
File:  GREGIQN4 . INC 

> 


TYPE 

bufferraeraory  -  ARRAY ( 1 . . 3 )  OF  INTEGER; 
bufferaddress  -  *buf ferraemory; 

PROCEDURE  SaveRegion (VAR  buff  :  bufferaddress; 

xl,yl,  (upper  left) 
x2,y2  (lower  right) 

:  INTEGER); 

( - Local  functions  to  SaveRegion - ) 

FUNCTION  max (a, b:  INTEGER)  :  INTEGER; 

BEGIN 

IF  a<b  THEN  max  :-  b  ELSE  max  :-  a 
END; 

FUNCTION  min (a,b  :  INTEGER)  :  INTEGER; 

BEGIN 

IF  a<b  THEN  min  :-  a  ELSE  rain  :-  b 
END; 

( - End  local  functions - ) 

VAR  width,  height,  size  :  INTEGER; 
dummy 1,  dunroy2  :  BYTE; 

BEGIN  (SaveRegion) 

(correct  for  negative  x  and  y) 
xl  :-  max (xl, 0) ;  x2  :-  max (x2,  0) ; 
yl  :-  max (yl, 0) ;  y2  :-  max (y2,  0) ; 

(correct  for  large  y) 

yl  :-  min (yl, 199) ;  y2  :-  rain (y2, 199) ; 

(compute  height  of  image  in  pixels) 
height  :-  ABS(yl-y2)  +  1; 

CASE  CRTmode ( dummy 1,  dumray2)  OF 

{  durarayl  and  durany2  not  used  ) 

4,5:  (one  of  the  low  resolutions) 

BEGIN 

xl  :-  rain (xl, 3 19) ;  x2  :-  rain (x2, 319) ; 

(compute  width  of  image  in  pixels) 
width  :-  ABS (X1-X2)  +1; 

(compute  size  of  buffer  need  to  store  image) 
size  :-  ( (width+3)  DIV  4)  *  height  *2  +  6; 

GETMEM (buff,  size); 

GETPIC (buff*,  xl, yl, x2,y2) 

END; 

6:  (high  resolution) 

BEGIN 

xl  :-  rain (xl, 639);  x2  :-  rain (x2, 639) ; 
width  :-  ABS (xl-x2)  +1; 

size  :-  ( (width+7)  DIV  8)  *  height  +  6; 
GETMEM (buff,  size); 

GETPIC (buff*,  xl,yl,x2,y2) 

END; 

ELSE  WRITE (*G);  (unacceptable  mode) 

END  (CASE) 

END;  (SaveRegion) 


End  Listing  One 


Listing  Two 

0Listing  20.  The  0CRTraode0  function  determines  which 
display  mode  is  currently  active.  It  returns  an  integer 
code  as  well  as  two  arguments  which  determine  text  row 
and  column  information  if  a  text  mode  is  active. 


Version:  2 

Author:  H.  D.  Callihan,  PhD,  UPJ 
Date:  2/29/87 
File:  CRTMODE. INC 

VIDEO  MODES: 

0  -  40x25  monochrome 

1  -  40x25  color 

2  -  80x25  monochrome 

3  -  80x25  color 

4  -  320x200  color  graphics  (40x25  text) 

5  -  320x200  mono  graphics  (40x25  text) 

6  -  640x200  b/w  high  res  graphics  (80x25  text) 

The  following  values  are  returned  on  the  AT4T  6300  for 
super res  mode  (640x400) . 

$40  -  640x400  mono  superres  graphics 
(80x25  high  quality  text) 

$48  -  640x400  mono  superres  graphics 
(80x50  tiny  text) 

Register  input  :  (AH)  < —  OFh 
Register  output:  (AL)  — >  current  mode 

(AH)  — >  number  of  character 
columns  on  screen 

(BH)  — >  current  active  display  page 
Uses  software  interrupt  lOh  for  BIOS  video  service. 


TYPE  regpack  -  RECORD 

ax, bx, cx,  dx,bp, si, di, ds, es, flags :  INTEGER 
END  (regpack); 

VAR  dosreg  :  regpack; 

BEGIN  { CRTmode ) 

WITH  dosreg  DO 
BEGIN 

(set  high  byte  $0F  for  register  input) 
ax  :-  $OFOO; 

(  $0F  -  00001111  binary) 

INTR ($10,  dosreg);  (software  interrupt  10h) 
CRTmode  :-  LO(ax);  (mask  low  byte) 
char  columns  :-  HI (ax);  (mask  high  byte) 
dispTayjpage  :-  HI(bx)  (mask  high  byte) 

END 

END;  (CRTmode) 


End  Listing  Two 


Listing  Three 

0Listing  30 .  The  0RestoreRegion0  procedure  copies  any 
upright  rectangular  graphics  screen  region  from  a 
buffer,  0buff0,  previously  saved  by  0SaveRegion0  where 
its  lower-left  screen  coordinates  (x,y)  are  specified. 
FREEMEM  is  used,  rather  than  the  standard  DISPOSE, 
because  the  buffer  size  needs  to  be  computed  at  run 
time. 


PROCEDURE  RestoreRegion 
(VAR  buff  :  bufferaddress;  (pointer) 

x, y  :  INTEGER;  (lower  left  corner) 
freeup  :  BOOLEAN);  (freemen:  after  restore) 

VAR  width,  height,  resolution,  size  :  INTEGER; 

x_ok,  y_ok  :  BOOLEAN; 

BEGIN 

resolution  :-  buffA(l); 
width  :-  buff* [2]; 
height  :-  buff* (3); 

(check  screen  boundary  y  limits) 

y_ok  :-  (y-height+1  >-  0)  AND  (y  <200); 

CASE  resolution  OF 
2:  (low) 

BEGIN 

(check  x  limits) 

x_ok  :-  (x  >-  0)  AND  (x+width-1  <  320) ; 

IF  x_Ok  AND  y_ok  THEN 
BEGIN 

PUTPIC (buf f*,x,y) ; 

IF  freeup  THEN 
BEGIN 

size  :- 

( (width+3)  DIV  4)  *  height  *2  +  6; 
FREEMEM (buff,  size) 


(continued  on  next  page) 


Dr.  Dobb’s  Journal,  1987 

878 


92 


TURBO  PASCAL  GRAPHICS 


Listing  Three 

(Listing  continued ,  text  begins  on  page  38.) 

END 

END 

END; 

1:  {high) 

BEGIN 

x _ ok  (x  >-  0)  AND  (x+width-1  <  640); 

IF  X_ok  AND  y_ok  THEN 
BEGIN 

PUTPIC (buf f*,x,y) ; 

IF  freeup  THEN  {free  the  buffer) 

BEGIN 

size  :- 

( (width+7)  DIV  8)  *  height  +  6; 

FREEMEM (buff ,  size) 

END 

END 

END; 

ELSE  WRITE  (*G*G) 

END  (CASE) 

end;  (RestoreRegion)  End  Listing  Three 


ELSE  WRITE (*G*G*G) 

END;  {CASE) 

IF  resolution  IN  [1,2]  THEN  FREEMEM (buff ,  size) 


END;  {FreeBuffer) 


End  Listing  Four 


Listing  Five 

gListing  50 .  The  procedure,  gSaveBlockToDiskg,  fscilitstes 
saving  a  screen  region  to  disk  that  was  previously 
stored  in  a  memory  buffer  using  gSaveRegiong .  The  file 
name  is  passed  into  the  procedure  as  a  string  argument. 
0GetBlockFromDisk§  is  a  procedure  which  does  the  opposite 
by  retrieving  a  block  from  disk  and  placing  it  into 
contiguous  memory  located  at  gbufferg.  gResolutiong  and 
gsizeg  are  also  computed  and  returned  as  a  matter  of 
convenience.  This  recovered  region  may  now  be  placed 
onto  the  screen  using  gRestoreRegiong . 


Listing  Four 


gListing  4g .  The  gFreeBufferg  procedure  permits  memory  to 
be  freed  dynamically  without  displaying  the  image  on  the 
screen.  Like  gRestoreRegiong,  it  also  requires  that  the 
size  of  the  memory  block  be  computed  based  upon  the 
block  previously  saved  by  gSaveRegiong. 


PROCEDURE  FreeBuffer  (  VAR  buff  :  bufferaddress  ) ; 

VAR  resolution,  width,  height,  size  ;  INTEGER; 

BEGIN 

resolution  buff*(l); 
width  buff* (2); 
height  buff* (3); 

CASE  resolution  OF 
2  :  (LCW) 

size  ( (width+3)  DIV  4)  *  height  *2+6; 
1  :  {HIGH) 

size  {(width+7)  DIV  8)  *  height  +  6; 


{H  D  CALLIHAN,  UNIVERSITY  OF  PGH  AT  JOHNSTOWN,  1987) 

TYPE  FileString80  -  STRING (80); 

PROCEDURE  SaveBlockToDisk 

(  FileName  :  FileString80; 

buffer  :  bufferaddress  {mem  address  of  block) 

)  ; 

VAR 

size,  resolution,  width,  height  :  INTEGER; 
FileVariable  :  FILE;  {untyped  file) 

BEGIN 

resolution  buffer* [1] ; 
width  buffer* (2); 
height  buffer* (3); 

CASE  resolution  OF 

1  :  (HIGH) 

size  ((width  +  7)  DIV  8)  *  height  +  6; 

2  :  {LCW) 

size  ((width  +  3)  DIV  4)  *  height  *2  +  6; 
ELSE  WRITE  (*G*G*G*G) 

END;  {CASE) 

IF  resolution  IN  (1,2)  THEN  BEGIN 


ASSIGN  {  FileVariable,  FileName  ); 

REWRITE  (  FileVariable  ); 

BLOCKWRITE  (  FileVariable,  buffer*,  l+(size-l)  DIV  128  ); 
CLOSE  (  FileVariable  ) 

END  {IF) 

END;  {  BlockSave  ) 


PROCEDURE  GetBlockFromDisk 

(  FileName  ;  FileStringSO; 

VAR  buffer  :  bufferaddress; 

VAR  resolution,  {1-hi,  2-lo) 
size  ;  INTEGER  ); 

VAR 

FileVariable  ;  FILE; 
width,  height, 

SizeOfFile  :  INTEGER; 

{number  of  128-byte  records  in  block  file) 

BEGIN 

ASSIGN  (  FileVariable,  FileName  ); 

RESET  (  FileVariable  ); 

SizeOfFile  FILESIZE  (  FileVariable  ); 

IF  SizeOfFile  <>  0  THEN  BEGIN 

GETMEM (  buffer,  SizeOfFile  *  128  ); 

BLOCKREAD  (  FileVariable,  buffer*,  SizeOfFile  ); 
resolution  buffer* {1 ] ; 
width  buffer* [2); 
height  buffer* {3); 

CASE  resolution  OF 

1  :  (HIGH) 

size  ((width+7)  DIV  8)*height  +6; 

2  :  (LOW) 

size  ((width+3)  DIV  4)*height*2  +6; 

ELSE  WRITE (*G*G*G*G*G) 

END  (CASE) 

END;  {IF) 

CIOSE  (  FileVariable  );  .... 

END;  (GetBlocM  End  Listing  Five 


Listing  Six 

gListing  60.  The  following  program  demonstrates  the  use 
of  gSaveRegiong  and  gRestoreRegiong.  Figure  2  contains 
images  saved  in  gbufferg  and  0buffer20. 


PROGRAM  Test_SaveRegion_and_RestoreRegion; 


{$1 

graph. p)  { — > 

extended  Turbo  Graphics) 

{$1 

crtmode . inc )  { — > 

Callihan  function  to  check 
current  crt  mode) 

{$1 

gregion4 . inc )  { — > 

Callihan  save  and  restore 
region  Turbo  Code 
and  FreeBuffer  code) 

VAR 

i  :  INTEGER; 

buffer, 

buffer2  :  bufferaddress; 

{TYPE  declared  in  the  above  Callihan  file. ) 

{The  TYPE  must  be  used  here  for  SaveRegion, 
RestoreRegion,  and  FreeBuffer  to  work.  ) 

BEGIN 

GRAPHCOLORMODE;  {Pick  your  resolution) 

{  HIRES;  )  {  use  only  colors  0  and  1  if  this  is  used) 

GRAPHWINDOW(50, 100,200, 180) ; 

{pick  window:  (50,100)  to  (200,180)) 

FILLSCREEN (2) ; 

{clear  current  window  to  green-1,  black-0, etc. ) 

{red-2,  yellow-3  ) 

{Now,  draw  lines  in  black-0,  green-1,  etc.) 

{draw  a  nice  border  around  the  window) 

DRAW(148,78, 148,  2,  3); 

DRAW (148,  2,  2,  2,  3) ; 

DRAW (  2,  2,  2,78,  3) ; 

DRAW (  2,78,148,78,  3)  ; 

{Now,  let's  draw  a  couple  circles  in  the  window 
with  center  and  radius  determined  by  a  loop  index. 

Use  color  as  last  parameter.) 

FOR  i  10  TO  25  DO  CIRCLE(3*i,  2*i,  i,  1) ; 

READLN;  {wait  for  user  to  press  return) 

SaveRegion  (buffer,  0,0,150,80); 

{saves  in  current  window  coords) 

{151  pixels  wide  by  81  high  ) 

{  GRAPHCOLORMODE; )  (this  call  clears  and  resets 
window  to  full  screen) 

FILLSCREEN (0) ;  {clear  window  to  black, 

don't  reset  to  full  screen) 

(continued  on  next  page) 


Dr.  Dobb's  Journal,  November  1987 


95 

879 


TURBO  PASCAL  GRAPHICS 

Listing  Six 

(Listing  continued,  tejct  begins  on  page  38.) 

READLN; 

RestoreRegion (buffer, 0,  80,  false)  ; 

(Restores  in  current  window  coords  } 

(at  the  lower  left  point  0,80  ) 

(but  does  not  free  the  buffer.  ) 

READLN; 

GRAPHCOLORMODE;  (Pick  resolution  again) 

(  HIRES;) 

RestoreRegion (buffer,  0,  80,  false)  ; 

READLN; 

FOR  i  1  TO  10  DO 
(let's  get  fancy) 

RestoreRegion (buffer, 

150  -  8*i,  200  -  10*i,  false); 

READLN; 

SaveRegion (buffer2, 0,  0, 319,199)  ; 

(Use  buffer2  for  background) 

FILLSCREEN (0) ; 

FOR  i  1  TO  10  DO  BEGIN 

RestoreRegion (buffer,  10+4 *i,  200-5*i,  false); 
READLN; 

RestoreRegion (buffer2, 0, 199, false) 

(Restore  background) 

END;  (for) 

READLN; 

TEXTMODE  (Return  to  the  standard  text  screen) 

END. 

End  Listings 


96 

880 


Dr.  Dobb's  Journal,  November  1987 


SOUNDEX  ALTERNATIVE 

Listing  One  (Te/cf  begins  on  page  62.) 

/*  AC  implementation  Bickel's  name  comparison 

*/ 

/*  algorithm,  CACM  30/3  (March  1987),  p.  244 

*/ 

/*  This  is  generic  C  code  and  should  work  on 

any 

*/ 

/*  compiler  with  no  modification. 

*/ 

/*  Jim  Howell,  March  1987. 

*/ 

/*  This  code  is  placed  in  the  public  domain. 

You  are 

*/ 

/*  free  to  use  it  in  any  way  you  see  fit. 

*/ 

# include  <ctype.h> 

#include  <stdio.h> 

unsigned  char  MaskArray  [521  ={0,  0x40, 

/* 

a 

*/ 

3,  0x80, 

/* 

b 

*/ 

2,  0x20, 

/* 

c 

*/ 

1,  0x10, 

/* 

d 

*/ 

0,  0x20, 

/* 

e 

*/ 

2,  0x10, 

/* 

f 

*/ 

2,  0x08, 

/* 

g 

*/ 

1,  0x08, 

/* 

h 

*/ 

0,  0x10, 

/* 

i 

*/ 

3,  0x08, 

/* 

j 

*/ 

3,  0x20, 

/* 

k 

*/ 

1,  0x04, 

/* 

i 

*/ 

2,  0x04, 

/* 

m 

*/ 

0,  0x08, 

/* 

n 

*/ 

0,  0x04, 

/* 

o 

*/ 

2,  0x02, 

/* 

P 

*/ 

3,  0x10, 

/* 

q 

*/ 

1,  0x02, 

/* 

r 

*/ 

0,  0x02, 

/* 

s 

*/ 

0,  0x01, 

/* 

t 

*/ 

1,  0x01, 

/* 

u 

*/ 

3,  0x40, 

/* 

V 

*/ 

2,  0x01, 

/* 

w 

*/ 

3,  0x04, 

/* 

X 

*/ 

3,  0x02, 

/* 

y 

*/ 

3,  0x01}; 

/* 

z 

*/ 

/*  Demonstrate  the  algorithm  with  some  short  examples. 

*/ 

main  () 

{ 

unsigned  char  LetterSetO  [4], 

LetterSetl  [4]; 

MakeLetterSet  ("ecdysiast",  LetterSetO); 
MakeLetterSet  ("ecstasy",  LetterSetl); 

print f  ("The  likeness  of  'ecdysiast'  and 

'ecstacsy ' 

is  %d.\n". 

CompareSets  (LetterSetO,  LetterSetl)); 

MakeLetterSet  ("ectoplasm",  LetterSetl); 

printf  ("The  likeness  of  'ecdysiast'  and 

'ectoplasm 

is  %d.\n". 

CompareSets  (LetterSetO,  LetterSetl)); 

MakeLetterSet  ("edcysiast",  LetterSetl); 

printf  ("The  likeness  of  'ecdysiast*  and  edcysiast' 

is  %d.\n". 

CompareSets  (LetterSetO,  LetterSetl)); 

}  /*  main  */ 

/*  Make  the  letter  set  by  going  through  the 

string 

V 

/*  one  character  at  a  time. 

*/ 

MakeLetterSet  (Name,  Mask) 

char  *Name; 

unsigned  char  *Mask; 

char  *pC; 

unsigned  char  *pM; 

int  I, 

Lk; 

pC  =Name; 

for  (I  =0;  I  <4;  ++  I) 

Mask  [I]  =0; 

while  (*pC)  { 

/*  Use  letters  only,  and  convert  */ 

/*  uper  to  lower  case.  */ 

98 


Dr.  Dobb's  Journal,  November  1987 

881 


if  (isalpha  (*pC))  ( 

pM  -IMaskArray  (2  Mtolower  (*pC)  —  ‘ a  * )  1 ; 
Mask  [*pM]  l-*(pM+l); 

) 

++  pC; 

) 

)  /*  MakeLetterSet  *1 


/*  This  particular  version  constructs  a  mask  by  using  */ 
/*  a  logcal  AND  of  the  relevant  bytes  of  the  two  sets.  */ 
/*  The  rightmost  bit  is  checked  to  see  if  the  likeness  */ 
/*  score  needs  to  be  increased  by  the  appropriate  V 
/*  weight,  and  the  mask  is  shifted  right  one  bit.  An  */ 
/*  alternative  would  be  to  use  a  bit  mask  and  to  *7 
/*  shift  it  instead  of  the  name  mask.  The  two  bytes  */ 
/*  of  the  least  common  letters  are  checked  for  content  */ 
/*  to  eliminate  any  unnecessary  calculations.  */ 


CompareSets  (Setl,  Set2) 
unsigned  char  *Setl, 

*Set2; 

( 

unsigned  char  Mask; 
int  I, 

Lk; 

Lk  -0; 

/*  For  the  first  byte.  */ 

Mask  -Setl  [0]  S  Set2  [0J; 
for  (I  -0;  I  <7;  ++  I)  { 
if  (Mask  £  0x01) 

Lk  +-3; 

Mask  »-l; 

) 

/*  The  second  byte.  */ 

Mask  -Setl  [1]  £  Set2  (1); 
for  (I  -0;  I  <5;  ++  I)  ( 
if  (Mask  t  0x01) 

Lk  +-4; 

Mask  »-l; 

) 

/*  The  third  byte.  */ 

Mask  -Setl  [2]  t  Set2  [2] ; 
if  (Mask) 

for  (I  -0;  I  <6;  ++  I)  ( 
if  (Mask  t  0x01) 

Lk  +«5; 

Mask  »-l; 

) 

/*  The  last  byte  is  more  complicated.  */ 

Mask  -Setl  (3)  t  Set2  (3); 
if  ((Mask) 

return  (Lk) ; 
if  (Mask  t  0x01) 

Lk  +-9; 

Mask  »-l; 
if  (Mask  £  0x01) 

Lk  +-8; 

Mask  »-l; 

if  (Mask  £  0x01) 

Lk  +-8; 

Mask  »-l; 
if  (Mask  £  0x01) 

Lk  +«8; 

Mask  »-l; 
if  (Mask  £  0x01) 

Lk  +-7 ; 

Mask  >>-1; 
if  (Mask  £  0x01) 

Lk  +-7; 

Mask  »-l ; 
if  (Mask  £  0x01) 

Lk  +»6; 

Mask  »-l; 
if  (Mask  £  0x01) 

Lk  +-6; 

Mask  »-l; 

return  (Lk) ; 

)  /*  CompareMasks  */ 

End  Listings 


/*  z  */ 
/*  y  V 

/*  x  */ 
/*  j  */ 
/*  q  */ 
/*  k  */ 

/*  V  »/ 

/*  b  */ 


Dr.  Dobb’s  Journal,  November  1987 


INTERRUPT  HANDLING 

Listing  One  (Text  begins  on  page  54.) 

PUBLIC  Lhandl 

EXTERN  c  ds, Hhandl 

;See  text  for  NEAR/FAR  choice. 

Lhandl  : 

pushf 

;Push  all  registers.  The  order  is 

push  ds 

;  arbitrary  so  long  as  it  is  known  to  the 

push  es 

;  interrupt  function  in  Hhandl (). 

push  bp 

push  si 

push  di 

push  dx 

push  cx 

push  bx 

push  ax 

mov  ds,cs:c  ds 

,-Adjust  DS  and  whatever  other  registers 

;  your  version  of  C  requires. 

call  Hhandl 

;Call  high-level  handler. 

pop  ax 

;Restore  registers.  These  may  have  been 

pop  bx 

;  altered  by  the  interrupt  function 

pop  cx 

;  called  by  the  high-level  handler. 

pop  dx 

pop  di 

pop  si 

pop  bp 

pop  es 

pop  ds 

popf 

ret  02 

;Substitute  db  0cah,2,0  if  a  NEAR  procedure 

End  Listing  One 

Listing  Two 

Hhandl (fake) 

int  fake; 

l 

REGS  *regs  -  tfake; 

char  serv; 

serv»regs->ax»8; 

/*  preserve  service  #  from  caller's  AH  */ 

interrupt (NEW16, regs)  ; 

/*  chain  to  keyboard  i/o  bios  routine  */ 

if  (serv— 0)  switch  (regs- 

>ax) {  /*  if  get-input  service,  test  return  */ 

case  HOTKEY1 :  hotl(); 

/*  run  through  the  hot  keys  */ 

break; 

case  H0TKEY2 :  hot2(); 

break; 

case  HOTKEY 3 :  hot3() 

break; 

. . .and  so  forth. . . 

1 

} 

End  Listing  Two 

Listing  Three 

; interrupt (nn,  pptr ) 

;  int  nn; 

;  REGS  *pptr; 

PUBLIC  interrupt 

interrupt  ; 

push  bp 

mov  bp, sp 

;Set  up  base  pointer  for  stack 

push  ds 

;Push  whatever  registers  need  to  be  preserved 

mov  al,0cdh 

;CD  (do-interrupt)  machine  code  into  low  byte 

mov  ah, [bp+6] 

;Now  reverse-byte  AX  reads  "CDnn" :  nn=int# 

mov  cs ; intnum[0] , ax 

;Move  interrupt  code  to  instruction  site 

Ids  bx, [bp+8] 

;Load  segiofs  of  struct  pointer  into  DS:BX 

mov  ax, [bx+0] 

;Move  structure's  AX  value  into  AX 

push  [bx+2] 

;Push  final  BX  value  onto  stack 

mov  cx, [bx+4] 

;Load  the  rest,  skipping  BP  &  FLAGS 

mov  dx, [bx+6] 

mov  di, [bx+8] 

mov  si, [bx+10] 

mov  es, [bx+14] 

mov  ds, [bx+16] 

/Pointer  gone:  must  pop  BX 

pop  bx 

push  bp 

; Somebody  else  may  have  stolen  things  beforehand 

100 


Dr.  Dobb's  Journal,  November  1987 

883 


intnum: 

dw  00 

/Scene  of  crime  of  self-modifying  code 

pop 

bp 

push 

bx 

/Save  returned  DS  C  BX  so  we  can  use  pointer 

push 

ds 

Ids 

bx, [bp+8] 

/Load  pointer 

mov 

[bx+0] , ax 

/Inverse  of  sequence  above 

mov 

[bx+4] , cx 

mov 

[bx+4] ,  cx 

mov 

[bx+6]  ,dx 

mov 

[bx+8]  ,di 

mov 

[bx+10] , si 

mov 

[bx+14] , es 

pushf 

/Must  consider  flags  this  time 

pop 

[bx+18] 

/Flags  into  flags  slot 

pop 

[bx+16] 

/DS  into  structure 

pop 

[bx+2] 

/And  BX 

pop 

ds 

/Recover  preserved  registers 

pop 

bp 

ret 

End  Listing  Three 


Listing  Four 


;chgint (old, new, lowhandler) 

;  int  old, new, (‘lowhandler) ();  /‘old  vector,  new,  pointer  to  handler*/ 
PUBLIC  chgint_ 
chgint_: 
push  bp 
mov  bp, sp 
push  ds 
mov  ah, 35h 
mov  al, [bp+8] 
int  21h 
mov  ax,  es 
or  ax.bx 
jnz  error 
mov  ah , 3  5h 


;Set  up  base  pointer  to  stack 

;First  check  to  see  if  new  vector  Is  In  use 
/Destination  interrupt  #  into  AL 
/Present  vector  contents  into  ES:BX 
/Set  up  for  possible  error  return  in 
/Are  both  seg  t  offset  zero? 

/if  so,  report  error  and  exit 
/Get-vector  service  again 


AX 


mov  al, [bp+6] 
int  21h 
mov  ax,es 
mov  ds,  ax 
mov  dx,bx 
mov  ah,25h 
mov  al, [bp+8] 
int  21h 
Ids  dx, [bp+10] 
mov  ah,25h 
mov  al, [bp+6] 
int  21h 
pop  ds 
pop  bp 
mov  ax, 0 
error:  ret 


/Old  interrupt  #  into  AL 
/Pick  up  "old"  in  ES:BX 
/ES:BX->DS:DX  for  function  25h  call 


/Interrupt  destination  into  AL 
/Load  DS:DX  into  table  at  "new" 

/Low  load  pointer  to  low-level  handler 

/Old  int#  into  AL  again 

/Put  function  pointer  into  table 

/Restore  registers 

/return  of  zero  in  AX  means  O.K. 


Listing  Five 

# include  <regsdef.h>  /*  typedef  of  REGS  structure  */ 

idefine  OLDINT  XX  /*  interrupt  captured  */ 

Idefine  NEWINT  YY  /*  new  location  for  old  vector  */ 

/*****************************/ 

Hhandl (xx) 
int  xx; 

{ 

REGS  *regs  =  (xx; 

/* 

*  Do  whatever  it  is  here  that  needs  to  be  done.  Obviously,  other 

*  C  functions  could  be  called  from  this  one.  Don't  forget  to  chain 

*  the  interrupt  so  that  other  resident  programs  get  their  chances. 

*  If  a  hardware  interrupt  is  being  captured  and  your  handler  is  fairly 


End  Listing  Four 


Dr.  Dobb’s  Journal,  November  1987 

884 


101 


*  lengthy,  you  may  need  to  set  a  static  flag  to  block  reentry  and 

*  include  _inb (0x20, 0x20) — an  EOI  to  the  PIC. 

*/ 

} 

/****************************/ 

main()  /*  Initialization  routine  */ 

( 

int  Lhandlt);  /*  tell  the  compiler  about  the  low-level  handler  */ 

REGS  rr;  /*  allocate  space  for  initializers ' s  interrupt  calls  */ 


> 


saveds();  /*  save  data  segment  */ 

/* 


*  Set  up  the  high-level  handler  here  by  initializing  globals — e.g., 

*  screen  parameters,  addresses  in  the  00:400h  memory  area,  strings  for 

*  hot-key  substitutions,  a  pointer  to  the  DOS-can-be-interrupted  flag, 

*  a  pointer  to  the  memory  allocation  chain,  etc.  Do  it  now,  in  C, 


*  while  you  are  not  in  residence. 
*/ 

if (chgint (OLDINT, NEWINT, Lhandl) ) 
puts (“\n\nVECTOR  IN  USE\7\7") ; 
else  ( 

rr.ax  -  0x3100; 
rr.dx  -  PROGRAMSIZE; 
puts  ("\n\nDONE  t  INSTALLED"); 
interrupt (0x21,  Srr)  ; 

) 


/*  test  t  swap  vectors  */ 

/*  exit,  stay  resident  service.  */ 
/*  PROGRAMSIZE  in  paragraphs  */ 


End  Listings 


Dr.  Dobb's  Journal,  Novembr  1987 


103 

885 


ASYNC  APPLETALK 


Listing  One  (Listing  continued,  text  in  October.) 


MDoWarn  —  Control  call  to  put  up  a  warning 
Entry:  AO  ->  IOQelement 


MDoWarn 


move  CSParam(aO)  ,D0 

bsr.s  DoWarn 
bra  AbusExit 


Put  up  alerts 


;  get  the  error  code  Into  DO 
;  warn  'em  and  return  status  In  DO 
;  and  exit 


DoWarn  —  Warn  the  user...  Give  a  beep,  and  display  a  dialog; 

wait  for  their  choice,  then  try  the  NNNN  one  more  time. 
If  they  choose  "Use  new  address"  from  Mismatch  dialog, 
set  SysLAPaddr  and  SysNetNum  to  zero  before  exiting. 


Entry: 

DO  - 

■  PortNotCf 

noAnswer 

Exit: 

DO  - 

■  -4001  (user  clicked  OK/Try  again) 

-4002 

(user 

clicked  Use  New) 

-193  (ResFNotFound) 


resfile 

EQU 

-2 

dlgwindow 

EQU 

-6 

warning 

EQU 

-8 

MyCurMap 

EQU 

-10 

res  file  number 
dialog  window  handle 
error  number/return  status 
save  the  current  res  file 


DoWarn 


SUBR  10 


;  Warn  the  user  about  troubles 


move.w  DO, warning (A6)  ;  remember  the  warning 

move.w  CurMap,MyCurMap(A6)  ;  and  the  current  res  file 
InitCursor  ;  make  it  an  arrow  again 


;  Open  our  resource  file. 


subq.l  42, sp 
pea  fileName 
OpenResFile 

move.w  (SP)+, resfile (A6) 
cmp.w  1-1,  resfile  (a6) 

bne.s  83 
move.w  I60,-(A7) 

SysBeep 

move.w  ff ResFNotFound,  DO 
bra.s  820 


;  make  space  for  result 
;  point  to  file  name 

;  save  the  resfile  number 
;  check  for  failure 

;  branch  if  ok 
;  else  beep  (long) 

;  return  bad  status 
;  and  quit 


;  beep  at  'em 
83 

move.w  46,- (A7)  ;  1/10  second  beep 

_SysBeep 


;  choose  a  dialog  to  display 

move.w  4PortNCalrt,D0 

cmp.w  4PortNotCf, warning (a6)  ;  which  warning? 
beq . s  8  5 

move.w  4Noansalrt,D0  ;  noAnswer  dialog 


;  now  display  the  dialog 

es 

subq.l  44, sp 
move.w  D0,-(sp) 
clr.l  -(sp) 
move.l  4-1, -(sp) 
_GetNewDialog 

move.l  (SP)+,dlgwlndow(A6) 

;  Now  do  the  dialog  stuff 

subq.l  42, sp 
clr.l  - (sp) 
pea  4 (sp) 

_ModalDialog 


;  space  for  result  of  _GetNewDialog 
;  dialog  resource  ID 

;  dialog  record  in  heap 

;  in  front  of  other  windows 

;  save  the  dialog's  handle 


;  result  on  stack 
;  normal  fllterproc 
;  point  to  result  space 
;  Do  it 


;  discard  dialog 


104 

886 


Dr.  Dobb’s  Journal,  November  1987 


move . 1  dlgWlndow (a6) , - (sp)  ;  point  to  dialog 
_D1 sposDialog 

;  What  did  they  hit? 


move.w 

(sp)  +,d0 

$ 

cmp.w 

#1,  DO 

beq.s 

810 

clr.b 

SysLAPAddr (a2) 

* 

clr.w 

SysNetNum (a2) 

t 

810 

neg.  1 

DO 

t 

sub.  1 

#4000, DO 

» 

move.w 

DO, warning (A6) 

! 

;  discard 

resource 

file 

move.w 

resfile (A6) ,D0 

t 

cmp.w 

MyCurMap  (AS) ,  DO 

$ 

beq.s 

815 

! 

move.w 

DO,- (sp) 

CloseResFile 

; 

815 

move.w 

.warning (A6) , DO 

; 

020  ;  DO 

is  result 

code  for  this  routine 

_SUBEND 

•DOWARN  ' 

>• 

FlleName 

DC.  B 

15 

DC.  B 

•Async  AppleTalk  * 

DC.  B 

•V1.2a6' 

ALIGN 

2 

get  the  button's  item  t 
(Try  Again  or  OK  (-1) )  or  Use  New? 
go  if  not  “Use  New" 
otherwise.  Use  New 

and  reset  net  and  node  adrs 

item  will  be  1  or  2;  return  -4001 
or  -4002  as  the  status 
save  it 


get  the  refnum 

was  it  the  current  resource  file 
go  if  so  (someone  else  opened  it) 
else,  push  it 
and  close  it 

get  the  status  from  the  NNNN 


and  exit 


end  of  lap. a 


End  Listing  One 


Listing  Two 

CRC  Calculations 

This  file  contains  a  CRC  calculation  in  Pascal.  It  was  used  with 
preliminary  versions  of  Async  AppleTalk,  and  computes  the  same 
function  as  the  code  in  the  M68000  listing. 

The  NextCRC  algorithm  simulates  the  feedback  shift  register  which 
normally  implements  a  CRC  calculation.  NextCRC  takes  each  four- 
bit  nibble  of  the  input  char  and  uses  a  table  (crctbl)  to  select 
a  mask  which  is  exclusive-or 'd  with  the  current  CRC  accumulator. 

} 

{  pseudo-CONST  —  put  this  in  the  initialization  code  of  your  program 


crctbl [00 ] 

:■=>  $0000; 

crctbl [01] 

$CC01; 

crctbl [02 ) 

$D801; 

crctbl [03] 

$1400; 

crctbl [04 ] 

:=  $F001; 

crctbl [05] 

= 

$3C00; 

crctbl [06] 

$2800; 

crctbl [07] 

- 

$E401; 

crctbl [08] 

$A001; 

crctbl [09] 

- 

$6C00; 

crctbl [10] 

$7800; 

crctbl [11] 

$B401; 

crctbl [12] 

$5000; 

crctbl [13] 

$9C01; 

crctbl [14] 

$8801; 

crctbl [15] 

- 

$4400; 

VAR  crctbl  :  array  [0..15]  of  integer; 

function  NextCRC  (crc  :  integer;  c  :  QDbyte)  :  integer; 
VAR 

j  ;  integer; 


BEGIN 

j  :=  crctbl [  band  (bxor (crc, c) , $000F)  ]; 
crc  : -  bxor (bsr (crc,  4) ,  j) ; 
c  :**  bsr  (c,  4) ; 

j  crctbl [  band  (bxor (crc, c) , $000F)  ]; 
crc  bxor (bsr (crc, 4) , j) ; 
next crc  :=  crc; 

END;  {  NextCRC  } 

function  crcl6  (p  :  qdptr;  len  :  integer)  :  integer; 


(continued  on  next  page) 


Dr.  Dobb’s  Journal,  November  1987 


105 

887 


ASYNC  APPLETALK 


Listing  Two  (Listing  continued) 

VAR 

i,j  :  integer;  {  sixteen  bits  wide  } 

c  :  qdbyte;  {  an  eight  bit  value  } 

crc  :  integer;  {  the  CRC  accumulator  } 

BEGIN 

crc  0; 

for  i  :■  1  to  len  do  begin 
c  pA; 

p  :«  pointer (ord (p)  +1); 
crc  NextCRC (crc, c) ; 

end; 

crcl6  crc; 

END;  {  crcl6  } 


End  Listing  Two 


Listing  Three 


AssumeEq  Argl,  Arg2  —  macro  to  generate  a  compile-time  error  if  two 
arguments  are  unequal. 

To  optimize  code  size,  we  will  be  making  various  assumptions, 
mainly  as  to  offset  values.  This  macro  is  a  way  of  formalizing 
those  assumptions  within  the  code. 


BLANKS  ON 
STRING  ASIS 

MACRO 

_AssumeEq 

IF  &Eval (&Syslst [1 ] )  <>  &Eval  (&Syslst [2 ] )  THEN 

_ERR  ;  Invalid  statement  -  will  cause  error 

ENDIF 

ENDM 


_StatCount  Argl  —  increment  a  statistics  count  if  stat  keeping  is  enabled 
Assumes  A2  points  to  the  driver  variables 


MACRO 
_St at Count 

IF  debug  THEN 

ADDQ.L  #1, iSyslst (1 ] (A2) ;  Update  the  count 

ELSE 

nop  ;  commented  out 

ENDIF 


Subr 


ENDM 

— 

assembles  a  "Link  A6( 
works  for  _SUBR  <no  j 

MACRO 

Subr 

Ssize 

IF 

&size  -  ' ■  THEN 

Link 

ELSE 

A6,  #0 

Link 

ENDIF 

ENDM 

A6, #  (-fisize) 

and  SUBR  ### 


_Subend  NAME, $xx  —  Subroutine  epilog 

If  debugging,  put  in  Unlk  and  the  name 

MACRO 


Subend 

&name 

Unlk 

A6 

;  unlink  the  stack  frame 

rts 

;  and  return 

DC.  B 

&name 

;  the  name 

ALIGN 

2 

ENDM 

End  Listings 


106 

888 


Dr.  Dobb's  Journal,  November  1987 


C  CHEST 


Listing  One 

(Listing  continued,  text  begins  on  page  116.) 

Listing  1  —  calendar. c,  Printed  12/31/1987 


♦include  <stdio.h> 
♦include  <time.h> 
♦include  <ctype.h> 


CALENDAR. C  This  program  searches  a  file  called  "calendar" 

in  the  current  directory  and  prints  all  lines 
that  start  with  today's  or  tomorrow's  date.  Note  that  this 
behaviour  differs  from  Unix,  which  allows  the  date  to  be 
anywhere  on  the  line.  Leading  whitespace  on  the  line  is 
ignored,  however.  Like  Unix,  "tomorrow"  is  extended  to 
Monday  if  calendar  is  run  on  a  weekend. 

If  the  date  is  a  weekday  (monday,  tuesday,  Wednesday), 
which  can  be  abbreviated  by  the  first  few  character  of  the 
name  (mon,  tu,  wed)  then  the  line  is  printed  on  that  day, 
reguardless  of  the  actual  date.  For  example,  lines  starting 
with  "tue"  will  be  printed  every  tuesday.  Explicit  dates 
can  be  listed  too  and  most  common  abreviations  work: 

Sept.  7,  1987 

September  7  '87 

sept .  7 

Sep  7  87 

9-7 

9/7 

9/7/87 
9  7  1987 
9-7-1987 

and  so  forth.  Like  unix,  "7  September"  doesn't  work,  however. 
The  day  must  follow  the  month. 

If  a  *  is  found  in  the  month's  section  of  any  of  the  above 
dates  (as  in  "*-7-87"  or  "*  7)"  then  that  day  in  any  month 
will  match.  If  no  day  is  specified,  then  all  days  in  the 
indicated  month  will  match. 

Abbreviations  are  formed  by  looking  at  the  first  few 
characters  of  the  word.  The  shortest  possible  abbreviations 
are: 


8U 

Sunday 

mo 

monday 

tu 

tuesday 

w 

Wednesday 

th 

thursday 

fr 

friday 

sa 

Saturday 

j» 

january 

fe 

f ebruary 

mar 

march 

ap 

april 

may 

may 

jun 

june 

jul 

july 

au 

august 

se 

September 

o 

October 

n 

novemeber 

d 

december 

BUGS:  *  If  you  specify  both  the  day  of  the  week  and  the  date, 
as  in  "Tuesday,  July  7,  1987",  the  "Tuesday"  is 
processed  and  the  date  is  ignored. 

*  The  following  won't  work: 

Jun  1  49er's  today 

because  calendar  will  pick  off  49  as  the  current  year. 
Put  some  nonnumeric  character  that  can't  be  part  of  a 
date  string  in  front  of  the  "49."  You  can  use  any 
character  but  a  number  or  one  of :(-./') . 


♦define  JAN  31  /*  Days  in  the  month  */ 

♦define  FEB  28 

♦define  MAR  31 

♦define  APR  30 

♦define  MAY  31 

♦define  JUN  30 

♦define  JUL  31 

♦define  AUG  31 

♦define  SEP  30 

♦define  OCT  31 

♦define  NOV  30 

♦define  DEC  31 

int  of fset_to_first_of_raonth()  - 
( 

0,  /•  Offset  to  the  first  of  •/ 

JAN,  /*  of  the  month  from  */ 

JAN  +FEB,  /*  January,  1st.  V 

JAN  +FEB  +MAR, 

JAN  +FEB  +MAR  +APR, 

JAN  +FEB  +MAR  +APR  +MAY, 

JAN  +FEB  +MAR  +APR  ■♦MAY  +JUN, 

JAN  +FEB  +MAR  +APR  +MAY  +JUN  +JUL, 

JAN  +FEB  +MAR  +APR  +MAY  +JUN  +JUL  +AUG, 

JAN  +FEB  +MAR  +APR  +MAY  + JUN  fJUL  +AUG  +SEP, 

JAN  +FEB  +MAR  +APR  +MAY  +JUN  +JUL  +AUG  +SEP  +OCT, 

JAN  +FEB  +MAR  +APR  +MAY  +JUN  +JUL  +AUG  +SEP  +OCT  +NOV 


Offset  to  the  first  of  */ 
/*  of  the  month  from  */ 
/*  January,  1st.  •/ 


char  ‘skipstuff (p) 

char  *p; 

( 

/»  All  of  the  following  can  be  used  in  a  date  string  */ 

while  (is space  (*p)  |  |  *p— '-'  |  |  *p—  '/’II  *p— '  .'II  *p— ' ,  ' 

I  I  *p~'  \  "  ) 

++p; 


main  () 

< 

char 

FILE 

struct  tra 

long 

int 

int 


buf [132] ,  *p; 

*fd; 

*t; 

thetime; 

month,  day,  year,  wday; 
cl,  c2,  c3; 


if(  !(fd  -  fopen ("calendar",  "r")  )  ) 

{ 

perror(  "calendar"  ); 
exit (1) ; 

) 

thetime  -  time  (NULL)  ; 

t  -  localtime (  ithetime  ); 

printf ("Today  is  %s\n",  asctime(  t  )  ); 

while (  fgets (  buf,  132,  fd  )  ) 

( 

wday  -  -1; 
month  -  -1; 
day  -  0; 

year  -  0; 

p  -  buf; 

while(  isspace(*p)  ) 

++p; 

if (  isalpha(‘p)  ) 


cl  - 

tolower ( 

p[0] 

); 

c2  - 

tolower ( 

p[l] 

; 

c3  - 

tolower ( 

p[2] 

; 

if 

< 

cl— 

a ' 

ii 

c2— 

P' 

) 

month 

else 

in 

cl— 

a ' 

ii 

c2— 

u' 

) 

month 

else 

in 

cl— 

d' 

) 

month 

else 

in 

cl - 

f' 

ii 

c2— 

e  • 

) 

month 

else 

if< 

cl— 

f ' 

ii 

c2— 

r' 

) 

wday 

else 

in 

cl— 

j' 

ii 

c2— 

a ' 

) 

month 

else 

in 

cl— 

j* 

ii 

c2 — 

u ' 

ii 

c3— 

1' 

) 

month 

else 

in 

cl— 

j' 

ii 

c2— 

u' 

ii 

c3— 

n' 

) 

month 

else 

in 

cl— 

m* 

ii 

c2— 

a ' 

ii 

c3— 

r ' 

) 

month 

else 

iti 

cl— 

m' 

ii 

c2— 

a ' 

ii 

c3— 

y' 

) 

month 

else 

ifi 

cl— 

ra' 

ii 

c2— 

o' 

) 

wday 

else 

in 

cl— 

n' 

) 

month 

else 

in 

cl— 

o ' 

) 

month 

else 

in 

cl— 

s ' 

ii 

c2— 

a ' 

) 

wday 

else 

in 

cl— 

s' 

ii 

c2— 

e ' 

) 

month 

else 

in 

cl— 

s ' 

ii 

c2— 

u* 

) 

wday 

else 

in 

cl— 

t* 

ii 

c2— 

h' 

) 

wday 

else 

if  ( 

cl— 

t* 

ii 

c2— 

u* 

) 

wday 

else 

in 

cl— 

w ' 

) 

wday 

if<  wday  >0  ) 

if (  t->tm_wday  — ■  wday  ||  (t->tm_wday  +1)  —  wday) 
fputs(  buf,  stdout  ); 


while(  isalpha(  "p  )  II  *p  —  '.' 

p^+; 


p  -  skipstuff (p) ; 


if(  isdigit(  *p  )  ||  *p  —  ) 

( 

if(  *p  —  ) 

I 

month  -  t->tm_mon; 

++p; 

) 

else  if(  month  <  0  ) 

month  -  stoi(  ip  )  -  1; 

p  -  skipstuff (  p  ); 
day  -  stoi  (  ip  ) ; 

p  -  skipstuff (  p  ); 

if(  (year  -  stoi  (ip))  >  99  ) 
year  —  1900; 

) 

if(  Jyear  )  year  -  t->tm_year  ; 

if(  month  <  0  )  month  -  t->tm_mon  ; 

if(  !day  )  day  -  t->tm_mday  ; 

/*  Convert  the  specified  date  to  an  offset,  in  days, 

*  from  Jan  1.  Unix  provides  the  offset  for  today  in  the 

*  tm_yday  field.  It  makes  life  difficult  by  making  1  Jan. 

*  be  0,  however.  We  also  have  to  compensate  for  leap 

*  year.  If  the  year  is  a  multiple  of  4,  it's  a  leap  year 

*  unless  it's  also  the  first  year  of  the  century  (1900 

*  was  not  a  leap  year,  even  though  it's  a  multiple  of  4). 


day  +-  of fset_to_first_of_month [  month  ]  -  1; 
if (  year  %  4  --  0  ii  year  %  ICO  !-  0  ii  month  >-  2  ) 
day++; 

(continued  on  page  110) 


108 


Dr.  Dobb's  Journal,  November  1987 

889 


C  CHEST 


Listing  One 

(Listing  continued,  text  begins  on  page  116.) 


215| 
216| 
217| 
218  | 
219| 
220| 
221| 
2221 
223| 
224  | 
225| 
226| 
227| 
228| 
229| 
230| 
231| 
2321 
233| 

234  | 

235  1 
236| 
237|  } 


Print  the  line  if  required.  The  first  test  is  the 
normal  situation  (today  or  tomorrow) .  The  second 
test  takes  care  of  Saturday,  which  must  include 
the  following  Monday  in  its  definition  of  "tomorrow." 
The  third  test  takes  care  of  December  31,  which  must 
recognize  January  1  as  "tomorrow."  Note  that  I'm 
intentionally  not  testing  for  the  years  being  different 
because  it  seems  reasonable  that,  if  no  year  is 
specified  and  today  is  December  31,  that  the  next  year 
is  implied  by  January  1,  with  no  date. 


if(  t->tro_year  —  year  44 

(day  —  t->tra_yday  | |  day  ■ 
fputs (  buf ,  stdout  ) ; 


t->tm_yday  +1)  ) 


if(  t->tm_wday  —  6  44  day  —  t->tm_yday  +  2  ) 

fputs (  buf,  stdout  ) ; 

if(  t->tra_mday  —  31  44  t->tm_raon  —  11  44  day  —  0  ) 
fputs (  buf,  stdout  ) ; 


End  Listing  One 


Listing  Two 

Listing  2  —  dateh.c.  Printed  7/26/1987 


♦include  <stdio.h> 
♦include  <dos.h> 
♦include  <time.h> 
♦include  <sys/types.h> 
♦include  <sys/stat.h> 


I  / 


91 
10| 

111 
12| 

13| 

14| 

15| 

16| 

17| 

18| 

19| 

20| 

211 
22| 

23| 

24| 

25| 

26| 

27| 

28  | 

29| 

30| 

311 
32| 

33| 

34| 

35 |  main(  argc,  argv  ) 
36 |  char  **argv; 

37|  { 


Creates  or  updates  file  called  date.h  that  looks  like: 


♦define  _DAY_  "6-17-87" 

♦define  _TIME_24_  "12:5:23" 

♦define  _TIME_  "12:5  PM" 

♦define  _DATE_  "6-17-87  12:5  PM" 

and  holds  the  current  time  and  date.  The  file  is  created 
if  it  doesn't  exist.  If  it  does  exist,  it  is  modified,  but 
only  if  it  was  not  modified  or  created  earlier  on  the  same 
day.  All  of  this  behaviour  can  be  modified  by  command-line 
switches  [see  usage(),  below]. 

This  program  uses  the  Unix  time  functions.  For  them  to  work, 
you  need  to  set  the  various  codes  in  the  T2  environment 
variable.  For  example: 

setenv  TZ— PST8PDT  (Unix  or  Sh) 

set  TZ-PST8PDT  (COMMAND.COM) 

Says  the  Pacific  Standard  Time  is  8  hours  off  of  Greenwich 
and  the  PST  is  used  as  the  abbreviation  for  it.  PDT  is  used 
as  an  abbreviation  for  daylight  savings  time. 

Exit  status  is  0  if  the  file  is  untouched,  1  if  it's  created 
or  modified,  2  on  an  error  of  some  sort. 


38  | 

FILE 

*fd; 

146| 

39| 

struct  stat 

stats; 

147  | 

40| 

struct  tm 

*t; 

148| 

411 

int 

day  ; 

149| 

42| 

long 

thetime; 

150| 

43| 

int 

pm; 

151| 

44  | 

char 

*p; 

152| 

45| 

153| 

46| 

int 

verbose 

-  0; 

154  | 

47| 

int 

force 

-  0; 

48| 

int 

create 

-  0; 

49| 

50  | 

if(  argc  !- 

1  ) 

511 

{ 

52| 

P  “ 

argv [ 1 ] ; 

53| 

54| 

if( 

*p  •- 

) 

55| 
56| 
57| 
58  | 
59| 
60| 
611 
62| 
63| 
64| 
65| 
66  | 
67| 
68| 
69| 
701 
711 
72| 
731 
74| 
75| 
76| 
77 1 


usage  () ; 
e 

for(  ++p  ;  *p  ;  p++  ) 


switch (  *p  ) 

{ 

case  'v 
case  ' f 
case  'c 
default 


verbose  -  1 
force  -  1 
create  -  1 
usage () ; 


break; 

break; 

break; 


) 

) 

thetime  -  time (NULL)  ; 

if(  stat ( "date . hM,  tstats)  !-  0  ) 

i 

if(  ! create  )  /»  File  doesn’t  exist  */ 

exit {  0  ) ; 


) 

else 


t  -  localtime (  4thetirae  ) ; 

printf ("Creating  DATE.H:  %s\n",  asctime(t)  ); 


78 

79 

80  | 
81 
82 
83| 

84 

85  1 

86 

87  | 

88 

89 

90  | 

91 

92 
931 

94 

95 

96  i 

97 

98 

99 
100 
101 
102 

103 

104 

105 

106 

107 

108 

109 

110 
111 
112 

113 

114 

115 

116 

117 

118 

119 

120 
121 
122 

123 

124 

125 

126 

127 

128 

129 

130 

131 

132 

133 

134 

135 

136 

137 

138 

139 

140 

141 

142 

143 

144 

145 


t  -  localtime (  & (stats. 3t_ratime)  ); 
if (  verbose  ) 

printf (  "DATE.H  last  modified:  %s",  asctime(t)  ); 
day  -  t->tm_yday; 
t  -  localtime (  4thetime  ) ; 
if(  verbose  ) 

printf (  "Today  is:  %s",  asctime(t)  ); 

if (  t->tra_yday  --  day  £4  ! force  ) 
exit (  0  ) ; 

else 

printf ("Modifying  DATE .H\n") ; 


if (  ! (fd 

{ 


) 

else 

{ 


fopen(  "date.h", 

perror(  "date.h"  ); 
exit  (  2  ) ; 


fprintf(fd,  "♦define  _DAY_ 


\"%d-%d-%d\"\n", 

t->tm_mon, 
t->tra_mday, 
t->tra_year  ) ; 

fprintf (fd,  "♦define  _TIME_24_  \"%d : %d : %d\"\n", 

t->tm_hour, 
t->tm_rain, 
t->tm_sec  ) ; 

pra  -  t->tra_hour  >-  12; 

if (  t->tm_hour  >  12  ) 

t->tm_hour  —  12; 

else  if (  t->tm_hour  —  0  ) 
t->tra_hour  -  12; 

fprintf (fd,  "♦define  _TIME_  \"%d:%d  %s\"\n", 

t->tra_hour, 

t->tra_min, 

pm  ?  "PM"  :  "AM"  ) ; 

fprintf (fd,  "♦define  _DATE_  \"%d-%d-%d  %d:%d  %s\"\n", 

t->tra_mon, 

t->  tnwnday, 

t->tra_year, 

t->tm_hour, 

t->tm_min, 

pm  ?  "PM"  :  "AM"  ); 

fclose (fd) ; 

) 

exit (  1  ) ; 


♦define  E(x)  fprintf (stderr,x) 

usage  () 

{ 

E( "Usage  dateh  [-fvc]\n"); 

E ( " I f  date.h  doesn't  exist,  create  it,  otherwise  if\n"); 
E("the  date  stamp  isn't  today,  update  it\n") ; 

E ("\n") ; 

E("-f  forces  an  update,  even  if  the  date  stamp  is  ok\n"); 

E ("— v  verbose  operation\n") ; 

E('»-c  create  file  if  it  doesn't  exist\n")  ; 

E ("\n") ; 
exit (  2  ); 

)  End  Listings 


110 

890 


Dr.  Dobb’s  Journal,  November  1987 


STRUCTURED  PROGRAMMING 


Listing  One  (Te?ct  begins  on  page  124. ) 


Listing  1.  Pascal  function  for  the  simple  heuristic  search  in  an 
unordered  array. 

FUNCTION  Searchl (VAR  Data  :  AnyArrayType;  {  input  } 

VAR  Index  :  IntegerArray;  {  in/out  } 

NData  :  INTEGER;  {  input  ) 

Item  s  ScalarType  {  input  })  :  INTEGER; 

{  function  returns  the  index  of  the  matching 
array,  or  zero  if  no  match  is  found  } 

VAR  I,  Tempo  :  INTEGER; 

BEGIN 

I  1; 

(  scan  array  } 

WHILE  (Item  <>  Data (  Index  [I]  ])  AND  (I  <-  NData)  DO 
I  I  +  1; 

IF  I  <-  NData 

THEN  BEGIN  {  match  found  ) 

Searchl  Index[I];  (  returned  result  ) 

(  Swap  indices  ?  ) 

IF  I  >  1  THEN  BEGIN 

Tempo  Index(I); 

Index[I]  Index[I-l]; 

Index [1-1]  Index[I]; 

END  (  IF  ) 

END 

ELSE  Searchl  0;  (  not  found  ) 

END;  (  Searchl  ) 

End  Listing  One 


Listing  Two 

Listing  2.  Pascal  function  for  the  heuristic  search  method  for 
unordered  arrays. 

FUNCTION  Search2(VAR  Data  ;  AnyArrayType;  (  input  ) 

VAR  Index  :  IntegerArray;  {  in/out  ) 

VAR  Loctn  :  SmalllntArray; (  in/out  ) 

NData  :  INTEGER;  {  input  ) 

Item  ;  ScalarType  {  input  ))  ;  INTEGER; 

(  function  returns  the  index  of  the  matching 
array,  or  zero  if  no  match  is  found  ) 

CONST  Factor  -  0.7;  (  time-series  factor  ) 

(  limits  used  to  select  search  schemes  } 

UPPER_LIMIT  -  0.65;  (  -  0.5  +  0.15  ) 

LOWER_LIMIT  -  0.35;  {  -  0.5  r  0.15  ) 

(  number  of  Loctn  elements  used  in  predicting  the  next 
matching  location  } 

N  -  4; 

VAR  Continue  ;  BOOLEAN; 

I,  J,  Tempo,  This_Location,  Median, 

Skip,  Result  ;  INTEGER; 

Next_Location,  Power  :  REAL; 


PROCEDURE  Swap_Indices (K  ;  INTEGER); 
{  Procedure  used  to  swap  indices  ) 
BEGIN 

(  Swap  indices  ?  ) 

IF  K  >  1  THEN  BEGIN 

Tempo  Index [K] ; 

Index [K]  Index [K-l]; 

Index [K-l]  Index [KJ; 

END  (  IF  } 

END; 


BEGIN 

{  Estimate  the  next  location  ) 

Next_Location  0.0;  {  initialize  next  location  ) 

Power  1.0; 

FOR  I  N  DOWNTO  1  DO  BEGIN 

Next_Location  Next_Location  +  Power  *  Loctn [I]; 
Power  Power  *  Factor; 

END; 

{  calculate  predicted  next  search  location  as  a  fraction  ) 
Next_Location  (1.0  -  Factor)  *  Next_Location  /  NData; 


Result  0;  {  default  value  for  no  match  ) 

IF  Next_Lo cation  >  UPPER_LIMIT  THEN  BEGIN 
(  Search  last-to-first  ) 

I  NData; 

WHILE  (Item  <>  Data [  Index (I]  J)  AND  (I  >  0)  DO 
I  I  -  1; 

IF  I  >  0  THEN  BEGIN 

Result  Index[IJ; 

Swap_Indices (I) ;  (  swap  indices  ) 

END  (  IF  ) 

END 

ELSE  IF  Next_Locat ion  <  LOWER_LIMIT  THEN  BEGIN 
{  Search  first-to-last  } 

I  1; 

WHILE  (Item  <>  Data  [  Index [I]  ])  AND  (I  <-  NData)  DO 
I  I  +  1; 


IF  I  <-  NData  THEN  BEGIN 
Result  Index (I]; 

Swap_Indices (I) ;  {  swap  indices  } 

END  (  IF  ) 

END 

ELSE 

(  Perform  bidirectional  search  starting  at  the  middle  ) 
Median  NData  div  2; 

IF  Item  O  Data l  Index [Median]  ]  THEN  BEGIN 
Skip  1; 

Continue  TRUE; 

REPEAT 

I  Median  +  Skip; 

IF  I  <-  NData  THEN 

IF  Item  -  Data [  Index[I]  ]  THEN  BEGIN 
Result  Index [I]; 

Continue  FALSE; 

Swap_Indices (I) ;  (  swap  indices  ) 

END;  {  IF  ) 

J  Median  -  Skip; 

IF  J  >  0  THEN 

IF  Item  -  Data[  Index [J]  ]  THEN  BEGIN 
Result  Index [J]; 

Continue  FALSE; 

Swap_Indices (J) ;  (  swap  indices  } 

END;  {  IF  ) 

IF  (I  >  NData)  AND  (J  <  1)  THEN  (  out  of  bounds  ) 
Continue  FALSE; 

Skip  Skip  +1; 

UNTIL  (NOT  Continue); 

END 

ELSE  BEGIN 

Result  Index [Median] ; 

Swap_Indices (Media) 

END;  (  IF  ) 

END;  (  IF  ) 

IF  Result  >  0  THEN  BEGIN 

(  Update  location  array  ) 

FOR  I  1  TO  N-l  DO 

Loctn [I ]  Loctn[I+l]; 

Loctn [N]  Result 
END;  {  IF  ) 

Search2  Result;  (  return  result  ) 


END;  {  Search  2  ) 


End  Listing  Two 


Listing  Three 

Listing  3.  Pascal  code  for  function  implementing  the  first 

heuristic  search  method  for  fixed  ordered  arrays. 


PROCEDURE  Initialize (VAR  Data  ;  AnyArrayType;  {  input  ) 

VAR  Table  :  RecordArray;  [  in/out  ) 

TableSize  :  INTEGER;  (  input  ) 

NData  :  INTEGER  (  input  }); 

(  initialize  index  table  by  inserting  data  at  equal  intervals  ) 
{ 

TYPE  TableRecord  -  RECORD 
Key  :  ScalarType; 

Index  :  INTEGER; 

END; 

RecordArray  -  ARRAY [ 1 . . MAX_TABLE_S I ZE ]  OF  TableRecord; 

) 

VAR  I,  J,  Delta  :  INTEGER; 

BEGIN 

Delta  NData  div  TableSize; 

J  1; 

FOR  I  :■  1  TO  TableSize  DO  BEGIN 
Tabled]. Key  Data[J]; 

Tabled]  .Index  J; 

J  J  +  Delta; 

END; 

END;  {  initialize  ) 


FUNCTION  Search3 (VAR  Data  :  AnyArrayType;  (  input  ) 

VAR  Table  :  RecordArray;  (  in/out  ) 

TableSize,  (  input  ) 

NData  ;  INTEGER;  (  input  ) 

Item  ;  ScalarType  {  input  })  :  INTEGER; 

VAR  Found,  NoMatch  :  BOOLEAN; 

First,  Last,  I,  K,  Result  :  INTEGER; 


(continued  on  page  115) 


112 


Dr.  Dobb’s  Journal,  November  1987 

891 


STRUCTURED  PROGRAMMING 


Listing  Three  (Listing  continued,  text  begins  on  page  124.) 


BEGIN 

Result  0;  {  initialize  result  with  default  value  ) 

{  Search  for  Item  in  index  table  ) 

I  1; 

WHILE  (Item  >  Table [I] .Key)  AND  (I  <-  TableSize)  DO 
I  I  +  1; 

Found  FALSE; 

IF  I  <-  TableSize 

THEN  Found  r-  (Item  -  Table [I] .Key) 

ELSE  I  I  -  1; 

IF  Found 
THEN 

Result  Table[I] . Index 

ELSE 

(  Get  range  for  search  limits  } 

First  Table [1-1] .Index; 

Last  Table [I] . Index; 

K  1-1; 

No Match  TRUE; 

I  First; 

WHILE  (I  <-  Last)  AND  NoMatch  DO 

IF  Item  <>  Data [I] .Key  THEN  I  I  +  1 

ELSE  NoMatch  FALSE; 


IF  NOT  NoMatch  THEN  BEGIN 

Result  Tabled]  .Index;  (  store  result  ) 

IF  K  >  1  THEN  BEGIN  {  update  table  entry  } 
Table [K] .Key  Item; 

Table [K] . Index  Result 
END;  (  IF  ) 

END;  (  IF  ) 

END;  (  IF  ) 

Search3  Result  {  return  result  ] 

END;  (  Search3  ) 


End  liliafingw 


Dr.  Dobb’s  Journal,  November  1987 

892 


115 


COLUMNS 


C  CHEST 


Using  the  Unix/ANSI  Time  Functions 


This  month  I’m  going  to  look  at 
the  Unix  (and  now  ANSI)  time 
functions  and  demonstrate  their  use 
with  two  short  but  useful  programs 
— calendar  and  dateh. 

Calendar  is  an  implementation  of 
the  Unix  program  of  the  same  name. 
It  searches  a  file  called  calendar  and 
prints  every  line  that  starts  with 
today’s  or  tomorrow’s  date.  If  you 
put  it  into  your  AUTOEXEC.BAT  file 
and  put  the  calendar  in  your  root 
directory,  the  program  prints  remind¬ 
ers  of  what  you  have  to  do  on  any 
given  day  when  you  start  up  your 
system. 

The  second  program,  dateh,  cre¬ 
ates  (or  modifies)  a  file  called  date.h 
to  hold  C  #defines  for  a  few  strings 
that  represent  today’s  date  and  the 
time  when  dateh  was  run.  You  can 
use  these  #define s  in  a  program’s 
sign-on  message  to  keep  track  of 
when  the  program  was  compiled.  A 
sample  dateh  output  file  is  shown 
in  Example  1,  below.  The  file  is  modi¬ 
fied  only  if  it  already  exists  and  if  it 
hasn’t  been  modified  earlier  today, 
though  you  can  change  this  behav¬ 
ior  with  command-line  switches. 
The  default  behavior  facilitates  using 
dateh  along  with  a  make  utility.  You 
can  call  it  every  time  you  compile, 
but  you’ll  only  have  to  recompile 


by  Allen  Holub 


the  file  that  # includes  date.h  once  a 
day.  Dateh  returns  an  exit  status  of 

1  if  the  file  is  modified  (0  if  it’s  not, 

2  on  a  command-line  syntax  error). 
So,  you  can  use  the  COM¬ 
MAND. COM  ERRORLEVEL  mecha¬ 
nism  (or  my  shell’s1  $status  variable) 


in  a  batch  file  or  shell  script  to 
decide  whether  or  not  to  recompile. 

The  Time  Functions 

The  Unix  time  functions  provide  a 
portable  way  to  access  the  time  and 
date  from  within  a  program.  You 
can  use  them  to  get  both  time  and 
date  during  execution  of  a  program 
and  the  time-and-date  stamp  associ¬ 
ated  with  a  file.  Because  they  have 
been  incorporated  into  ANSI,  they 
provide  a  degree  of  system  inde¬ 
pendence  not  available  if  you  do 
direct  DOS  calls.  All  these  functions 
require  an  # include  <time.h>  at  the 
top  of  your  program. 

There  are  two  classes  of  time  func¬ 
tions.  The  lowest-level  function, 
called  time( ),  provides  the  time  and 
date  as  the  number  of  seconds 
elapsed  since  midnight,  January  1, 
1970  GMT.  The  returned  value  is  of 
type  long.  Because  there  are  roughly 
31,557,600  seconds  in  a  year  (365.25 
X  24X60X60),  and  because  the  larg¬ 
est  (signed)  32-bit  number  that  a 
longint  can  hold  is  2,147,483,647,  the 
time  will  roll  over  on  January  18, 
2038  at  2:56:02  a.m.  This  means,  of 
course,  that  all  Unix  programs  that 
use  timet )  will  fail  on  in  the  morn¬ 
ing  of  January  18,  2038 — you  can’t 
have  everything. 

The  ANSI  standard  differs  from 
Unix  in  that  the  return  value  from 


tdefine 

DAY 

"6-17-87" 

#define 

TIME  24 

"13:11:41" 

#define 

TIME 

"1:11  PM" 

tdefine 

_DATE_ 

"6-17-87  1:11  PM" 

Example  1  Date.h  file  created  by  dateh 


timet )  is  an  object  of  type  time _ t 

rather  than  long  and  time _ t  is  left 

undefined.  This  means  that  a  com¬ 
piler  manufacturer  could  typedef 

time _ t  as  a  double,  for  example,  in 

order  to  move  the  rollover  point  well 
into  the  next  century.  I’ll  use  the 
ANSI  convention  for  the  rest  of  this 
article.  The  ANSI  calling  syntax  for 
timet )  is: 

time _ t  time(  timeptr  ); 

time _ t  ’timeptr; 

and  the  Unix  syntax  is: 

long  timet  timeptr  ); 
long  ’timeptr; 

The  timet )  function  always  returns 
the  current  time.  In  addition,  if 
timeptr  is  not  NULL,  it  will  also  load 
the  time  into  the  object  pointed  at 
by  timeptr.  The  function  returns 

((time _ t)(-l))  if  the  calendar  time 

isn’t  available. 

A  second  low-level  function  that’s 
useful  in  conjunction  with  the 
timet )  function  is  statt ).  The  calling 
syntax  is: 

#include  <sys/types.h> 

#include  <sys/stat.h> 

int  stat(  path,  buf  ) 
char  ’path; 
struct  stat  *buf; 

The  first  argument  is  the  path  name 
of  a  file;  the  second  is  a  pointer  to  a 
structure  declared  in  stat.h.  Statt) 
fills  the  structure  with  information 
about  the  file,  such  as  the  permis¬ 
sion  mask,  file  size,  and  so  forth. 
There  are  three  fields  that  are  time 

related:  st _ atime,  st _ mtime,  and 

st _ dime.  These  hold  the  file’s  date 

stamp,  and  they  use  the  same 
elapsed-time-since-1970  mechanism 
that  is  used  by  timet ).  In  Unix  sys¬ 
tems,  st _ atime  holds  the  time  of 

last  access,  st _ mtime  is  the  time 


116 


Dr.  Dobb's  Journal,  November  1987 

893 


C  CHEST 

(continued  from  page  116) 

when  the  file  was  last  modified,  and 

st _ dime  is  the  create  time.  In  MS- 

DOS  compilers  such  as  Microsoft’s, 
all  three  fields  are  set  to  the  last- 
modified  time  because  that's  the 
only  information  available  from  DOS. 

A  related  function,  fstatt ),  works 
rather  like  staff )  does,  but  it  takes  a 
FILE  pointer  as  its  first  argument 
rather  than  a  path  name. 

The  difftimef )  function  returns 
the  difference  between  two  times, 
as  returned  by  timed  or  staff ).  Its 
calling  syntax  is: 

double  difftime  (tl,  t2) 
time _ t  tl,  t2; 

Note  that  the  returned  value  is  of 

type  double,  not  time _ t.  Difftimef ) 

is  necessary  because  you  do  not 

know  the  actual  type  of  a  time _ f  (it 

may  be  a  structure,  for  example). 
Consequently,  you  can’t  just  sub¬ 
tract  two  times  to  get  the  difference 
if  you  want  your  code  to  be  port¬ 
able. 

Note  that  the  minimum  resolu¬ 
tion  available  from  timet )  is  sec¬ 
onds.  If  you  need  fractional  time, 
you  need  to  use  the  ftimet )  func¬ 
tion.  Its  calling  syntax  is: 

#include  <sys/types.h> 


#include  <sys/timeb.h> 

void  ftimef  timep  ) 
struct  timeb  "timep; 

The  timeb  structure  is  shown  in 
Example  2,  below.  The  two  fields  of 
interest  are  the  time  field,  which 
holds  the  same  value  as  would  be 
returned  from  timet),  and  the  mil- 
litm  field,  which  holds  an  additional 
number  of  milliseconds  (thou¬ 
sandths  of  a  second).  Note  that  the 
resolution  here  is  not  better  than 
your  system  clock.  For  example,  the 
IBM  PC  clock  ticks  18.2  times  a 
second,  so  the  maximum  resolution 
on  a  PC  is  54  milliseconds. 

The  second  level  of  time  functions 
all  represent  the  time  as  a  structure, 
also  declared  in  time.h  (and  shown 
in  Example  3,  below). 

Two  functions — gmtimet )  and  lo- 

caltimet ) — convert  the  time f 

times,  as  returned  by  timet )  or 
staff ),  into  tm  structures.  They  both 
have  the  same  calling  syntax: 

struct  tm  "gmtime  (  time  ) 
struct  tm  *localtime(  time  ) 

time _ t  "time _ ptr; 

Gmtimet )  converts  to  Greenwich 
Mean  Time,  and  localtimet )  goes  to 
the  current  local  time.  Note  that 
you’re  passing  in  a  pointer  to  a  vari¬ 


able  that  holds  the  time,  not  the 
time  itself. 

The  local  time  is  determined  in 
mysterious  ways,  depending  on 
your  system.  Most  Unix  systems 
have  it  hard-coded  into  the  local¬ 
timet  )  function,  which  automatically 
figures  out  the  peculiarities  of  day¬ 
light  savings  time  and  so  forth.  Many 
compilers,  however,  use  an  environ¬ 
ment  variable  to  determine  the  local 
time.  For  example,  Microsoft’s  uses 
an  environment  variable  called  TZ 
for  this  purpose.  It’s  set  to  a  string 
something  like  PST8PDT,  which  says 
that  Pacific  Standard  Time  (PST)  is  8 
hours  off  Greenwich  and  that  Pacific 
Daylight  Time  (PDT)  is  1  hour  off 
that.  Microsoft  provides  a  tzsetO 
function  that  can  be  used  to  set  this 
environment  variable  from  within  a 
program.  In  Unix,  the  name  of  the 
current  time  zone  is  available 
through  the  timezonet )  function. 
Consult  the  ctime(3)  section  of  the 
manual  for  details. 

The  mktimet )  function  goes  in 
the  other  direction.  It  takes  as  input 
a  tm  structure  and  returns  the 

equivalent  time  in  a  time _ t,  as 

would  have  been  returned  by 
timet ).  The  calling  syntax  is: 

time _ t  mktimeltime _ ptr) 

struct  tm  "time _ ptr; 

Mktimet )  can  also  be  used  to  flush 

out  the  tm _ mday  and  tm _ yday 

fields  of  the  structure.  That  is,  if  you 
want  to  find  out  the  day  of  the  week 
for  a  particular  day,  you  can  fill  up 
a  tm  structure  yourself,  leaving  the 

tm _ wday  and  tm _ yday  fields 

empty,  and  then  mktimet )  will 
modify  those  fields  as  required. 

Three  functions  are  provided  for 
printing  the  time:  strftimet ), 
ctimet ),  and  asctimet ).  Strftimet )  is 
an  ANSI  function,  not  supported  by 
Unix,  that  works  something  like 
sprintft )  does.  Because  it's  not  im¬ 
plemented  by  most  compilers,  I 
won't  discuss  it  further.  Calling  syn¬ 
taxes  for  the  other  two  functions 
are: 

char  "asctimet  time  ) 
struct  tm  "time; 

char  ctime(  tp  ) 
time _ t  "tp; 


struct  timeb 
( 

time  t 

time; 

/* 

Time,  like  that  returned  from  time(). 

*/ 

unsigned  short 

millitm; 

/* 

fraction  of  a  second,  in  milliseconds 

*/ 

short 

timezone; 

/* 

The  difference  moving  westward,  in 

*/ 

/* 

minutes,  between  Greenwich  Mean  Time 

*/ 

/* 

and  the  local  time. 

*/ 

short 

dstflag; 

/* 

Nonzero  if  daylight  savings  time  is 

*/ 

>; 

/* 

in  effect . 

*/ 

Example  2  The  timeb  structure 


struct 

( 


tm 


int 

int 


tm_hour; 
tm_min; 
int  tm_sec; 
int  tm_isdst; 
int  tm_year; 
int  tmjmon; 
int  tm_mday; 
int  tm_wday; 
int  tm_yday; 


/*  hour  */ 
/*  minute  */ 
/*  second  */ 
/*  daylight  savings  time  is  active  */ 
/*  year  -  1900  */ 
/*  month;  January  =0  */ 
/*  day  of  the  month  */ 
/*  day  of  the  week;  Sunday  —  0  */ 
/*  number  of  elapsed  days  since  */ 
/*  January  1,  January  1  ==  0  */ 


Example  3  A  tm  structure 


118 

894 


Dr.  Dobbs  Journal,  November  1987 


They  both  return  a  string  that  is 
exactly  26  characters  long  and  has 
the  following  form: 

Wed  Jul  22  11:57:45  1987\n\0 

Asctimef )  takes  a  pointer  to  a  tm 
structure  as  its  argument,  as  is  re¬ 
turned  from  grntimef )  or  localtimel ). 

Ctime( )  takes  a  pointer  to  a  time _ t 

argument,  as  is  returned  by  timeO 
or  stat( ).  Note,  again,  that  this  argu¬ 
ment  is  a  pointer  to  a  variable  hold¬ 
ing  the  time,  not  to  the  time  itself. 

Calendar.c 

Calendar  is  a  DOS  version  of  the 
Unix  memo  program.  It  searches  a 
file  called  calendar,  which  must  be 
in  the  current  directory,  for  lines 
starting  with  a  date,  and  if  that  date 
is  either  today  or  tomorrow,  it  prints 
the  line.  Note  that  this  behaviour 
differs  from  that  of  the  Unix  pro¬ 
gram,  which  allows  the  date  to  be 
anywhere  on  the  line.  My  implemen¬ 
tation  ignores  white  space  that  pre¬ 
cedes  the  date,  but  no  other  charac¬ 
ters  can  be  present.  “Tomorrow”  for 
weekends  is  extended  to  include  the 
following  Monday. 

My  calendar,  unlike  the  Unix  ver¬ 
sion,  returns  an  exit  status  of  0 
when  it  prints  something  (nonzero 
if  it  doesn’t).  I  invoke  it  from  the 
LOGIN.BAT  file  used  by  my  shell 
with  the  following: 

calendar 

if(  Sstatus  =  =  0  )  then 
bell 
bell 
endif 

Bell  is  a  two-line  program  that  rings 
the  bell  on  the  PC.  This  way  I  get 
both  the  message  and  an  audible 
signal  when  I  have  something  to  do. 
You  could  do  the  same  thing  from 
within  a  COMMAND.COM  batch  file 
by  using  the  ERRORLEVEL  mecha¬ 
nism. 

The  program  recognizes  most 
forms  of  dates,  though  the  date 
must  take  the  form  of  month,  then 
day,  then  year;  neither  87-9-17  nor 
17  Sept  is  recognized.  All  the  fol¬ 
lowing  are  recognized,  however: 

Sept.  7,  1987 
September  7  ’87 
Sep.  7 


Dr.  Dobb’s  Journal,  November  1987 


C  CHEST 

(continued  from  page  119) 

sep  7  87 

9-7 

9/7 

9/7/87 
9  7  1987 
9-7-1987 

as  are  severed  other  variants.  Experi¬ 
ment  and  see  if  your  favorite  form 
works.  If  you  use  an  asterisk  in  place 
of  a  month,  then  that  day  of  every 
month  is  recognized — for  example 
79  will  be  recognized  on  the  ninth 
of  every  month. 

I've  also  added  a  non-Unix  feature 
that  recognizes  days  of  the  week 
(monday,  tuesday,  Wednesday,  and  so 
forth).  When  these  are  used,  memos 
are  printed  on  the  given  day  of  every 
week  (say,  every  Monday). 

You  can  abbreviate  month  and 
day-of-the-week  names  by  leaving  off 
letters  at  the  ends  of  the  words,  but 
you  can’t  leave  off  so  many  letters 
that  a  conflict  is  created.  For  exam¬ 
ple,  you  can’t  use  Ju  because  the 
program  won’t  know  if  you  mean 
June  or  July.  The  smallest  possible 
abbreviations  are  shown  in  Listing 
One,  page  108,  on  lines  42-51.  Case 
is  ignored,  so  SEP,  sep,  and  Sep  are 
all  acceptable. 

Calendar  gets  confused  if  you 
start  the  text  portion  of  the  line  with 
a  character  that  could  be  part  of  the 
date  and  a  partial  date  is  specified. 
For  example,  if  you  say: 

9/17  4  for  feast  at  Fran's 

calendar  will  think  that  the  date  is 
September  19,  1904. 

9/17/87  4  for  .  .  . 

9/17:  4  for  . . . 

are  both  OK,  however;  the  first  be¬ 
cause  the  whole  date  is  specified, 
and  the  second  because  the  colon 
will  separate  off  any  trailing  num¬ 
bers.  In  general,  avoid  numbers, 
slashes,  apostrophes,  dashes,  and  pe¬ 
riods  at  the  beginning  of  the  memo 
text. 

The  program  itself  is  in  Listing 
One.  Most  of  it  is  in  the  main( ) 
subroutine,  which  starts  on  line  110. 
Today’s  date  is  fetched  on  lines  125 
and  126,  using  the  time( )  and  local¬ 


time  ( )  functions  described  earlier. 
Word-to-month  conversion  is  done 
in  the  if  statement  on  lines  140-178. 
The  program  just  looks  at  the  first 
few  characters  in  the  word  in  the 
stupidest  way  possible.  It  didn’t 
seem  worth  the  effort  to  do  things 
in  a  more  efficient  manner. 

Numeric  parts  of  the  date  are  split 
off  on  lines  180-195.  Note  that  the 
month  is  only  modified  (on  line  188) 
if  it  hasn’t  been  set  previously  (by 
the  word-processing  code).  An  aster¬ 
isk  is  translated  to  the  current 
month  on  line  182.  The  stoi(  i  sub¬ 
routine  has  appeared  in  this  column 
several  times  in  the  past  (it’s  not 
part  of  this  month’s  listing).  It  works 
just  like  atoi( )  does,  but  it  is  passed 
a  pointer  to  a  character  pointer  and 
updates  that  pointer  to  point  past 
the  number. 

The  tests  on  lines  198-200  check 
for  unspecified  parts  of  the  date  and 
fill  in  the  equivalent  information  for 
today  if  necessary.  That  is,  if  the 
year  isn’t  specified,  the  current  year 
is  used  and  so  forth. 

The  date  is  converted  to  an  offset, 
in  days,  from  the  first  of  the  year  on 
lines  211-213.  I’m  doing  this  to  make 


it  easier  to  find  tomorrow.  Other¬ 
wise,  the  end  of  the  month  would 

present  problems.  The  offset _ to _ 

first _ of _ month  array  is  declared 

on  lines  78-92.  The  if  statement  on 
line  212  adjusts  for  leap  years  (any 
year  that  is  an  even  multiple  of  4 
except  for  even  centuries — 1900 
wasn't  a  leap  year).  Finally,  the  ap¬ 
propriate  lines  are  printed  on  lines 
227-235. 

Dateh.c 

The  dateh  program  is  used  to  auto¬ 
matically  create  #defme s  for  the  cur¬ 
rent  date  and  time  so  that  you  can 
put  a  visible  date  stamp  into  a  pro¬ 
gram’s  log-in  message.  You  can  put 
it  into  a  DOS  batch  file  so  that  it’s 
invoked  automatically  before  calling 
make.  If  you’re  using  my  shell,  you 
can  do  it  with  an  alias: 

alias  make  ’dateh;  make’ 

The  output  file  was  described  ear¬ 
lier.  It’s  created  only  when  a  file 
called  date.h  already  exists  in  the 
current  directory  and  it  was  not 
modified  today.  It  behaves  in  this 
way  for  two  reasons.  First,  you  don’t 


Dr.  Dobb's  Journal,  November  1987 

896 


121 


C  CHEST 

(continued  from  page  121) 

want  myriad  date.h  files  littering  all 
your  directories.  Second,  if  you’re 
using  a  make  utility,  you  only  want 
to  recompile  the  file  that  #  includes 
date.h  once  a  day. 

The  program’s  behavior  can  be 
modified  with  the  following  com¬ 
mand-line  switches: 

^-forces  an  update,  even  if  the 
date  stamp  is  today’s. 

-v — verbose  operation.  The  program 
tells  you  about  the  current  date  and 
so  forth  as  it  runs. 

-c — forces  dateh  to  create  the  file  if 
it  doesn't  already  exist. 

The  source  code  for  dateh  is  in 
Listing  Two,  page  110.  It  is  much 
more  straightforward  than  calen¬ 
dars.  Command-line  switches  are 
processed  on  lines  50-65.  The  stat( ) 
call  on  line  69  does  two  things.  It 
tests  for  the  existence  of  a  file  called 
date.h  istat  returns  -1  if  the  file 
doesn't  exist),  and  if  the  file  does 
indeed  exist,  it  initializes  the  stats 
structure  with  the  relevant  statistics. 
Here,  I’m  interested  in  the  last-modi¬ 
fied  date  stamp  held  in  the 
st _ mtime  field.  This  field  is  con¬ 

verted  to  a  tm  structure  by  the  local- 
time( )  call  on  line  79. 1  do  the  same 
thing  for  today’s  date  and  time  on 
line  86.  Finally,  the  two  times  are 
compared  on  line  91.  The  rest  of  the 
subroutine  just  modifies  the  date.h 
file  if  the  date  stamps  don't  match, 
getting  the  relevant  information  from 
the  tm  structure  pointed  to  by  t. 

Note 

1.  My  shell  is  described  in  my  book 
On  Command:  Writing  a  Unix- Like 
Shell  for  MS-DOS  (Redwood  City, 
Calif.:  M&T  Books,  1987). 

DDJ 

(Listings  begin  on  page  108.) 

Vote  for  your  favorite  feature/article. 

Circle  Reader  Service  No.  7. 


Dr.  Dobb’s  Journal,  November  1987 


COLUMNS 


STRUCTURED  PROGRAMMING 


This  month  I'm  going  report  on 
some  new  algorithms  for  heuris¬ 
tic  searching.  Heuristic  approaches, 
as  you  may  know,  differ  from  con¬ 
ventional  searching  by  improving 
their  performance  over  time,  by 
“learning”  the  characteristics  of  the 
data  and  the  requests  normally 
made  of  it.  In  this  column  I'll  pre¬ 
sent  algorithms  for  searching  both 
unsorted  and  sorted  arrays:  the  meth¬ 
ods  for  searching  through  unsorted 
arrays  borrow  from  statistical  time- 
series  analysis  to  create  a  short-term 
heuristic  memory,  whereas  the  meth¬ 
ods  for  sorted  arrays  are  variations 
on  the  traditional  index  table  search 
scheme. 

Unsorted  Arrays 

The  simplest  case  of  a  heuristic 
search  involves  searching  through 
an  unsorted  array  or  list.  The  search 
begins  with  the  first  array  element 
and  examines  the  rest  of  the  array 
until  a  match  is  found.  If  the 
matched  element  is  not  the  first 
array  member,  it  is  swapped  with 
the  previous  array  element.  Thus, 
the  most  frequently  sought  elements 
tend  to  bubble  toward  the  front  end 
of  the  array. 

Listing  One,  page  112,  shows  a 
Pascal  function  that  implements  a 
version  of  the  simple  heuristic 
search  algorithm.  The  Pascal  func¬ 
tion  passes  an  array  of  data  as  well 
as  an  array  of  indices.  The  values  of 
the  indices  rather  than  the  mem¬ 
bers  of  array  Data  are  swapped.  This 
approach  is  more  convenient  and 


by  Namir  Clement  Shammas 

faster  when  record  structures  are 
used  and  the  search  is  performed 
for  various  keys. 

This  simple  method  may  be  too 
simple,  though.  Using  it,  you  may 
experience  any  of  the  following  situ¬ 
ations,  based  on  the  pattern  of  the 
data  and  requests: 


124 

898 


Heuristic  Searching 


Modula-2 


Pascal 


Fortran 


1.  The  statistical  probability  distribu¬ 
tion  for  seeking  any  element  is  uni¬ 
form.  In  this  case,  the  benefit  of 
using  this  method  over  a  simple 
linear  search  depends  on  the  actual 
sequence  of  sought  elements.  Thus, 
the  advantage  of  a  heuristic  search, 
at  least  of  this  simple  kind,  is  most 
likely  minimal. 

2.  The  statistical  probability  distribu¬ 
tion  for  seeking  array  elements 
strongly  favors  a  particular  set  of 
data.  Using  the  simple  heuristic 
method  is  the  most  efficient  in  this 
situation. 

3.  There  is  a  shifting  trend  for  seek¬ 
ing  different  data  sets  in  the  array 
(that  is,  the  statistical  probability  dis¬ 
tribution  for  seeking  any  element  is 
also  a  function  of  time  or  call  se¬ 
quence).  Here  the  efficiency  of  the 
algorithm  is  also  shifting:  after  a  spe¬ 
cific  set  of  data  bubbles  toward  the 
front  of  the  array,  the  trend  shifts  in 
favor  of  another  set.  This  causes  the 
members  of  the  new  set  to  bubble 
toward  the  front  and  displace  the 
ones  from  the  previous  set.  As  a 
result,  the  efficiency  of  the  algorithm 
drops  to  a  minimum  during  a  shift 
in  such  trends  because  the  sought 
elements  are  either  in  the  middle  or 
at  the  tail. 

It  is  this  last  case  that  I  want  to 
examine  more  closely.  It  is  possible 
to  improve  the  algorithm  to  take 
advantage  of  any  patterning  of  need 
if  you  consider  the  following  ques¬ 
tions:  When  do  you  start  searching 
at  the  lower  array  bound?  When  do 
you  start  backward  searching  at  the 
upper  array  bound?  And  when  do 
you  start  bidirectional  searching 


beginning  at  the  median  element? 

These  three  questions  have  a 
common  solution:  you  predict  the 
location  of  the  next  element  from 
the  history  of  search  locations. 
There  exists  in  statistical  time-series 
analysis  a  tool  to  do  just  this.  In  the 
following  discussion,  I  use  the 
weighted  average  method  to  provide 
projections  for  the  next  sought  loca¬ 
tion.  The  equation  (which  I’ll  call 
equation  1)  involved  is: 

L[n  + 1]  =  (1— f)(L[n]  +  fL[n-l]  +  PL[n-2] 

+  ...) 

where  L[l] . . .  L[n]  is  the  array  of 
locations  found  and  f  is  a  fractional 
factor.  L[n]  is  the  location  of  the  last 
search,  L[n-1]  is  the  location  of  the 
one  before  it,  and  so  on.  The  value 
of  f  is  usually  calculated  from  the 
autocorrelation  coefficient  associ¬ 
ated  with  a  given  set  of  values  of 
array  L.  For  the  sake  of  reducing 
computing  time,  however,  f  is  as¬ 
signed  a  fixed  value  (say  0.70).  As  the 
value  of  the  f  factor  approaches 
unity,  more  weight  is  given  to  the 
“older”  data,  and  vice  versa. 

With  the  predicted  next  location 
at  hand,  you  can  determine  the 
search  scheme:  first  to  last,  bidirec¬ 
tional  to  middle,  and  last  to  first. 
This  is  expressed  in  the  following 
conditions: 

1.  If  L[n  +  l)/(array  size)  >  0.65,  then 
use  the  last-to-first  search  scheme. 

2.  If  L[n  +  l]/(array  size)  <  0.35,  then 
use  the  first-to-last  search  scheme. 

3.  If  0.35  <  =  L[n  +  l]/(array  size)  <  = 
0.65,  then  use  the  bidirectional- 
middle  search  scheme. 

Listing  Two,  page  112,  shows  the 
Pascal  function  SearchZ,  which  em¬ 
ploys  the  modified  heuristic  method 
just  described.  It  uses  equation  1 
with  f=0.70.  The  last  four  search 
location  values  are  used  to  predict 
the  next  search  location,  and  the 


Dr.  Dobb’s  Journal,  November  1987 


STRUCTURED  PROGRAMMING 

(continued  from  page  124) 

predicted  next  location  is  calculated 
as  a  fraction  of  the  total  number  of 
array  elements.  This  fractional  value 
is  then  used  in  selecting  the  search 
scheme.  Notice  that  the  function  up¬ 
dates  the  search  location  history 
only  when  matches  are  found. 

Ordered  Arrays 

Conventional  (nonheuristic)  search¬ 
ing  through  ordered  arrays  usually 
employs  index  tables.  The  search  for 
a  given  datum  begins  in  the  index 
table  whose  entries  define  a  range 
to  be  examined  in  the  array.  The 
sought  datum  may  or  may  not  find 
its  match  within  that  range  and  is 
guaranteed  not  to  match  outside  the 
range.  Normally,  the  index  tables  are 
specified  ahead  or  initialized  to  set 
the  table  entries  based  on  the  avail¬ 
able  data.  For  an  array  in  which  no 
new  members  are  added  and  none 
are  removed,  the  index  table  re¬ 
mains  the  same. 

This  is  the  basis  on  which  I  will 
now  develop  and  discuss  two  heu¬ 
ristic  search  methods  for  ordered 
arrays.  The  basic  scheme  I  use  is  the 
dynamic  modification  of  the  index 
table  as  influenced  by  the  sought 
data. 

In  the  first  of  these  methods,  the 
index  table  is  constantly  changing 
at  run  time.  It  is  first  initialized  by 
taking  evenly  spaced  elements  from 
the  data  array.  The  index  table  itself 
is  an  array  with  a  lower  bound  of  0. 
The  value  of  the  table  entry  at  the 
lower  bound  points  to  the  first  ele¬ 
ment  (with  common  values  of  0  or 
1)  of  the  data  array.  The  basic  algo¬ 
rithm  for  searching  and  updating 


the  heuristic  index  table  is: 

1.  Scan  the  array  elements  to  com¬ 
pare  the  search  datum  with  the 
table  entries.  Determine  the  indices 
to  the  “first”  and  "last”  members  of 
the  data  array.  Let  K  be  the  table 
index  that  points  to  the  first 
member. 

2.  Search  the  data  array  in  the  range 
first  . . .  last. 

3.  Return  the  index  of  the  matching 
element  and  replace  the  Kth  index 
table  entry  with  the  matching  data 
if  a  match  is  found. 

4.  Return  a  special  coded  integer  if 
no  match  is  found. 

Listing  Three,  page  112,  shows  a 
Pascal  function  Search3  that  imple¬ 
ments  this  algorithm  and  a  proce¬ 
dure  that  initializes  the  index  table. 

One  of  the  weaknesses  of  this  heu¬ 
ristic  search  method  is  that  the 
index  table  may  contain  entries  of 
clustered  data.  The  loss  of  uniformly 
distributed  or  well-spread  indices  is 
a  major  factor  in  the  deterioration 
of  search  efficiency.  To  remedy  this 
clustering  problem,  I  suggest  a 
second  method.  This  new  variant 
combines  a  few  aspects  from  tradi¬ 
tional  index  tables  with  the  ones  in 
the  previous  method.  The  result  is  a 
heuristic  index  table  with  both  fixed 
and  replaceable  entries.  The  index 
table  consists  of  several  sections 
such  that  each  section  begins  with 
one  fixed  entry  followed  by  m  re¬ 
placeable  entries.  The  algorithm  is 
very  similar  to  that  of  the  previous 
method: 

1.  Search  the  data  array  in  the  range 
first  . . .  last. 

2.  If  a  match  is  found,  return  the 


index  of  the  matching  element. 

3.  If  (K— 1)  is  not  0  or  a  multiple  of 
(m  +  1),  replace  the  Kth  index  table 
entry  with  the  matching  data. 

4.  If  no  match  is  found,  return  a 
special  coded  integer. 

The  test  in  step  3  protects  the  fixed 
entries  from  being  overwritten.  This 
simple  modification  provides  anchor 
points  in  the  heuristic  index  table. 
You  can  set  up  the  index  table  using 
the  same  method  as  for  the  previous 
search  algorithm. 

I've  discussed  all  the  algorithms  I 
promised,  but  I  can’t  resist  pointing 
out  an  interesting  variation  of  the 
second  search  method  that  can  be 
applied  to  data  stored  in  binary 
trees.  In  this  variation,  a  modified 
index  table  employs  pointers  instead 
of  integer  indices,  with  the  pointers 
used  to  indicate  the  node  at  which 
you  start  the  search  in  a  binary  tree 
rather  than  taking  the  traditional  ap¬ 
proach  of  starting  at  its  root  node. 
By  using  the  combination  of  fixed 
and  replaceable  pointers  in  the 
index  table,  efficient  searching  is 
maintained.  Because  the  fixed  array 
is  stored  in  a  binary  tree,  you  can 
easily  construct  a  balanced  or  near- 
balanced  binary  tree,  which  adds 
even  more  to  the  efficiency  of  the 
search. 

If  you  have  developed  other  heu¬ 
ristic  search  schemes,  I  would  like 
to  hear  from  you.  Please  write  to  me 
care  of  DDJ. 

Availability 

All  the  source  code  for  articles  in 
this  issue  is  available  on  a  single 
disk.  To  order,  send  $14.95  to  Dr. 
Dobb’s  Journal,  501  Galveston  Dr., 
Redwood  City,  CA  94063,  or  call  (415) 
366-3600,  ext.  216.  Please  specify  the 
issue  number  and  format  (MS-DOS, 
Macintosh,  Kaypro). 

DDJ 


(Listings  begin  on  page  112.) 

Vote  for  your  favorite  feature/article. 
Circle  Reader  Service  No.  8. 


126 


Dr.  Dobb’s  Journal,  November  1987 

899 


COLUMNS 


ARTIFICIAL  INTELLIGENCE 


Object-Oriented  Programming 


If  you’ve  been  following  this 
column,  you  will  have  noticed  a 
strong  emphasis  on  object-oriented 
tools.  That’s  because  I’ve  been 
deeply  involved  in  the  evaluation  of 
object-oriented  technology  for  the 
past  year  or  two.  I’m  not  alone:  in 
the  course  of  this  evaluation,  I’ve 
met  many  professional  programmers 
from  both  large  and  small  compa¬ 
nies  who  are  evaluating  or  actively 
using  object-oriented  systems.  Large 
companies  such  as  Hewlett-Packard 
and  General  Motors  are  using  object- 
oriented  approaches  in  major  devel¬ 
opment  work.  There  is  a  major 
thrust  going  on  in  the  industry  just 
now  to  introduce  this  technology 
into  important  and  even  critical  pro¬ 
jects. 

Nevertheless,  some  programmers 
and  managers  are  reluctant  to  adopt 
object-oriented  tools.  I  thought  it 
would  be  a  good  idea  to  devote 
some  column  space  this  month  to 
some  of  the  reasons  why  people 
have,  after  investigating  the  para¬ 
digm,  abandoned  or  avoided  it.  Also, 
if  any  of  you  have  had  the  experi¬ 
ence  of  deciding  whether  to  use  the 
object-oriented  approach  in  a  major 
programming  project,  please  write 
and  tell  me  about  your  experience. 
I'm  interested  both  in  those  in 
which  you  chose  the  object-oriented 
approach  and  those  in  which,  after 
some  serious  study,  you  decided 
against  it. 

I  also  discuss  this  month  what 
object-oriented  programming  has  to 
offer  to  artificial  intelligence  work, 


by  Ernest  R.  Tello 

and  I  consider  what  tools  are 
needed  to  really  exploit  the  object- 
oriented  paradigm  in  AI. 

Adventures  in  Object- 
Oriented  Programming 

One  programmer  I  spoke  with  is  Al 
Globus,  who  works  at  NASA  Ames 


for  the  contractor  Sterling  Software. 
Al  used  an  object-oriented  C  tool  in 
a  project  involving  simulating  life 
patterns  that  might  occur  on  the 
space  station.  Recently,  Al  went  back 
to  using  straight  C.  His  reasons  for 
switching  back  intrigued  me.  But 
what  intrigued  me  even  more  is  that 
even  though  he  was  using  regular 
C,  he  was  still,  in  a  way,  doing  object- 
oriented  programming.  He  had  aban¬ 
doned  the  tool  but  not  the  para¬ 
digm. 

The  point  is  important  and  is 
often  overlooked:  it  is  not  always 
necessary  to  use  an  object-oriented 
programming  language  to  program 
in  the  object-oriented  paradigm.  Al 
admitted  that  it  was  primarily  be¬ 
cause  he  had  already  worked  on  the 
program  in  an  object-oriented  lan¬ 
guage  that  he  was  able  to  do  object- 
oriented  programming  in  regular  C. 
He  had  got  to  know  the  code  and 
the  problem  addressed  so  well  after 
writing  the  program  once  that  it  was 
not  all  that  difficult  to  write  it  again 
without  the  aid  of  an  object-oriented 
programming  tool. 

I  also  spoke  with  Serge  Hering,  a 
software  engineering  consultant 
who  evaluated  an  object-oriented  C 
package  as  a  candidate  for  a  real¬ 
time  system  in  an  IBM  PC/AT  envi¬ 
ronment  but  decided  against  it.  Al¬ 
though  sympathetic  toward  the  con¬ 
cept  of  object-oriented  systems, 
Serge  doubted  that  current  tools 
and  the  industry  at  large  had  ma¬ 
tured  enough  for  the  concepts  to  be 
put  to  use  in  the  project  he  was 
undertaking.  As  he  put  it,  “Object- 
oriented  concepts  will  have  to  en¬ 
compass  all  aspects  of  computing 


systems  to  be  really  practical.” 

According  to  Serge,  a  true  object- 
oriented  C  system  needs  to  be  more 
than  just  a  preprocessor  to  C.  It 
requires  that  the  object-oriented  para¬ 
digm  be  supported  at  the  operating 
system  level  (he's  concerned  with 
doing  multitasking).  After  calling  one 
of  the  vendors  and  questioning  the 
staff  on  this  point,  he  was  told  that 
object-oriented  C  on  an  AT  was  not 
a  suitable  vehicle  for  implementing 
a  multitasking  application. 

His  basic  position  now  is  that 
object-oriented  programming  is  an 
all-or-nothing  affair.  In  order  for  it 
to  really  work,  it  will  have  to  be 
adopted  across  the  industry,  or  at¬ 
tempts  at  using  it  are  liable  to  back¬ 
fire.  The  additional  overhead  re¬ 
quired  by  object-oriented  systems, 
he  feels,  can  be  handled  in  the 
proper  environment,  but  the  danger 
is  that  aspects  of  a  system  that  have 
not  been  implemented  with  the 
object-oriented  approach  in  view 
could  come  back  to  haunt  it  unless 
the  paradigm  gets  the  kind  of  indus¬ 
trywide  support  he  describes. 

Personally,  I  find  Serge’s  view,  for 
all  its  merits,  a  bit  too  conservative. 
The  most  interesting  point  I  think 
he  raises  is  just  what  an  object- 
oriented  C  that  encompassed  more 
than  just  a  preprocessor  to  C  might 
look  like.  I  would  be  interested  in 
hearing  from  readers  who  may  have 
an  idea  on  what  such  a  “total” 
object-oriented  system  in  the  C  envi¬ 
ronment  might  comprise. 

Object-Oriented 
Programming  for  AI 

But  I  haven’t  really  explained  why  I 
am  concentrating  so  much  on 
object-oriented  programming  in  a 
column  ostensibly  devoted  to  artifi¬ 
cial  intelligence  techniques.  It’s 
simple,  really. 

I  believe  that  organizing  programs 
along  object-oriented  lines  is  not 
only  one  of  the  best  means  we  have 


130 

900 


Dr.  Dobb's  Journal,  November  1987 


ARTIFICIAL  INTELLIGENCE 

(continued  from  page  130) 

so  far  for  achieving  greater  program¬ 
ming  efficiency  and  modularity  but 
also  that  it  has  important  implica¬ 
tions  for  the  goals  of  AI.  These  po¬ 
tential  advantages  and  implications 
do  not  come  automatically  with  the 
use  of  the  paradigm,  but  through  it 
many  of  them  seem  to  be  much 
more  attainable.  Possibly  one  of  the 
most  important  issues  is  the  poten¬ 
tial  of  object-oriented  systems  for 
alleviating  the  problem  of  "sequen¬ 
tial  closure”  in  programming. 

What  I  am  referring  to  by  the  term 


sequential  closure  is  the  fact  that,  in 
the  current  stage  of  software  tech¬ 
nology,  if  part  of  a  program  breaks, 
then  very  often  the  whole  program 
breaks.  This  stands  in  contrast  to 
most  living  organisms,  which  show 
considerable  resiliency  or  "fault  tol¬ 
erance”  to  breakdowns  in  their  func¬ 
tioning  parts.  This  phenomenon 
seems  nowhere  more  prevalent  than 
in  the  brain.  Many  studies  have 
shown  the  astounding  degree  to 
which  the  brain  can  adjust  to  mas¬ 
sive  tissue  loss,  for  example.  But,  as 
we  know,  removing  even  a  line  from 
most  programs  can  be  fatal. 

As  sequential  closure  is  so  much 


a  part  of  current  programming  tech¬ 
nology,  it  might  be  useful  to  spell 
out  why  I  think  it  is  not  the  most 
desirable  constraint  to  have.  As  I  see 
it,  there  are  three  main  conse¬ 
quences  of  the  sequential  closure  of 
current  software  technology: 

1.  A  considerable  amount  of  pro¬ 
gramming  time  must  be  spent  not 
only  in  debugging  but  also  in  rede¬ 
bugging. 

2.  There  is  an  inherent  limit  to  how 
“user-friendly”  programs  can  be. 

3.  There  is  an  inherent  limitation  on 
the  flexibility  and  generality  of  prob¬ 
lems  that  programs  can  address. 

Most  programmers  would  agree,  I 
think,  that  advances  are  needed  to 
improve  the  state  of  our  technology 
in  at  least  one,  if  not  all  three  of 
these  areas. 

Object-oriented  programming  is 
not  a  magic  fix  for  these  problems, 
but  there  is  at  least  one  area  in 
which,  I  believe,  it  can  help  signifi¬ 
cantly.  The  main  feature  of  the 
object-oriented  approach  that 
makes  this  possible  is  the  ability  to 
accomplish  a  new  and  important 
degree  of  partitioning  of  large-scale 
functionality. 

Now  there  is  nothing  new  about 
partitioning  the  functionality  of  the 
program.  Libraries  provide  this.  We 
currently  take  for  granted  the  way 
that  reusable  library  functions  can 
be  written  so  as  to  be  used  time  and 
time  again  in  a  variety  of  situations. 
Of  course,  this  reusability  often 
comes  only  at  the  expense  of  a  great 
deal  of  effort,  but  it  is  taken  for 
granted  that  it  is  one  of  the  goals  of 
competent  programming. 

The  problem  is  that  functions  are 
still  at  a  fine-grained  level  in  the 
organization  of  most  reasonably 
sized  programs.  The  main  sequence 
of  code  that  calls  the  library  func¬ 
tions  is  still  very  much  subject  to 
sequential  closure.  The  code  con¬ 
sists  of  a  sequence  of  statements 
and  subroutines  that  provides  a  clo¬ 
sure  on  the  correct  functioning  of 
the  program  that  often  must  be  de¬ 
bugged  and  redebugged  each  time 
important  modifications  are  made 
to  the  program.  As  programs  get 
larger  and  more  complex  (as  in  the 
case  of  AI),  it  becomes  imperative 
that  much  larger  functioning  parts 


132 


Dr.  Dobb's  Journal  November  1987 

901 


of  programs  have  this  packaged  func¬ 
tionality  so  that  they  do  not  have  to 
be  repeatedly  adjusted  and  de¬ 
bugged  to  accommodate  changes 
and  additions  in  other  parts  of  the 
code. 

Object-oriented  programming  can 
provide  broader-grained  partitioning. 
This  is,  of  course,  one  of  the  reasons 
that  suggested  to  Doug  Cox,  the 
author  of  Object-Oriented  Program¬ 
ming:  An  Evolutionary  Approach  (Ad- 
dison-Wesley,  1986),  that  object-ori¬ 
ented  systems  could  be  described 
as  software  ICs. 

In  the  case  of  hardware  ICs,  we 
have  units  of  packaged  functionality 
that  can  be  used  in  a  wide  variety  of 
cases  without  modification  or  fur¬ 
ther  debugging.  The  scope  of  ordi¬ 
nary  library  functions  is  so  narrow 
that  no  one  has  suggested  that  they 
be  compared  with  ICs.  It  is  only  in 
the  most  rudimentary  programs  that 
functions  operate  as  main  operating 
parts  within  the  program. 

Functions  are  usually  at  the  same 
time  too  general  and  too  specific  to 
have  this  role.  They  are  too  general 
in  that,  rather  than  being  a  function¬ 
ing  part  of  an  actual  program,  they 
are  more  like  standard  nuts  and 
bolts  that  come  in  an  enormous 
variety  of  sizes.  Without  any  argu¬ 
ments  supplied,  with  many  func¬ 
tions  it  is  still  often  unclear  just 
what  they  will  do. 

On  the  other  hand,  functions  are 
too  specific  in  that  the  variety  of 
situations  to  which  they  can  re¬ 
spond  is  often  very  limited.  A  func¬ 
tion  usually  requires  that  the  pro¬ 
grammer  know  in  advance  just  what 
situation  is  being  covered  each  time 
it  is  called.  Functions  are  also  spe¬ 
cific  in  that  the  types  of  things  they 
are  responsible  for  are  usually  fairly 
detailed,  low-level  operations.  With 
objects,  which  are  usually  made  up 
of  several  closely  related  and  inter¬ 
acting  functions,  we  have  a  way 
around  this  limitation. 

Consider  classes.  Classes  in  the 
object-oriented  paradigm  are  a  kind 
of  IC  template  for  stamping  out 
custom  working  parts  for  a  wide 
variety  of  programs.  There  may  be 
only  a  small  number  of  instances  of 
that  class  that  are  actually  used  in  a 
given  program.  But  in  making  those 
instances,  it  is  typical  to  assign 
values  to  instance  variables  that 


define  the  conditions  of  a  problem 
once  and  for  all  While  still  leaving 
open  the  issue  of  just  what  mes¬ 
sages  will  be  sent  to  what  objects  at 
what  time.  What  can  be  set  without 
hand  coding  by  the  programmer, 
however,  is  what  arguments  these 
methods  will  take  in  a  lot  of  cases. 
It  is  possible  to  have  a  systematic 
way  of  assigning  whole  sets  of  argu¬ 
ments  to  methods  simply  by  setting 
values  to  instance  variables  at  the 
time  an  object  that  acts  as  a  func¬ 
tioning  part  of  a  program  is  initial¬ 
ized.  I  hope  in  a  later  column  to 
provide  some  actual  coded  exam¬ 
ples  of  this. 


Inference  Engines 
for  Objects 

I  believe  that  the  object-oriented  ap¬ 
proach  has  a  lot  to  contribute  to 
furthering  the  goals  of  AI.  Using 
object-oriented  systems  to  model 
problem  domains  can  be  a  little  de¬ 
ceptive,  though,  unless  some  crucial 
design  issues  are  met.  Simply  creat¬ 
ing  a  static  model  of  the  unique 
aspects  of  a  certain  “world”  of  activ¬ 
ity,  such  as  the  world  of  medical 
diagnosis  or  electronic  troubleshoot¬ 
ing,  is  no  guarantee  that  programs 
using  such  models  will  be  at  all 
intelligent. 

In  other  words,  if  the  information 


Dr.  Dobb's  Journal,  November  1987 

902 


133 


ARTIFICIAL  INTELLIGENCE 

(continued  from  page  133) 

just  sits  there  statically,  it  does  not 
contribute  the  kind  of  knowledge 
we  are  interested  in,  no  matter  how 
well  the  world  model  may  have  been 
designed.  Everything  depends  on 
the  techniques  used  for  making  use 
of  that  information.  The  more  ac¬ 
tively  a  program  can  be  designed  so 
as  to  be  continually  accessing  the 
knowledge  incorporated  in  a  hierar¬ 
chical  model  of  a  problem  space, 
the  more  intelligence  will  have  been 
gained. 

In  rule-based  systems,  the  infer¬ 
ence  engine  is  a  general  program 
that  accesses  knowledge  in  the  form 
of  rules  and  keeps  things  moving  so 
that  the  knowledge  captured  in 
rules  gets  used  and  applied.  But  so 
far,  there  have  not  been  any  general- 
purpose  “inference  engines”  built 
for  accessing  knowledge  in  the  form 
of  objects.  As  things  now  stand,  we 
are  just  beginning  to  get  the  idea  of 
how  to  use  these  structures  in  a 
very  ad  hoc  way.  It  is  worthwhile, 
though,  to  speculate  a  little  bit  on 
what  generic  routines  might  be  im¬ 
portant  for  AI  programs  using  ob¬ 
jects  to  create  deep  models  of  do¬ 
mains. 

Certainly  some  unusual,  yet  ge¬ 
neric,  search  routines  specific  to 
object-oriented  systems  are  a  neces¬ 
sity.  One  such  basic  routine  would 
be  a  method  that  built  its  own  list 
of  all  the  instances  of  a  given  class 
for  the  purpose  of  search,  then 
made  its  own  ordering  of  the  names 
of  these  objects,  and  then  searched 
them  for  specific  information  by  look¬ 
ing  up  the  values  of  various  slots  or 
instance  variables.  Far  more  intri¬ 
cate  and  flexible  search  routines  are 
possible,  too,  such  as  those  that  cam 
do  a  more  general  type  of  informa¬ 
tion  gathering  so  as  to  provide  a 
temporary  seach  space  of  only  the 
relevant  data  needed  for  a  given  situ¬ 
ation.  This  idea  of  programs  that 
can  build  temporary  search  spaces 
dynamically  for  use  in  solving  prob¬ 
lems  efficiently  is  one  that  has  a 
broad  applicability  to  several  differ¬ 
ent  types  of  AI  application. 

An  interesting  use  of  class  hierar¬ 
chies  in  AI  has  been  the  partitioning 
of  rule-based  knowledge  into  sets  of 
rules  that  are  organized  according 


to  a  network  of  situations  in  a  prob¬ 
lem  space.  Here  only  the  tip  of  the 
iceberg  has  been  touched  so  far. 
There  is  room  for  a  wide  range  of 
approaches  to  providing  flexible  prob¬ 
lem  solving  methods  based  on  so¬ 
phisticated  routines  for  selecting 
rulesets  to  apply  by  rapid  initial 
searches  up  and  down  a  hierarchy 
rather  than  just  by  executing  simple 
test  and  branch  procedures. 

For  example,  let’s  say  that  the  ap¬ 
plication  is  one  of  actually  building 
a  design  for  something.  Here,  one  of 
the  important  services  that  com¬ 
puter  technology  can  provide  is  in 
helping  human  designers  make  sure 
they  have  not  overlooked  anything 
important.  The  problem  is  that  what 
is  important  in  a  given  design  can 
differ  enormously  from  case  to  case, 
depending  on  the  details  and  re¬ 
quirements  of  a  given  design.  In 
such  a  case  it  would  be  useful  to 
have  various  sets  of  constraint  rules 
that  could  be  selected  and  applied 
in  a  selective  way  rather  than  by  a 
rigid  procedure. 

One  of  the  key  types  of  know-how 
that  human  designers  learn  by  expe¬ 
rience  is  how  much  detail  is  rele¬ 
vant  for  evaluating  a  given  aspect  of 
a  design.  This  is  important  for  avoid¬ 
ing  entirely  inappropriate  analyses 
and  conclusions.  So,  for  example,  in 
some  designs,  there  is  enough  stan¬ 
dardization  in  the  parts  being  used 
that  an  iteration  through  each  part 
in  turn  would  be  absurd.  All  that  is 
needed  is  inspection  of  a  few  typical 
elements  to  reach  an  important  gen¬ 
eral  conclusion.  This  is  the  kind  of 
realization  that  a  human  designer 
reaches  by  common  sense  but 
which  can  be  automated.  In  this 
case,  a  general  routine  could  be 
used  that  sifted  quickly  through  all 
the  instances  of  each  type  of  design 
element  used  in  a  proposed  design 
and  decided  which  categories 
would  need  a  detailed,  iterative  treat¬ 
ment  and  which  could  be  evaluated 
by  inspecting  typical  elements. 

The  general  point  that  object-ori¬ 
ented  systems  can  be  effective  for  AI 
applications  only  to  the  degree  that 
an  active  access  is  provided  to  the 
information  represented  as  objects 
has  some  interesting  consequences. 
For  one  thing,  it  means  that  an  ade¬ 
quate  object-oriented  programming 
environment  must  provide  at  least 


some  minimum  features  to  make 
this  possible. 

First  on  the  list  is  the  ability  to 
keep  track  of  all  the  objects  cur¬ 
rently  in  the  system,  which  includes 
both  classes  and  instances,  accord¬ 
ing  to  their  place  in  the  hierarchy. 
That’s  more  powerful  than  it 
sounds;  part  of  what  we  consider 
simple  common  sense,  which  all  nor¬ 
mally  functioning  humans  have,  is  a 
sense  of  the  place  of  things  in  an 
implicit  logical  or  conceptual  hierar¬ 
chy.  It's  also  what  conventional  pro¬ 
grams  don't  have. 

For  example,  we  take  it  for 
granted  that  bicycles  and  cars  are  a 
means  of  transportation  and  hence 
are  in  the  category  of  equipment  or 
artifacts  and  that,  therefore,  certain 
kinds  of  behavior  with  them  are  ap¬ 
propriate.  The  boundaries  of  this 
are  subject  to  challenge,  of  course, 
but  we  take  it  for  granted  that  it  is 
not  appropriate  to  use  a  piano  for 
firewood  or  to  risk  our  lives  to 
rescue  a  bookend  from  a  burning 
house.  We  know  that  liquids  cannot 
be  safely  wrapped  in  newspaper  and 
millions  of  other  bits  of  general,  prac¬ 
tical  knowledge  that  seem  to  be 
stored  as  a  general  understanding 
of  types  of  things  and  their  roles 
and  characteristics. 

The  logical  way  to  represent  this 
common  sense  knowledge  of  how 
to  get  around  in  the  practical  world 
is  as  some  kind  of  conceptual  hierar¬ 
chy.  That  way,  the  mere  fact  that  a 
given  object  is  an  instance  or 
member  of  a  class  will  activate  cer¬ 
tain  knowledge  about  how  to  deal 
with  objects  of  that  type. 

The  two  main  ways  this  is 
currrently  done  in  AI  are  by  use  of 
rule-based  programming  and  proce¬ 
dural  attachments  or  active  values. 
In  many  of  the  larger  professional 
AI  programming  systems,  it  is  possi¬ 
ble  to  store  rules  about  classes  of 
things  that  are  invoked  for  certain 
types  of  objects.  The  problem, 
though,  is  to  provide  a  means  of 
determining  when  it  is  appropriate 
for  such  knowledge  to  be  invoked. 
Active  values  or  procedural  attach¬ 
ments  act  like  "demons”  that  are 
waiting  for  certain  events  in  order 
to  be  invoked.  So  various  operations 
can  be  waiting  in  the  wings,  as  it 
were,  for  certain  values  to  be 
changed  or  accessed. 


134 


Dr.  Dobb’s  Journal,  November  1987 

903 


ARTIFICIAL  INTELLIGENCE 

(continued  from  page  134) 

So,  for  example,  if  we  attempt  to 
change  the  physical  state  of  an  arti¬ 
fact,  such  as  by  burning  a  piano, 
before  that  operation  can  be  permit¬ 
ted,  certain  procedures  can  be  auto¬ 
matically  invoked  that  apply  knowl¬ 
edge  about  appropriate  behavior  for 
that  class  of  things.  A  demon  for 
wooden  objects  might  state  that 
things  in  the  category  of  useful  fur¬ 
niture  have  a  greater  value  than  sat¬ 
isfying  temporary  needs  for  warmth 
by  destroying  them. 

When  representing  knowledge  to 
be  used  to  solve  certain  problems, 
you  are  constantly  faced  with  deci¬ 
sions  about  how  implicit  or  explicit 
to  make  the  knowledge.  In  conven¬ 
tional  programming  the  knowledge 
that  is  incorporated  in  a  program  is 
nearly  always  completely  implicit. 
That  is,  it  does  not  really  exist  as 
knowledge  in  the  program  other 
than  as  knowledge  used  by  the  pro¬ 
grammer  to  design  its  functioning. 

In  AI  paradigms  such  as  rule- 
based  programming,  the  attempt  is 
made  to  make  knowledge  explicit  as 


general  rules  so  that  this  knowledge 
can  be  applied  to  new  situations  not 
entirely  foreseen  by  the  developer. 
The  ability  to  handle  novel  situations 
is  still  limited,  however.  So  far,  the 
main  advantage  gained  by  making 
knowledge  explicit  is  that  greater 
generality  and  flexibility  can  be  given 
to  applications.  Rather  than  having 
the  knowledge  “hard-wired"  into  a 
conventional  program  that  is  not  ca¬ 
pable  of  comparing  different  pieces 
of  knowledge  to  reach  its  own  con¬ 
clusions,  it  is  useful  to  provide  a 
way  of  modeling  the  general  process 
by  which  practical  results  are  ob¬ 
tained.  The  alternative  is  just  using 
the  results  in  a  purely  ad  hoc  way. 

Obviously,  it  is  never  possible  to 
make  everything  the  result  of  a  proc¬ 
ess  of  explicit  reasoning.  Even 
humans  could  not  function  that 
way.  Nothing  would  ever  get  done 
in  real  time.  We  would  all  end  up 
like  parodies  of  Hamlet,  forever  ana¬ 
lyzing  the  pros  and  cons  of  every 
aspect  of  even  the  most  trivial  ac¬ 
tions.  One  of  the  rules  of  thumb  for 
practical  activity,  whether  by  human 
or  machine,  is  that  certain  things 
must  be  taken  for  granted  as  im¬ 


plicit  and  used  for  a  limited  number 
of  events  of  explicit  analysis  or  rea¬ 
soning. 

From  this  point  of  view,  we  can 
see  one  of  the  main  distinguishing 
features  of  human  as  opposed  to 
machine  analysis.  At  any  given  time, 
we  can  decide  to  change  something 
from  the  status  of  implict  to  explicit 
if  we  need  to  for  any  reason.  It’s  an 
important  part  of  living  to  sense 
when  it  is  necessary  to  question 
things  normally  taken  for  granted. 
So  far,  with  AI  systems,  the  decision 
about  what  knowledge  will  be  im¬ 
plicit  and  what  explicit  is  decided 
by  the  programmer  and  cannot  be 
changed  except  by  rewriting  the  pro¬ 
gram.  In  a  future  column  I  will  de¬ 
scribe  a  program  I  have  been  design¬ 
ing  that  actually  decides  to  some 
degree. 

So,  as  things  now  stand,  the  main 
trade-off,  as  always,  is  flexibility  vs. 
efficiency.  The  question  is  how 
much  knowledge  can  be  made  ex¬ 
plicit  without  losing  adequate  per¬ 
formance.  This  is  always  problem 
specific.  For  some  problems,  a  week 
of  processing  time  would  be  accept¬ 
able  if  the  result  obtained  were  accu¬ 
rate  and  reliable  enough.  For  others, 
even  ten  seconds  might  seem  an 
eternity.  Making  the  proper  choice 
between  what  to  make  explicit  and 
implicit  in  an  AI  program  is  often 
something  that  can  only  be  done  by 
developers  with  a  great  deal  of  expe¬ 
rience  in  the  field.  Even  here,  often 
decisions  made  early  on  turn  out  to 
be  proven  wrong  later  in  the  game. 

The  advantage  of  modular  para¬ 
digms  such  as  rule-based  and  object- 
oriented  systems,  though,  is  that 
changes  can  usually  made  by  local 
modifications  that  do  not  alter  the 
functioning  of  the  program  as  a 
whole.  Often  knowledge  can  be 
made  explicit  by  adding  and  sub¬ 
tracting  various  rules  and/or  func¬ 
tioning  classes  that  do  not  require 
that  the  application  be  substantially 
rewritten.  And  that  sort  of  functional 
partitioning  is  essential  to  any  large- 
scale  programming  project  and  es¬ 
pecially  to  work  in  artificial  intelli¬ 
gence. 

DDJ 

Vbte  for  your  favorite  feature/article. 

Circle  Reader  Service  No.  9. 


136 

904 


Dr.  Dobb's  Journal,  November  1987 


LETTERS 

(continued  from  page  12) 


type  unevenly,  the  average  typing 
speed  was  often  much  less  than  ten 
characters  per  second. 

In  spite  of  their  slowness  the  tele¬ 
type  machines  were  the  most  practi¬ 
cal  terminals  for  multiuser  systems 
(then  called  time  sharing  systems) 
and  for  small  computers.  The  elec¬ 
tronic  terminals  using  a  CRT  were 
at  that  time  dumb  terminals  and 
very  expensive.  Furthermore,  if  hard 
copy  was  desired  it  was  necessary 
to  buy  a  printer  which  would  cost 
almost  as  much  as  a  full  teletype 
terminal.  The  widespread  use  of  tele¬ 
type  machines  as  terminals  left  its 
mark  on  computer  standards  in 
many  ways  besides  the  TTY  abbre¬ 
viation.  The  most  important  is  prob¬ 
ably  the  ASCII  code.  This  was  devel¬ 
oped  for  the  teletype  machines  and 
then  adopted  by  computers  when 
the  teletype  machines  were  adopted 
as  terminals. 

Many  computerists  are  puzzled 
by  the  fact  that  all  of  the  control 
codes  except  DEL  are  at  the  begin¬ 
ning  of  the  codes  while  DEL  is  at 
the  very  end.  The  old  name  for  DEL, 


RUBOUT,  gives  a  clue  to  this  mys¬ 
tery.  Because  it  was  very  difficult  to 
type  at  anywhere  near  the  maxi¬ 
mum  of  ten  characters  per  second 
on  the  teletype  machines  most  tele¬ 
types  were  equipped  with  a  paper 
tape  punch  and  reader.  The  mes¬ 
sage  was  punched  into  paper  tape 
off  line  and  then  transmitted  at  the 
maximum  rate  by  the  tape  reader 
on  line  to  save  expensive  on  line 
time.  But  suppose  you  made  a  mis¬ 
take  and  hit  the  wrong  key  near  the 
end  of  punching  a  long  tape.  Once 
a  key  was  struck  there  was  no  way 
to  unpunch  the  holes  in  the  tape 
and  it  would  be  very  frustrating  to 
have  to  do  the  whole  tape  over 
again.  Instead  the  tape  was  manu¬ 
ally  backed  up  to  the  error  and  the 
error  overpunched  with  the 
RUBOUT  code  which  punched  all 
the  hole  positions.  The  receiving  tele¬ 
type  was  set  to  ignore  RUBOUT 
codes  so  that  the  errors  simply 
didn’t  print. 

The  paper  tape  units  came  in  at 
least  two  version,  the  7-bit  and  the 
8-bit  types.  The  7-bit  type  was 


common  for  most  communications 
uses  but  for  computers  and  for  com¬ 
munication  channels  with  a  parity 
bit  the  8-bit  units  were  used.  With 
computers  paper  tapes  were  used 
for  off  line  storage  and  for  distribut¬ 
ing  software.  For  computer  use  at 
the  computer  high  speed  paper  tape 
punches  and  readers  were  devel¬ 
oped  and  became  so  common  that 
CP/M  had  input/output  channels  re¬ 
served  for  them. 

Many  of  the  earlier  and  less  ex¬ 
pensive  teletype  machines  were 
upper  case  only  and  this  is  one  of 
the  reasons  that  BASIC  as  developed 
at  Dartmouth  College  would  work 
with  all  upper  case.  Apparently  the 
Bell  Labs  could  afford  the  more  ex¬ 
pensive  teletypes  with  the  full  char¬ 
acter  set  and  both  C  and  Unix  are 
case  sensitive  although  Unix  has  a 
provision  for  translating  upper  case 
to  lower  case  for  upper  case  only 
terminals  and  both  tend  to  use 
mostly  lower  case.  Some  teletypes 
could  transmit  either  case  although 
they  printed  upper  case  only,  trans¬ 
lating  lower  case  to  upper  case 
when  receiving. 

Note  that  all  of  the  older  text  edi¬ 
tors  such  as  ed  on  the  Unix  system 
are  line  editors  and  that  only  the 
newer  editors  such  as  vi  are  full 
screen  editors.  That  is  because  with 
the  teletypes  as  terminals  there  was 
no  screen. 

This  is  not  the  first  case  where 
decisions  made  in  the  past  for  valid 
reasons  persist  in  standards  long 
after  the  reasons  for  making  those 
decisions  no  longer  apply.  After  all 
the  gauge  of  modern  railroads  was 
set  by  decree  of  an  ancient  Roman 
emporor  and  the  layout  of  the  stan¬ 
dard  typewriter  keyboard  was  made 
to  discourage  fast  typing  and  thus 
avoid  key  pileups.  After  a  standard 
has  been  in  use  for  a  while  there  are 
many  things  around  embodying  that 
standard  that  would  be  obsolete  if 
the  standard  were  changed  so  old 
standards  linger  on  long  after  they 
are  obsolete. 

Incidentally  the  slow  teletype  ma¬ 
chines  hooked  to  a  timesharing 
system  seemed  like  a  big  advance  to 
programmers  who  previously  had  to 
punch  their  programs  on  a  card 
punch  and  submit  them  for  batch 


138 


Dr.  Dobb’s  Journal,  November  1987 

905 


LETTERS 

(continued  from  page  138) 

processing  every  time  they  wanted 
to  try  something.  Then  they  would 
wait  hours  or  even  days  for  the  re¬ 
sults  which  might  be  only  a  cryptic 
error  message.  Think  how  that 
would  affect  your  debugging  proce¬ 
dures. 

David  S.  Tilton 

27  Pennacook  St. 

Manchester,  NH  03104 

Interrupts 

Dear  DDJ, 

Thomas  Zimniewicz  wonders  (in  his 
article  "An  Extended  IBM  PC  COM 
Port  Driver,"  June  1987)  why  the  IBM 
PC  serial  ports  use  a  UART  bit  to 
enable  the  interrupt-driver  line. 
They  do  this  so  the  interrupt  can  be 
turned  off.  Normal  computers  (Z- 
80s,  etc.)  often  have  such  lines 
driven  low  by  open-collector  drivers 
when  signaling  an  interrupt,  so  that 
more  than  one  device  can  use  the 
physical  hardware  interrupt  wire 
(the  software,  in  such  a  case,  would 
be  obliged  to  poll  each  possible 
device).  Such  interrupt  lines  are 
"pulled-up"  somewhere  in  the 
system  with  a  single  resistor,  so  even 
if  nobody  is  driving  the  interrupt 
wire,  the  CPU  won’t  see  erroneous 
interrupts. 

The  IBM/Intel  hardware,  on  the 
other  hand,  drives  the  interrupt  line 
high  when  an  interrupt  occurs,  and 
the  hardware  has  no  inherent  mecha¬ 
nism  for  sharing  an  interrupt  line. 
The  serial  cards  ameliorate  this  to 
some  extent  by  providing  a  disabling/ 
enabling  mechanism  attached  to  a 
spare  bit  on  the  UART;  thus,  by  se¬ 
lectively  turning  off  one  guy’s  inter¬ 
rupt  and  turning  on  another,  more 
than  one  card  can  use  the  same 
interrupt  wire;  but  not  really  at  the 
same  time.  The  serial  cards  usually 
also  include  a  DIP  switch  enable/ 
disable,  which  is  basically  in  series 
with  the  electrical  mechanism. 

Here  are  some  of  the  nice  things 
that  can  happen  if  you  forget  about 
the  interrupt  enable  bit. 

Two  or  more  serial  cards — or 
other  cards? — could  have  their  inter¬ 
rupts  enabled  simultaneously  on  the 
interrupt  line.  This  is  an  electrical 
short;  there  probably  won’t  be  any 
smoke  or  anything,  but  the  inter¬ 
rupts  definitely  won’t  work  the  way 


Dr.  Dobb's  Journal,  November  1987 


LETTERS 

(continued  from  page  140) 


you  expected.  You  could  set-up  the 
interrupt  controller  properly,  but 
leave  the  interrupt  disabled.  Accord¬ 
ing  to  my  Technical  Reference,  this 
leaves  the  interrupt  line  floating,  i.e., 
your  interrupts  will  become  phase- 
of-moon  dependent. 

Perhaps  with  PS-17  or  whoever, 
IBM  will  come  up  with  sophisti¬ 
cated  hardware  comparable  to  the 
average  microwave  oven  controller 
of  today.  We  can  but  hope. 

J.G.  Owen 

35  Admiral  St. 

Port  Jefferson  Station,  NY  11776 

Bandwidth  Bottleneck 

Dear  DDJ, 

I  have  noted  the  thoughts  on 
memory  and  bandwidth  in  recent 
issues  of  DDJ ;  they’re  forward-look¬ 
ing  and  pertinent  to  an  issue  which 
is  of  increasing  concern  to  program¬ 
mers  and  hardware  designers  alike. 
In  particular,  the  editorial  in  the 
March  issue  got  me  thinking  with 
the  reference  to  commuting.  How 
many  times  have  I  myself  sat  in  my 
car,  creeping  along,  wishing  for 
“more  bandwidth?"  Then,  as  the 
highway  crews  build  in  more  band¬ 
width,  more  cars  come  out  of  the 
woodwork  to  clog  it  all  up  again. 

The  central  question  becomes  not 
how,  but  why.  Why  should  massive 
amounts  of  data  be  thrown  from 
storage  to  processor,  often  in  order 


to  locate  or  operate  on  a  minuscule 
fraction  thereof,  and  then  thrown 
back  again,  or  even  dumped?  As 
long  as  processing  power  is  located 
over  here  and  data  over  there,  data 
volume  will  always  grow  to  over¬ 
whelm  the  physical  channel,  similar 
to  budgets  and  pocketbooks. 

Two  approaches  to  this  problem 
seem  to  point  in  fruitful  directions: 
more  efficient  data  transfer  and  elimi¬ 
nating  data  transfer  entirely.  The 
latter  is  not  necessarily  the  logical 
extension  of  the  first. 

Using  current  technology  and  its 
extrapolations,  it  seems  possible  to 
explore  more  advanced  methods  of 
storing  and  indexing  data  so  that 
the  patterns  inherent  in  the  data  are 
more  “visible”  to  the  data  user.  If 
the  system  wishes  to  use  data  for 
inferences  or  “inference-primitives,” 
rather  than  as  fragments  which 
must  be  assembled  into  inferences 
each  time  or  when  data  will  be  trans¬ 
ferred,  distill  it  first.  The  extra  proc¬ 
essing  time  up  front  would  greatly 
reduce  operations  time  later. 

Eliminating  data  transfer,  however, 
would  require  a  more  fundamental 
change  in  the  way  data  and  proces¬ 
sor  work  together.  Perhaps  we  are 
seeing  the  beginnings  of  this  quan¬ 
tum  leap  in  the  advancing  neural- 
network  research.  Why  must  Mo¬ 
hammed  go  to  the  mountain?  Make 
Mohammed  be  the  mountain!  Data 


essentially  becomes  the  processor 
and  the  problems  associated  with 
transfer  to  and  from  processing  cen¬ 
ters  disappears.  Analog  computers 
operate  in  this  fashion;  the  construc¬ 
tion  of  the  computer  is  really  data 
to  be  operated  on.  The  computation 
is  essentially  instantaneous  without 
the  need  to  read  or  otherwise 
handle  data  as  a  separate  entity. 

I  suppose  that  the  issue  revolves 
around  the  old  "chicken-and-the- 
egg”  dilemma.  If  we  can  eliminate 
the  need  to  handle  all  of  the  data, 
perhaps  we  would  know  the  answer 
before  asking  the  question,  or  the 
solutions  might  fail  to  be  general 
enough  to  be  truly  useful.  Additional 
research  on  what  data  really  is  will 
help  us  find  our  way. 

Thanks  for  placing  such  an  impor¬ 
tant  issue  before  the  informed  and 
active  computing  public  of  DDJ  read¬ 
ers.  I'm  sure  that  there  will  be  much 
informed  comment  and  discussion. 
H.  Ward  Silver 
RBR  Engineering 
P.O.  Box  1608 
Vashon,  WA  98070 

DDJ 


142 


Dr.  Dobb's  Journal,  November  1987 

907 


PROGRAMMER'S  SERVICES 

OF  INTEREST 


Sigma  Designs  has  announced  Sig- 
maVGA,  a  high-resolution  graphics 
board.  As  its  name  implies,  Sig- 
maVGA  offers  compatibility  with 
IBM’s  new  Video  Graphics  Array 
(VGA)  standard.  It  also  offers  com¬ 
patibility  with  the  IBM  monochrome 
display  mode  and  CGA,  EGA,  and 
the  Hercules  graphics  adapter 
modes,  and  it  supports  single-scan 
and  multisync  monitors  as  well  as 
the  IBM  PS/2  analog  color  and  mono¬ 
chrome  monitors.  SigmaVGA  has 
256K  of  on-board  memory  and  per¬ 
mits  up  to  256  colors  to  be  selected 
from  a  palette  of  262,144  colors. 

According  to  Sigma  Designs,  Sig¬ 
maVGA  supports  all  the  new  VGA 
BIOS  mode  and  function  calls,  in¬ 
cluding  320  X  200  in  256  colors  or  64 
gray  scales;  640  X  480  in  16  colors  or 
gray  scales;  720  X  400  in  16  colors  or 
gray  scales;  and  simultaneous  dis¬ 
play  of  two  out  of  a  possible  eight 
character  sets  stored  in  memory.  It 
can  also  save  and  restore  video 
states  for  use  in  multitasking  appli¬ 
cations  and  offers  132-column  sup¬ 
port  with  all  digital  monitors. 

The  SigmaVGA  graphics  board  re¬ 
tails  for  $499  and  comes  bundled 
with  Show  Partner,  Version  3.0,  from 
Brightbill-Roberts  &  Co.  of  Syracuse, 
New  York.  Show  Partner  is  an  appli¬ 
cation  designed  to  capture  screens 
from  any  source,  then  present  them 
using  a  variety  of  special  effects  and 
animation  techniques,  including 
color,  motion,  and  sound.  Reader 
Service  No.  16. 

Sigma  Designs  Inc. 

46501  Landing  Pkwy. 

Fremont,  CA  94538 
(415)  770-0100 


Video  Seven  has  announced  a 
single-chip  EGA  that  is  completely 
hardware  compatible  with  EGA,  CGA, 
MDA,  and  HGC  (Hercules  graphics 
board)  display  modes.  The  chip  is 
contained  in  a  160-pin  package  and 
features  a  complete  implementation 
of  the  EGA  hardware  as  well  as  a 
fully  functional  6845  CRT  controller. 
Advanced  features  include  enhanced 
automatic  switch  support,  dot  clock 
speeds  of  up  to  34  MHz,  double¬ 
scan  capability,  and  built-in  support 
logic.  The  board  is  manufactured  for 
Video  Seven  by  LSI  Logic. 

The  price  has  not  yet  been  deter¬ 
mined.  Reader  Service  No.  17. 
Video-Seven  Inc. 

46335  Landing  Pkwy. 

Fremont,  CA  94538 
(415)  656-7800 

Graphics  Software  Systems  has  re¬ 
leased  the  OS/2  Graphics  Develop¬ 
ment  Toolkit  for  the  advance  release 
of  OS/2.  The  OS/2  GDT  includes 
device  drivers  for  VGA,  EGA,  and 
CGA  graphics  adapters;  the  Micro¬ 
soft  Mouse;  the  IBM  Proprinter, 
Graphics  Printer,  and  Color  Graph¬ 
ics  Printer;  the  Quietwriter  III;  and 
plotters  from  IBM  and  HP.  Addi¬ 
tional  drivers  now  under  develop¬ 
ment  include  IBM/PS/2  Mouse,  the 
HP  LaserJet +  ,  and  laser  and  dot¬ 
matrix  printers  from  Epson  and 
other  vendors. 

The  OS/2  GDT  advance  release  in¬ 
cludes  language  binding  to  support 
Microsoft  C  and  Macro  Assembler. 
The  final  version  of  OS/2  GDT  will 
support  OS/2  implementations  of  C, 
BASIC,  FORTRAN,  Pascal,  and  Macro 
Assembler  from  IBM  and  other  third- 
party  vendors. 

The  list  price  is  $995,  with  dis¬ 
counts  available  to  registered 
owners  of  the  GSS  Graphics  Develop¬ 
ment  Toolkit  for  DOS.  A  final  version 
of  the  OS/2  GDT  will  be  available 
from  GSS  when  the  final  version  of 
OS/2  is  available  from  Microsoft  and 
IBM.  Reader  Service  No.  18. 

Graphics  Software  Systems  Inc. 

9590  S.W.  Gemini  Dr. 

Beaverton,  OR  97005 
(503)  641-2200 

Atron  has  announced  Windows 


Probe,  a  set  of  debugging  tools  for 
developers  using  the  Microsoft  Win¬ 
dows  operating  environment.  The 
tools  permit  trapping  programs  that 
are  not  yet  in  memory  but  reside  on 
disk.  With  many  programs  operating 
simultaneously,  program  bugs  have 
a  far  greater  chance  of  corrupting 
other  programs  as  well  as  them¬ 
selves.  Windows  Probe  tracks  the 
program  in  real  time  and  adjusts 
symbolic  and  source-level  debugging 
information  accordingly. 

Atron’s  AT  PROBE  contains  1  mega¬ 
byte  of  hidden  and  write-protected 
memory  that  stores  the  Windows 
Probe  debugger  software  and  the 
symbolic  and  source-level  debugging 
information  without  taking  memory 
space  in  the  lower  1  megabyte  of 
system  memory.  AT  PROBE’s  real¬ 
time  trace  feature  lets  programmers 
see  how  the  program  operated  in 
real  time. 

Windows  Probe  is  compatible 
with  the  C  compiler  that  is  part  of 
the  Windows  development  system. 
Programmers  can  display  and 
change  local  and  complex  data  vari¬ 
ables  as  well  as  do  source-level  de¬ 
bugging. 

Windows  Probe  costs  $495.  Reader 
Service  No.  19. 

Atron 

20665  Fourth  St. 

Saratoga,  CA  95070 
(408)  741-5900 

Foresight  Resources  Corp.  has  re¬ 
cently  released  a  version  of  its  drafix 
1  computer-aided  design  and  draft¬ 
ing  program  for  the  Atari  ST.  Drafix 
1/Atari  ST  is  the  first  professional- 
quality  CAD  program  available  for 
the  Atari  520ST  and  1040ST  comput¬ 
ers.  The  program  is  menu-driven 
with  available  commands  displayed. 
It  includes  object  drawing  and  edit¬ 
ing,  snap  grid  and  object  snap  draw¬ 
ing  aids,  multiple  fonts,  crosshatch¬ 
ing,  an  automatic  dimensioning 
system,  and  symbol  library  manage¬ 
ment. 

Drafix  1/Atari  ST  sells  for  S195. 
Reader  Service  No.  20. 

Foresight  Resources  Corp. 

932  Massachusetts 
Lawrence,  KS  66044 
(913)  841-1121 


144 

908 


Dr.  Dobb's  Journal,  November  1987 


OF  INTEREST 

(continued  from  page  144) 


Raster  Technologies  has  intro¬ 
duced  two  products  in  a  line  of 
GX4000  parallel-processing  graphics 
accelerators  for  use  with  Sun  Mi¬ 
crosystems’  workstations.  Raster's 
GX4330  and  GX4340  plug  directly 
into  a  Sun-3  or  Sun-4  VME  back¬ 
plane.  They  use  a  new  parallel  archi¬ 
tecture  that  Raster  has  developed  to 
execute  the  proposed  ANSI  PHIGS 
and  PHIGS  +  standards  at  the  fast¬ 
est  possible  rate. 

The  systems  offer  a  24-bit,  true- 
color  display  and  the  ability  to 
create  and  edit  a  standards-based 
3-D  display  list  structure.  They  sup¬ 
port  Sun’s  X. 11/NeWS  Window 
System  and  200,000  to  1,000,000  trans¬ 
formed  and  drawn  32-bit  floating¬ 
point  3-D  vectors  per  second.  The 
architecture  consists  of  a  dual- 
ported  Display  List  Module  incorpo¬ 
rating  1-megabit,  static  column  RAM 
chips. 

Reader  Service  No.  21. 

Raster  Technologies  Inc. 

Two  Robbins  Rd. 

Westford,  MA  01886 
(617)  692-7900 


MacMemory  has  released  Turbo  SE, 
a  16-MHz  68000-based  accelerator 
board  for  the  Apple  Macintosh  SE 
that  offers  a  minimum  speed  in¬ 
crease  of  200  percent.  Users  also 
have  the  option  of  moving  the  Macin¬ 
tosh  SE  ROMs  to  the  Turbo  SE  board 
to  double  the  speed  of  all  ROM  op¬ 
erations.  This  increase  is  100  per¬ 
cent  compatible  with  existing  Mac 
applications. 

Turbo  SE  sells  for  $599.  Reader 
Service  No.  22. 

MacMemory  Inc. 

2480  N.  First  St. 

San  Jose,  CA  95131 
(408)  922-0140 

AJS  Publishing  has  released  db/ 
LIB,  a  database  library  that  features 
a  set  of  20  assembly-language  proce¬ 
dures  that  give  Microsoft's  QuickBA¬ 
SIC  (Version  2.0  or  later)  full  rela¬ 
tional  database  management  capa¬ 
bility  and  dBASE  III  standard  file 
compatibility  for  database,  index, 
and  text  files. 

In  addition  to  the  library  files,  the 
db/LIB  software  package  contains  sev¬ 


eral  executable  data  management 
routines,  including  Browse,  Copy 
Structure,  and  List.  Features  include 
the  ability  to  search  and  index  on 
memo  fields,  offer  direct  access  to 
the  keys  in  index  files  and  to  the 
database  file  header,  and  provide 
access  to  files  down  any  subdirec¬ 
tory  path. 

The  db/LIB  software  internally  man¬ 
ages  its  own  system  of  record  buffer¬ 
ing  and,  by  writing  records  to 
memory  buffers  rather  than  to  disk, 
makes  many  applications  run  appre¬ 
ciably  faster.  It  employs  dynamic 
string  allocation  for  user  variables 
returned  by  the  library,  which 
means  that  variables  do  not  require 
preallocation  of  string  space. 

Db/LIB  sells  for  $139.  Reader  Serv¬ 
ice  No.  23. 

AJS  Publishing 
P.O.  Box  379 

North  Hollywood,  CA  91603 
(818)  985-3383 

Sterling  Castle  Software  has  just 
released  Version  4.0  of  the  BlackStar 
C  Function  Library.  The  new  version 


146 


Dr.  Dobb’s  Journal,  November  1987 

909 


OF  INTEREST 

(continued  from  page  146) 

supports  the  EGA,  allows  terminate- 
and-stay-resident  program  develop¬ 
ment,  and  has  six  new  serial  com¬ 
munication  functions.  The  TSR  func¬ 
tion  enables  the  development  of  pro¬ 
grams  that  can  pop  up  and  be  util¬ 
ized  while  another  program  is  run¬ 
ning.  The  library  supports  the  stan¬ 
dard  ANSI  language;  is  compatible 
with  Microsoft  C,  Version  3.0/4.0,  and 
Lattice  C,  Version  3.0;  and  is  adapt¬ 
able  to  other  versions. 

The  product  offers  more  than  300 
functions,  including  device  handlers 
for  screen,  graphics,  keyboard, 

printer,  and  mouse.  It  also  has  the 
capabilities  of  interrupts,  string, 
menu,  date,  time,  and  system  func¬ 
tions  and  uses  primitive  functions 
written  in  assembly  language  to  im¬ 
prove  speed  and  memory  usage.  The 
library  includes  complete  source 
code  and  small-,  medium-,  and  large- 
memory  models. 

Version  4.0  costs  $129.  Reader  Serv¬ 
ice  No.  24. 

Sterling  Castle  Software 

702  Washington  St.,  Ste.  174 

Marina  del  Rey,  CA  90292 
(213)  306-3020 

Silicon  Beach  Software  has  intro¬ 
duced  Super  3D  for  the  Macintosh, 
a  product  that  offers  3-D  graphics 
modeling  and  animation  for  profes¬ 
sional  engineering  and  graphics  arts 
applications.  It  was  designed  to  pro¬ 
vide  artists,  architects,  engineers 
with  dynamic  simulation  and  mod¬ 
eling  capabilities. 

Modeling  tools  allow  users  to 
create  shaded  shapes  in  more  than 
16,000  colors,  3-D  animation  for  dy¬ 
namic  visualization  of  complex  struc¬ 
tures,  and  movie-camera-like  tools 
for  precise  visual  control.  The  tool 
palette  offers  ten  basic  tools  for  cre¬ 
ating  graphics  primitives  such  as 
points,  lines,  arcs,  circles,  ovals,  rec¬ 
tangles  and  polygons.  Objects  can 
revolve  about  any  axis  or  they  can 
be  extruded,  translated,  scaled, 
flipped,  replicated,  and  mirrored  or 
arbitarily  reshaped. 

Super  3D  is  priced  at  $295.  Reader 
Service  No.  25. 

Silicon  Reach  Software  Inc. 

P.O.  Box  261430 

San  Diego,  CA  92126 
(619)  695-6956 

NeXT  and  Adobe  Systems  Inc. 

have  announced  a  jointly  developed 
version  of  PostScript  for  workstation 
displays.  The  new  product,  Display 
PostScript,  will  be  independent  of 
windowing  systems  and  include  full 
support  for  outline  fonts,  arbitrary 
line-widths,  rotation,  and  color. 

According  to  Steve  Jobs,  “Display 
Postscript  allows  us  to  achieve  true 
WYSIWYG  from  the  display  to  the 
printed  page  since  the  same  imag¬ 
ing  model  is  now  used  for  both.’’ 

No  price  has  been  announced  be¬ 
cause  the  product  is  still  under  de¬ 
velopment.  Adobe  has  scheduled  a 
demonstration  of  the  product  for 
the  summer  of  1988.  Circle  Reader 
Service  No.  26. 

NeXT,  Inc. 

3475  Deer  Creek  Road 

Palo  Alto,  CA  94304 
(415)  424-0200 

DDJ 

148 

910 


Dr.  Dobb's  Journal,  November  1987 


FORUM 


SWAINE'S  FLAMES 


What  do  the  following  have  in 
common:  binary  addition,  sub¬ 
traction,  multiplication,  and  division; 
sorting;  graph  connectivity;  multipli¬ 
cation  of  matrices  and  finding  their 
inverses,  determinants,  and  ranks; 
polynomial  greatest  common  denom¬ 
inators;  and  context-free  languages? 

Answer:  they  all  belong  to  a  class 
of  problems  denoted  as  NC,  short 
for  Nick’s  class,  after  one  Nicholas 
Pippenger  who  gave  the  class  formal 
definition  some  eight  years  ago  in  a 
paper  called  “On  Simultaneous  Re¬ 
source  Bounds”  in  the  Proceedings 
of  the  ZOth  IEEE  Symposium  on  the 
Foundations  of  Computer  Science 
(IEEE  Computer  Society,  Los  Ange¬ 
les  1979).  The  problems  that  gain 
admittance  to  Nick’s  class  are  just 
those  problems  that  can  be  solved 
substantially  faster  and  more  effi¬ 
ciently  using  many  processors 
rather  than  one.  They  are  the  prob¬ 
lems  that  wOl  drive  the  growth  and 
spread  of  parallel  processing  tech¬ 
nology. 

Progress  on  parallel  processing  ar¬ 
chitectures  seems  to  be  gaining  mo¬ 
mentum.  Some  of  the  approaches 
include  investigating  problems  such 
as  the  Byzantine  Generals  problem, 
in  which  processors  have  to  reach 
an  agreement  through  message  pass¬ 
ing  even  though  some  of  the  mes¬ 
sages  are  unreliable,  and  research 
on  message  passing  in  a  sparse  net¬ 
work  of  processors.  And  up  in  the 
hills  behind  the  Berkeley  campus  of 
the  University  of  California,  research¬ 
ers  at  the  Mathematical  Sciences  Re¬ 
search  Institute,  the  “Camelot  of  com¬ 
plexity  theory,”  have  developed  sig¬ 
nificant  new  parallel  algorithms  in 
recent  years. 

The  Fundamental  Algorist,  Donald 
Knuth,  tells  of  a  visit  to  the  Univer¬ 
sity  of  Chicago  some  years  ago.  En¬ 
tering  a  certain  building,  he  encoun¬ 
tered  two  signs.  One  said  "Informa¬ 
tion  Science"  and  had  an  arrow  point¬ 


ing  to  the  right;  the  other  said  "In¬ 
formation”  and  pointed  to  the  left. 
A  real-life  cartoon  of  the  schism  be¬ 
tween  science  and  practice,  or,  as 
Knuth  sees  it,  between  science  and 
art. 

Knuth  has  argued  for  years  that 
computer  programming  should  be 
viewed  more  as  an  art  and  that  the 
toolmakers  ought  to  make  tools  that 
are  a  pleasure  to  use.  Floating-point 
arithmetic  should  satisfy  simple 
mathematical  laws,  he  argues;  then 
using  it  for  serious  purposes  could 
be  pleasant.  He  even  thinks  that  JCL 
can  be  beautiful. 

Beauty  is  in  the  eye  of  the  be¬ 
holder.  Is  reverse  Polish  notation 
beautiful?  How  about  those  Unix  pro¬ 
gram  names?  Awk!  To  some,  beauty 
can  be  structural  elegance,  as  in 
Bach’s  "Tocatta  and  Fugue  in  D 
Minor";  to  others,  it’s  a  clever  hack, 
like  Tom  Lehrer’s  "Poisoning  Pigeons 
in  the  Park.” 

Alan  Turing,  one  of  program¬ 
ming’s  patriarchs,  loved  a  clever 
hack.  Turing,  as  his  colleague  James 
Wilkinson  put  it,  "was  particularly 
fond  of  little  programming  tricks 
(some  would  say  he  was  too  fond  of 
them  to  be  a  ‘good’  programmer).” 
Because  the  world's  first  working 
program  on  an  electronic  stored- 
program  computer  ran  on  June  21, 
1948,  and  Turing  was  writing  pro¬ 
grams  for  the  machine  a  few  days 
later,  this  particular  aesthetic  dis¬ 
agreement  would  seem  to  have  deep 
roots. 

All  of  the  above  items  were  drawn 


from  a  book  that  I  recommend  to 
any  programmer.  It’s  ACM  Turing 
Awards  Lectures:  The  First  Twenty 
Years:  1966-1985,  (Addison  Wesley, 
1987).  Credit  where  credit  is  due. 

The  key  to  the  puzzle  from  last 
month  is  that  the  stated  conditions 
imply  that  only  Mickey  would  get 
any  of  the  loot.  That  column  was  a 
nostalgic  aberration,  by  the  way;  I 
once  did  puzzles  like  it  weekly,  en¬ 
couraged  by  Maggie  Canon,  then 
editor-in-chief  of  InfoWorld,  now  of 
Macintosh  Today.  Thanks,  Maggie. 
Credit  where  credit  is  due. 


Cousin  Corbett’s  Secrets  of  Soft¬ 
ware  Success,  Part  VII:  The  Rule  of 
Point  One.  This  is  a  corollary  to  an 
earlier  Secret,  but  is  important 
enough  to  merit  a  Part  number  of  its 
own.  The  rule  is:  use  Version  i.O  of 
your  product  to  debug  the  changes 
since  Version  i-1.  When  the  product 
is  relatively  stable  again,  release  it  as 
Version  i.l.  The  user  version  of  this 
rule  is:  don't  buy  anything  until  Ver¬ 
sion  i.l  is  released.  The  authorship 
of  this  rule  is  in  dispute.  A  PC  Week 
author  may  already  have  published 
the  rule,  but  he  borrowed  it  from 
our  editor,  Tyler  Sperry,  who  claims 
to  have  invented  it.  My  cousin 
Corbett,  on  the  other  hand,  swears 
that  it’s  his.  All  I  know  for  sure  is 
that  the  inspiration  came  from  Mi¬ 
crosoft.  Credit  where  credit  is  due. 


\ 


SyA/^r/^yf 


Michael  Swaine 
editor-in-chief 


152 


Dr.  Dobbs  Journal,  November  1987 

911 


#134  DECEMBER  1987 


2.95  (3.95  CANADA) 


Dr-Dobb’s  Journal  of 

Software  Tools 

FOR  THE  PROFESSIONAL  PROGRAMMER 


Making  Sense 
of  Operating 
Systems 

Dynamic  Linking  in  OS/2 
ROMing  C  Code 
Turbo  C  Graphics 
Languages: 

C,  Assembler,  Forth 


System  v 


DECEMBER  1987 


CONTENTS 


VOLUME  12,  ISSUE  12 


OS/2  ^ 
RAM-Caching  ► 
EXE  to  ROM  ^ 
Faster  Roots 
Text  Graphics  ► 


Turbo  C  ► 
Forth  ^ 


ARTICLES 


Dynamic  Linking  in  08/2  18 

by  D.  E.  Cortesi 

Dave  explains  one  of  the  less  explored  aspects  of  OS/2— the 
built-in  hooks  for  third-party  extensions. 

A  RAM-Cache  Manager  in  C  30 

by  Alan  Deikman 

Alan  provides  a  set  of  caching  routines  for  optimizing  program 
performance. 

Putting  ROM  Code  in  its  Place  38 

by  Rick  Naro 

Rick’s  LOCATE  utility  lets  you  move  code  from  DOS  .EXE  files 
to  ROM  environments. 

Integers  Don’t  Float  48 

by  Ray  Mariella 

Ray  investigates  the  tradeoff  between  speed  and  precision  in 
some  algorithms  used  for  calculating  square  roots. 

A  Graphics  Toolbox  for  Turbo  C — Part  II  54 

by  Kent  Porter 

Kent  builds  upon  the  library  he  offered  last  issue.  This  time 
the  subject  is  text  graphics:  items  like  menu  bars,  pop-up 
windows,  and  pull-down  menus. 


COLUMNS 


126 


C  CHEST 
by  Allen  Holub 

Allen  presents  a  multitasking  kernel  that  allows  a  program  to 
run  several  subroutines  as  independent  tasks. 

THE  FORTH  COLUMN  144 

by  Martin  Tracy 

News  from  the  August  ANSI  Forth  meeting  as  well  as  a  set  of 
Forth-83  words  to  extend  string  handling  capabilities. 


FORUM 


EDITORIAL  6 

by  Tyler  Sperry 

RUNNING  LIGHT  8 

by  Tyler  Sperry 

ARCHIVES  8 

LETTERS  lO 

by  you 


PROGRAMMER'S 

SERVICES 


ADVERTISER  INDEX:  153 

Where  to  go  for  more  infor¬ 
mation  on  products. 

OF  INTEREST  154 

Products  for  programmers 
SWAINE’S  FLAMES  160 

by  Michael  Swaine 


About  the  Cover 

We’d  like  to  be  able  to  tell  you 
that  you  could  avoid  all  this  OS 
documentation  just  by  reading 
this  single  issue  of  DDJ.  We’d 
like  to,  but  we  can’t.  The  new 
trend  in  operating  systems 
seems  to  be  an  entry  require¬ 
ment  of  at  least  six  months’ 
study. 


This  Issue 

It’s  the  return  of  DDJ’s  infa¬ 
mous  annual  Operating  Sys¬ 
tems  issue.  So  why  are  you 
reading  this  instead  of  the 
articles  listed  on  the  left? 


Next  Issue 

January  is  DDJ’s  annual  68K 
issue,  and  we’re  starting  the 
new  year  off  right  with  a  new 
column  devoted  to  advanced 
Mac  hacking,  er,  programming. 
Plus  a  couple  of  other  sur¬ 
prises. 


Dr.  Dobb’s  Journal,  December  1987 

914 


3 


FORUM 


EDITORIAL 


DOS  Ex  VI  at' hill  a 


Any  month  now,  we’ll  be  greeted 
with  the  official  debut  of  Micro¬ 
soft’s  OS/2.  It  is,  if  you’ll  believe  Bill 
Gates’  patter,  the  operating  system 
we’ve  all  been  waiting  for.  At  long 
last,  we'll  have  multitasking  and 
large  program  support.  So  why 
aren’t  we  all  more  enthusiastic 
about  it?  Why  is  it  that,  as  we  get 
closer  and  closer  to  actually  having 
OS/2,  the  alternatives  begin  to  look 
so  much  better? 

Some  of  this  may  be  the  result  of 
the  prolonged  vaporware  period. 
And  some  of  it  is  no  doubt  the 
skepticism  we  normally  feel  at  the 
introduction  of  any  new  software. 
In  the  case  of  OS/2,  I  think  it’s  fair  to 
say  that  the  critics  have  had  plenty 
to  work  with.  There  are  all  those 
minor  technical  points  you’ve  heard 
before:  it’s  not  thoroughly  debugged 
yet;  it's  too  fat;  it’s  too  slow;  it's  too 
expensive.  Well,  yes,  true  enough. 
But  I’d  like  to  remind  everyone  of  a 
problem  that  dwarfs  all  those  above. 
Simply  put,  OS/2  is  too  late. 

Note  that  I’m  not  talking  about 
delays  in  shipping  the  product.  Mi¬ 
crosoft  has  been  pretty  good  on  meet¬ 
ing  the  ship  dates  for  its  OS/2  Devel¬ 
oper  Kits.  No,  what  I’m  talking  about 
here  is  a  critical  marketing  problem 
for  OS/2:  it’s  a  dynamite  product  for 
a  market  that  existed  a  couple  of 
years  ago. 

The  problem,  as  you’ve  no  doubt 
noticed,  is  that  the  world  just  didn’t 
stop  moving  when  the  80286  came 
along.  As  I  write  this,  just  weeks 
before  Fall  Comdex,  there  are  sev¬ 
eral  companies  offering  80386  moth¬ 
erboard  upgrades  for  XT  owners. 
Intel  is  scheduled  to  announce  their 
new  Inboard  386/PC  designed  to  fit 
into  a  normal  PC.  No  toggle 
switches,  no  software  tweaking.  You 
just  drop  it  in  and  your  PC  is  an 
order  of  magnitude  faster.  (The  RAM 
on  the  386  card  is  well-used:  some 


for  faster  ROM  BIOS  access,  and 
some  for  caching  the  hard  disk.) 
And  this  for  a  price  cheaper  than 
you’re  used  to  seeing  AT-clones  sell 
for. 

So,  given  that  386  revolution  is 
well  under  way,  you  have  to  wonder 
about  the  leadership  demonstrated 
in  Microsoft’s  introducing  a  286- 
bound  operating  system  when  every¬ 
one  is  gearing  up  for  the  386  revolu¬ 
tion.  Probably  the  most  telling  in¬ 
dictment  of  Microsoft’s  strategy  is 
the  market’s  reaction  to  the  intro¬ 
duction  of  its  own  Windows/386.  Fore¬ 
casting  the  success  and  failure  of 
software  is  always  a  tricky  matter, 
but  it  seems  reasonable  to  presume 
that  Windows/386  will  be  vastly  more 
successful  than  OS/2  over  the  short 
run.  Indeed,  the  release  of  Windows/ 
386  before  OS/2  with  its  Presentation 
Manager  could  be  seen  as  Micro¬ 
soft’s  tacit  admission  that  an  OS/2 
tomorrow  is  no  match  for  a 
DesqView  today. 

I  mention  all  this  because  in  the 
months  to  come,  DDJ  will  be  spend¬ 
ing  a  great  deal  of  space  exploring 
the  386  universe.  Weil  be  investigat¬ 
ing  a  variety  of  operating  systems 
and  environments.  And  we’ll  be  cov¬ 
ering  OS/2,  of  course.  It’s  just  sad  to 
realize  that  our  OS/2  coverage,  even 
as  it  begins,  may  be  as  dated  in 
another  year  as  CP/M  Plus  coverage 
is  today. 


I/Iii/UUJi  jjvmum  ui 

Software  Tools 

FOR  THE  PROFESSIONAL  PROGRAMMER 


Editorial 

Editor-in-Chief  Michael  Swaine 
Editor  Tyler  Sperry 
Managing  Editor  Vince  Leone 
Associate  Editor  Ron  Copeland 
Assistant  Editor  Sara  Noah  Ruddy 
Technical  Editors  Allen  Holub 

Richard  Relph 

Contributing  Editors  Kent  Porter 

Namir  Shammas 
Ernest  R.  Tello 

Copy  Editor  Rhoda  Simmons 

Production 

Director  Art/Production  Larry  L.  Clay 

Art  Director  Michael  Hollister 
Assoc.  Art  Director  Joe  Sikorvak 
Technical  Illustrator  Barbara  Mautz 
Typesetter  Mary  E.  Lopez 
Cover  Photographer  Michael  Carr 
Circulation 

Circulation  Director  Maureen  Kaminski 
Fulfillment  Coordinator  Francesca  Martin 
Book  Marketing  Mgr.  Jane  Sharninghouse 
Subscription  Supervisor  Kathleen  Shay 
Newsstand  Sales 

Coordinator  Larry  Hupman 
Administration 
Finance  Director  Kate  Wheat 
Business  Manager  Betty  Trickett 
Accounts  Payable  Supv.  Mavda  Lopez-Qpintana 
Accts.  Receivable  Supv.  Laura  DiLazzaro 
Advertising  Director 
Ferris  Ferdon  (415)  366-3600 
Marketing  Mgr.  Michael  Wiener 
Trafficking  Coordinator  Patricia  Albert 
Account  Managers  see  page  129 
Associate  Publisher 
Michael  Swaine 
Assistant  Sara  Noah  Ruddy 

Dr.  Dobb's  Journal  of  Software  Tools  (USPS  307690) 
is  published  monthly  by  M&.T  Publishing  Inc.,  501 
Galveston  Dr.,  Redwood  City,  CA  94063;  (4151  366-3600. 
Second-class  postage  paid  at  Redwood  City  and  at 
additional  entry  points.  DDJ  is  published  under  license 
from  People's  Computer  Company,  2682  Bishop  Dr., 
Suite  107.  San  Ramon,  CA  94583,  a  nonprofit  corpora¬ 
tion. 

Article  Submissions:  Send  manuscripts  and  disk  (with 
article  and  listings)  to  the  Associate  Editor. 

DDJ  on  CompuServe:  Type  GO  DDJ 
Address  Correction  Request:  Postmaster:  Send  Form 
3579  to  Dr.  Dobb's  Journal.  P.O.  Box  27809.  San  Diego, 
CA  92128.  ISSN  088-3076 

Customer  Service!  For  subscription  problems  call: 
outside  CA  (800)  321-3333:  in  CA  (6191  485-9623  or 
566-6947.  For  book/software  older  problems  call  (415) 
366-3600. 

Subscriptions:  S29.97  per  1  year:  S56.97  for  2  years. 
Canada  and  Mexico  add  S27  per  year  airmail  or  S10  per 
year  surface.  All  other  countries  add  S27  per  year 
airmail.  Foreign  subscriptions  must  be  prepaid  in  U.S. 
funds  drawn  on  a  U.S.  bank.  For  foreign  subscriptions, 
TELEX:  752-351. 

Foreign  Newsstand  Distributor:  Worldwide  Media 
Service  Inc..  386  Park  Ave.  South,  New  York.  NY  10016: 
1212)  686-1520  TELEX  620430  (WUIi. 

Entire  contents  copy  light  1  1987  bv  M&T 
Publishing,  Inc.,  unless  otheiwise  noted 
on  specific  articles.  All  lights  reserved. 


M&T  Publishing  Inc. 

Chairman  of  the  Board  Otmar  Weber 
Director  C.F.  von  Quadt 
President  and  Publisher  Laird  Foshav 


6 


Dr.  Dobb’s  Journal,  December  1987 

915 


ARCHIVES 


RUNNING  LIGHT 


It’s  been  a  crazy 
month,  no  doubt 
about  it.  I'd  tell  you  all 
about  it,  except  you'd 
probably  be  bored 
with  the  details,  and 
as  the  old  joke  goes,  I 
don’t  want  to  talk 
about  it.  Suffice  to  say 
that  this  month  has 
seen  the  departure  of 
Vince  Leone,  our  long- 
suffering  Managing  Editor,  and  the 
usual  mad  rush  to  get  the  magazine 
out  the  door  has  been  unusually 
mad. 

This  month  has  also  seen  an  ab¬ 
normal  mix  of  last  minute  code 
changes  and  too  little  space  in  the 
magazine.  The  code  for  Rick  Naro’s 
LOCATE  utility,  for  example,  was  up¬ 
dated  for  MASM  version  5.0,  but  un¬ 
fortunately  we’ve  had  to  continue 
the  listings  into  next  month.  That 


code  will  be  joined  by 
some  last-minute  ad¬ 
ditions  to  the  articles 
by  Dave  Cortesi  and 
Martin  Tracy.  Those 
of  you  who  use  Com¬ 
puServe  won’t  be  af¬ 
fected  by  most  of  this 
madness  since  we’ll 
be  updating  the  code 
before  it  goes  online. 
And  before  it  goes 
onto  the  listings  disk. 

One  last  note  before  I  go:  the  grem¬ 
lins  obliterated  my  CompuServe 
number  last  month,  but  you  can 
still  reach  me  as  76703,4266.  If  you're 
doing  neat  things  in  Object-oriented 
programming  or  AI,  now’s  the  time 
to  pitch  me  an  article. 


Tyler  Sperry 
editor 


STATEMENT  OF  OWNERSHIP,  MANAGEMENT,  AND  CIRCULATION 

(Act  of  August  12,  1970,  Section  3685,  Title  39,  United  States  Code) 


1.  Title  of  Publication:  Dr.  Dobb  s  Journal  of  Software 
Tools,  Publication  No.  08883076. 

2.  Date  of  Filing:  October  10,  1987. 

3.  Frequency  of  Issue:  Monthly  (12  issues,  $29,971. 

4.  Location  of  Known  Office  of  Publication:  501 
Galveston  Dr.,  Redwood  City,  CA  94063. 

5.  Location  of  Headquarters  of  General  Business  Of¬ 
fices  of  the  Publishers:  501  Galveston  Dr.,  Red¬ 
wood  City,  CA  94063. 

6.  Names  and  Addresses  of  Publisher,  Editor,  and 
Managing  Editor:  Publisher,  Laird  Foshay,  501 
Galveston  Dr.,  Redwood  City,  CA  94063.  Editor, 
Michael  Swaine,  501  Galveston  Dr.,  Redwood  City, 
CA  94063.  Managing  Editor,  Vince  Leone,  501 
Galveston  Dr.,  Redwood  City,  CA  94063. 

7.  Owner:  M&T  Publishing,  501  Galveston  Dr.,  Red¬ 
wood  City,  CA  94063. 

8.  Known  Bondholders,  Mortgages,  and  Other  Secu¬ 
rity  Holders  Owning  or  Holding  1  Percent  or  More 
of  Total  Amount  of  Bonds,  Mortgages  or  Other 
Securities:  Markt  «Sl  Technik,  Hans-Pinsel-Strasse 
2,  8013  Haar  bei  Munich  W.  Germany. 

9.  Extent  and  Nature  of  Circulation: 

A.  Total  Number  of  copies  printed.  Average 
number  of  copies  each  issue  during  the  pre¬ 
ceding  12  months:  70,701.  Actual  number  of 
copies  of  single  issue  published  nearest  to 
filing  date:  76,069. 

B.  Paid  Circulation.  1.  Sales  through  dealers  and 
carriers,  street  vendors,  and  counter  sales.  Av¬ 
erage  number  of  copies  each  issue  during 
preceding  12  months:  14,727.  Actual  number 
of  copies  of  single  issue  published  nearest  to 
filing  date:  17,258.  2.  Mail  subscriptions.  Aver¬ 
age  number  of  copies  each  issue  during  pre¬ 


ceding  12  months:  41,486.  Actual  number  of 
copies  of  single  issue  published  nearest  to 
filing  date:  42,697. 

C.  Total  Paid  Circulation.  Average  number  of 
copies  each  issue  during  preceding  12  months: 
56,213.  Actual  number  of  copies  of  single  issue 
published  nearest  to  filing  date:  59,955. 

D.  Free  distribution  by  mail,  carrier,  or  other 
means,  samples,  complimentary,  and  other 
free  copies.  Average  number  of  copies  each 
issue  during  preceding  12  months:  2,266. 
Actual  number  of  copies  of  single  issue  pub¬ 
lished  nearest  to  filing  date:  1,900 

E.  Total  distribution.  Average  number  of  copies 
each  issue  during  preceding  12  months:  58,479. 
Actual  number  of  copies  of  single  issue  pub¬ 
lished  nearest  to  the  filing  date:  61,855. 

F.  Copies  not  distributed.  1.  Office  use,  left  over, 
unaccounted,  spoiled  after  printing.  Average 
number  of  copies  each  issue  during  preceding 
12  months:  650.  Actual  number  of  copies  of 
single  issue  published  nearest  to  filing  date: 
655.  Returns  from  News  Agents.  Average 
number  of  copies  each  issue  during  preceding 
12  months:  11,572.  Actual  number  copies  of 
single  issue  published  nearest  to  filing  date: 
13,559. 

G.  Total.  Average  number  of  copies  each  issue 
during  preceding  12  months:  70,701.  Actual 
number  of  copies  of  single  issue  published 
nearest  to  filing  date:  76,069. 

I  certify  that  the  statements  made  by  me  above 

are  correct  and  complete. 


Laird  Foshay,  Publisher 


Ten  Years  Ago  in  DDJ 

“DALLAS,  Texas — The  first  flexible  disk 
drive  for  5V4"  diskettes  to  offer  double 
density  recording  of  250,000  bytes  on 
each  side  of  a  diskette,  was  introduced  at 
the  National  Computer  Conference  here 
today  by  the  Pertec  Division  of  Pertec 
Computer  Corporation.”  News  release 
received  Jun  18,  1977,  DDJ,  November/ 
December  1977. 

Kicking  the  8080  Habit 

"Maybe  it's  just  my  imagination,  but  it 
seems  that  a  lot  of  people  aren’t  utilizing 
the  Z-80  to  its  fullest.  Everyone  is  so  used 
to  writing  code  for  the  8080  that  they 
don't  seem  to  bother  upgrading  their 
software  when  they  upgrade  their  CPU. . . 

“I  would  like  to  see  you  guys. . .  explain 
all  the  nifty  Z-80  tricks.  I  know  I  can’t  be 
the  only  one  that  is  stuck  in  the  rut  of 
8080  code.  (Please!!  Don't  tell  me  I 
swapped  my  CPU  board  JUST  for  speed — 
the  software  potential  is  fantastic.)’’ 
Letters  to  the  Editor,  DDJ,  November/ 
December  1977. 

Exec  with  Extreme  Prejudice 

“In  a  multi-tasking  environment  such 
as  we  expect  to  see  in  MS-DOS  3.0,  this 
function  (function  4BH  -  EXEC)  will  be 
even  more  useful  and  will  undoubtedly 
be  elaborated  with  several  additional 
features.  Under  such  an  operating  system, 
a  parent  task  can  ‘spawn’  any  number  of 
child  tasks,  which  can  execute  concur¬ 
rently  and  asynchronously,  and  commu¬ 
nicate  by  means  of  queues,  semaphores, 
and  pipes. 

"Well,  you  say,  pie  in  the  sky  is  all  very 
nice,  but  why  is  the  EXEC  function  getting 
so  much  attention  in  this  magazine 
column?  The  answer  is,  of  course,  that 
when  I  tried  to  actually  use  the  function 
I  ran  into  any  number  of  glitches  and 
hazy  spots  in  the  documentation.’1 2 3 4 5 6 7 8 9  Ray 
Duncan,  "16-bit  Software  Toolboy,"  DDJ, 
December  1983. 

Encryption 

“A  small  but  vital  piece  of  hardware 
containing  a  microelectronic  chip  only  1 
cm  square  has  been  tested  and  validated 
at  the  Commerce  Department’s  National 
Bureau  of  Standards  (NBS) — marking  the 
first  NBS  validation  of  a  commercial  imple¬ 
mentation  of  the  Federal  Data  Encryption 
Standard  published  early  this  year.”  News 
release  received  October  31,  1977,  DDJ, 
November/December  1977. 


Dr.  DoBB’S  loURNALof 

COMPUTER 

(Calisthenics  Orthodontia 


Running  Light  Without  Overbyte 


8 

916 


Dr.  Dobb's  Journal,  December  1987 


FORUM 


LETTERS 


Stone  Age  Software 

Dear  DDJ, 

Thank  you  for  an  excellent  editorial 
in  the  September  1987  issue.  The 
point  Tyler  Sperry  makes  about  the 
crippling  of  potentially  powerful  ma¬ 
chines  is  a  sore  point  with  me  also. 

I  work  for  a  large  computer  manu¬ 
facturer  and  have  seen  for  myself 
how  these  limiting  factors  get  incor¬ 
porated  into  machines,  not  only  in 
software  but  also  in  hardware.  I 
don’t  see  an  immediate  solution  but, 
like  you,  I’ll  keep  screaming  for  the 
liberation  of  these  systems  from 
stupid  design  flaws. 

Les  J.  Record 
1201  East  Mesa  Pk.  Dr. 

Round  Rock,  TX  78664 


software.  Switching  back  from  pro¬ 
tected  mode  itself  is  Stone  Age,  not 
the  messy  way  it  has  to  be  done. 
And  AboveBoard  emulators  are  a 
Stone  Age  way  of  reducing  a  16- 
megabyte  address  space  to  an  8- 
megabyte  collection  of  memory 
chunks — and  there  are  people 
proud  of  that! 

The  cry  for  hardware  compatibil¬ 
ity — which  means  hardware-depend¬ 
ent  software  (which  equals  incom¬ 
patible  software) — has  made  the  tran¬ 
sition  from  today’s  status  quo  far 
more  difficult  than  the  change  from 
8-bit  CP/M  to  16-bit  MS-DOS  has 
been.  It’s  like  using  metal  tools  to 
make  better  stone  axes.  It  will  take 
true  Bronze  Age  men  to  change 
things. 

Jost  Riedel 
Am  Reservoir  2 
P.O.  Box  1141 
D-3522  Bad  Karlshafen  1 
West  Germany 

Recursive/Iterative 

Trade-Off 

Dear  DDJ, 

I  really  enjoyed  the  article  on  back¬ 
tracking  by  Charles  F.  Bowman 
(August  1987).  His  structure  for  the 


puzzle-solving  program,  using  recur¬ 
sive  techniques,  is  nice  and  simple. 

I  just  have  one  quibble  with  the 
article:  In  his  discussion  on  ways  to 
speed  up  the  process,  Bowman 
apologizes  for  his  use  of  a  recursive 
approach  and  states  the  conven¬ 
tional  wisdom,  “Recursive  proce¬ 
dures  are  costly  because  of  the  con¬ 
siderable  amount  of  overhead  re¬ 
quired  for  each  successive  call.  Your 
program  must  save  registers,  store  a 
return  address,  allocate  local  stor¬ 
age,  and  so  on.” 

That  statement  surely  sounds  plau¬ 
sible.  I've  accepted  it  on  faith  for 
years.  It  was  probably  even  true  on 
most  of  the  old  mainframe  comput¬ 
ers,  which  is  probably  the  context 
in  which  both  Bowman  and  I  heard 
it.  But  it  may  not  be  true  in  the 
context  of  microcomputers.  All  cur¬ 
rent  microprocessors  support  very 
fast  subroutine  call/return  mecha¬ 
nisms  as  well  as  stack  push/pops. 
Because  of  these  operations,  some¬ 
times  recursive  is  better  in  all  ways. 

To  test  the  truth  of  the  recursive/ 
iterative  trade-off,  I  wrote  an  eight- 
queens  problem  in  both  forms  (the 
recursive  version  is  shown  in  Exam¬ 
ple  1,  page  14).  The  program  was 
written  in  Turbo  Pascal  for  a 
PC  clone.  Much  to  my  pleas¬ 
ant  surprise,  the  recursive  ver¬ 
sion  turned  out  to  be  a  full 
40  percent  faster  than  the  non- 
recursive  form.  It  was  also 
smaller,  of  course. 

Another  concern  often 
voiced  about  recursive  ap¬ 
proaches  is  that  of  limited 
stack  space.  The  idea  is  that, 
if  your  program  has  to  go 
many  levels  of  recursion,  it 
may  crash  by  overflowing  the 
stack.  To  test  that  hypothe¬ 
sis,  I  ran  the  program  shown 
in  Example  2,  page  14,  again 
using  Turbo  Pascal.  The  pro¬ 
gram  did  indeed  crash  (grace¬ 
fully)  but  at  a  level  of  more 
than  5,400  layers  of  recursion. 
That’s  5,400  successive  sub¬ 
routine  calls,  folks!  That 
should  be  enough  for  most 
of  us! 

So  it  appears  that  the  con¬ 
ventional  wisdom  “recursion 


Hurry  up  Jennings — our  very  existence 
may  depend  on  it! 


Dear  DDJ, 

I  don’t  think  there  is  any 
reason  to  complain  about  the 
architecture  of  the  80286,  at 
least  not  about  the  inability 
to  switch  back  to  real  mode. 
This  is  like  building  a  V30- 
based  computer  to  run  CP/M 
Plus  and  then  cursing  at 
those  crazy  16-bit  data  and 
I/O  buses  that  make  design¬ 
ing  a  good  8-bit  system  such 
a  mess. 

Remember,  the  8086/8088 
didn’t  even  have  an  8080  emu¬ 
lation  mode,  so  why  blame 
the  designers  of  the  80286  for 
including  an  8086  emulation 
mode?  It  is  there  just  to  give 
an  upward  migration  path, 
not  to  be  frantically  switched 
on  and  off. 

Let’s  face  it,  it’s  not  the 
chip  that  is  Stone  Age  but  the 


12 


Dr.  Dobb’s  Journal ,  December  1987 

917 


LETTERS 

(continued  from  page  12) 


{  A  skeleton  program  for  the  eight-queens 
problem.  By  altering  the  dimensions  and  the 
routines  fit,  place,  and  unplace,  it  also 
serves  as  a  model  for  any  other  solution 
involving  backtracking.  The  program  assumes  a 
data  array  q[0. .maxcount]  containing  the 
values  to  be  adjusted  (row  on  the  chessboard 
in  the  case  of  the  eight  queens.  Three 
procedures  are  assumed: 

Function  fit(l)  compares  the  position  of  the 
queen  q[l]  with  all  pieces  placed  so  far, 
and  returns  TRUE  if  there  is  no  conflict. 
Procedure  place  (1)  records  queen  q[l]  on  the 
board 

Procedure  unplaced)  removes  queen  q[l]. 

) 

procedure  Try(l:  integer); 
begin 

for  q [ 1 ]  :=  0  to  maxcount  do  begin 
if  fit  (1)  then  begin 
place  (1) ; 

if  1  =  max  then  ShowResult 
else  try (1  +  1) ; 
unplace (1) ; 

end; 

end; 

end; 

{  Main  Program  ) 
begin 

initialize; 
try  (0)  ; 
end. 


Example  1:  Eight  queens  problem  in  Pascal 


Program  test; 

procedure  bump(n:  integer); 
begin 

writeln(n)  ; 
bump(n  +  1)  ,- 

end 

{  Main  Program  ) 

begin 

bump (0) ; 
end. 


Example  2:  Program  to  test  levels  of  procedure  nesting 


Bytes  8088  Clocks 

3 

12+ea-12+9-21 

mov 

ax, word  ptr  (bp). value 

2 

2 

mov 

bx,  ax 

3 

12+ea=  12+9-21 

mov 

ax, word  ptr  (bp) .value{2) 

8 

44 

3 

12+ea = 12+9 » 21 

mov 

ax, word  ptr  (bp) .value (2) 

3 

12+ea  -  12+9  «  21 

mov 

bx, word  ptr  (bp). value 

6 

42 

3 

24+ea  =  24+9  =  33 

les 

bx, (bp) .value 

2 

2 

mov 

ax,es 

5 

35 

Table  1:  Timings  for  Example  6,  page  26,  July  1987  DDJ 


is  costly"  needs  to  be  put  to  rest.  In 
the  modern  world  of  micros,  the 
simplest  and  most  elegant  solution 
may  also  be  the  smallest  and  fastest. 

As  an  aside,  the  issue  of  fast  call/ 
return  mechanisms  should  also 
cause  us  to  take  another  hard  look 
at  assembly-language  programming. 
The  conventional  picture  of  an  as¬ 
sembly-language  program  is  that  of 
one  long  string  of  in-line  code  and 
macros.  Many  programmers  tend  to 
think  this  is  necessary  to  achieve 
the  speed  you  expect  of  a  native- 
language  program.  But  because  of 
the  support  provided  by  the  micro 
chips,  the  trade-offs  of  modularity 
vs.  speed  favor  the  former  in  assem¬ 
bly  language,  even  more  than  in  a 
higher-order  language. 

DDJ  Forum  User 

Instruction  Timings 

Dear  DDJ, 

A  reader  called  to  point  out  an  error 
in  my  article  "8088  Assembly-Lan¬ 
guage  Programming  Techniques" 
(July  1987).  It  seems  that  the  early 
versions  of  Intel’s  instruction  timing 
tables  implied  that  no  cycles  are 
used  for  effective  address  calcula¬ 
tion  when  the  AX  register  is  used  for 
moves  to  and  from  memory.  In  fact, 
this  is  the  case  only  when  using 
direct  memory  reference.  For  all 
other  cases,  AX  is  treated  in  the 
same  way  as  any  other  register. 

Table  1,  left,  gives  the  timings  for 
my  Example  6,  on  page  26  of  the 
July  issue.  There  is  no  doubt  that 
the  third  method  is  best  for  reed 
mode.  Protected  mode  on  the  80286, 
however,  is  another  story. 

Tom  Disque 

SAS  Institute  Inc. 

P.O.  Box  8000 

SAS  Circle 

Cary,  NC  27511-8000 

DDJ 


14 

918 


Dr.  Dobb's  Journal,  December  1987 


ARTICLES 


Dvnamic  I  .inking 

in  OS/2 


o  operating  system  is 
likely  to  include  all  the 
functions  that  creative 
programmers  will  demand.  It 
appears  that  Microsoft,  bene¬ 
fiting  from  past  experience, 
has  purposely  designed  OS/2 
so  that  third-party  develop¬ 
ers  can  more  easily  add  func¬ 
tions  to  it.  Yet  to  maintain  an  acceptable  degree  of 
quality  assurance,  Microsoft  wanted  to  make  sure  that 
any  additions  would  fit  seamlessly  into  the  whole  system 
without  compromising  its  reliability.  This  article  is  a 
brief  survey  of  the  mechanisms  included  in  OS/2  to  allow 
this,  and  in  particular,  the  concept  of  dynamic  linking. 

Briefly  put,  a  dynamic  link  is  an  external  reference 
that  is  not  resolved  at  the  time  the  program  is  linked. 
Instead,  the  connection  to  the  external  routine  is  made 
either  while  the  program  is  being  loaded  or  sometimes 
even  later  while  it  is  executing. 

Dynamic  linking  isn’t  a  novel  idea;  it  was  fundamental 
to  the  operation  of  the  Multics  system  (the  Intel  80286 
has  some  intriguing  similarities  to  the  GE  645,  the 
Multics  host)  and  readers  of  the  DDJ  Forum  on  Compu¬ 
Serve  have  told  me  that  the  Prime  operating  system 
PRIMOS  and  UCSD  Pascal  have  similar  features. 

Preparing  a  Dynalink  Library 

Here’s  how  dynamic  linking  works  in  OS/2.  You  design 
and  build  a  package  of  code  that  will  be  useful  to  more 
than  one  program.  At  first  you  store  its  object  code  in 
an  object  library  as  usual  and  test  it  by  linking  it  in  the 
usual  way  with  the  programs  that  use  it.  When  your 
package  is  in  something  like  its  final  form,  you  run  it 

David  E.  Cortesi,  415  Cambridge  St.,  #18,  Palo  Alto,  CA 
94306.  Dave  is  a  former  DDJ  columnist. 


alone  through  the  linker  in  a 
special  pass.  You  also  supply 
a  description  file  that  tells 
the  linker  that  this  code  will 
be  a  dynamic  link  library,  or 
a  dynalink,  as  it’s  come  to  be 
called.  In  the  description  file 
you  specify  the  names  of  all 
the  entry  points  that  will  be 
publicly  available  in  this  package.  You  may  also  specify 
attributes  for  individual  segments;  I’ll  come  back  to 
those  later. 

The  linker  processes  the  object  code  much  as  it  does 
in  MS-DOS,  merging  segments  by  class  and  group, 
resolving  internal  references  between  segments,  adjust¬ 
ing  offsets  to  account  for  merged  segments.  One  step  of 
linking  in  OS/2  is  different  from  that  with  MS-DOS.  In 
MS-DOS,  the  linker’s  output  is  a  single,  monolithic 
binary  image  in  which  the  input  segments  have  lost 
their  individual  identity.  In  OS/2,  the  linker  keeps  seg¬ 
ments  separate  in  the  executable  file.  That’s  necessary 
so  that  the  system  loader  can  load  a  program  one 
segment  at  a  time,  as  it  must  do  in  order  to  build  a  local 
descriptor  table  (LDT)  for  the  80286  hardware. 

The  processed  segments  of  your  package  of  subrou¬ 
tines  go  into  a  file  that  has  the  same  format  as  an  OS/2 
executable  (.EXE)  file,  but  by  convention  a  dynalink 
library  has  the  file  type  description.  If  your  package  is 
THEGOODS,  its  code  will  be  linked  as  THEGOODS.DLL. 
That  is  all  that  needs  to  be  done  for  code  that  will  be 
bound  to  its  caller  very  late,  during  execution  time,  but 
that’s  a  rare  use. 

Most  dynalink  libraries  need  to  be  known  to  their 
client  programs  at  link  time.  In  that  case,  one  more 
processing  step  has  to  be  done.  A  utility  called  IMPLIB 
reads  the  description  file  and  writes  an  artificial  object 
library  that  can  be  used  at  link  time.  Applied  to  the 


Built-in  facilities  for  the 
third-party  extension  of  OS/2 


18 


Dr.  Dobbs  Journal,  December  1987 

919 


sary  because  it  is  possible  to  tell  the  linker  to  mark  any 
segment  “load  on  demand.”  In  that  case,  the  OS/2  loader 
won’t  load  the  segment  but  will  only  set  up  the  local 
descriptor  table  to  cause  a  hardware  trap  if  the  segment 
is  referenced  by  an  instruction.  When  and  if  that  hap¬ 
pens,  the  segment  will  be  loaded. 

Once  it  has  storage  copies  of  the  program's  segments, 
the  loader  processes  dynamic  links.  The  linker  has  given 
it  the  character  string  file  name  of  the  dynalink  library 
and  the  names  of  the  entry  points  needed.  The  loader 
looks  up  the  file;  it  has  to  be  found  in  a  directory 
designated  in  the  system  configuration  file. 

The  system  may  not  have  to  load  the  segments  of 
dynalink  code.  Dynamically  linked  code  segments  are 
shared  among  all  clients  that  use  them,  so  if  CLIENT.EXE 
is  the  second  instance  of  a  program  that  uses 
THEGOODS,  no  disk  input  will  be  needed  for  code 
segments. 

Because  a  dynalink  library  has  precisely  the  same  file 
format  as  a  .EXE  file,  the  process  of  loading  dynalink 
code  is  really  just  an  extension  of  loading  a  program. 
The  only  extra  step  is  that  the  loader  has  to  fix  up  the 
external  references  in  the  client  code.  Because  the 
process  is  so  similar,  there’s  no  reason  why  code  in  one 
dynalink  library  shouldn't  call  code  in  another  one,  and 
that  in  a  third,  and  so  on. 

Benefits  of  Dynamic  Linking 

Some  of  the  benefits  of  dynamic  linking  are  clear  at 
once.  There’s  an  economy  of  disk  space:  whereas  under 
MS-DOS  every  client  program  contains  a  copy  of  the 
common  code,  OS/2  stores  only  a  single  copy  in  the 
dynalink  library.  There’s  economy  of  memory  because 
only  a  single  copy  of  common  code  is  kept  in  storage. 
There’s  economy  of  load  time  because  only  the  needed 
segments  are  brought  in  from  disk.  (What’s  faster  than  a 
disk  cache?  Not  doing  the  disk  input  at  all!) 


description  file  for  your  package,  this  produces  a  very 
small  file  named  THEGOODS.LIB. 


Using  a  Dynalink  Library 

Now  client  programs  can  begin  using  your  package. 
They  declare  its  entry  points  just  as  they  would  declare 
any  other  external  references,  whether  in  assembly 
language: 


extern  GoodyAfar 


or  in  Pascal: 


Procedure  GoodyB(c:char); 
external; 


or  in  C: 


extern  int  far  GoodyC( ); 


And  the  calls  to  the  package’s  procedures  are  written 
just  as  they  would  be  if  the  package  were  to  be  linked  in 
the  usual  way  (which  it  might  still  be  in  some  cases). 

When  a  client  program  is  linked,  the  import  library, 
THEGOODS.LIB,  is  one  of  the  libraries  input  to  the  link. 
The  object  records  in  it  are  special.  They  don’t  contain 
object  code;  they  only  tell  the  linker  the  name  of  the 
dynalink  library  (THEGOODS.DLL,  remember?)  and  the 
names  of  the  entry  points  that  it  exports  for  use.  The 
linker  writes  a  table  of  these  names  into  the  linked 
program  (let’s  say  it's  CLIENT.EXE)  and  pointers  to 
where  the  references  occur  in  the  linked  segments. 


Load  Time  Linking 

Eventually  CLIENT.EXE  will  be  loaded  for  execution.  The 
linked  segments  of  code  and  data  will  be  brought  in 
from  the  disk  file — if  necessary.  It  might  not  be  neces¬ 


Dr.  Dobb’s  Journal,  December  1987 

920 


19 


OS/2  DYNAMIC  LINKING 

(continued  from  page  19) 

These  features  would  be  enough  to  justify  extensive 
use  of  dynamic  linking.  And  OS/2  does  use  it  extensively. 
The  whole  interface  to  the  operating  system  is  based  on 
it.  All  the  OS/2  system  functions  are  presented  as  exter¬ 
nal  procedures  that  programs  declare  and  call.  And  all 
of  those  system  procedures  are  defined  by  code  in 
dynalink  libraries  with  names  such  as  DOSCALLS.DLL, 
VIOCALLS.DLL,  and  MOUCALLS.DLL.  As  a  result  (and  in 
sharp  contrast  to  MS-DOS),  it  is  actually  easier  to  use 
OS/2  system  calls  from  a  high-level  language  than  it  is  to 
use  them  from  assembly  language. 

I  promised  this  would  be  a  discussion  of  system 
extensions;  here’s  where  the  connection  is  made.  Be¬ 
cause  the  entire  programming  interface  to  OS/2  is 
through  dynalinks — and  because  dynalinks  may  them¬ 
selves  call  DynaLinks — any  new  dynalink  library  is  a 
functional  extension  to  OS/2  on  an  equal  footing  with  the 
system  code  itself.  There  aren’t  any  special  interfaces, 
no  magic  incantations  that  only  gurus  may  use;  there 
isn’t  the  sharp  distinction  between,  say,  ordinary  pro¬ 
grams  and  TSR  programs  that  exists  in  MS-DOS.  There 
is  just  the  hierarchy  of  functions  available  in  DynaLink 
libraries,  with  the  OS/2  kernel  dynalinks  at  the  base  of 
the  pyramid. 

And  there  is  no  lack  of  function,  either.  Don’t  suppose 
that,  because  a  system  extension  is  restricted  to  the 
same  facilities  that  any  program  might  use,  it  isn’t 
possible  to  write  interesting  system  extensions.  There 
are  a  few  limitations,  and  I’ll  mention  them  shortly.  But 
consider  this:  all  the  functional  extensions  to  be  sup¬ 
plied  in  the  IBM  Extended  Edition  of  OS/2  (database 
facility  and  multiple  communications  protocols)  as  well 
as  the  whole  of  the  Microsoft  Presentation  Manager 
(protect-mode  Windows),  all  of  this  code  will  be  sup¬ 
plied  as  dynalink  libraries.  There  won’t  be  any  features 
added  to  OS/2  at  a  kernel  level  to  support  them — they’re 
all  there  in  1.0  and  available  (to  high-level  languages, 
remember)  as  dynalink  calls.  So  if  you  repackage  your 
B-tree  access  method  as  a  dynalink  library,  it  will  use  the 
same  kernel  functions  and  be  exactly  as  accessible  to  its 
clients  as  IBM’s  database  facility  is  to  its. 

Let’s  explore  some  of  the  subtleties  of  dynamic  link¬ 
ing.  Return  to  the  point  at  which  you  are  linking  your 
package  of  code  into  a  dynalink  file.  A  “definition"  file 
must  be  given  to  the  linker  to  describe  it.  A  definition  file 
may  be  used  when  linking  any  program;  that’s  how  you 
tell  the  linker  to  mark  a  program  segment  for  deferred 
loading.  But  there  are  two  definitions  that  will  most 
often  be  applied  to  DynaLink  segments — one  is  shared 
data;  the  other  is  IOPL  code. 

Shared  Data  Segments 

By  default,  data  segments  are  not  shared  between  pro¬ 
grams.  Instead,  a  new  copy  of  a  data  segment  is  loaded 
from  the  .EXE  or  .DLL  file  for  each  instance  of  a  program 
that  uses  it.  That  fits  the  expectations  of  most  programs: 
you  don’t  ordinarily  think  of  a  segment  of  data  as  being 
accessible  to  two  or  more  concurrent  programs  at  once. 

You  can  tell  the  linker  to  mark  a  data  segment 


“single,”  however — that  is,  that  there  is  to  be  only  one 
instance  of  that  data  segment  in  storage  no  matter  how 
many  client  programs  might  have  concurrent  access  to 
it.  This  simple  concept  forms  the  basis  for  some  sophis¬ 
ticated  applications. 

Recall  that,  although  such  a  data  segment  is  common 
to  all  client  programs,  it  isn’t  directly  accessible  to  them. 
The  data  segment  is  linked  with  your  code,  not  (directly) 
with  the  client  program.  If  it  doesn’t  contain  any  ex¬ 
ported  entry  points,  there  is  no  way  in  which  a  client 
program  can  form  an  external  reference  to  it.  Though 
there  are  ways  in  which  determined  programmers  could 
find  out  its  segment  address,  there  is  no  way  they  could 
find  out  how  you  arrange  and  manage  its  contents 
unless  you  tell  them.  The  only  convenient  way  that 
client  code  can  access  a  dynalink  data  segment  (shared 
or  not)  is  by  calling  the  procedures  in  the  dynalink 
package. 

Because  a  shared  data  segment  is  dynamically  linked, 
it  will  be  loaded  with  the  first  client  program  to  request 
the  package.  It  will  remain  in  storage  so  long  as  at  least 
one  client  program  does.  (If  it's  important  to  keep  your 
dynalink  package  in  storage  at  all  times,  you  can  write  a 
dummy  client  program  and  start  it  with  a  statement  in 
the  configuration  file.)  The  segment  is  an  island  of  static 
data  that  your  code  can  address  easily  (because  it  was 
linked  with  your  code)  but  that  only  your  code  knows 
how  to  address.  You  can  use  it  for  a  shared  buffer  pool, 
or  for  queues  of  work  (however  your  package  defines 
“work”),  or  generally  for  any  kind  of  pooled  resource 
that  your  package  can  usefully  manage  on  behalf  of 
multiple  concurrent  client  programs. 

You  can't  predict  how  many  clients  may  be  executing 
in  your  package  at  any  instant;  there  might  be  none  or 
there  might  be  dozens!  But  the  OS/2  kernel  has  plenty  of 
functions  to  help  you  manage  shared  data  in  a 
multiprogramming  environment.  There’s  a  complete, 
and  quite  efficient,  set  of  semaphore  operators,  so  you 
can  serialize  access  to  the  data.  There’s  a  generalized 
queueing  facility  so  that  multiple  "writer”  threads  can 
queue  data  (using  a  variety  of  queue  disciplines)  for  a 
single  “reader”  to  process.  There’s  a  storage  suballoca¬ 
tion  facility  that  has  built-in  serialization  so  multiple 
threads  can  allocate  and  free  pooled  storage  concur¬ 
rently. 

I/O  Privilege  and  Devices 

In  OS/2,  application  code  runs  at  privilege  level  3,  in  the 
80286  scheme  of  things.  The  I/O  privilege  level  is  set  at  2, 
so  application  code  that  tries  to  do  an  I/O  instruction 
will  trap  out  and  be  terminated.  Any  code  segment, 
however,  dynalink  or  not,  can  be  i  larked  at  link  time  as 
being  eligible  for  I/O  privilege.  Code  in  such  segments 
may  request  access  to  a  range  of  I/O  port  numbers  from 
the  kernel  and  then  may  do  I/O  instructions,  including 
setting  and  clearing  the  interrupt  flag. 

There  are  serious  restrictions  on  this  I/O  privilege, 
however.  There  is  no  way  that  application  code  can  get 
control  on  an  I/O  interrupt.  There  is  no  way  that 
application  code  can  lock  a  segment  in  storage  so  as  to 
use  it  for  a  DMA  buffer.  And  because  OS/2  is  slicing  time 
among  potentially  many  programs,  polling  an  I/O  port  is 


20 


Dr.  Dobb’s  Journal ,  December  1987 

921 


OS/2  DYNAMIC  LINKING 

(continued  from  page  20) 
impractical  as  well. 

As  a  result;  a  package  based  on  a  specific  piece  of 
hardware  will  have  to  include  a  device  driver.  Device 
drivers  have  access  to  a  set  of  kernel  functions  to  assist 
them  in  translating  between  virtual  and  real  addresses, 
in  locking  storage,  in  fielding  interrupts,  and  so  forth. 

Unfortunately,  OS/2  device  drivers  are  even  harder  to 
write  than  MS-DOS  device  drivers.  The  reason  is  the 
uncertainty  about  the  machine  state  at  the  time  an 
interrupt  occurs.  The  system  might  be  operating  in 
real-address  mode  in  the  simulated  DOS  3.3  environ¬ 
ment — or  it  might  be  operating  in  protect  mode.  The 
interrupt-handling  code  of  a  device  driver  has  to  be  able 
to  operate  in  either  mode.  Furthermore,  the  split  be¬ 
tween  the  “strategy”  and  “interrupt”  halves  of  a  device 
driver,  a  split  that  was  only  formal  in  MS-DOS,  is  real  in 
OS/2.  The  strategy  routine  has  to  start,  or  queue,  the 
work  to  be  done  and  then  get  back  to  the  kernel  pronto, 
and  the  interrupt  routine  has  to  be  able  to  operate 
asynchronously. 

The  expanded  support  for  the  IOCtl  system  call  might 
make  some  device-centered  packages  easier  to  design. 
Application  code  can  exchange  data  with  device  driver 
code  through  this  call.  (The  distributed  OS/2  device 
drivers  support  an  elaborate  scheme  of  “generic”  IOCtl 
calls  that  bring  a  degree  of  device  independence  to  this 
very  low  level  of  the  system.)  If  I  were  designing  a 
package  based  around  a  piece  of  hardware,  I  would  try 
very  hard  to  put  as  little  function  in  the  device  driver  as 
possible  and  reserve  as  much  as  possible  to  dynalinked 
code.  I’d  use  the  device  driver  to  lock  segments  in 
storage  and  to  field  interrupts.  Everything  else  could  be 
done  from  dynalinked  code  that  used  the  device  driver 
as  its  private  resource. 

Dynalink  Descriptors 

Because  I’m  into  technical  details  here,  I  might  as  well 
explore  a  strange  addressing  problem.  This  item  is 
merely  a  sidetrack  intended  for  those  who  know  the 
80286  hardware  well;  others  can  just  skip  ahead  to  the 
next  section. 

You  might  suppose  that,  as  dynalink  segments  are 
shared  across  clients,  they  must  be  addressed  through 
the  Global  Descriptor  Table  (GDT).  Not  so.  OS/2  appears 
to  reserve  the  GDT  for  kernel  code  and  system  data 
objects.  Application  code  segments  and  dynamic  link 
segments  all  go  in  the  LDT  of  each  task.  It’s  also  possible 
to  share  dynamically-allocated  data  segments  between 
programs,  and  these  shared  segments  also  are  ad¬ 
dressed  through  the  LDT,  not  the  GDT  as  you  might  at 
first  expect. 

The  reason  is  probably  security.  Dynalink  code  isn’t 
meant  to  be  a  global  resource;  it's  linked  to  a  specific 
client  program  or  programs.  Shared  segments  are  only 
for  the  use  of  the  clique  of  programs  sharing  them.  By 
putting  their  descriptors  in  the  LDTs  of  the  using 
programs,  the  system  ensures  that  only  the  right  pro¬ 
grams  can  use  them. 

Those  who  know  the  80286  really  well  will  immedi¬ 


ately  spot  a  problem.  Loaded  code  often  contains  far 
pointers  to  its  own  segments,  and  far  pointers  are 
constants  embedded  in  the  code.  It  follows  that  a 
dynalink  segment  must  use  the  identical  segment  ad¬ 
dress  in  every  LDT  in  which  it  is  entered.  If  it  didn’t,  its 
embedded  far  pointers  would  be  correct  for  a  call  from 
some  tasks  but  not  from  others. 

In  order  to  make  that  happen,  OS/2  has  to  treat  the 
8,191  possible  entry  indexes  into  an  LDT  as  a  global 
pool — even  though  an  LDT  is  a  private  resource  of  one 
program.  When  a  dynalink  library  is  first  loaded,  it  is 
assigned  a  set  of  descriptor  table  entry  numbers  from 
the  pool.  It  will  use  those  segment  numbers  in  whatever 
program’s  table  it  appears,  and  no  other  segment  can 
use  those  numbers  until  this  dynalink  library  loses  its 
last  client  and  is  unloaded.  There’s  clearly  a  potential  for 
creating  large,  sparse  LDTs.  It  remains  to  be  seen  if  that 
will  be  a  problem  in  production  systems. 

Errors  and  Exit  Lists 

Very  well,  a  client  program  called  your  InitGoody  entry 
point,  then  called  your  StartGood  entry  point  to  begin  a 
complex  operation.  And  now  the  client  program  has 
done  something  foolish  and  has  been  terminated  by  the 
system,  leaving  your  shared  data  segment  or  I/O  device 
in  a  halfway  condition,  storage  allocated,  semaphores 
uncleared.  Other  clients  will  suffer.  That's  not  good;  it’s 
a  principle  of  OS/2  design  that  one  program’s  disasters 
shouldn’t  affect  other  programs. 

For  every  program,  the  OS/2  kernel  maintains  an  exit 
list — a  list  of  code  addresses  that  want  to  get  control 
before  the  program  is  terminated.  There’s  a  system  call 
to  enroll  an  entry  point  in  the  exit  list.  You’d  use  it  from 
the  InitGoody  entry  point,  enrolling  your  GoodyByBy 
procedure  as  an  exit  procedure.  Now  if  the  client  makes 
a  serious  mistake,  OS/2  will  call  your  code  and  you  can 
clean  up  the  debris. 

The  reason  it's  a  list,  not  just  a  single  address,  is  that 
the  OS/2  design  anticipates  that  an  application  might  be 
using  several  dynalink  libraries  and  every  one  of  them 
might  want  to  register  its  own  exit  procedure.  There’s 
no  promise  about  which  one  will  be  called  first,  but  that 
shouldn’t  matter.  Each  one  should  only  be  concerned 
with  resources  that  are  in  its  private  domain  and  none 
should  even  be  aware  of  the  others. 

Device  Monitors  and  Replacement  Functions 

The  OS/2  named  devices  \PRN,  LPTn,  COMn,  SCREENS, 
and  KBD$ )  can  be  opened  as  files  and  used  via  file 
handles,  just  as  under  MS-DOS.  Or  they  can  be  ad¬ 
dressed  as  device  drivers  using  the  generic  IOCtl  system 
calls.  The  keyboard,  screen,  and  mouse  are  also  sup¬ 
ported  by  dozens  of  system  calls  with  names  such  as 
KbdCharln,  VioWriteCells,  and  MouGetPtr.  All  these  are 
directly  available  to  dynalink  code,  as  they  are  to  appli¬ 
cations. 

There  are  yet  two  more  levels  of  control  over  device 
I/O.  Although  available  to  any  code,  they’re  sufficiently 
complex  that  they'll  probably  only  be  used  in  dynalink 
packages.  These  are  device  monitors  and  replacement 
functions. 

A  device  monitor  is  a  piece  of  code  that  monitors  the 


24 

922 


Dr.  Dobb’s  Journal,  December  1987 


OS/2  DYNAMIC  LINKING 

(continued  from  page  24) 

stream  of  data  bytes  that  flow  between  a  device  and  a 
program.  The  monitor  receives  a  data  packet  with  a  call 
to  the  DosMonRead  dynalink  function  and  is  expected 
to  pass  the  packet  on  with  a  call  to  DosMonWrite.  But 
there’s  no  rule  that  it  has  to  pass  on  every  packet  or  that 
it  pass  on  the  same  number  of  packets.  A  monitor  is  free 
to  censor  the  data  stream,  to  generate  new  packets,  or 
to  substitute  packets. 

Consider  a  monitor  for  the  keyboard  data  stream.  It 
sees  all  keystrokes,  each  neatly  wrapped  up  in  a  "data 
packet”  with  flags  for  the  current  shift  state  and  a 
millisecond  time  stamp.  It  can  interpret  the  keystrokes 
as  it  wishes  and  substitute  for  them.  In  short,  a  key¬ 
stroke  monitor  may  be  a  keyboard  “macro”  program — 
and  it  doesn’t  have  to  field  interrupts  or  be  loaded  in 
any  special  way. 

A  keyboard  monitor  is  created  by  system  calls.  A 
program  could  set  up  its  own  keystroke  monitor,  but 
more  likely  one  would  be  set  up  in  dynalink  code  (and 
torn  down  in  an  orderly  way  in  an  exit  procedure). 
Because  it’s  easy  to  set  one  up,  there  can  be  more  than 
one.  OS/2  permits  a  whole  pipeline  of  keystroke  monitors 
to  exist,  each  one  getting  the  output  of  the  last  and 
providing  input  to  the  next,  and  they  don’t  have  to  be 
aware  of  each  other. 

There  is  a  separate  logical  keyboard  for  each  of  the  16 
“screen  groups,”  and  a  keystroke  monitor  sees  only  the 
strokes  from  the  keyboard  of  one  screen  group  (not 
necessarily  the  one  its  code  is  running  in!).  The  printer 
devices,  however,  are  global  to  all  screen  groups.  A 
printer  monitor  sees  all  the  data  passing  to  its  printer 
from  all  processes  in  the  system.  Data  packets  contain 
the  process  ID  that  produced  them,  and  “open”  and 
“close”  packets  are  visible.  This  is  the  basic  function 
needed  to  build  a  print  spooler:  a  program  that  inter¬ 
cepts  multiple  data  streams,  saves  the  data  on  disk,  and 
sends  files  on  to  the  printer  in  orderly  sequence.  An 
extremely  rudimentary  spooler  is  built  on  this  facility 
and  distributed  with  OS/2,  but  there  is  plenty  of  scope 
for  better  ones  to  be  built.  There  are  intriguing  possibili¬ 
ties  for  building  completely  transparent  support  for 
network  printers  and  for  building  filters  that  translate 
printer  control  codes  for  one  make  of  printer  to  an¬ 
other’s.  All  of  this  happens,  remember,  at  the  level  of 
application  code,  without  any  requirement  for  I/O  privi¬ 
lege  or  hardware  dependency. 

The  dynalink  libraries  distributed  with  OS/2  contain, 
as  I  mentioned,  dozens  of  system  calls  for  using  the 
keyboard,  the  mouse,  and  the  screen.  The  screen  calls 
contain  all  that's  necessary  for  a  full-screen  editor,  for 
example. 

The  distributed  code,  however,  assumes  the  presence 
of  an  IBM-compatible  keyboard,  an  IBM-compatible 
screen  adapter,  and  one  of  a  small  number  of  mouse 
devices.  That  isn’t  enough.  The  OEMs  that  sell  OS/2 
systems  will  have  the  know-how  and  development  tools 
to  build  replacement  device  drivers  and  dynalink  librar¬ 
ies  that  support  their  particular  hardware  under  the 
standard  calls.  But  there  will  be  times  when  the  code  for 


a  keyboard,  mouse,  or  screen  operation  should  be 
replaced  at  the  level  of  an  individual  program  or  an 
individual  screen  group  (multiple  programs  may  run  in 
a  single  screen  group). 

This  need  is  provided  for.  It  is  possible  for  a  program 
to  “register”  a  replacement  procedure  for  almost  any  of 
the  supplied  mouse,  screen,  or  keyboard  calls.  From  the 
moment  of  registration,  whenever  the  supplied  function 
is  called,  control  will  be  transferred  to  the  registered 
replacement.  That  procedure  gets  the  same  stacked 
parameters  as  the  original  function  would  have  seen.  It 
may  choose  to  interpret  them  in  a  different  way  (permit¬ 
ting  access  to  a  wider  screen,  for  example),  or  apply 
them  to  the  hardware  in  a  different  way,  or  censor  or 
expand  on  them  in  the  manner  of  a  device  monitor,  or 
perhaps  just  record  them  for  performance  monitoring. 
The  Presentation  Manager  will  probably  use  this  facility 
to  take  over  the  mouse  and  screen  calls  in  the  screen 
group  that  it  controls. 

Run-Time  Linking 

I  alluded  to  the  ability  to  do  a  very  late  dynamic  link, 
after  the  program  has  been  loaded  and  while  it  is 
executing.  There  is  a  system  call  that  will  take  the  file 
name  of  a  dynalink  library  and  load  it,  returning  a 
handle.  There  is  another  call  that  takes  such  a  handle 
and  the  character  name  of  an  entry  point  and  returns  a 
far  pointer  to  the  entry  point.  The  calling  program  may 
then  call  the  dynalink  entry  point. 

This  isn’t  genuine  linking  because  calls  to  the  external 
code  can’t  be  freely  embedded  in  the  caller's  code.  Calls 
have  to  be  indirect  by  way  of  pointers  in  storage,  and  the 
linkage  must  be  established  with  the  explicit  system 
calls.  Still,  it’s  a  facility  with  some  important  uses. 
Basically,  it  gives  a  program  the  ability  to  configure  its 
own  contents  at  run  time.  The  components  of  a  large 
subsystem  may  be  linked  independently,  each  as  a 
dynalink  library.  The  kernel  of  the  subsystem  may  decide 
at  execution  time  which  of  its  components  ought  to  be 
loaded.  A  communications  subsystem,  for  example, 
might  dynamically  load  its  components  for  different  line 
protocols  in  this  way,  as  needed. 

Coding  Dynalinks 

Between  dynamic  linking,  exit  lists,  device  monitors,  and 
replacement  device  functions,  it  ought  to  be  possible  to 
extend  OS/2  in  about  any  direction.  Some  extensions  are 
more  obvious  than  others.  For  instance,  a  dynamic  link 
library  would  appear  to  be  about  the  ideal  way  to 
package  a  compiler’s  run-time  library.  Think  what  a 
reduction  that  would  make  in  the  size  of  compiled 
programs!  Unfortunately,  it  looks  as  if  the  languages 
released  with  Version  1.0,  at  least,  will  have  their  run¬ 
time  code  in  the  old-fashioned  kind  of  object  library. 

Something  else  obvious  is  that  it  would  be  very 
desirable  to  write  dynalink  modules  in  high-level  lan¬ 
guages.  Here,  alas,  you  run  into  a  major  glitch  in  the 
system  design.  The  code  generated  by  the  IBM/Microsoft 
C,  Pascal,  and  FOBTBAN  compilers  is  not  suitable  for 
dynamic  linking! 

The  reasons  aren’t  too  hard  to  understand.  A  dynami¬ 
cally  linked  module  can  be  entered  from  different  pro- 


26 


Dr.  Dobb's  Journal,  December  1987 

923 


OS/2  DYNAMIC  LINKING 

(continued  from  page  26) 

grams.  It  has  to  be  prepared  to  use  its  caller's  stack,  and 
it  cannot  make  any  assumptions  about  the  way  its  caller 
uses  registers.  Unfortunately,  the  code  from  most  com¬ 
pilers  does  make  assumptions  about  registers,  lots  of 
them.  The  IBM/Microsoft  compilers  are  particularly  de¬ 
pendent  on  two  assumptions  that  just  aren't  true  on 
entry  to  dynalink  code:  that  ss  =  ds  and  that  ds  bases 
DGROUP. 

Neither  of  these  is  a  big  problem.  The  assumption  that 
the  stack  and  data  segments  are  identical  was  a  lazy 
coding  ploy  (don’t  bother  saving  ds,  just  reload  it  by 
pushing  ss  and  popping  ds)  that  ought  never  to  have 
been  allowed.  This  assumption  isn’t  true  in  large-model 
C  anyway,  so  it  shouldn’t  be  too  difficult  to  eradicate  it 
from  all  the  code  generators  and  all  the  run-time  librar¬ 
ies. 

The  assumption  is  that  ds  =  DGROUP  is  more  subtle. 
Every  compiled  module  has  a  DGROUP,  a  segment  of 
static  data.  When  object  modules  are  linked,  the  linker 
merges  all  DGROUPs  into  one  segment  and  adjusts  the 
offsets  ip  all  instructions  as  required.  This  works  just 
fine  for  an  application  program  because  all  its  parts  are 
linked  in  one  run.  Whatever  object  code  it  includes 
references  the  same  DGROUP.  But  the  code  of  a  Dyna- 
Link  library  is  linked  in  a  separate  run.  The  pieces  of  its 
DGROUP  are  formed  then,  and  its  DGROUP  is  nothing 
like  the  DGROUP  formed  when  its  clients  are  linked. 

But  a  compiler’s  code  generator  may  assume  that 
there  is  only  one  DGROUP  and  that  ds  addresses  it  at  all 
times  after  the  initialization  of  the  module.  That  ain't 
true  when  a  dynalink  procedure  is  entered;  then  DS 
addresses  the  caller’s  DGROUP.  What’s  lacking  is  that 
the  prologue  to  any  public  procedure  should  contain 
the  sequence  familiar  to  any  assembly-language  pro¬ 
grammer: 

push  ds 


mov  ax,  seg  DGROUP 
mov  ds,ax 
assume  ds  DGROUP 

And,  of  course,  the  exit  code  of  any  public  procedure 
would  have  to  contain  pop  ds  to  restore  the  register. 

Current  compilers  do  not  generate  this  code,  and  as  a 
result  the  code  they  produce  can’t  be  used  as  dynalink 
procedures!  Not,  at  least,  unless  it  has  an  assembly- 
language  front  end  to  switch  DGROUPs  for  it. 

One  other  irritating  problem  gets  in  the  way  of  high- 
level  dynalinks.  Existing  compilers  are  prone  to  generat¬ 
ing  automatic  checks  of  one  kind  or  another — for  stack 
overflow,  for  subscript  ranges,  for  whatever.  Not  all  of 
these  can  be  disabled.  If  even  one  of  them  is  left,  some 
kind  of  error-handling  module  will  be  included  in  the 
link.  But  the  only  way  such  modules  have  to  handle  an 
error  is  to  issue  a  message  to  the  console,  and  the  only 
way  they  can  seem  to  find  to  issue  a  console  message  is 
to  use  the  compiler’s  file  I/O  mechanism,  so  the  pres¬ 
ence  of  even  one  stack  check  causes  a  major  part  of  the 
compiler’s  run-time  library  to  cascade  into  the  dynalink 
code.  And  that  often  brings  with  it  modules  that  won’t 
link  properly  in  a  dynalink. 

Microsoft  has  been  thoroughly  beaten  up  on  for  these 
problems  at  developers’  conferences  and  claims  to  be 
working  on  solutions.  It’d  better  be  working  hard.  Dy¬ 
namic  link  libraries  are  an  extremely  attractive  facility  of 
OS/2,  and  the  first  programming  languages  that  support 
them  properly  will  have  a  competitive  edge  in  a  system 
that  (Bill  Gates  confidently  says)  will  have  ten  million 
users  by  1992. 

DDJ 

Vote  for  your  favorite  feature/article. 

Circle  Reader  Service  No.  1. 


as 

924 


Dr.  Dobb’s  Journal,  December  1987 


ARTICLES 


A  RAM-Cache 
Manager  in  C 

by  Alan  Deikman 


To  an  application  program  in 
most  operating  systems,  a 
read/write  request  appears  to 
take  place  instantaneously — through¬ 
put  is  primarily  dependent  on  the 
I/O  speed  of  the  devices  being  ac¬ 
cessed.  For  some  devices  this  is  an 
acceptable  limitation.  After  all, 
during  the  interval  a  program  is  wait¬ 
ing  to  receive  user  input  from  the 
keyboard,  there  usually  isn’t  any¬ 
thing  for  the  program  to  do  but 
wait. 

For  random-access  disk  requests, 
however,  the  data  being  sought, 
which  may  not  arrive  for  another  80 
milliseconds  or  so,  may  actually 
have  been  in-hand  only  moments 
before.  If  a  program  had  the  intelli¬ 
gence  and  foresight  to  save  that 
block  of  data  in  RAM,  there  might 
not  be  any  need  for  disk  access  at 
all  and,  as  such,  no  need  for  that 
80-millisecond  delay. 

When  accessing  disk  blocks,  it’s  a 
common  practice  to  access  the 
same  blocks  over  and  over  again 
and  to  access  other  blocks  only  once 
in  a  long  while.  The  operating 
system  portion  of  the  disk  is  typi¬ 
cally  the  most  often  accessed — in 
MS-DOS,  this  section  holds  the  di¬ 
rectories  and  the  file  allocation 
tables  (FATs). 

In  a  multiuser  operating  system, 
ideally,  CPU  time  is  diverted  to  work 
on  some  other  task  when  a  program 
requests  a  disk  block,  with  control 
passing  back  to  the  primary  pro- 


Alan  Deikman,  Software  Services, 
Menlo  Park,  CA  94026-2106.  Alan  is  a 
management,  marketing,  and  com¬ 
puter  consultant.  He  has  been  con¬ 
sulting  since  1978  and  has  had  an 
independent  practice  since  1985. 


Prevent  disk  bashing 
with  RAM  caching 


gram  as  soon  as  the  required  data 
becomes  available.  Overall  system 
throughput  is  thus  enhanced  at  the 
expense  of  the  individual  program. 
Even  so,  there’s  no  advantage  in  let¬ 
ting  an  application  program  force 
the  operating  system  to  access  the 
disk  more  than  is  necessary. 

If  such  a  program  is  used  fre¬ 
quently,  the  capacity  of  the  entire 
system  can  be  greatly  increased  if 
each  program  is  optimized  for  fewer 
time-intensive  disk  requests.  For  I/O¬ 
intensive  applications,  this  becomes 
a  critical  issue.  An  accounting  or 
database  system  does  very  little  com¬ 
putation  as  a  rule,  and  most  of  the 
time  spent  (after  the  user  has  en¬ 
tered  input)  is  spent  waiting  to  store 
or  retrieve  data. 

Within  this  context,  there  are  two 
primary  options  for  increasing  pro¬ 
gram  performance: 

1.  Buy  faster  hardware. 

2.  Reduce  the  number  of  actual  disk 
accesses  required. 

Disk  caching  is  one  of  the  least  ex¬ 
pensive  solutions  available  to  the 
programmer  to  optimize  program 
performance. 

Cache  Theory 

To  make  a  disk  cache,  an  area  of 
RAM  is  set  aside  to  hold  the  most 
recently  used  blocks  of  data.  Each 


block  is  identified  by  block  number. 
Whenever  a  request  for  a  disk  block 
read  occurs,  the  cache  area  is 
checked  first  to  see  if  the  required 
block  is  available.  If  it  is  available, 
no  disk  access  occurs,  and  the  call¬ 
ing  program  uses  the  RAM  copy  of 
the  block.  That  disk  block  is  then 
tagged  "most  recently  used”  (MRU). 
When  this  happens  it  is  called  a 
cache  hit. 

If  it  turns  out  that  a  disk  access  is 
required  (because  a  cache  miss  oc¬ 
curred),  the  block  is  obtained  from 
the  operating  system  as  usual. 
Before  the  block  is  used,  however, 
it’s  copied  into  the  cache  area,  re¬ 
placing  the  “least  recently  used” 
(LRU)  block  in  the  cache  area,  if  any 
exists.  It  is  then  marked  MRU. 

When  a  disk  block  is  written,  it 
also  is  placed  in  the  cache  as  the 
MRU  block.  In  most  applications  it 
is  desirable  to  write  the  block  physi¬ 
cally  to  the  disk  at  that  time,  but 
sometimes  it  is  better  to  wait  until 
the  block  becomes  the  LRU  block 
and  is  about  to  be  overwritten.  It  is 
possible  that  the  block  needn’t  be 
written  in  the  first  place. 

The  routines  in  this  article  main¬ 
tain  a  two-way  linked  list,  called  the 
LRU/MRU  chain.  The  cacalloO  rou¬ 
tine  sets  up  the  original  linked  list 
to  include  all  the  blocks  allocated, 
as  illustrated  in  Figure  1,  page  31.  A 
-1  (OxFFFF)  flags  the  end  of  the 
chain. 

Whenever  a  block  is  designated 
the  MRU,  it  is  taken  out  of  the  linked 
list  and  reinserted  at  the  end.  This 
operation  is  performed  by  the  func¬ 
tion  cacnewl ).  If  this  operation  were 
to  be  performed  on  block  2  of  the 
initial  chain,  the  result  would  be  as 
shown  in  Figure  2,  page  31. 


30 


Dr.  Dobb's  Journal,  December  1987 

925 


Cache  Size  and  Application 

There's  a  trade-off  between  cache 
size  and  net  efficiency.  If  the  cache 
is  too  small,  the  likelihood  of  hits  is 
small.  On  the  other  hand,  if  the 
cache  is  too  large,  the  CPU  spends 
more  time  than  necessary  looking 
up  disk  blocks  in  the  cache.  (The 
MS-DOS  manual  warns  against  this, 
in  the  discussion  on  setting  up  the 
BUFFERS  command  in  CONFIG.SYS; 
the  BUFFERS  command  sets  up  a 
system  cache  of  128-byte  records.) 

Figure  3,  below,  shows  the  curve 
representing  an  application  that 
uses  caching  with  perfectly  random 
disk  accesses.  The  left  side  of  the 
curve  represents  no  caching,  and  for 
small  caches  a  decrease  in  perform¬ 
ance  occurs.  Performance  increases 
to  some  theoretical  optimum  and 
then  falls  off.  Diminishing  returns 
ultimately  make  disk  caching  more 
of  a  burden  than  a  benefit. 

Most  applications,  however,  are 
not  truly  random  in  the  way  in 
which  they  access  the  disk.  There 
are  cases  in  which  disk  caching  is 
misapplied.  Consider  the  situation 
in  which  a  file  is  read  from  begin¬ 
ning  to  end.  No  block  is  ever  read 
twice,  so  applying  disk  caching  is 
merely  adding  overhead.  If  the  file 
is  one  block  larger  than  the  cache 
area,  and  must  be  rewound  and 
read  again,  then  each  read  on  the 
second  pass  will  result  in  a  cache 
miss.  Thus  it  is  never  desirable  to 
apply  disk  caching  to  sequentially 
read  files  of  an  unknown  size. 

In  yet  another  case,  when  the 
cache  memory  is  bigger  than  the 
disk  file,  disk  caching  becomes  su¬ 
perfluous  because  it  would  have 
been  better  to  read  the  whole  file 
into  RAM  initially,  then  access  the 
blocks  directly,  than  to  put  up  with 
the  storage  overhead  of  keeping  MRU/ 
LRU  information. 

Because  there  are  so  many  vari¬ 
ables  associated  with  the  efficiency 
of  caches,  I’ve  provided  the  cacstatf ) 
routine  to  obtain  the  statistics  of  the 
cache,  allowing  the  user  to  adjust 
the  size  of  the  cache  for  optimum 
performance.  The  values  returned 
are  cache  hits,  cache  misses,  and 
cache  adds.  The  sum  of  hits  and 
misses  represent  the  total  number 
of  times  the  cache  was  searched  for 
a  block.  The  optimum  ratio  of  hits 
to  total  accesses  depends  on  a 


number  of  variables  such  as  the 
ratio  of  CPU/RAM  speed  to  disk  aver¬ 
age  access  time. 

Multiple  Caches 

One  of  the  most  glaring  failings  of 
having  an  operating  system  perform 
all  the  cache  work  for  disk  I/O  is  that 
the  operating  system  isn’t  in  a  posi¬ 


tion  to  discriminate  between  differ¬ 
ent  types  of  accesses.  Also,  most 
systems  can’t  allocate  cache 
memory  on  a  dynamic  basis.  As  a 
result,  the  operating  system  cache 
can  be  wiped  clean  of  useful  re¬ 
cords  just  because  the  application 
made  one  pass  of  an  input  file,  fill¬ 
ing  the  cache  with  records  that  are 


Figure  2:  After  block  Z  is  made  MRU 


Dr.  Dobb’s  Journal,  December  1987 

926 


31 


RAM-CACHE  MANAGER 

(continued  from  page  31) 

never  read  again.  If  the  system  is 
multitasking  (or  multiprocess),  the 
situation  would  be  even  worse  be¬ 
cause  other  programs  could  domi¬ 
nate  the  available  cache  space. 

It’s  usually  obvious  to  the  pro¬ 
gram  designer  which  disk  files 
should  be  cached  and  which 
shouldn’t.  A  good  application  for  a 
disk  cache  is  a  compiler’s  temporary 
tables,  where  using  a  disk  scratch 
file  is  considered  only  after  RAM 
memory  runs  out.  In  B-tree  indexing 
subroutine  libraries,  caching  is  par¬ 
ticularly  effective  in  processing  look¬ 
ups  and  node  additions. 

For  this  reason,  all  the  routines 
provided  in  this  article  operate  on  a 
structure  that  is  pointed  to  by  a 
single  variable  kept  by  the  calling 
program.  This  approach  allows  any 
number  of  separate  caches  of  vari¬ 
able  sizes  to  be  managed  concur¬ 
rently. 

The  cacallo( )  routine  uses  the  stan¬ 
dard  library  mallocf )  routine  to  allo¬ 
cate  all  memory  necessary  for  the 
cache.  It  returns  a  pointer  to  the 
root  structure  of  the  cache  that  is 
used  as  a  parameter  to  all  the  other 
cache  routine  calls.  All  the  global 
parameters,  and  pointers  to  other 
objects,  are  contained  in  this  struc¬ 
ture,  which  is  typedefe d  to  be 
CACDS. 

Four  parameters  are  required  to 
set  up  a  cache: 

•  the  number  of  records  in  the 
cache 

•  the  length  of  each  record 

•  a  pointer  to  an  external  function 
for  processing  free  records 

•  an  identifier  word  (long)  to  pass 
to  the  external  function 

If  the  application  interface  is  not 
going  to  defer  the  writing  of  disk 
blocks,  a  free  record-processing  rou¬ 
tine  is  not  necessary.  In  this  case, 
the  third  parameter  provided  should 
be  a  null  pointer  (char  *)  0.  The 
identifier  word  is  used  when  a 
single  routine  is  being  used  to 
handle  the  freed  blocks  from  multi¬ 
ple  caches.  This  value  can  be  used 
by  the  called  routine  to  identify  from 
which  cache  the  record  is  being 
transmitted. 


Using  the  Cache  Routines 

All  the  routines  and  the  typedef  for 
the  cache  structure  are  in  the  file 
cache  .h,  shown  in  Listing  One,  page 
62.  It’s  not  really,  necessary  for  the 
calling  routine  to  have  access  to 
typedef  because  the  calling  routines 
never  need  direct  access  to  the 
CACDS. 

Cache.h  declares  the  structure  in 
typedef,  however,  so  that  the  calling 
routine  can  conform  to  ANSI  specifi¬ 
cations  by  declaring  an  external  or 
local  variable  of  type  CACDS.  (With 
most  compilers  it  is  acceptable  for 
the  calling  routine  to  store  the  re¬ 
turned  value  of  cacallot )  in  a  charac¬ 
ter  pointer.) 

The  cache.h  file  also  provides  the 
function  declarations  necessary  for 
function  type  checking  by  the  com¬ 
piler.  Listing  Two,  page  62,  shows  all 
the  cache-processing  routines, 
which  may  be  compiled  separately 
and/or  incorporated  into  a  subrou¬ 
tine  library.  Listing  Three,  page  67, 
is  a  simple  test  program  for  these 
routines. 

After  the  cache  has  been  set  up,  it 
will  have  to  be  checked  to  see  if  the 
desired  record  is  already  stored 
there.  To  do  this,  the  cacfindd  rou¬ 
tine  is  called.  If  the  cache  record 
designated  by  the  num  parameter 
isn't  found,  a  null  pointer  is  re¬ 
turned  and  the  calling  routine  can 


then  take  the  appropriate  action.  Usu¬ 
ally  this  means  issuing  a  readO  re¬ 
quest  to  get  the  desired  block,  then 
adding  the  record  to  the  cache  ac¬ 
cording  to  the  procedure  that  fol¬ 
lows. 

If  the  cache  record  is  found  by 
cacfind( ),  a  character  pointer  to  the 
desired  record  is  returned  to  the 
calling  routine.  That  record  is  also 
taken  out  of  its  current  position  in 
the  LRU  chain  and  added  at  the 
MRU  end.  For  example,  suppose  the 
calling  routine  is  accessing  blocks  of 
512-byte  records  and  wants  to  read 
the  record  whose  number  is  stored 
in  the  variable  recn.  The  external 
character  pointer  cache  is  initialized 
with  the  return  value  of  an  earlier 
call  to  cacallo( ).  The  code  (sans  any 
error  checking)  might  appear  as 
shown  in  Example  1,  right. 

This  example  doesn’t  add  any  re¬ 
cords  that  were  read  from  the  disk 
to  the  cache.  A  record  produced  by 
the  read( )  call  should  be  added  to 
the  cache  at  the  MRU  end  because, 
after  all,  it’s  the  most  recently  used 
record.  Each  time  a  record  has  to 
be  added  to  the  cache,  the 
cacnumd  routine  is  called.  It  is  im¬ 
portant  to  note  that  the  cacnumd 
routine  does  not  check  to  see  if  the 
added  record  is  already  in  the 
cache.  Thus  each  call  to  cacnumd 
should  be  preceded  by  a  cacfind( ) 


char  buffer [512]; 
long  recn; 
char  *rec; 

( 

if  (  (rec  »  cacfind (cache,  recn))  ->•  NULL)  ( 

/*  record  not  in  cache,  must  be  read  from  disk  */ 

lseek(fd,  recn  *  512,  0); 
readffd,  buffer,  512); 
rec  m  buffer; 

) 

/*  rec  points  to  record  to  process  */ 

) 


Example  1:  Accessing  blocks  of  512-byte  records 


char  buffer  [512] ; 

long  recn;  /*  record  number  to  write  */ 

char  *rec;  /*  pointer  to  block  in  cache  area  */ 

{ 

if  (tree  =  cacfind  (cache,  recn))  ==  NULL)  rec  ■»  cacnum(cache,  recn); 
memcpy(rec,  buffer,  512);  /*  copy  data  into  cache  area  */ 

lseeklfd,  512  *  recn,  0) ; 
write (fd,  rec,  512); 

} 


Example  3:  Writing  blocks  of  512-byte  records 


927 


RAM-CACHE  MANAGER 
(continued  from  page  33) 

call. 

The  cacnumO  routine  does  the 
following: 

1.  Finds  the  LRU  record  in  the  cache 
area. 


2.  If  that  record  has  been  marked  for 
processing,  calls  the  external  free 
block-processing  routine. 

3.  Makes  that  record  the  MRU. 

4.  Numbers  that  record  with  a  new 
number  supplied  by  the  calling  rou¬ 
tine. 

5.  Returns  a  character  pointer  to 
that  record. 


To  avoid  having  to  copy  blocks  from 
place  to  place  in  RAM,  and  to  elimi¬ 
nate  the  need  to  allocate  an  external 
buffer,  the  calling  routines  should 
access  the  records  through  a  charac¬ 
ter  pointer  returned  by  cacfindf ) 
and  cacnum( ). 

To  complete  the  previous  exam¬ 
ple  for  reading  records,  the  follow¬ 
ing  code  (again  without  error  check¬ 
ing  for  simplicity)  can  be  used: 

long  recn; 
char  *rec; 

{ 

if  ((rec  =  cacfind(cache,  recn))  =  = 

NULL)  { 

rec  =  cacnumlcache,  recn); 

lseek(fd,  recn  *  512,  0); 

read(fd,  rec,  512); 

} 

/*  rec  points  to  record  to  process  */ 

} 

Writing  data  records  undergoes  a 
similar  process.  A  record  that  is  writ¬ 
ten  also  becomes  the  MRU  record 
via  the  cacfindf )  and  the  cacnumf ) 
routines.  If  the  record  is  already  in 
the  cache,  however,  there  is  no  need 
to  call  cacnumf ).  The  complemen¬ 
tary  routine  to  the  previous  one 
might  look  like  that  shown  in  Exam¬ 
ple  2,  page  33. 

Processing  Freed  Records 

As  mentioned  earlier,  it  is  some¬ 
times  advantageous  to  enter  outgo¬ 
ing  records  (those  to  be  written  to 
disk)  into  the  cache  without  actually 
performing  the  disk  I/O  operation, 
then  write  them  to  disk  only  when 
the  block  within  the  cache  that  the 
record  is  sitting  on  needs  to  be  used 
for  some  other  purpose.  Typically, 
this  would  be  the  case  for  a  scratch 
file  (such  as  a  symbol  table),  where 
the  file  would  not  even  be  created 
unless  the  data  overflowed  the 
cache.  In  most  other  cases,  it  is  best 
to  write  the  data  records  immedi¬ 
ately,  as  in  Example  2. 

If  the  writing  of  data  records  is  to 
be  deferred,  the  third  and  fourth 
parameters  to  the  cacallof )  function 
are  used  and  the  cacprocf )  function 
is  used  to  mark  records  that  are  to 
be  processed  before  the  space  they 


char  buffer [512]; 

long  recn;  /*  record  number  to  write  */ 

char  *rec;  /*  pointer  to  block  in  cache  area  */ 

( 

if  ((rec  =  cacfind  (cache,  recn))  ==>  NULL)  rec  «  cacnum  (cache,  recn) ; 
memcpy (rec,  buffer,  512);  /*  copy  data  into  cache  area  */ 

cacproc  (cache,  recn);  /*  mark  the  block  for  processing  */ 

Example  3:  Code  to  defer  writing  records 

write_cache (idnt,  recn,  recb) 
long  idnt;  /*  cache  identifier  */ 

long  recn;  /*  record  number  */ 

char  *recb;  /*  record  buffer  */ 

( 

lseek(fd,  recn  *  512,  0); 
write (fd,  recb,  512); 
return; 

) 

Example  4:  Example  function  to  write  a  record 

34 

928 


Dr.  Dobb’s  Journal,  December  1987 


RAM-CACHE  MANAGER 

(continued  from  page  34) 

occupy  is  to  be  overwritten.  The 
function  specified  by  the  calling  rou¬ 
tine  is  called  with  three  parameters: 

1.  the  cache  identifier  (the  fourth 
parameter  to  cacallof ) ) 

2.  the  record  identifier  (generally  the 
block  number) 

3.  a  character  pointer  to  the  record 
to  process 

To  implement  this,  the  previous 
record-writing  routine  would  be 
changed  to'  the  code  in  Example  3, 
page  34. 

The  calling  routine  must  provide 
an  external  function  to  write  the 
record,  specified  by  the  initial  call 
to  caca!lo( ).  When  a  block  marked 
for  processing  is  needed  for  some 
other  purpose,  the  routine  is  called. 
An  example  is  shown  in  Example  4, 
page  34. 

In  some  applications,  a  block 
marked  for  processing  will  become 
obsolete.  In  this  case,  the  routine 
cacuprcf )  is  called.  This  routine  will 
keep  a  block  from  being  sent  to  the 
external  function  that  writes  the  re¬ 
cords. 

Closing  Down  the  Cache 

Because  cacalloO  uses  mallocO  to 
allocate  all  the  memory  the  cache 
uses,  it’s  possible  to  free  that 
memory  with  cacfreeO.  All  the  re¬ 
cords  that  are  due  for  postpro¬ 
cessing  are  processed  at  this  time 
by  using  the  cacflshf )  routine,  before 
the  library  routine  free( )  is  called. 

Variable  Length  Records 

The  cache  routines  presented  in  this 
article  are  best  suited  to  small-  and 
medium-size  caches  of  large,  fixed- 
length  records.  Typically,  the  “re¬ 
cords"  cached  are  disk  blocks, 
which  contain  smaller,  logical  re¬ 
cords.  When  the  records  are  small, 
the  linear  search  would  become  in¬ 
efficient  in  short  order.  Figure  4, 
right,  shows  a  block  diagram  of  a 
complete  application  in  which  inter¬ 
face  programs  process  requests  from 
the  applications. 

Conclusion 

The  routines  provided  herein  can 
be  applied  to  almost  any  applica¬ 


tion.  They  were  originally  imple¬ 
mented  on  a  Unix  System  III,  68000- 
based  system  and  were  subse¬ 
quently  ported  to  other  environ¬ 
ments.  They  work' well  under  MS- 
DOS  using  all  the  memory  models 
of  the  Microsoft  C  compiler,  al¬ 
though  a  cache  of  more  than  63K 
does  not  work. 

Availability 

All  the  source  code  for  articles  in 
this  issue  is  available  on  a  single 


disk.  To  order,  send  $14.95  to  Dr. 
Dobb’s  Journal,  501  Galveston  Dr., 
Redwood  City,  CA  94063,  or  call  (415) 
366-3600,  ext.  216.  Please  specify  the 
issue  number  and  format  (MS-DOS, 
Macintosh,  Kaypro). 

DDJ 

(Listings  begin  on  page  62.) 

Vote  for  your  favorite  feature/article. 

Circle  Reader  Service  No.  2. 


Figure  4:  Application  interface 


Dr.  Dobb’s  Journal,  December  1987 


37 

929 


ARTICLES 


Putting  ROM  Code 
in  Its  Place 

by  Rick  Naro 


Until  recently,  developers  pro¬ 
gramming  for  EPROM-based 
applications  didn’t  have  a  util¬ 
ity  that  would  take  the  standard 
output  of  MS-DOS  tools  and  spit  out 
a  file  suitable  for  burning  into 
EPROM.  After  spending  the  last  few 
months  longing  to  use  the  latest  in 
compilers,  such  as  Microsoft’s  C  or 
Borland’s  Turbo  C,  I  decided  that 
perhaps  the  best  tack  to  take  would 
be  to  develop  my  own  locate  utility. 

Although  it’s  an  easy  matter  to 
put  the  contents  of  a  .COM  file  into 
an  EPROM,  programs  in  .EXE  format 
are  a  different  animal  altogether. 
Unlike  the  binary  image  of  a  pro¬ 
gram  found  in  a  .COM  format,  a 
.EXE  file  is  a  relocatable  object 
module,  which  requires  that  the  seg¬ 
ment  references  in  a  program  be 
relocated  or  adjusted. 

Before  the  .EXE  file  can  be  run,  a 
loader  must  convert  the  relocatable 
format  into  an  executable,  absolute 
object  module.  In  the  typical  MS- 
DOS  system,  program  loading  and 
segment  fix-up  is  performed  by  COM- 
MAND.COM  and  as  such  is  transpar¬ 
ent  to  the  user.  The  COM- 
MAND.COM  loader  simply  reads  the 
relocatable  object  module  into 
memory,  performs  the  segment  fix¬ 
up  by  adjusting  the  segment  refer¬ 
ences  relative  to  the  base  segment 
of  the  load  module,  then  transfers 
control  to  the  program. 

To  support  relocation,  the  .EXE 
file  is  partitioned  into  two  compo- 


Rick  Naro  is  a  part  time  program¬ 
mer  who  is  interested  in  software 
development  tools  for  embedded  sys¬ 
tems.  He  can  be  reached  do  Para¬ 
digm  Systems,  P.O.  Boy  152,  Milford, 
MA  01 757  or  as  naro  on  BIX. 


A  DOS  Locate  utility 


nents — a  header  containing  the  relo¬ 
cation  information  and  the  actual 
binary  load  module.  Both  the 
memory  requirements  for  the  pro¬ 
gram  and  the  initial  register  values 
are  found  in  the  header,  so  the  first 
step  of  loading  the  binary  image  is 
easy. 

COMMAND.COM  simply  requests 
a  suitably  sized  memory  block  in 
which  to  place  the  load  module, 
then  reads  the  binary  image  into 
that  block  of  memory.  Once  loaded 
into  memory,  segment  references 
are  fixed  relative  to  the  base  seg¬ 
ment,  using  the  segment  fix-up  re¬ 
cords.  When  a  programmer  codes 
the  following  instruction  sequence, 
for  example,  the  assembler  and 
linker  cannot  determine  the  final 
segment  value  to  be  used  for  the 
data  segment,  and  the  fix-up  is  left 
for  the  loader  to  perform: 

mov  ax,  data  ;  Load  ax  with  the  data 

segment 

mov  ds,  ax  ;  Store  in  ds 

Instead,  the  linker  inserts  the  offset 
(or  virtual  segment)  of  the  segment 
data  from  the  base  of  the  load 
module  into  the  binary  object 
module  and  inserts  an  entry  in  the 
segment  fix-up  record  pointing  to 
the  segment  reference  requiring  fix¬ 
up.  If  the  program  base  is  segment 
3000h  and  the  virtual  segment  of 


data  is  1234h,  the  loader  will  per¬ 
form  the  fix-up  by  adding  the  two 
and  overwriting  the  segment  offset 
with  the  sum  4234h  (which  is  the 
physical  segment  for  the  segment 
data  in  this  instance).  Once  all  seg¬ 
ment  fix-ups  have  been  processed, 
the  loader  can  transfer  control  to 
the  new  program. 

An  MS-DOS  Locator 

For  embedded  systems,  a  special 
type  of  loader  called  a  locator  is 
required.  A  loader  is  distinguished 
from  a  locator  in  two  ways:  by 
output  destination  and  by  the  or¬ 
ganization  of  the  absolute  object 
module.  Although  a  loader  is  de¬ 
signed  to  write  the  absolute  object 
module  directly  to  memory  (for  im¬ 
mediate  execution),  the  output  of  a 
locator  is  an  Intel  extended  hex  file 
suitable  for  EPROM  burning. 

Another  important  feature  of  a  lo¬ 
cator  is  its  ability  to  rearrange  seg¬ 
ments  at  arbitrary  addresses  to  re¬ 
flect  the  physical  organization  of  the 
target  system.  A  typical  embedded 
system  normally  contains  EPROM  at 
the  upper  addresses  for  the  pro¬ 
gram  code  and  RAM  at  the  lower 
addresses  for  data,  interrupt  vectors, 
and  the  stack.  Because  this  organiza¬ 
tion  is  incompatible  with  the  con¬ 
tiguous  MS-DOS  absolute  object 
module,  relocating  the  segments  to 
new  addresses  is  crucial  to  the  op¬ 
eration  of  the  locator.  When  the  lo¬ 
cator  has  finished  processing  a  .EXE 
file,  ROMable  code  and  data  will  be 
fixed  at  addresses  in  the  EPROM 
address  space  and  volatile  data  and 
the  stack  will  be  fixed  in  the  RAM 
address  space  and  the  segment  fix¬ 
ups  adjusted  to  reflect  the  rearrange¬ 
ment  of  segments. 


38 

930 


Dr.  Dobb’s  Journal,  December  1987 


By  itself,  the  .EXE  file  header  con¬ 
tains  insufficient  information  for  re¬ 
location,  so  the  segment  map  of  the 
program  along  with  instructions  on 
where  the  segments  will  be  placed 
in  the  target  system  is  required.  The 
segment  map  is  prepared  by  the 
linker  and  identifies  each  segment 
by  name,  class,  length,  and  its  posi¬ 
tion  within  the  binary  load  module. 
The  user  must  also  prepare  a  con¬ 
figuration  file  describing  the  charac¬ 
teristics  of  the  target  system  and  the 
physical  addresses  that  the  program 
segments  will  bound.  Using  both  the 
map  and  configuration  files,  the  lo¬ 
cator  can  extract  and  physically  relo¬ 
cate  the  segments  to  build  the  ROM- 
able  load  module. 

Although  a  programmer  is  nor¬ 
mally  concerned  with  segments, 
they  are  far  too  numerous  and 
varied  in  name  to  be  of  much  use  to 
the  locator.  Instead,  the  locator 
works  with  classes.  A  class  is  simply 
a  tag  applied  to  a  segment  by  the 
assembler  or  compiler.  This  tag  per¬ 
mits  the  linker  to  group  a  segment 
with  other  related  segments. 

For  example,  each  separately  com¬ 
piled  source  file  in  the  large-memory 
model  will  generate  a  uniquely 
named  code  segment,  but  all  such 
segments  will  belong  to  the  code 
class.  Using  the  locator  directives,  a 
programmer  can  fix  the  address  of 
any  class  and  specify  the  order  of  a 
set  of  classes,  configuring  the  abso¬ 
lute  object  code  for  any  target  hard¬ 
ware. 

The  locator  also  needs  to  process 
the  segment  fix-ups  but  in  a  slightly 
different  manner  from  the  way  the 
loader  does.  Each  segment  listed  in 
the  segment  map  is  given  an  entry 
in  a  linked  list  that  contains  its  seg¬ 
ment  name,  length,  virtual  segment 
number,  and  physical  segment 
number  organized  by  class  name. 
When  the  configuration  file  is  read, 
the  base  segment  used  for  a  class  is 
fixed  and  all  physical  segment  num¬ 
bers  are  adjusted  by  adding  this 
value  to  the  segment  offset  within 
the  class.  Segment  fix-up  then  can 
proceed  with  the  virtual  segment 
number  from  the  fix-up  record  used 
to  scan  the  linked  list  looking  for  a 
match.  If  found,  the  corresponding 
physical  segment  number  is  re¬ 
turned  and  used  in  the  fix-up;  other¬ 
wise,  an  unresolved  segment  error 


is  reported. 

Although  the  location  process 
sounds  simple,  there  are  two  pitfalls 
that  must  be  avoided.  As  noted  pre¬ 
viously,  an  MS-DOS  .EXE  file  is  de¬ 
signed  to  be  executed  from  a  con¬ 
tiguous  block  of  memory  whereas 
embedded  systems  typically  have  a 
fragmented  address  space  with  pock¬ 
ets  of  RAM  and  EPROM  placed  at 
the  whim  of  the  hardware  designer. 
A  potential  problem  exists  in  that 
two  adjacent  segments  in  different 
classes  can  share  a  common  virtual 
segment  and  then  be  located  non- 
contiguously  when  the  segments  are 
extracted.  Because  the  virtual-to- 
physical-segment  translation  in  this 
instance  is  ambiguous,  a  situation 
known  as  segment  aliasing  results. 
Segment  aliasing  can  be  avoided  by 
guaranteeing  that  two  segments  in 
different  classes  never  share  a 
common  virtual  segment.  This  is 
easily  accomplished  by  verifying  that 
the  first  segment  in  a  class  is  para¬ 
graph-aligned  or  that  each  segment 
spans  a  paragraph  boundary. 

There  is  also  a  potential  problem 
is  using  groups.  A  group  is  a  collec¬ 
tion  of  unrelated  segments  that  are 
organized  to  fit  within,  and  be  ad¬ 
dressed  as,  a  single  physical  seg¬ 
ment.  Some  linkers,  such  as  the  Mi¬ 
crosoft  linker,  don’t  include  suffi¬ 
cient  information  for  the  locator  to 
reconstruct  a  group.  If  groups  are 
used,  the  user  must  have  informa¬ 
tion  on  the  organization  of  the 
group  and  include  instructions  in 
the  configuration  file  to  permit  its 
reconstruction.  This  is  accomplished 
by  using  the  locate  directives  to  fix 
the  address  of  the  first  class  in  the 
group  and  then  order  the  remaining 
classes  in  the  group  as  specified  by 
the  compiler  vendor. 

LOCATE 

LOCATE  is  an  MS-DOS  utility  that 
accepts  a  relocatable  .EXE  file  and 
outputs  an  absolute  object  module 
suitable  for  burning  in  EPROM.  The 
source  code  and  a  make  file  for 
building  LOCATE  can  be  found  in 
the  accompanying  listings  (begin¬ 
ning  on  page  68). 

Because  each  application  is 
unique,  LOCATE  uses  several  direc¬ 
tives  to  control  the  location  process. 
These  directives  are  used  to  identify 
ROMable  classes,  assign  physical  seg¬ 


ments  to  classes,  and  specify  the 
order  of  classes  in  the  absolute  load 
module.  Some  directives  accept  a 
list  of  one  or  more  operands.  The  I 
and  1  characters  are  used  whenever 
an  operand  is  optional  and  can  be 
repeated  zero  or  more  times.  Unless 
otherwise  specified,  directives  and 
operands  are  delimited  by  white 
space. 

The  default  configuration  file  has 
the  file  name  of  the  input  .EXE  file 
with  an  extension  of  .CFG.  Using  a 
command-line  option,  the  default 
file  name  can  be  overridden  and  any 
file  can  be  specified  to  contain  the 
configuration  instructions.  This 
option  allows  multiple  load  mod¬ 
ules  to  share  a  common  configura¬ 
tion  file. 

Class  Directive 

The  class  directive  assigns  a  physi¬ 
cal  segment  to  a  class.  The  first 
segment  in  the  specified  class  is 
assigned  the  base  segment  number 
and  the  remaining  segments  in  the 
class  are  assigned  segments  relative 
to  the  first  segment  in  the  class. 
These  segments  depend  on  the 
length  of  the  preceding  segments 
and  the  segment  alignment. 

The  class  directive  uses  the  follow¬ 
ing  syntax: 

class  class  =  seg 

where  class  is  the  name  of  the  class 
and  seg  is  the  16-bit  physical  seg¬ 
ment  where  the  class  will  be  lo¬ 
cated.  For  example: 

class  code  =  OxfcOO 

assigns  the  class  code  to  segment 
fcOOh  and  therefore  the  physical  ad¬ 
dress  fcOOOh. 

Order  Directive 

The  order  directive  is  used  to  spec¬ 
ify  the  ordering  of  two  or  more 
classes.  It  is  important  because  it 
allows  unrelated  classes  to  be  made 
contiguous  without  firsthand  knowl¬ 
edge  of  the  size  and  number  of  seg¬ 
ments  in  the  class. 

The  order  directive  uses  the  fol¬ 
lowing  syntax: 

order  class  [class] 

where  the  first  class  in  the  list  was 


Dr.  Dobb’s  Journal,  December  1987 


39 

931 


DOS  LOCATE  UTILITY 

(continued  from  page  39) 

specified  in  a  class  directive.  Any 
class  names  specified  after  the  first 
are  located  contiguously  and  aligned 
to  the  segment  boundary  of  the  first 
segment  in  each  class.  For  example: 

order  code  data  bss 

orders  the  classes  data  and  bss  im¬ 
mediately  following  the  class  code. 

Dup  Directive 

The  dup  directive  is  used  to  make  a 
copy  of  the  specified  class.  If  used, 
the  dup  directive  should  appear 
before  any  other  directives. 

The  dup  directive  uses  the  syntax: 

dup  class  dup _ class 

where  class  is  an  existing  class  and 

dup _ class  is  the  name  given  to  the 

copy  of  class.  For  example,  the  direc¬ 
tive: 

dup  data  const 

makes  a  copy  of  the  data  class 
named  const.  This  command  is 
used  in  conjunction  with  the  order 
directive  to  locate  the  data  and  bss 
classes  in  RAM  but  force  a  copy  of 
the  class  data  to  be  included  in 
EPROM  for  power-on  initialization 
of  any  initialized  data. 

If  the  class  data  contains  the  in¬ 
itialized  data  from  a  compiler,  the 
following  commands  will  locate  data 
at  address  lOOOh  and  create  a  copy 
of  data  called  const  to  be  placed 
after  the  code  class.  The  start-up 
code  can  then  initialize  the  class 
data  by  copying  the  const  class  to 
the  data  class. 

dup  data  const  ;  Copy  the  class  and 
call  it  const 

class  data  =  0x100  ;  Fix  data  at 

address  OlOOOh 

class  code  =  OxfcOO  ;  Fix  code  at 

address  fcOOOh 

order  code  const  ;  Const  to  immedi¬ 
ately  follow  code 
;  And  read  by  the  startup  code 

Rom  Directive 

The  Rom  directive  is  used  to  specify 


DOS  LOCATE  UTILITY 

(continued  from  page  41) 

which  classes  are  ROMable.  Classes 
containing  program  code  and  con¬ 
stant  data  need  to  be  located  and 
placed  in  ROM  to  be  available  when 


mainO 

{ 

int  i,  j  ; 

static  char  *s  =*  ; 

for  (1  -  0;  i  <  10;  i++)  { 

for  (j  -  0;  j  <  10;  j++) 
array [i] [j]  «  i  *  j  ; 

1 


the  system  is  powered  up  and  in¬ 
itialized.  Other  classes  such  as  un¬ 
initialized  data  and  the  stack  re¬ 
quire  only  to  be  located  at  a  physi¬ 
cal  address  and  do  not  need  to  be 
placed  in  the  output  file. 

The  com  directive  uses  the  follow- 


/*  Initialized  data  */ 

/*  Uninitialized  data  */ 


/*  Automatics  */ 

/*  Static  initialized  data  */ 


strcpy (s,  ptr)  ; 


/*  Bring  in  a  library  function  */) 


Example  1:  The  demonstration  program 


Omasm  /MX  tc,  tc,  tc  ; 

Microsoft  (R)  Macro  Assembler  Version  4.00 
Copyright  (C)  Microsoft  Corp  1981,  1983,  1984,  1985. 

All  rights  reserved. 

49272  Bytes  symbol  space  free 
0  Warning  Errors 
0  Severe  Errors 

C>tcc  -c  -ml  demo 

Turbo  C  Version  1.0  Copyright  (c)  1987  Borland  International 
demo . c : 

Available  memory  293342 


Example  2:  Compiling  the  C  source  code  and  the  MASM  start-up  module 


Otype  demo. map 


Start 

Stop 

Length 

Name 

Class 

OOOOOH 

00035H 

00036H 

TEXT 

CODE 

00036H 

0007FH 

00Q4AH 

DEMO  TEXT 

CODE 

00080H 

OOOA8H 

00029H 

STRCPY  TEXT 

CODE 

000B0H 

000BFH 

00010H 

ETEXT 

CODEEND 

000C0H 

000D3H 

00014H 

DATA 

DATA 

000E0H 

001A7H 

OOOC8H 

BSS 

BSS 

001A8H 

001A8H 

OOOOOH 

BSSEND 

BSSEND 

001BOH 

003AFH 

00200H 

STACK 

STACK 

Address 

Publics  by  Value 

0000:0000 

START 

0003:0006 

MAIN 

0008:0000 

STRCPY 

000B:0010 

TEND 

O00C: 0000 

PTR 

000C : 0000 

IDATA 

000c : 0020 

ARRAY 

000C : 0020 

BDATA 

000C : 00E8 

EDATA 

001B:0200 

TOS 

Program  entry  point  at  0000:0000 


Example  3:  The  linker  map  file 


/*  This  program  demonstrates  the  use  of  the  LOCATE  utility.  It 
contains  all  of  the  components  of  a  typical  C  program  to 
exercise  the  startup  code  and  locate  utility. 

*/ 

char  *ptr  *■  "class  DATA”  ; 
int  array [10] [10]  ; 


Dr.  Dobb’s  Journal,  December  1987 

932 


Dr.  Dobb’s  Journal,  December  1987 


DOS  LOCATE  UTILITY 

(continued  from  page  42) 

ing  syntax: 
rom  class  [class] 

For  example: 
rom  code  const 

forces  the  classes  code  and  const  to 
be  placed  in  the  output  object  file. 

Comments 

To  aid  in  documenting  the  location 
process,  comments  can  be  added  to 
the  tail  of  any  command  line  or  as  a 
separate  line.  Comments  begin  with 
a  semicolon  (;)  and  continue  to  the 
end  of  the  line.  Blank  lines  and  com¬ 
ments  can  appear  freely  within  the 
configuration  file  for  documentation 
and  readability. 

Options 

In  order  to  provide  a  degree  of  flexi¬ 
bility,  the  LOCATE  utility  can  accept 
command-line  options  (or  switches 
if  you  prefer)  that  influence  the  op¬ 
eration  of  the  locator.  Command¬ 
line  options  are  lowercase  letters 
introduced  with  a  leading  dash  (-) 
with  no  white  space  between  the 
option  letter  and  the  argument. 
Some  examples  of  LOCATE  com¬ 
mand  lines  are: 

locate  -b  hello 

locate  -b  -ccommon.cfg  hello 
locate  -hhello.hx  -b  hello 

A  command  line  begins  with 
locate,  is  followed  by  zero  or  more 
options,  and  is  terminated  with  the 
path  name  of  the  file  to  be  located. 
In  the  following  descriptions,  left 
and  right  brackets  (l  and  ] )  are  used 
to  denote  mandatory  arguments. 

-b — The  default  setting  for  LOCATE 
is  to  generate  an  Intel  extended  hex 
start  address  record  containing  the 
entry  point  of  the  program.  By  speci¬ 
fying  the  -b  option,  LOCATE  will 
create  an  absolute  segment  at  ad¬ 
dress  jfgjvO  and  place  an  interseg¬ 
ment  jump  instruction  to  the  entry 
point  of  the  program. 

-clfdename] — specifies  a  different 
configuration  file.  The  default  con¬ 
figuration  file  is  filename  .CFG, 


where  filename  is  the  name  of  the 
load  module.  One  use  of  this  option 
is  to  allow  different  object  modules 
to  be  located  using  a  shared  configu¬ 
ration  file. 

-hlfilename) — changes  the  name  of 
the  Intel  extended  hex  output  file. 
Normally,  the  output  is  placed  in  a 
file  with  the  same  name  as  the  .EXE 
input  file  and  a  default  extension  of 
.HEX. 

-plfdenamel — changes  the  name  of 
the  locate  map  file  containing  the 
segment  assignments  and  public 


Example  4:  The  configuration  file 


symbols.  Normally,  the  locate  statis¬ 
tics  are  placed  in  a  file  with  the 
same  name  as  the  .EXE  input  file 
and  a  default  extension  of  .LOC. 

LOCATE  Example 

To  demonstrate  the  use  of  LOCATE, 
I’ll  now  discuss  an  example  that 
uses  the  Turbo  C  compiler  from 
Borland.  Using  the  large-memory 
model,  the  program  in  Example  1, 
page  42,  loads  several  different  code 
and  data  segments  that  can  then  be 
processed  by  LOCATE.  The  com- 


s 

This  configuration 

file  is  used  with  Turbo  C  to  build  a 

; 

ROMable  image .  It 

defines  physical  addresses  for  three 

• 

classes,  makes  a  copy  of  the  initialized  data  class  to 

' 

keep  in  ROM  and  instructs  the  locator  the  order  of  the 
different  classes. 

dup 

DATA  CONST 

;  Make  a  copy  of  the  initialized  data  class 
;  and  name  it  CONST 

class 

CODE  =  OxfcOO 

;  Start  code  at  address  FC000H 

class 

STACK  ”  0x0080 

;  The  stack  at  address  00800H 

class 

DATA  -  0x0100 

;  DGROUP  at  address  01000H 

order 

DATA  BSS  BSSEND 

;  Define  the  order  of  DGROUP 

order 

CODE  CODEEND  CONST 

;  And  the  order  of  classes  in  ROM 

rom  CODE  CONST 

;  ROM  only  the  program  code  and  the  copy 
;  of  the  initialized  data  that  the  startup 
;  code  copies  from  ROM  to  DGROUP 

Dr.  Dobbs  Journal,  December  1987 


45 

933 


DOS  LOCATE  UTILITY 

(continued  from  page  45) 

piled  C  source  and  Turbo  C  run¬ 
time  routines  are  linked  with  the 
Turbo  C  assembly-language  start-up 
code,  TCASM.  The  start-up  code  to¬ 
gether  with  the  order  and  dup  direc¬ 
tives  demonstrates  how  the  locator 
can  initialize  the  data  class  and  zero 
out  the  bss  class. 

As  shown  in  Example  2,  page  42, 
you  begin  by  compiling  the  C  source 
and  the  MASM  start-up  module.  The 
Turbo  C  options  used  are  to  disable 
linking  (— c)  and  select  the  large- 
memory  model. 

I  disable  the  automatic  link  follow¬ 
ing  the  compile  so  that  I  can  substi¬ 
tute  a  ROMable  version  of  the  C 
start-up.  The  Turbo  C  large-memory- 
model  library  is  searched  to  satisfy 
the  external  reference  to  the  strcpyi ) 
function,  as  follows: 

Otlink  /m  tc  demo,  demo,  demo, 

\turboc\lib\cl 
Turbo  Link  Version  1.0  Copyright  (c) 
1987  Borland  International 

For  reference,  the  linker  map  file  for 
this  example  is  reproduced  in  Exam¬ 
ple  3,  page  42.  Note  the  segment 
and  class  assignments  and  watch 
how  the  locator  processes  and  con¬ 
verts  the  executable  image  to  a  ROM- 


able  image. 

The  configuration  file  for  the  ex¬ 
ample  (see  Example  4,  page  45)  must 
be  able  to  handle  the  group  and  the 
initialized  data  generated  by  Turbo 
C.  LOCATE  is  instructed  to  make  a 
copy  of  the  data  class  that  contains 
the  program  initialized  data.  Next 
the  base  segments  of  the  three  inde¬ 
pendent  classes  (code,  data,  and 
stack)  are  specified  using  the  class 
directive.  The  order  directive  is  used 
to  recreate  the  Turbo  C  dgroup  and 
fix  the  copy  of  the  initialized  data 
segment  immediately  following  the 
codeend  class.  With  it  tucked  nicely 
in  EPROM  and  its  physical  address 
determined  by  the  tend  label,  the 
start-up  code  can  copy  the  class 
const  to  the  data  class  before  calling 
main( ). 

LOCATE  is  then  executed  to  proc¬ 
ess  the  .EXE  file  and  output  the 
absolute  load  module,  as  follows: 

C>locate  -b  demo 
MS-DOS  Locate  Utility 
Copyright  (C)  1987  Paradigm  Sys¬ 
tems.  All  rights  reserved 

The  output  from  the  locator,  shown 
in  Example  5,  page  46,  is  an  Intel 
extended  hex  file  and  a  segment 
map  detailing  the  new  physical  ad¬ 
dress  assignments.  The  locate  map 
also  contains  a  list  of  public  sym¬ 


bols  for  use  in  debugging  the  target 
system.  Note  how  the  segments  and 
classes  have  been  relocated  accord¬ 
ing  to  the  instructions  in  the  con¬ 
figuration  file  and  correspond  to  the 
addresses  of  the  target  hardware. 

The  file  DEMO.HEX  (in  Example 
6,  page  46)  is  now  ready  to  be  sent 
to  the  EPROM  programmer.  In  an 
8-bit  system,  the  data  is  burned  di¬ 
rectly  into  one  or  more  EPROMs.  In 
a  16-bit  bus  system,  the  EPROM  pro¬ 
grammer  must  be  used  to  split  the 
load  module  into  upper  and  lower 
bytes  for  programming  the  upper 
and  lower  bytes  in  separate 
EPROMs. 

Summary 

Although  a  simple  example,  the 
sample  program  demonstrates  the 
power  and  flexibility  of  turning  a 
low-cost  PC  into  a  powerful,  embed¬ 
ded,  system  development  tool  for 
the  NEC  and  Intel  microprocessors. 
With  access  to  a  wide  range  of  popu¬ 
lar  software  development  tools,  pro¬ 
gram  development  for  embedded  sys¬ 
tems  has  never  been  easier. 

DDJ 

(Listings  begin  on  page  68.) 

Mate  for  your  favorite  feature/article. 

Circle  Reader  Service  No.  3. 


Otype  demo.loc 

MS-DOS  Locate  Utility  Version  1.0 

Input  File 

:  DEMO . EXE 

Output  File:  DEMO.HEX 

Configuration  File:  DEMO. 

CFG 

Invoked  by 

C:\BIN\LOCATE 

.EXE  -b  demo 

Date/Time : 

Mon  Aug  03  14:40:17  1987 

Segment  Information 

Name 

Class 

Address  Length 

TEXT 

CODE 

FCOOOH  0036H 

DEMO  TEXT 

CODE 

FC036H  004AH 

STRCPY  TEX 

CODE 

FC080H  0029H 

ETEXT 

CODEEND 

FC0B0H  0010H 

DATA 

DATA 

01000H  0014H 

BSS 

BSS 

01020H  00C8H 

BSSEND 

BSSEND 

010E8H  0000H 

STACK 

STACK 

00800H  0200H 

DATA 

NEWDATA 

FCOCOH  0014H 

77BOOT 

(ABSOLUTE) 

FFFF0H  0005H 

Public  Symbols 

FC00 : 0000 

START 

FC03 : 0006  MAIN 

FC08 : 0000 

STRCPY 

FC0B:0010  TEND 

0100 : 00E8 

EDATA 

0100:0020  BDATA 

0100:0020 

ARRAY 

0100:0000  IDATA 

0100:0000 

_PTR 

0080:0200  TOS 

Entry  Point  -  Fc00:0000 

Example  S: 

The  output  from  the  locator 

: 0200O002FC0O00 

: 10000 OOOFAB88OOO8EDOBCOOO2B8 000 18EC0B80BD8 
: 100010 OOFC408ED8BE00008BFEB920002BCFF3A48D 
: 10002000061F32COBF2000B9E8002BCFF3AAFB9AUD 
:  060 030000 60003FCEBCA10 
: 02000002FC03FD 

: 10000 6005 65733F6EB2433FFEB1A8BC6F7E7508BC 4 
: 10001 600C6BA1400F7E28BD88BC7D1E003D858894B 
: 10002 6008720004 783FF0A7CE14683FE0A7CD7FFD0 
: 10003 6003 60200FF360000FF3 60 600FF3 604 00 9A3F 
: OA004600000008FC83C4085F5ECBD5 
: 02000002FC08F8 

: 100000005 657558BECFC1EC47E0E8BF732C0B9FFE1 
: 10001000FFF2AEF7D18CC38EDBC47EOAF3A41F8B34 
: 090020005 60C8B460A5D5F5ECBB5 
: 02000002FCOCF4 

:100000000800000113000001436C 6173732 0444138 
:  04001000544 1000057 
: 02000002FFFFFE 
:  05000000EA000000FO15 
:  04OOOOO3FCOO000OFD 
:00000001FF 


Example  6:  The  file  DEMO.HEX 


46 

934 


Dr.  Dobb’s  Journal,  December  1987 


ARTICLES 


Integers  Don’t  Float 

by  Ray  Mariella 


To  do  graphics  programming, 
you  need  a  fast,  integer 
square  root  routine — the 
faster  the  better.  Unfortunately, 
many  published  square  root  rou¬ 
tines  seem  to  be  transported  floating¬ 
point  routines,  and  the  demands  for 
accuracy  are  quite  different  for  float¬ 
ing-point  numbers  and  integers.  In 
fact,  a  single  pass  of  Newton's 
method  may  give  you  sufficient  ac¬ 
curacy  and  still  be  faster  than  the 
routines  borrowed  from  floating¬ 
point  experience. 

The  use  of  Newton's  method  is 
familiar:  given  a  target  integer,  N,  the 
first  step  is  to  get  an  estimate  X0  of 
the  square  root  of  N,  then  apply 
Newton’s  first  derivative  method  to 
iterate,  which  is: 

f(X1)  =  f(X0)  +  (X-X0)(df(X)/dX) 

If  Vn  is  desired,  let  f(X)=XXX  and 
df(X)/dX  =  2X.  Then,  if  Xj  is  the  square 
root  of  N,  f(X1)  =  N.  Using  this  defini¬ 
tion  of  f(X),  the  equation  can  be 
written  N  =  (X0  X  X0)  -I-  (Xj-XglQXg),  or 
to  find  X,  from  a  first  guess  X0,  use: 

Xj  =  (X0  +  N/X0)/2 

If  Xj  isn't  accurate  enough,  substi¬ 
tute  Xj  for  X0  in  this  second  equa¬ 
tion  to  get  X2,  and  so  on.  The  catch 
comes  from  the  implementation  of 
“and  so  on." 

In  algorithms  that  have  been  trans- 


flay  Mariella,  872  El  Qpanito  Ct.,  Dan¬ 
ville,  CA  94526.  Ray  works  at  the 
Lawrence  Livermore  National  Labo¬ 
ratory  in  Livermore,  California,  grow¬ 
ing  compound  semiconductor  fdms 
for  ultra-high-speed  circuits  and  op¬ 
tical  devices. 


Calculating 
square  roots 
without  penalties 


ferred  from  floating-point  proce¬ 
dures,  in  which  64  bits  and  more 
are  used  for  accuracy,  many  itera¬ 
tions  may  be  needed  and  an  error 
value  must  be  calculated  and  moni¬ 
tored.  When  the  error  is  small 
enough,  the  routine  stops  and  re¬ 
turns  the  latest  value,  Xi(  as  the 
square  root.  Because  multiple  itera¬ 
tions  are  needed  and  an  error  value 
is  monitored  on  each  pass,  the  usual 
code  does  not  use  the  second  equa¬ 
tion  on  its  first  pass  but  rather  uses 
Xj  =  N/X0  for  convenience.  You  can 
see  this  in  the  typical  code  fragment 
for  32-bit  integers  on  the  8086, 
shown  in  Example  1,  below. 

Eveh  with  a  good  first  guess  of  X0, 
such  as  the  average  of  the  maximum 


lower  bound  and  the  minimum 
upper  bound  by  factors  of  2,  a  single 
pass  of  ISQRT  does  not  usually  suf¬ 
fice.  For  example,  to  find  the  square 
roots  of  all  the  integers  from  1  to 
65,535,  an  average  of  1.6  divisions 
per  root  is  needed,  or  an  extra  42,150 
iterations.  Using  the  same  first  guess, 
ROOT.C  (the  C  code  in  Example  2, 
page  50)  allows  the  accurate  calcula¬ 
tion  of  the  integer  square  roots 
using  a  single  pass.  The  assembly- 
language  code  for  one  pass  of 
Newton’s  method  is  much  shorter 
than  that  for  ISQRT,  as  shown  in 
Example  3,  page  52. 

At  this  point  it's  useful  to  define 
accuracy.  With  most  integers,  an  in¬ 
teger  square  root  is  not  an  exact 
root.  When  I  speak  of  the  "exact" 
square  root,  I  refer  to  rounding  the 
exact  floating-point  square  root  to 
the  nearest  integer.  My  assembly- 
language  code  is  written  for  the  8086 
line  of  CPUs,  and  my  "exact”  an¬ 
swers  come  from  the  8087  numeric 
coprocessor.  (See  Listings  One 
through  Four,  beginning  on  page 


;  the  32 

bit  integer  N  is  in 

DI:SI 

;  initial 

guess 

X0  is  in  BX 

ISQRT: 

MOV 

DX,  DI 

/prepare  for  division 

MOV 

AX,  SI 

;DX: AX  /  BX 

DIV 

BX 

;N/X0 

SUB 

AX,  BX 

/error  term 

CMP 

AX,  1 

/check  if  >  +1 

JG 

ISQ1 

/if  above  1,  keep  on 

CMP 

AX, \so0\l 

/check  for  \sc0\l,0,+l 

JGE 

done 

/if  OK,  get  out 

ISQ1 : 

SAR 

AX,  1 

/  (N/X0  \sc0\X0)/2 

ADD 

BX,  AX 

/  (N/X0  +X0)/2  -  XI 

JMP 

short  ISQRT 

/use  XI  as  X0 

Example  1:  A  code  fragment  for  calculating  square  roots  of  32-bit  integers 
on  t6 


48 


Dr.  Dobb’s  Journal,  December  1987 

935 


INTEGERS  DONT  FLOAT 
(continued  from  page  48) 


98.) 


Applications  vary,  and  if  you 


demand  that  the  integer  square  root 
either  agree  with  the  exact  root  or 
differ  by  -1,  then  ROOT.C  works  up 
to  N  =  127,087  (the  square  root  of 
127,088  is  356.49  and  a  single  pass 


/*  ROOT.C  a  square  root  algorithm  by  RPM  */ 

/*  long  integers,  single  pass  of  Newton  */ 

tinclude  <stdio.h> 

main  (\scl28\) 

( 

long  int  N,  guess2,  sqrrt; 
register  int  infi,  guessl; 

printf  (  "\n  square  root  of  what  number  ■  ) ; 
scanf  ("%ld", tN) ; 

{  guessl  =  infi  =1; 
guess2  -  N; 
logit:  infi  «»  1; 

if  (  infi  <  guess2  ) 

(  guess2  »*=  1; 
guessl  -  infi; 
goto  logit; 

) 

guessl  +»  guess2; 
guessl  >>-  1; 

/*  newton's  method  */ 
infi  -  N  /  guessl; 
sqrrt  -  infi  +  guessl; 
sqrrt  >>«  1; 

) 

printf  (  “  square  root 

1 


/*  div  by  2  */ 


/*  sum  */ 
/*  avg  */ 


%ld",  sqrrt); 


Example  2:  C  code  to  calculate  integer  square  roots  using  a  single  pass 


gives  357).  If  you’re  willing  to  accept 
0,  or  +1,  or  -1  as  an  error,  then  a 
single  pass  of  Newton's  method  can 
be  used  up  to  1,941,799  (for  1,941,800 
the  square  root  is  1393.48  and  a 
single  pass  gives  1,395). 

Skipping  all  those  compare,  condi¬ 
tional  jump  statements,  and  multi¬ 
ple  passes  can  save  a  lot  of  CPU 
time.  On  my  8-MHz  PC6300  with 
V30,  the  assembly-language  program 
RALL16  (Listing  Four)  which  is  opti¬ 
mized  for  16-bit  integers,  finds  all 
the  square  roots  from  1  to  65,535  in 
2.5  seconds,  with  an  empty  loop 
time  of  0.2  seconds,  or  about  35 
microseconds  per  root.  This  is 
slightly  faster  than  the  speed  at 
which  the  8087  performs  the  same 
task.  I  found  that  29,776  of  the  roots 
were  1  less  than  the  exact  root  and 
the  rest  agreed  with  the  exact  root. 
The  Microsoft  C  4.0  version  of 
RALL16  took  4.6  seconds  for  all  inte¬ 
gers  from  1  to  60,000,  or  about  73 
microseconds  per  root. 

For  32-bit  integers  up  to 
4,294,967,295  (FFFF:FFFF),  two  passes 
of  the  second  equation  are  needed. 
My  program  code,  ISQRT32ASM  (in 


INTEGERS  DONT  FLOAT 

(continued  from  page  50) 

Listing  One),  for  the  8086  line  is 
cumbersome,  but  it  does  work  for 
all  these  integers  (it  takes  approxi¬ 
mately  153  hours  to  compare  the 
V30  and  8087  square  roots).  Again, 
the  result  after  two  passes  of 
Newton's  method  either  agrees  with 
the  exact  integer  or  is  1  less.  This 
code  executes  in  about  110  micro¬ 
seconds  per  root  on  m3'  machine. 
(Here  the  8087  is  more  than  twice  as 
fast  as  the  V30  and  looks  very  attrac¬ 
tive  indeed!) 


root  for  integers  from  10,000  to 
100,000,000  in  steps  of  10,000  (using 
register  variables).  This  V30  perform¬ 
ance  is  about  the  same  speed  as  the 
same  program  on  a  Macintosh  Plus 
with  Lightspeed  C.  The  maximum 
integer  that  ROOT.C  can  handle  de¬ 
pends  upon  the  compiler  and  its 
acceptance  of  such  things  as  un¬ 
signed  register  variables. 

The  decision  whether  to  use  the 
single-pass  or  double-pass  version 
of  Newton's  procedure  can  usually 
be  made  by  the  programmer  before 
starting,  based  on  the  number  of 


Again  with  Microsoft  C  4.0,  the 
two-pass  version  of  ROOT.C  takes 
an  average  of  360  microseconds  per 

maximum  integer  that  the  program 
will  see.  In  either  event,  my  meas¬ 
urements  have  shown  that  using  the 

;  the  32  bit  integer  N  is  in  DI:SI 
;  initial  guess  XO  is  in  BX 

NEWTON:  MOV  DX,  DI 

; prepare  for  division 

MOV  AX, SI 

;DX:AX  /  BX 

DIV  BX 

;N/X0 

ADD  BX,  AX 

; (N/XO  +X0) /2  =  XI 

RCR  AX, 1 

; (N/X0  \sc0\X0) /2 

Example  3:  Assembly-language  code  for  one  pass  of  Newton’s  method 


unaltered  Newton's  method  pro¬ 
vides  greater  speed  and  somewhat 
better  accuracy  than  can  be  attained 
using  the  more  common  algorithms 
that  came  from  the  floating-point 
literature. 

Along  the  same  lines,  in  the  March 
1986  issue  of  DDJ,  Richard  Campbell 
described  both  a  C  routine  and  an 
assembly-language  version  for  the 
NS3Z0yy  processor  for  calculating 
square  roots. — eds. 

Availability 

Most  of  the  source  code  for  articles 
in  this  issue  is  available  on  a  single 
disk.  To  order,  send  $14.95  to  Dr. 
Dobb’s  Journal,  501  Galveston  Dr., 
Redwood  City,  CA  94063,  or  call  (415) 
366-3600,  ext.  216.  Please  specify  the 
issue  number  and  format  (MS-DOS, 
Macintosh,  Ka3'pro). 

DDJ 

(Listings  begin  on  page  98.) 

Vote  for  your  favorite  feature/article. 

Circle  Reader  Service  No.  4. 


936 


ARTICLES 


A  Graphics  Toolbox 
for  Turbo  C — Part  2 


In  Part  1  of  “A  Graphics  Toolbox 
for  Turbo  C”  ( DDJ ,  November 
1987),  I  developed  a  library  of 
low-level  graphics  routines  for  Turbo 
C  and  explored  ways  to  incorporate 
them  into  high-level  drawing  rou¬ 
tines  for  APA  (all-points  addressable, 
or  pixel-oriented)  graphics.  In  this 
article,  I  use  the  library  routines  de¬ 
veloped  in  Part  1  to  create  text  graph¬ 
ics  such  as  menu  bars,  pop-up  win¬ 
dows,  pull-down  submenus,  and  the 
like. 

Popping  up  a  visual  object  that 
obscures  a  portion  of  the  screen, 
then  restoring  that  overlaid  area 
after  removing  that  object,  is  a  visu¬ 
ally  exciting  if  relatively  simple  pro¬ 
gramming  trick.  Before  writing  an 
object  to  screen,  you  simply  copy 
display  memory  into  a  save  buffer; 
to  make  the  object  go  away — that  is, 
to  restore  the  screen  to  its  former 
appearance — you  just  write  back  the 
screen  previously  saved  to  buffer 
into  display  memory.  In  this  way, 
the  text  obscured  by  the  pop-up 
reappears  magically  intact. 

Finding  display  memory  can  be  a 
bit  more  challenging,  however.  The 
display  memory  location  depends 
on  the  kind  of  adaptor  you're  using. 
To  get  that  information  from  the 
operating  system,  I  constructed  a 
function  called  videomodet ).  It  de¬ 
termines  which  type  of  video  adap¬ 
tor  is  active:  if  videomodet )  returns 
7,  there’s  an  MDA  with  display 
memory  at  BOOOhiO;  otherwise,  dis- 


Kent  Porter,  1909-4  Montecito  Rd., 
Mountain  View,  CA  94043.  Kent  has 
written  17  books  about  programming 
and  hundreds  of  magazine  articles 
on  computer  hardware  and  software. 
He  is  a  technical  editor  for  DDJ. 


by  Kent  Porter 


Easy  menu  bars , 
pop-up  windows ,  and 
pull-down  submenus 


play  memory  begins  at  segment 
B800h. 

Text  screens  for  IBM-standard 
adaptors  occupy  4,096  bytes, 
whereas  for  the  Hercules  the  display 
memory  is  16,384  bytes.  You  can 
find  out  the  video  buffer  size  from 
40h:4Ch  using  Turbo  C’s  peekt )  func¬ 
tion,  which  returns  an  integer  indi¬ 
cating  screen  memory  size. 

Some  non-MDA  adaptors  provide 
for  more  than  one  video  page.  That 
is,  they  subdivide  the  total  display 
memory  into  4K  segments  (pages) 
and  allow  you  to  select  which  is  the 
active  display  buffer.  That’s  what  the 
video  library  function  setpaget ) 
does,  and  activepaget )  tells  which  it 
is.  Therefore,  with  a  non-MDA  adap¬ 
tor,  you  have  to  determine  the  active 
page  and  multiply  by  4,096  to  deter¬ 
mine  the  starting  offset  of  the  cur¬ 
rent  video  buffer  within  segment 
B800h.  (Or  you  can  fetch  this 
number  directly  from  40h:4Eh.) 

Once  you  know  the  size  and  loca¬ 
tion  of  the  active  page’s  video  buffer, 
you  can  allocate  a  node  of  that  size 
on  the  heap  and  use  Turbo  C’s  move- 
datat  i  function  to  save  the  display 
image.  You  can  then  create  a  pop¬ 
up.  Later,  simply  reverse  the  source 
and  destination  par  ameters  in  move- 
datat )  to  restore  the  screen  and 
make  the  pop-up  vanish. 

Restoring  the  screen  is  prob¬ 
lematic  with  the  IBM-brand  CGA, 


which  has  a  flawed  design  that  gen¬ 
erates  “snow.”  This  occurs  when 
you  write  directly  to  screen  memory 
in  text  mode.  Fortunately,  this  isn’t 
a  problem  with  the  Compaq  and 
most  non-IBM  CGA  clones.  The  solu¬ 
tion  with  the  IBM  board  involves 
synchronizing  character  movements 
with  the  video  controller.  (For  more 
detail  on  this  procedure,  see  Ray 
Duncan’s  Advanced  MS-DOS,  page 
79). 

The  saveScrnt )  and  restScrnt ) 
functions  near  the  end  of  POPWIN.I 
(Listing  One,  page  106)  furnish  some¬ 
what  simplified  versions  of  this  dis¬ 
cussion.  They  assume  that  the  active 
page  is  0 — the  default — and  that  an 
IBM-standard  adaptor  is  in  use. 
Also,  they  don’t  deal  with  snow. 

You  might  wish  to  take  a  more 
sophisticated  approach  to  saving 
screens.  In  the  demo  developed  for 
this  article,  I  never  save  more  than 
one  screen  image,  and  thus  I  simply 
copy  the  active  screen  buffer  to  and 
from  an  object  of  the  same  size  on 
the  heap. 

Real-world  situations  often  have  a 
hierarchy  of  pop-ups,  in  which  one 
leads  to  another,  to  another,  and  so 
on.  In  that  case,  you  build  a  stack 
(LIFO)  structure  on  the  heap  so  that 
you  can  retreat  back  down  the  hier¬ 
archy  in  an  orderly  manner.  A  circu¬ 
lar  doubly  linked  list  is  ideal  for  this 
purpose. 

In  summary,  then,  the  steps  for 
handling  pop-ups  are: 

•  Save  the  display  buffer  on  the 
heap. 

•  Write  the  pop-up. 

•  On  a  signal  from  the  user,  copy 
the  heap  image  back  to  the  display 
buffer,  thus  restoring  the  screen. 


54 


Dr.  Dobb's  Journal,  December  1987 

937 


Now  that  we’ve  solved  one  of  the 
big  mysteries  surrounding  pop-ups, 
let’s  discuss  the  abstraction  of  com¬ 
plexity. 

Visual  Objects  as 
Data  Structures 

Although  the  screen  is  a  free-form 
area  that  can  contain  literally  any¬ 
thing  you  want  it  to,  special  objects 
such  as  pop-ups  and  menu  bars 
always  conform  to  a  prescribed  set 
of  attributes.  That .  is,  they  have  a 
defined  number,  size,  location,  color 
combination  (foreground/back¬ 
ground),  border,  text  content,  and 
so  on. 

The  point  is  that  any  object  can 
be  defined  in  terms  of  two  charac¬ 
teristics:  its  appearance  and  its  text 
content.  Sometimes  it’s  appropriate 
to  combine  them  into  one,  and  some¬ 
times  it’s  not.  I’ll  consider  both 
cases. 

This  approach  suggests  the  use  of 
data  structures  and  accompanying 
routines  that  interpret  those  struc¬ 
tures  on  the  display.  On  that  basis, 
you  can  describe  visual  objects  by 
building  structures  and  leave  it  to 
the  routines  to  translate  them  into 
reality.  Let’s  prove  the  point  with  a 
discussion  of  menu  bars. 

Building  Menu  Bars 

The  menu  bar  has  become  a  staple 
of  interactive  software.  You  see  it 
everywhere:  Reflex,  Paradox,  Lotus 
1-2-3,  Turbo  C  itself.  It’s  that  row  of 
selections  across  the  top  of  the 
screen,  telling  you  the  categories  of 
things  you  can  do  with  the  program. 
Usually,  when  you  pick  a  selection, 
that  choice  leads  to  others  via  an¬ 
other  menu  bar  (the  Lotus  approach) 
or  a  pull-down  submenu,  which  I 
discuss  later  in  this  article. 

In  software  with  a  consistent  user 
interface,  the  menu  bar  always  has 
a  predefined  place  on  the  screen. 
Here  I  assume  it’s  at  the  top — row 
0 — but  your  own  application  might 
dictate  another  location,  in  which 
case  you’ll  have  to  modify  the 
menubar ( )  routine  to  suit.  A  menu 
bar  also  has  a  fixed  text  content.  If 
you  want  to  add  or  delete  selec¬ 
tions,  that’s  a  different  menu. 

The  advantage  of  a  routine  such 
as  menubarf )  (Listing  Two,  page  106) 
is  that  it  interprets  the  structure 
passed  to  it.  This  enables  you  to 


pass  different  structures  to  a 
common  function,  obtaining  the  de¬ 
sired  results  in  each  case.  The  previ¬ 
ous  menu  bar  is  replaced  by  the 
new  one,  effecting  an  instantaneous 
change  in  context. 

Given  that  all  menu  bars  in  an 
application  occupy  the  same  dis¬ 
play  row,  any  given  menu  bar  can 
be  described  in  terms  of  its  color 
scheme,  number  of  selections,  and 
text  content.  That  leads  to  the  defi¬ 
nition  of  a  menu  bar  specification 


With  multiple 
pop-up  windows , 
a  circular 
doubly  linked  list 
is  ideal  for 
screen  saving. 


as  the  C  structure: 
typedef  struct  { 

int  background,  foreground;  /* 

colors  V 

int  nsels;  /*  no.  of  selections  V 

char  *sel;  I*  pointer  to  text  content 

7 

}  MENUBARSPEC; 

You  might  want  to  use  different 
color  schemes  to  identify  various 
menu  bar  levels:  for  example,  top 
level  =  red  background,  middle 
level  =  magenta,  and  low  level  =  blue. 
Also,  it’s  likely  that  each  menu  will 
have  its  own  number  of  potential 
selections.  A  generalized  routine 
such  as  menubarf )  can  easily 
handle  these  variations,  simply  pluck¬ 
ing  them  from  the  structure. 

The  menubarf )  function  first  iden¬ 
tifies  its  environment  by  getting  the 
active  page  and  the  number  of  text 
columns  available  (40  or  80),  and 
then  it  builds  the  background/fore¬ 
ground  attribute  byte  with  a  call  to 
the  video  library’s  chattrf )  function 
(a  service  function  that  does  not  call 
the  ROM  BIOS  directly  or  indirectly) . 
Next,  it  clears  the  current  menu  bar 
row  and  sets  the  attribute  by  writing 


out  a  stream  of  spaces  for  the  full 
screen  width.  Finally,  it  writes  the 
text  of  the  menu  selections. 

The  menu  selections  are  specified 
in  a  separate  string,  a  pointer  to 
which  appears  in  the  menu  specifi¬ 
cation.  For  convenience,  the  string 
has  the  form: 

menutext  =  “sell\0sel2\0sel3\0seln\0” 

in  which  each  text  element  is  null- 
terminated.  This  is  easier  to  handle 
than  the  more  conventional  two- 
dimensional  array  of  characters,  as 
the  last  loop  in  menubarf )  illus¬ 
trates. 

Using  this  approach,  you  can 
define  a  menu  bar  by  first  initializ¬ 
ing  its  text  content,  then  its  appear¬ 
ance  and  a  pointer  to  the  text  con¬ 
tent  in  a  structure  of  type  MENUBAR¬ 
SPEC.  If  you  had  two  menu  levels, 
separately  defined,  you  could  dis¬ 
play  the  top  level  with  menubar 
f&firstLevel);  and,  on  the  appropri¬ 
ate  signal  from  the  user,  switch  to 
the  next  with  menubar  f&second- 
Level);. 

Because  menu  bars  tend  to  have 
a  fixed  appearance  and  content,  it’s 
easy  to  implement  them  using  this 
scheme.  Pop-ups  are  more  flexible 
and  consequently  a  little  more  de¬ 
manding.  Still,  you  can  create  a  wide 
variety  of  pull-down  menus  and  pop¬ 
up  windows  using  a  common  set  of 
routines  and  structures,  as  dis¬ 
cussed  next. 

Pop-Ups  and  Pull-Downs 

In  practice,  there’s  no  difference  be¬ 
tween  the  two.  A  pop-up  is  a 
window  that  can  appear  anywhere 
on  the  display.  A  pull-down  is  a 
pop-up  that  looks  as  though  it 
dropped  down  from  a  menu  bar 
selection.  The  distinction  is  purely 
one  of  location. 

Pop-ups  rely  on  the  ROM  BIOS 
window  routine  (function  6  under 
interrupt  lOh ).  Despite  its  compre¬ 
hensive-sounding  name,  the  BIOS 
routine  is  useful  for  only  two  things: 
scrolling  a  subset  of  the  screen  and 
filling  that  subset  with  a  background 
attribute  to  give  it  a  color.  User- 
written  routines  that  want  to  do 
more  than  this  will  have  to  provide 
code  to  overcome  its  limitations. 

POPWIN.I  (Listing  One)  furnishes 
generalized  routines  for  managing 


Dr.  Dobb's  Journal ,  December  1987 

938 


55 


GRAPHICS  TOOLBOX 

(continued  from  page  55) 

pop-up  windows,  relying  on  the  low- 
level  routines  from  the  video  library. 
This  is  an  # include  file,  and  there’s 
little  advantage  to  making  it  into  a 
linkable  library  because  any  window¬ 
ing  program  needs  all  the  functions 
it  contains. 

Like  a  menu  bar,  a  pop-up  can  be 
described  in  terms  of  its  characteris¬ 
tics,  which  in  this  case  are: 

•  the  coordinates  of  its  opposite  cor¬ 
ners  (giving  both  location  and  size) 

•  the  text  attribute  (background/fore¬ 
ground  colors) 

•  the  border  type  (single-  or  double- 
score) 

The  POPDESCR  structure  in 
POPWIN.I  provides  a  window  de¬ 
scriptor. 

The  static  array  bordl  11 1  gives  the 
ASCII  values  for  single-  and  double- 
score  text  characters  that  form  the 
window  border.  The  border  choices 
are  0  for  no  border  and  1  and  2  for 
single-  and  double-scores,  respec¬ 
tively. 


POPWIN.I  contains  four  functions 
for  directly  manipulating  windows: 

•  popMaket )  creates  and  displays 
the  blank  window  described  by  the 
structure  passed  to  it  as  an  argu¬ 
ment 

•  popScroll( )  scrolls  the  window's 
contents  upward  by  one  row 

•  popyyt )  positions  the  cursor  rela¬ 
tive  to  the  window’s  upper-left 
corner 

•  popPutsf )  writes  a  string  starting 
at  the  specified  cursor  position 

The  other  two  functions — 
saveScrnt )  and  restScrnf ) — save 
and  restore  the  display  image. 

To  use  these  routines,  you  must 
first  initialize  a  POPDESCR  structure 
for  each  window  you  plan  to  use. 
Thereafter,  simply  pass  a  structure 
address  to  the  pop  routines  as  you 
call  them.  Because  the  routines  do 
not  preserve  the  caller’s  cursor  posi¬ 
tion,  your  code  must  do  so  before 
the  calls. 

Note  that,  unlike  the  menu  bar 
structure,  POPDESCR  doesn’t  con¬ 
tain  a  pointer  to  the  window’s  text 
contents.  This  allows  you  to  use  a 


common  descriptor  for  several  win¬ 
dows  that  contain  variable  text:  con¬ 
text-sensitive  help  windows,  for  ex¬ 
ample.  Use  popPutst )  to  write  to  the 
window. 

The  popMaket)  function  opens  a 
window,  fills  it  with  the  assigned 
text  attribute,  draws  a  border 
around  it  (unless  the  structure's 
border  member  is  zero),  and  posi¬ 
tions  the  cursor  in  the  upper-left 
corner.  When  creating  a  new 
window,  call  this  function  first.  Note 
that  the  border  is  outside  the 
window's  defined  text  area;  that  is, 
if  the  window’s  upper-left  corner  is 
at  (12,  15),  the  left  side  of  the  box  is 
at  column  11  and  the  top  is  at  row 
14.  The  right  side  and  bottom  are 
similarly  one  unit  beyond  the  text 
area.  Thus,  don’t  put  a  window  at 
any  extreme  position  on  the  display, 
lest  the  border  be  partly  invisible  or 
partially  visible  in  an  unexpected 
place. 

The  popScrollt )  routine  shifts  the 
window’s  contents  upward  one  row 
by  calling  the  winScrolK )  function 
in  the  video  library,  passing  it  the 
boundaries  of  the  area  to  be  scrolled 
and  the  text  attribute  to  fill  the 
newly  opened  row  at  the  bottom. 
PopPutst )  uses  this  routine  to  guard 
against  text  overrunning  the  bottom 
of  the  window. 

Use  popjcyt )  to  position  the  cursor 
within  the  window,  using  the  upper- 
left  corner  as  the  origin.  No  matter 
where  the  window  is  physically  lo¬ 
cated,  popjcy(0,0,win)  refers  to  its 
upper-left  corner,  effecting  viewport 
text  coordinates.  This  function  is 
therefore  a  window-oriented  analog 
to  the  video  library’s  goto}ty( ). 

The  popPutst )  function  writes  text 
to  the  window,  starting  at  the  speci¬ 
fied  coordinates  with  respect  to  the 
window’s  upper-left  corner.  It  con¬ 
tains  safeguards  to  ensure  that  the 
text  does  not  get  outside  the 
window,  wrapping  if  the  text  tries 
to  go  beyond  the  right  side  and 
scrolling  if  it  tries  to  go  past  the 
bottom.  The  function  accommodates 
a  new  line  (VT)  embedded  in  the 
text,  but  it  does  not  have  formatting 
capabilities  like  print) T )  does.  If  you 
want  to  write  variables  to  a  window, 
use  sprintfr )  to  set  up  the  text  string, 
then  pass  the  results  to  popPutst ). 

The  functions  for  saving  and  re¬ 
storing  the  video  buffer  work  unpre- 


56 


Dr.  Dobb's  Journal,  December  1987 

939 


GRAPHICS  TOOLBOX 

(continued  from  page  56) 

dictably  with  IBM  display  adaptors 
(MDA,  CGA,  and  EGA)  but  work  nor¬ 
mally  with  Hercules  and  other  third- 
party  video  boards.  You  can  make 
them  more  intelligent  with  the  tech¬ 
niques  in  last  month's  installment. 
As  written,  these  functions  assume 
that  if  the  video  mode  is  7,  an  MDA 
(or  an  EGA  emulating  an  MDA)  is 
present  and  using  screen  memory 
at  segment  B000  and  that  otherwise 
a  color  device  is  using  the  video 
buffer  at  segment  B800. 

In  either  case,  because  these  rou¬ 
tines  service  text  displays,  they 
assume  a  4,096-byte  display 
memory.  SaveAreat )  allocates  a 
node  of  that  size  on  the  heap  and 
copies  the  video  buffer  to  it,  return¬ 
ing  a  pointer  to  the  node.  Afterward, 
you  can  create  a  pop-up.  To  make 
the  pop-up  disappear  later,  pass  the 
node  pointer  to  restScrn( ),  which 
moves  the  saved  image  back  into  the 
video  buffer  and  frees  the  heap 
space. 

Now  let’s  put  these  two  visual 
subsystems — menu  bars  and  pop- 
ups — to  work  in  a  demonstration. 

Tying  It  Together 

MENUDEMO.C  (Listing  Three,  page 
106)  writes  a  menu  bar  across  the 
top  of  the  screen  and  provides  two 
pull-down  submenus  on  successive 
keypresses.  Some  text  appears  under 
the  pull-downs  so  that  you  can  see 
how  overlaid  information  is  restored 
when  a  window  is  removed  from 
the  screen. 

The  pull-downs  are  described  by 
the  fdePop  and  editPop  structures 
and  their  text  contents  by  the 
strings  fdeMenu  and  editMenu.  For 
convenience,  the  strings  contain  em¬ 
bedded  new  lines  that  will  be  pro¬ 
cessed  by  popPuts( )  to  give  the  pull- 
downs  their  proper  appearance. 

The  initialization  steps  in  main( ) 
use  the  video  library’s  chattel )  func¬ 
tion  to  complete  the  fdePop  struc¬ 
ture,  then  copy  the  structure  to 
editPop.  The  location  of  editPop  is 
computed  with  reference  to  the 
second  entry  on  the  menu  bar,  and 
the  size  of  the  box  is  changed.  The 
program  is  now  ready  to  run. 

Setmode(3)  has  no  effect  on  an 
MDA,  but  it  places  other  adaptors 


in  color  mode  (an  EGA  with  a  mono¬ 
chrome  monitor  displays  the  colors 
as  intensities).  After  clearing  the 
screen,  the  program  writes  the 
menu  bar  and  prints  out  some  infor¬ 
mation  about  the  display. 

The  program  uses  oldy  and  oldy 
to  retain  the  cursor  position  before 
reporting  it.  This  is  so  that,  later,  a 
subsequent  output  line  will  overlay 
the  cursor  position  report. 

When  the  user  presses  a  key,  the 
first  pull-down  appears,  created  by 
popFileMenut ).  On  the  second 
keypress,  the  file  pull-down  disap¬ 
pears,  the  background  is  restored, 
and  popEditMenu  flashes  up  the  edit 
pull-down.  Two  additional  key¬ 
presses  restore  the  original  screen 
and  end  the  program. 

Note  that,  except  for  the  structure 
initialization,  it  takes  only  one  state¬ 
ment  to  display  a  menu  bar  and 
only  two  to  create  a  pop-up  (pop- 
MakeO  and  popPutsO).  These  calls 
exercise  a  good  deal  of  underlying 
code,  of  course,  but  this  code  re¬ 
mains  out  of  sight  in  the  #include 
files,  where  it  doesn’t  needlessly  clut¬ 
ter  the  program’s  listing. 

Full  Circle 

Obviously,  the  graphics  library  I’ve 
built  here  wasn’t  intended  to  be  a 
full-blown  graphics  toolkit;  rather,  it 
was  meant  to  provide  you  with  a 
skeletel  framework  to  build  upon. 
Use  the  techniques  illustrated  here 
to  build  more  powerful  visual  sub¬ 
systems  in  Turbo  C  that  satisfy  your 
specific  application  development 
needs. 

Availability 

All  the  source  code  for  articles  in 
this  issue  is  available  on  a  single 
disk.  To  order,  send  $14.95  to  Dr. 
Dobb’s  Journal,  501  Galveston  Dr., 
Redwood  City,  CA  94063,  or  call  (415) 
366-3600,  ext.  216.  Please  specify  the 
issue  number  and  format  (MS-DOS, 
Macintosh,  Kaypro). 

DDJ 

(Listings  begin  on  page  106.) 

Vbte  for  your  favorite  feature/article. 

Circle  Reader  Service  No.  5. 


Dr.  Dobb's  Journal,  October  1987 

940 


RAM-CACHE  MANAGER 


Listing  One  (Listing  continued,  te^ct  begins  on  page  30.) 


Listing  1 

/*  cache 

!  header  */ 

typedef 

struct  CACHE  { 

int 

reel; 

/• 

int 

maxr; 

/* 

long 

hits; 

/* 

long 

miss; 

/* 

long 

adds; 

/• 

int 

( *proc)  ()  ; 

/* 

long 

idnt; 

/* 

long 

*  nums ; 

/* 

char 

*recs; 

/* 

short 

•next; 

/* 

short 

‘prio; 

/* 

char 

•mark; 

!* 

short 

lru; 

/* 

short 

mru; 

/* 

)  CACDS; 

long 

cacamem;  /*  amount  of 

CACDS 

•cacallo (int,  int,  int 

char 

•cacmem  (int); 

char 

•cacnum  (struct 

CACHE 

char 

•cacold  (struct 

CACHE 

int 

cacf lsh (struct 

CACHE 

int 

cacnew  (struct 

CACHE 

char 

•cacfind (struct 

CACHE 

int 

cacproc (struct 

CACHE 

int 

cacunpc (struct 

CACHE 

int 

cacstat (struct 

CACHE 

int 

cacfree (struct 

CACHE 

record  length  */ 

number  of  records  in  cache  */ 

number  of  true  finds  for  cacfind() 

number  of  misses  for  cacfind()  */ 

number  of  calls  to  cacnewO  *1 

pointer  to  function  for  processing 

identifier  passed  to  proc()  */ 

numbers  of  records  */ 

pointer  to  first  record  •/ 

array  of  next  pointers  */ 

array  of  prior  pointers  */ 

array  of  process-record  markers  •/ 

oldest  index  •/ 

newest  index  */ 


*) 0 ,  long) ; 
long)  ; 


short)  ; 
long)  ; 
long)  ; 
long)  ; 

long  *,  long 


* ,  lono  *) ; 


End  Listing  One 


Listing  Two 

Listing  2 

/*  cache. c  -  memory  cache 
alan  deikman  12/86 

These  routines  handle  a  memory  cache  of  fixed  sized  records. 
All  memory  is  allocated  through  the  standard  library  routine 
malloc ()  . 


Each  routine  other  than  cacalloO  takes  as  the  first  parameter 
a  character  pointer  that  is  originally  returned  by  the  cacalloO 
routine . 


cacallo  () 
cacold { ) 
cacnuml) 
cacflshO 
cacfind() 
cacproc() 
cacunpc ( ) 
cacstat  () 
cacfree  () 


Allocate  cache 
Get  oldest  record 

Number  oldest  record  and  make  it  the  newest 

Process  all  marked  records 

Find  record  n  and  make  it  newest 

Mark  record  for  processing  when  freed 

Unmark  record  for  processing  when  freed 

Get  cache  statistics 

Free  cache 


♦include  <stdio.h> 
♦include  <malloc.h> 
♦include  "cache. h" 


/*  cacalloO  ~  Allocate  cache  */ 


CACDS  ‘cacallo (num,  reel, 
int  num; 

int  reel; 

int  ( *ext f )  () ; 

long  idnt; 

{ 

CACDS  *cac; 

char  ‘cacmeraO; 

int  i  -  0; 


extf,  idnt) 

/*  number  of  records  to  allocate  */ 

/*  length  of  each  record  */ 

/*  pointer  to  external  processing  function  */ 
/*  parameter  passed  to  external  function  */ 


/*  set  up  cac  structure  with  initial  values  */ 


if  (num  <  2)  num  -  2; 

cac  -  (CACDS  *)  cacraem (sizeof (CACDS) ) ; 

cac->recl  -  reel; 

cac->maxr  -  num; 

cac->proc  -  extf; 

cac->idnt  -  idnt; 

cac->hit a  - 

cac->miss  - 

cac->adds  -  0L; 

cac->num8  -  (long  *)  cacmem(num 
cac->next  -  (short  *)  cacmem(num 
cac->prio  -  (short  *)  cacraem (num 
cac->mark  -  cacraem (num); 
cac->recs  -  cacmem(num  *  reel); 

/*  initial  lru/mru  chain  */ 


sizeof (long) ) ; 
sizeof (short) ) ; 
sizeof (short) ) ; 


while  (i  <  num)  { 

cac->next[i]  -  i  +  1; 
cac->prio(ij  -  i  -  1; 
cac->nuras|i]  -  -1; 
cac->mark[i]  -  0; 
i++;  ) 


62 


Dr.  Dobb's  Journal,  December  1987 

941 


Listing  Two  (Listing  continued,  text  begins  on  page  30.) 

cac->next Inure  -  1]  -  cac->prio[0]  -  -1; 
cac->mru  -  0; 
cac->lru  -  nura  -  1; 

/•  return  cache  pointer  */ 

return  cac;  ) 


/*  allocate  memory  with  error  checking  */ 

char  *cacnem(siz) 
int  siz; 

{ 

char  *c; 

c  -  malloc (3iz) ; 
if  (c  —  (char  *)  0)  { 

fprintf (stderr,  "cacallo:  Can't  allocate  memory. \n") ; 
fprintf (stderr,  "Tried  to  get  %d  bytes  on  top  of  %ld  bytes  already\n", 
siz,  c^camem) ; 
exit ( 1 ) ;  } 
cacamem  +-  siz; 
return  c;  ) 

/*  number  oldest  record  and  make  it  the  newest,  if  the  record  was 
marked  for  exit  procesing,  call  external  processing  function, 
return  pointer  to  record  */ 

char  *cacnum(cac,  nura) 

CACDS  *cac;  /*  cache  header  * / 

long  num;  /*  number  new  MRU  record  */ 

{ 

char  *rec  -  cac->recs  +  (cac->lru  *  cac->recl) ; 

/ *  call  external  function  */ 

if  (cac->mark (cac->lruj  &&  cac->proc) 

(* (cac->proc) ) (cac->idnt,  cac->nums [cac->lru] ,  rec) ; 

/*  unmark  record  and  make  it  newest  */ 

cac->mark[cac->lru]  -  0; 
cac->adds++; 

cac->nums[cac->lru]  -  num; 
cacnew(cac,  cac->lru) ; 

/*  return  record;  ready  for  usage  */ 

return  rec;  } 

(continued  on  next  page) 


Dr.  Dobb’s  Journal,  December  1987 

942 


63 


RAM-CACHE  MANAGER 


Listing  Two  (Listing  continued,  tejct  begins  on  page  30.) 

/*  get  pointer  to  oldest  record  without  altering  age  */ 
char  *cacold(cac) 

CACDS  *cac;  /*  cache  header  */ 

{ 

return  cac->reca  +  (cac->lru  *  cac->recl) ;  ) 


/*  if  an  exit  processing  routine  has  been  defined,  process  all  marked 
records  */ 

cacflsh (cac) 

CACDS  *cac;  /*  cache  header  */ 

{ 

int  i; 

char  *rec; 

if  ( ! (cac->proc) )  return; 

for  (i  -  0;  i  <  cac->maxr;  i++)  if  (cac->mark [i] )  { 
rec  -  cac->recs  +  (i  *  cac->recl); 

(* (cac->proc) ) (cac->idnt,  cac->nums [i] ,  rec); 
cac->mark[i]  -  0;  ) 

return;  ) 


/*  make  record  newest  */ 
cacnew (cac,  rec) 

CACDS  *cac;  /*  cache  header  *7 

short  rec;  /*  record  to  make  newest  */ 

( 


/*  if  this  record  is  already  the  newest,  just  return  */ 
if  (rec  —  cac->mru)  return; 

/*  change  prior's  next  */ 

cac->next [cac->prio [rec] ]  -  cac->next [rec] ; 

/*  change  next's  prior  -  if  there  was  no  next  that  means  this  was  the 
lru  record,  change  lru  */ 

if  (cac->next [rec]  ! -  -1) 

cac->prio [cac->next [rec] ]  -  cac->prio [rec] ; 
else  { 

if  (rec  !-  cac->lru)  { 

fprintf (stderr,  "cacnew:  panic\n") ; 
exit (1 ) ;  } 

cac->lru  -  cac->prio [rec] ;  ) 

/*  now  the  record  is  out  of  the  chain,  stick  it  back  in  at  the  mru  end.  */ 


cac->prio[cac->raru]  -  rec; 

cac->next [rec]  -  cac->mru; 

cac->prio[rec]  -  -1; 

cac->mru  -  rec; 

/*  done  */ 


return;  ) 

/*  find  record  and  make  it  newest  */ 
char  *cacfind (cac,  num) 

CACDS  *cac;  /*  cache  header  */ 

long  num;  /*  record  number  to  look  for  *7 

{ 

int  i; 


for  (i  -  0;  i  <  cac->maxr;  i++)  if  (cac->nums [i]  —  num)  { 
cac->hits++; 
cacnew (cac,  i); 

return  cac->recs  +  (i  *  cac->recl) ;  } 


cac->miss++; 
return  (char  *)  0;  ) 

/*  mark  record  for  external  processing  */ 
cacproc(cac,  num) 

CACDS  *cac;  /*  cache  header  */ 

long  num;  /*  record  to  mark  */ 

( 

int  i; 

for  (i  -  0;  i  <  cac->maxr;  i++)  if  (cac->nums [i]  —  num)  ( 
cac->mark[i]  -  1; 
return;  ) 


return;  } 

/*  un-mark  record  for  external  processing  */ 
cacunpc(cac,  num) 

CACDS  *cac;  /*  cache  header  */ 

long  num;  /*  record  to  unmark  */ 

( 

int  i; 

for  (i  -  0;  i  <  cac->maxr;  i++)  if  (cac->nums [i]  —  num)  { 
cac->mark[i]  -  0; 
return;  ) 


return;  ) 

/*  get  statistics  of  the  cache  */ 


(continued  on  page  67) 


64 


Dr.  Dobb's  Journal,  December  1987 

943 


RAM-CACHE  MANAGER 


Listing  Two  (Listing  continued,  te^ct  begins  on  page  30.) 

cacstat(cac,  hit,  mis,  add) 

CACDS  *cac;  /•  cache  header  */ 

long  *hit,  *mis,  *add;  /*  values  to  return  */ 

{ 

•hit  -  cac->hits; 

•mis  -  cac->miss; 

•add  -  cac->adds; 
return;  ) 

/•  free  cache  */ 

cacfree  (cac) 

CACDS  *cac;  /•  cache  header  •/ 

{ 

cacflsh  (cac) ; 

cacamem  —  sizeof (CACDS)  +  (cac->maxr  •  sizeof (long) )  + 

(cac->maxr  *  2  •  sizeof (short) )  +  (cac->maxr)  + 

(cac->maxr  *  cac->recl) ; 
free (cac->recs) ; 
free (cac->prio) ; 
free (cac->next) ; 
free (cac->nums) ; 
free (cac) ; 

return;  ) 

End  Listing  Two 


Listing  Three 

Listing  3 

♦include  <stdio.h> 

♦include  "cache. h" 

CACDS  * cache; 
main()  { 

print f ("Cache  test  routine\n"); 

cache  -  cacallo(8,  128,  (char  •)  0,  1L) ; 

print f ("Memory  allocated  -  %ld\n\n",  cacamem); 

while  (ctest())  cprint(); 

exit (0) ;  ) 


int  ctest()  { 

int  opt,  rec; 
long  num; 

print f ( "\nl-old,  2—num,  3“find,  4—proc;  "); 
scanf("%d",  4opt) ; 

switch  (opt)  ( 
case  0:  return  0; 

caae  1:  printf  C'cacold  returns  %lx\n",  cacold (cache) ) ;  return  1: 
case  2:  print f ("enter  record:  ") ; 
scanf ("%ld",  inum); 

printf ("cacnum  returned  %lx\n",  cacnum(cache,  num) ) ; 
return  1; 

case  3:  printf ("input  record  to  find:  "); 
scanf ("%ld",  inum) ; 

Pr^n*-^ ("cacfind  returned  %lx\n",  cacfind (cache,  num)); 
return  1; 

case  4:  print f ( "input  record  to  process:  "); 
scanf ("%ld",  4num) ; 
cacproc (cache,  num) ; 
printf ("cacproc  called\n") ; 
return  la- 

otherwise:  return  1;  ) 
return;  ) 

cprint ()  { 

register  int  i; 


printf (“cache  print:  hits-%ld  miss-%ld  adds-%ld\n", 
cache— >hit s,  cache— >miss,  cache->adds) ; 


printf ("Block  Numbers  Next  Prior  Mark  LRU-%d 

Printf  - - -  - 

for  (i  -  0;  i  <  cache->maxr;  i++) 

printf ("%Sd  %7id  %Sd  »5d  %4d\n“,  i,  cache->nuras (i) , 

cache->next [i] , 
cache->prio (ij , 


MRU-%d\n",  cache->lru,  cache->mru) ; 


return;  ) 


cache->mark[i] ) ; 


End  Listings 


Dr.  Dobb's  Journal,  December  1987 

944 


67 


DOS  LOCATE  UTILITY 

Listing  One  ( Text  begins  on  page  38J 

BOOT. C 

1 

♦include  <stdio.h> 

2 

♦include  <io.h> 

3 

♦include  <stdlib.h> 

4 

♦include  <string.h> 

5 

♦include  <malloc.h> 

7 

♦include  "loc.h" 

8 

♦include  "externs.h" 

10 

11 

void  create  bootstrap {seg  list,  entry) 

12 

SEG  DESCRIPTOR  *seg  list  ; 

13 

unsTgned  char  ‘entry  ; 

14 

{ 

15 

unsigned  int  count  ; 

16 

unsigned  char  *ptr  ; 

17 

18 

SEG  DESCRIPTOR  ‘p,  *q  ; 

19 

20 

/* 

21 

This  function  sets  up  a  new  class  which  contains  the  bootstrap 

22 

code  to  the  program  entry  point.  The  bootstrap  segment  is 

23 

always  located  at  physical  address  FFFF0H  is  ROMable. 

24 

25 

The  bootstrap  record  is  appended  to  the  load  module  for  the 

26 

purpose  of  locate  processing. 

27 

*/ 

28 

29 

/*  Traverse  the  linked  list  to  the  end  */ 

30 

p  -  seg  list  ; 

31 

while  (p->next  !-  NULL) 

32 

p  -  p->next  ; 

33 

34 

/*  Allocate  the  memory  for  the  bootstrap  record  */ 

35 

if  ((q  -  (SEG  DESCRIPTOR  *)  mal loc ( si zeof  (*p) ) )  —  NULL)  { 

36 

perror(  FILE  )  ; 

37 

exit  (1)  ; 

38 

) 

39 

40 

/*  Append  the  bootstrap  record  to  the  end  of  the  list  */ 

41 

p->next  -  q  ; 

42 

43 

/*  Initialize  the  bootstrap  segment  descriptor  */ 

44 

strcpy (q->name,  "??BOOT")  ; 

45 

strcpy (q->class,  M (ABSOLUTE) M )  ; 

46 

q->vseg  -  Oxffff  ; 

47 

q->pseg  -  Oxffff  ; 

48 

q->offset  -  0x0000  ; 

49 

q->len  -  5  ; 

50 

q->inited  -  TRUE  ; 

51 

q->romable  -  TRUE; 

52 

q->symbols  -  0  ; 

53 

q->symbol  list  -  NULL  ; 

54 

q->next  -  NULL  ; 

55 

56 

/*  Allocate  RAM  and  build  the  reset  vector  code  using  a  far  jump  */ 

57 

ptr  -  malloc (q->len)  ; 

58 

*ptr  -  OxEA  ; 

59 

‘((unsigned  char  “) (ptr  +  1) )  -  entry  ; 

60 

61 

/*  Append  the  bootstrap  code  on  tail  of  the  load  module  */ 

62 

q->position  -  lseek (tmp  file,  0L,  SEEK  END)  ; 

63 

count  -  write (tmp  file,  ptr,  q->len)  ; 

64 

if  (count  ! -  q->len)  {' 

65 

perror (  FILE  )  ; 

66 

exit  (1)  ; 

67 

} 

68 

69 

free (ptr)  ; 

70 

return  ; 

71 

} 

End  Listing  One 

Listing  Two 

FILES 

C 

1 

♦include  <stdio.h> 

2 

♦include  <fcntl.h> 

3 

♦include  <sys\types .h> 

4 

♦include  <sys\stat.h> 

5 

♦include  <io.h> 

6 

♦include  <string.h> 

8 

♦include  "loc.h" 

9 

♦include  "externs . h“ 

10 

11 

♦define  F  OPEN  O  RDONLY  |  0  BINARY 

12 

♦define  F  CREATE  0  CREAT  |  O  TRUNC  |  O  RDWR  |  O  BINARY 

13 

14 

15 

void  open  file  system (input  file) 

16 

char  ‘input  file  ; 

17 

i 

18 

char  ‘create  str  -  "Can't  create  %s"  ; 

19 

char  ‘open  str  -  "Can't  open  %s"  ; 

20 

char  errmsg[MAX  LINE)  ; 

21 

char  ‘filename  ext  ; 

22 

23 

/* 

24 

This  module  is  responsible  for  openning  or  creating  all  of  the 

25 

files  used  by  this  utility. 

26 

27 

*/ 

(continued  on  page  73) 

68 


Dr.  Dobb’s  Journal,  December  1987 

945 


DOS  LOCATE  UTILITY 


Listing  Two 

(Listing  continued,  te?ct  begins  on  page  38.) 

28  /*  Perform  all  the  filename  processing  V 

29  strcpy (module_name,  strupr (input_f lie) )  ; 

30  strcpy (exe_f name,  module_name)  ; 

31  strcat (strcpy (map  fname,  exe  fname) ,  ".MAP")  ; 

32  “ 

33  if  ((config  —  FALSE)  ||  (strlen (conf ig_f name)  —  0)) 

34  strcat (strcpy (conf ig_fname,  exe_fname) ,  ".CFG")  ; 

35  else 

36  strupr (conf ig  fname)  ; 

37 

38  if  ( (hex_name  —  FALSE)  ||  (strlen (abs_fname)  --  0)) 

39  strcat (strcpy (abs_fname,  exe_fname) ,  ".HEX")  ; 

40  else 

41  strupr (abs  fname)  ; 

42 

43  strcat (strcpy (print_fname,  exe_fname) ,  ".LOC")  ; 

4  4  strcat  (exe  fname,  ".EXE" )  ; 

45 

46  /*  Create  the  temporary  file  used  for  segment  fixups  and 


47 

48 

49 

50 

51 

52 

53 

54 

55 

56 

57 

58 

59 

60 
61 
62 

63 

64 

65 

66 

67 

68 

69 

70 

71 

72 

73 

74 

75 

76 

77 

78 

79 

80 
81 
82 

83 

84 

85 

86 
07 
88 

89 

90 

91 

92 

93 

94 

95 

96 

97 

98 

99 
100 
101 
102 

103 

104 

105 

106 

107 

108 

109 

110 
111 
112 

113 

114 

115 

116 

117 

118 

119 

120 
121 
122 

123 

124 

125 

126 

127 

128 


strcpy (tmp_f name,  "LOCATE. $$$")  ; 

tmp_f ile  -  open (tmp_fname,  F  CREATE,  S  IWRITE)  ; 

if  (tmp_file  “■  -1)  { 

sprintf (errmsg,  create_str,  tmp_fname)  ; 
perror (errmsg)  ; 
exit  (1)  ; 

) 

/*  Create  the  absolute  output  file  */ 
abs_file  -  open (abs  fname,  F  CREATE,  S  IWRITE)  ; 
if  (abs_file  —  -1)  ( 

sprintf (errmsg,  create_str,  abs_fname)  ; 
perror  (errmsg)  ; 
exit(l)  ; 

} 

/*  Open  the  .EXE  file  */ 

exe_file  -  open (exe_f name,  F  OPEN)  ; 

if  (exe_file  —  -1)  ( 

sprintf (errmsg,  open_str,  exe_fname)  ; 
perror (errmsg)  ; 
exit  (1)  ; 

) 

/*  Create  the  locate  map  output  file  */ 
print_file  -  fopen (print_fname,  "wt")  ; 
if  (print  file  —  NULL)  { 

sprint? (errmsg,  open_str,  print_fname)  ; 
perror (errmsg)  ; 
exit(l)  ; 

} 

/*  Open  the  configuration  file  for  reading  */ 
config_file  -  fopen (conf ig_fname,  "rt")  ; 
if  (config_f ile  «  NULL)  { 

sprintf (errmsg,  open_str,  config_fname)  ; 
perror (errmsg)  ; 
exit  (1)  ; 

) 

/*  Open  the  linker  map  file  for  reading  */ 
map_file  -  fopen (map  fname,  "rt")  ; 
if  (map_f ile  —  NULL?  { 

sprintf (errmsg,  open_str,  map_fname)  ; 
perror (errmsg)  ; 
exit  (1)  ; 

} 

return  ; 


void  close  file  system  () 

{ 

char  errmsg (MAX_LINE)  ; 

char  *close_str  -  "Unable  to  close  %s"  ; 
char  *delete_str  -  "Unable  to  delete  %s"  ; 

/* 

This  function  is  responsible  for  shutting  down  the  file  system 
and  cleaning  up  the  temporary  files.  All  files  opened  for 
reading  are  closed  and  all  files  open  for  writing  are  closed 
(normal  exit)  and  possibly  deleted  (control-C  abort  event) . 

V 


/*  Close  the  link  map  */ 
if  (fclose (map_file)  !-  0)  { 

sprintf (errmsg,  close_str,  map_fname)  ; 
perror (errmsg)  ; 


/*  Close  the  locate  configuration  file  */ 
if  (fclose (config_file)  !-  0)  { 

sprint f (errmsg,  close_str,  conf ig_fname)  ; 
perror (errmsg)  ; 


/*  Close  the  .EXE  file  */ 
if  (close (exe_file)  -1)  ( 

sprintf (errmsg,  close_str,  exe_fname)  ; 
perror (errmsg)  ; 


(continued  on  nepct  page) 


Dr.  Dobb’s  Journal,  December  1987 

946 


73 


DOS  LOCATE  UTILITY 

Listing  Two  (Listing  continued,  text  begins  on  page  38.) 

129 

130 

/*  Close  the  locate  map  */ 

131 

if  (fclose (print  file)  ! —  0)  { 

132 

sprintf (errmsg,  close  str,  print  fname)  ; 

133 

perror (errmsg)  ; 

134 

} 

135 

136 

/*  Close  the  absolute  or  hex  object  module  */ 

137 

if  (close (abs  file)  —  -1)  { 

138 

sprintf (errmsg,  close  str,  abs  fname)  ; 

139 

perror (errmsg)  ; 

141 

142 

if  (user  abort  ■  TRUE)  { 

143 

/*  DeFete  the  locate  map  */ 

144 

if  (remove (print  fname)  ~  -1)  { 

145 

sprint f (errmsg,  delete  str,  print  fname)  ; 

146 

perror (errmsg)  ; 

147 

i 

148 

149 

/*  Delete  the  object  file  */ 

150 

if  (remove  (abs  fname)  ~  -1)  { 

151 

sprintf  (errmsg,  delete  str,  abs  fname)  ; 

152 

perror (errmsg)  ; 

154 

i 

155 

156 

/*  Close  and  then  delete  the  temporary  file  */ 

157 

if  (close (tmp  file)  —  -1)  ( 

158 

sprintf (errmsg,  close  str,  tmp  fname)  ; 

159 

perror (errmsg)  ; 

160 

161 

162 

if  (remove  (tmp  fname)  —  -1)  { 

163 

sprintf (errmsg,  delete  str,  tmp  fname)  ; 

164 

perror  (errmsg)  ; 

165 

166 

167 

return  ; 

168 

)  End  Listing  Two 

Listing  Three 

LOADEXE.C 

1 

•include  <stdio.h> 

2 

•include  <io.h> 

3 

•include  <stdlib.h> 

4 

•include  <string.h> 

5 

•include  <malloc.h> 

6 

•include  <dos.h> 

8 

•include  "loc.h" 

9 

•  include  "externs.h" 

10 

11 

12 

13 

char  *warn_str  -  “Warning:  Unable  to  locate  virtual  segment  %04X\n" 

? 

14 

char  ‘load  exe  file() 

15 

i 

16 

char  buf (128 )  ; 

17 

int  count,  i  ; 

18 

long  seek  pos  ; 

19 

unsigned  int  module  size,  pseg,  *reloc_ptr,  segment  ; 

20 

unsigned  int  read  sFze,  mem  size  ; 

21 

char  ‘load  addr,  ‘entry,  *str  ; 

22 

23 

EXE  HEADER  header; 

24 

SEG  DESCRIPTOR  *p  ; 

25 

26 

/* 

27 

This  function  reads  in  the  .EXE  file  and  performs  the  fixup  of 

any 

28 

segment  references. 

29 

*/ 

30 

31 

/*  Read  in  the  .EXE  file  header  information  */ 

32 

count  -  read (exe  file,  (char  *)  theader,  sizeof (header) )  ; 

33 

if  (count  !-  sizeof (header) )  { 

34 

perror (  FILE  )  ; 

35 

exit  (1)  ; 

36 

i 

37 

38 

/*  Exit  if  not  a  valid  .EXE  file  */ 

39 

if  (header .signature  !-  0x5A4D)  { 

40 

perror ("Not  an  .EXE  file”)  ; 

41 

exit  (1)  ; 

42 

) 

43 

44 

/*  Seek  to  the  start  of  the  load  module  */ 

45 

if  (lseek (exe  file,  (long)  header. header  size  *  16,  SEEK  SET)  — 

-1L)  { 

46 

perror (  FILE  )  ; 

47 

exit  (1)  ; 

48 

) 

49 

50 

/*  Compute  how  much  memory  can  be  allocated  for  reads  */ 

51 

mem  size  -  32  *  1024  ; 

52 

if  (mem  size  >  memavl()) 

53 

mem  size  -  memavl  ()  ; 

54 

55 

/*  Allocate  the  memory  */ 

56 

if  {(load  addr  -  malloc(mem  size))  —  NULL)  ( 

(continued  on  page  76) 

74 


Dr.  Dobb’s  Journal,  December  1987 

947 


DOS  LOCATE  UTILITY 


Listing  Three  (Listing  continued,  te?ct  begins  on  page  38.) 


57 

58 

59 

60 
61 
62 

63 

64 

65 

66 

67 

68 

69 

70 

71 

72 

73 

74 

75 

76 

77 

78 

79 

80 
81 
82 

83 

84 

85 

86 

87 

88 

89 

90 

91 

92 

93 

94 

95 

96 

97 

98 

99 
100 
101 
102 

103 

104 

105 

106 

107 

108 

109 

110 
111 
112 

113 

114 

115 

116 

117 

118 

119 

120 
121 
122 

123 

124 

125 

126 

127 

128 

129 

130 

131 

132 

133 

134 

135 


perror  ( _ FILE _ )  ; 

exit  (1)  ; 


while  (1)  { 

/*  Read  in  a  segment  of  the  load  module  */ 
read_size  -  read (exe_f ile,  load_addr,  mem_size)  ; 
if  (read  size  --  0)  { 

free (Toad_addr)  ; 
break  ; 

) 

/*  Write  it  back  out  to  the  temporary  file  */ 
count  -  write (tmp_f ile,  load_addr,  read_size)  ; 
if  (count  !-  read_size)  { 
perror (  FILE  )  ; 
exit  (1)  ; 

) 


/*  Find  the  relocation  list  */ 

if  (lseek (exe_file/  (long)  header. first  reloc  item,  SEEK  SET)  «  -1L) 

perror ( _ FILE  )  ; 

exit  (1)  ; 


/*  Perform  the  segment  fixups  on  the  temporary  file  */ 
for  (i  -  0;  i  <  header. reloc_i terns;  i++)  { 

/*  Read  in  a  relocation  item  */ 

count  -  read (exe_f ile,  (char  *)  ireloc  ptr,  sizeof (reloc  ptr) )  ; 
if  (count  !-  sizeof (reloc_ptr) )  ( 

perror {  FILE  )  ; 
exit  (1)  ; 

} 

/*  Compute  the  position  of  the  fixup  in  the  temporary  file  */ 

seek _ pos  ~  (long)  FP_SEG (reloc_ptr)  ; 

seek  Dos  -  seek  pos  *  16  +  FP_OFF (reloc  ptr)  ; 
if  (lseek (tmp_flle,  seek_pos,  SEEK_SET)  —  -1L)  { 

perror (  FILE  )  ; 
exit  (1)  ; 

) 

/*  Read  in  the  virtual  segment  from  the  fixup  */ 

count  -  read (tmp_f ile,  (char  *)  fisegment,  sizeof (segment) )  ; 

if  (count  !«  sizeof (segment ) )  { 

perror  ( _ FILE  )  ; 

exit  (1)  ; 

1 

/*  Perform  the  location  */ 

if  (locate_virtual_segment (segment ,  ipseg)  --  ERROR) 
fprintf (stderr,  warn_str,  segment)  ; 

segment  -  pseg  ; 

/*  Re-seek  back  to  the  fixup  */ 

if  (lseek (tmp_file,  seek_pos,  SEEK  SET)  —  -1L)  ( 

perror ( _ FILE  )  ;  “ 

exit  (1)  ; 

) 

/*  Write  the  physical  segment  number  to  the  fixup  */ 
count  -  write (tmp_f ile,  (char  *)  fisegment,  sizeof (segment) )  ; 
if  (count  !-  sizeof  (segment) )  ( 

perror (  FILE  )  ; 
exit  (1)  ; 

) 


/*  Process  the  program  entry  point  */ 

if  (locate  virtual_segment (header. code_seg_disp,  &pseg)  —  ERROR) 
fprint f  (stderr,  "Warning:  Unable  to  locate  entry  point\n")  ; 

FP_SEG (entry)  -  pseg  ; 

FP_0FF (entry)  -  header . initial_pc  ; 

return  entry  ; 


{ 


End  Listing  Three 


Listing  Four 

LOCATE. C 


1  finclude  <stdio.h> 

2  iinclude  <stdlib.h> 

3  Iinclude  <string.h> 

4  Iinclude  <signal.h> 

5 

6  Iinclude  "loc.h" 

7  Iinclude  "globals.h" 

8  Iinclude  "externs.h" 

9 

10  /* 

11  LOCATE  ***  MS-DOS  ROM  Utility 

^  Copyright  (C)  1987  Rick  Naro.  All  rights  reserved. 

14 

15  int  main(argc,  argv) 

16  int  argc; 

(continued  on  page  78) 


76 

948 


Dr.  Dobb's  Journal,  December  1987 


DOS  LOCATE  UTILITY 


Listing  Four  (Listing  continued,  te?ct  begins  on  page  38.) 


17 

18 

19 

20 
21 
22 

23 

24 

25 

26 

27 

28 

29 

30 

31 

32 

33 

34 

35 

36 

37 

38 

39 

40 

41 

42 

43 

44 

45 

46 

47 

48 

49 

50 

51 

52 

53 

54 

55 

56 

57 

58 

59 

60 
61 
62 

63 

64 

65 

66 

67 

68 

69 

70 

71 

72 

73 

74 

75 

76 

77 

78 

79 

80 
81 
82 

83 

84 

85 

86 

87 

88 

89 

90 

91 

92 

93 

94 

95 

96 

97 

98 

99 
100 
101 
102 

103 

104 

105 

106 

107 

108 

109 

110 
111 
112 

113 

114 

115 

116 

117 

118 

119 

120 
121 


char  *argv[J; 

{ 

char  *s,  *input_file  ; 
unsigned  char  *entry  point  ; 
int  i  ; 

/* 

This  is  the  root  module  of  the  locate  utility  and  it  controls  the 
sequencing  of  the  entire  location  process. 


/*  Install  a  Control-C  interrupt  handler  */ 
if  (signal  (SIGINT,  break_handler)  —  (int  (*)())  -1)  { 

fprintf  (stderr,  "Failure  to  install  break  handlerNn")  ; 
abort  ()  ; 

} 

/*  Build  a  command  line  string  using  argv[0]  through  argv(argc-l)  */ 

command_line [0]  -  '\0'  ; 

for  (i  -  0;  i  <  argc;  i++)  { 

strcat ( comma nd_line,  argv[i])  ; 
strcat (command  line,  "  H)  ; 

) 

/*  Test  if  the  user  needs  help  in  running  this  utility  */ 
if  (argc  —  1) 
help  -  TRUE  ; 

config_fname [0J  -  abs_fname[0]  -  print_fname [0]  -  ' \0’  ; 

/*  Process  each  argument  in  sequence  until  all  are  processed  */ 
while  ( — argc  >  0  &&  (*++argv) [0]  —  •-•)  { 

for  (s  -  argv[0]  +  1;  *s  !-  '\0';  s++)  { 

switch  (*s)  ( 

case  *b' : 

boot  rec  -  TRUE  ; 
brealc  ; 


case  ' c* : 

config  -  TRUE  ; 
if  (*++s) 

strcpy (config  fname,  s)  ; 
*s—  -  *\0‘  ; 
break  ; 


case  'h': 

hex  name  -  TRUE  ; 

if  7*++s) 

strcpy (abs  fname,  s)  ; 
*s—  -  'XO*  ; 
break  ; 

default : 

help  -  TRUE  ; 
argc  -  0  ; 
break  ; 

) 

) 

) 

input_file  -  argv[0]  ; 


if  (help  —  TRUE) 
fprintf  (stderr, 
fprintf (stderr, 
fprintf (stderr, 
fprintf (stderr, 
fprintf (stderr, 
fprintf (stderr, 
exit  (1)  ; 

) 


{ 

"\nUsage  is\n\n")  ; 

"\tlocate  switches  exefile\n\n")  ; 

"The  valid  switches  are:\n\n")  ; 

"\t%-14s  create  bootstrap  record\n",  *'-b")  ; 
"\t%-14s  configuration  filenameNn",  "-c[name]")  ; 
"\t%-14s  hex  filename\n",  "-h(name)")  ; 


fprintf (stderr,  "MS-DOS  Locate  Utility  -  Version  1.0\n")  ; 
fprintf (stderr,  "Copyright  (C)  1987  Rick  Naro.  ")  ; 
fprintf (stderr,  "All  rights  reserved\n\n")  ; 

/*  Open  and  create  the  files  used  in  the  location  process  */ 
open_file_system (input_file)  ; 

/*  Install  the  routine  to  shutdown  the  utility  gracefully  in  the 
event  of  an  error.  */ 
onexit (close_f ile_system)  ; 

/*  Build  the  segment  descriptor  list  using  the  link  map  */ 
seg_list  -  build_seg_list () ; 

/*  Process  the  locate  configuration  file  */ 
if  (process_locate_f ile (seg_list )  —  ERROR)  ( 

fprintf (stderr,  "Error  (s)  reading  the  locate  map\n")  ; 
exit  (1)  ; 

) 


/*  Convert  any  public  symbols  to  their  new  physical  addresses  */ 
read_symbol_t able (seg_l  1st)  ; 

/*  Read  the  load  module  and  perform  the  segment  fixups  */ 
entry jpoint  -  load_exe_f ile ()  ; 

/*  Add  a  bootstrap  record  if  enabled  on  the  command  line  */ 
if  (boot_rec  ~  TRUE) 

create_boot strap (seg_list,  entry  jpoint)  ; 

/*  Output  the  load  module  in  the  specified  format  */ 
output_hex_OMF (abs_file,  seg_list,  entry_point)  ; 

/*  Make  the  locate  map  containing  the  new  segment  assignments  */ 


78 


Dr.  Dobb’s  Journal,  December  1987 

949 


122 

print  statist ics (map  fname,  print  fname,  command  line,  exe  fname,  \ 

123 

abs_fname,  config_fname,  entry  point)  ; 

125 

exit (0)  ; 

126 

i 

127 

128 

129 

void  break  handler  () 

130 

{ 

131 

/* 

132 

The  break  handler  is  provided  to  catch  Ctrl-C  interrupts  from  the 

133 

user  and  perform  a  shutdown  of  the  program  in  a  graceful  manner. 

134 

*/ 

135 

136 

/*  Set  the  user  abort  flag  for  the  file  system  close  routine  */ 

137 

user  abort  -  TRUE  ; 

138 

exit (1)  ; 

139 

) 

End  Listing  Four 

140 

Listing  Five 

MISC. 

C 

1 

•include  <stdio.h> 

2 

•include  <string.h> 

3 

•include  <malloc.h> 

4 

•include  <dos.h> 

5 

•include  <errno.h> 

7 

•include  “loc.h" 

8 

•  include  "externs.h** 

10 

extern  int  errno  ; 

11 

12 

char  *get  mem (size) 

13 

unsigned  long  size  ; 

14 

{ 

15 

union  REGS  regs  ; 

16 

char  *p  ; 

17 

18 

/* 

19 

This  function  is  a  substitute  for  allocation  of  huge  arrays.  It 

21 

uses  DOS  system  calls  to  directly  allocate  up  to  64K  for  a  memory 
block.  * 

22 

*/ 

23 

(continued  on  next  page) 

Dr.  Dobb’s  Journal,  December  1987 

950 


79 


DOS  LOCATE  UTILITY 


Listing  Five  (Listing  continued,  text  begins  on  page  38.) 

24  regs.x.bx  -  (unsigned  int)  (size  /  16  +  1)  ; 

25  regs.h.ah  -  0x48  ; 

26  FP_SEG (p)  -  intdos (4regs,  tregs)  ; 

27  If  (regs.x. cf lag)  ( 

28  err no  -  ENOMEM  ; 

29  p  -  NULL  ? 

30  ) 

31  else 

32  FP  OFF (p)  -  0  ; 

33 

34  return  p  ; 

35  ) 

36 

37 

38  void  free_mem(p) 

39  char  *p 

40  { 

41  union  REGS  regs  7 

42  struct  SREGS  sregd  7 

43 

44  /* 

45  This  function  is  the  complement  of  get_mem()  in  that  it  releases 

4  6  any  memory  previously  acquired  with  get_mem()  . 

47  */ 

48 

49  sregs.es  -  FP_SEG  (p)  ? 

50  regs.h.ah  -  0x49  7 

51  intdosx (tregs,  &regs,  tsregs)  ; 

52 

53  return  7 

54  } 

55 

56 

57  int  assign_physical__segment (class,  seg) 

58  char  *class  7 

59  unsigned  int  seg  7 

60  ( 

61  int  error  -  ERROR  7 

62  SEG_DESCRIPTOR  *p  7 

63 

64  /* 

65  This  function  assigns  the  specified  class  name  the  physical 

66  segment  number.  The  first  segment  within  a  named  class  will 

67  have  an  offset  of  zero.  All  other  segments  have  a  segment 

68  offset  relative  to  the  first  segment  in  the  class, 

69  */ 

70 

71  p  -  seg_list  7 

72  while  (p  !-  NULL)  { 

73  if  (strcmp (p->class,  class)  —  0)  { 

74  p->pseg  +-  seg  ; 

75  p->inited  -  TRUE  ; 

76  error  -  OK  7 

77  } 

78  p  -  p->next  7 

79  } 

80  return  error  ; 

81  } 

82 

83 

84  int  get  next_segment (pclass,  cclass,  seg) 

85  char  "pclass  ; 

86  char  "cclass  ; 

87  unsigned  int  "seg  7 

88  { 

89  int  error  -  ERROR  7 

90  BOOLEAN  found  -  FALSE  ; 

91  SEG_DESCRIPTOR  "p,  *q,  "last  ; 

92 

93  /* 

94  This  function  returns  the  next  physical  segment  address 

95  available  for  use  by  CCLASS  (current  class)  after  PCLASS 

96  (previous  class) .  A  typical  use  is  to  force  the  concatenation 

97  of  independent  classes. 

98  V 

99 

100  /*  Search  the  class  list  for  the  occurrence  of  the  CCLASS  */ 

101  p  -  q  -  seg_list  ; 

102  while  (q  !-  NULL)  { 

103  if  (strcmp  (q->class,  cclass)  —  0)  { 

104  found  -  TRUE  ; 

105  break  ; 

106  } 

107  q  -  q->next  ; 

108  ) 

109 

110  if  (found  --  FALSE) 

111  return  ERROR  ;  /*  Error  if  it  can't  be  found  */ 

112 

113  /*  Search  for  PCLASS  and  then  to  the  end  of  PCLASS.  */ 

114  while  (p  !-  NULL)  { 

115  if  (strcmp  (p->class,  pclass)  —  0)  ( 

116  last  -  p  7 

117  while (strcmp (p->class,  pclass)  —  0)  ( 

118  last  -  p  7 

119  p  -  p->next  7 

120  ) 

121 

122  /*  Return  the  next  available  segment  and  adjust  the  segment 

123  value  for  an  overflow  if  necessary.  V 

124 


80 


Dr.  Dobb’s  Journal,  December  1987 

951 


125 

126 

127 

128 

129 

130 

131 

132 

133 

134 

135 

136 

137 
13  8 

139 

140 

141 

142 

143 

144 

145 

146 

147 

148 

149 

150 

151 

152 

153 

154 

155 

156 

157 

158 

159 

160 
161 
162 

163 

164 

165 

166 

167 

168 

169 

170 

171 

172 

173 

174 

175 

176 

177 

178 

179 

180 
181 
182 

183 

184 

185 

186 

187 

188 

189 

190 

191 

192 

193 

194 

195 

196 

197 

198 

199 

200 
201 
202 

203 

204 

205 

206 

207 

208 

209 

210 
211 
212 

213 

214 

215 

216 

217 

218 

219 

220 
221 
222 

223 

224 

225 


-  last->pseg  +  last->len  /  16  ; 

IT  (q->offset  <  (last->len  ♦  last->offset)  %  16) 
•sag  +*  1  ; 

return  OK  ; 

} 

p  -  p->next  ; 

) 

return  ERROR  ; 


int  dup_class (old_class,  new  class) 
char  *old_class  ; 
char  *new  class  ; 

< 

int  error  -  ERROR  ; 

SEG_DESCRIPTOR  *p,  *q,  *prev,  *head  ; 

/* 

Copies  the  contents  of  the  OLD  CLASS  entry  to  the  newly  created 
NEW_CLASS  entry. 


p  -  seg_list  ; 
while  (p  l-  NULL)  { 

if  (strcmp (p->class,  old_class)  —  0)  { 

prev  -  head  -  NULL  ; 
while  (p  !-  NULL)  { 

if  (strcmp (p->class,  old_class)  !-  0) 
break  ; 

/*  Create  the  new  segment  descriptor  */ 

( (q  >■  (SEG  DESCRIPTOR  *)  mal loc  (si zeof  (*q)))  —  NULL)  { 

perror  ( _ FILE  )  ; 

exit (1)  ; 

} 

if  (prev  —  NULL) 
head  -  q  ; 
else 

prev->next  -  q  ; 

/*  c°Py  the  contents  and  add  the  new  entry  to  the  list  */ 

*q  -  *p  ; 

strcpy  (q->class/  new  class)  ; 
q->next  -  NULL  ; 

prev  -  q  ; 
if  (p->next  —  NULL) 
break  ; 
else 

D  -  D->next  ; 

) 

while  (p->next  !-  NULL) 
p  “  p->next  ; 

p->next  -  head  ; 
return  OK  ; 

) 

p  "  p->next  ; 

} 

return  ERROR  ; 


int  rom_class (rom_class) 
char  *rom  class; 

( 

int  error  -  ERROR  ; 
SEG_DESCRIPTOR  *p  ; 


Sets  the  romable  field  for  the  specified  class  to  TRUE  permitting 
the  output  of  the  segment  in  the  absolute  object  file. 


p  -  seg__list  ; 
while  (p  !-  NULL)  ( 

if  (strcmp (p->class,  rom_class)  --  0)  ( 

p-> romable  -  TRUE  i 
error  -  OK  ; 

} 

p  -  p->next  ; 

} 

return  error  ; 


int  locate_virtual_segment  (vseg,  pseg) 
unsigned  int  V3eg; 
unsigned  int  *pseg; 

( 

SEG_DESCRIPTOR  *p  ; 

/* 

Finds  an  initialized  segment  with  the  specified  virtual  segment 
number  and  returns  the  corresponding  physical  segment  number. 


p  -  seg_list  ; 


(continued  on  nepct  page) 


Dr.  Dobb’s  Journal,  December  1987 

952 


81 


DOS  LOCATE  UTILITY 

Listing  Five  (Listing  continued,  text  begins  on  page  38.) 

226 

while  {p  !-  NULL)  { 

227 

if  (p->inited  —  TRUE  44  p->vseg  —  vseg  44  p->len  !-  0)  { 

228 

*pseg  -  p->pseg  ; 

229 

return  OK  ; 

230 

i 

231 

p  -  p->next  ; 

232 

) 

233 

return  ERROR  ; 

234 

i 

End  Listing  Five 

Listing  Six 

OUTHEX.C 

1 

•include  <stdio.h> 

2 

•include  <stdlib.h> 

3 

•include  <io.h> 

4 

•include  <string.h> 

5 

•include  <malloc.h> 

6 

•include  <dos.h> 

8 

•  include  "loc.h** 

9 

10 

11 

•include  “externs.h" 

12 

void  output  hex  OMF (hex  file,  seg  list,  entry_point) 

13 

int  hex  file  ; 

14 

SEG  DESCRIPTOR  *seg  list  ; 

15 

unsigned  char  *entry  point  ; 

17 

unsigned  int  offset,  i,  count  ; 

18 

unsigned  char  *seg  start,  *text  ; 

19 

20 

21 

SEG_DESCRIPTOR  *p  ; 

22 

/* 

23 

This  function  controls  the  sequencing  of  the  Intel  extended  hex 

24 

output  using  the  Intel  hex  output  routines. 

25 

26 

*/ 

27 

/*  Run  through  the  segment  list  and  output  all  ROMable  segments  */ 

28 

p  -  seg  list  ; 

29 

while  (p  !-  NULL)  { 

30 

31 

if  (p->romable  —  TRUE)  { 

32 

/*  Allocate  enough  memory  to  hold  the  segment  (up  to  64K)  */ 

33 

if  ((text  -  get  mem ( (unsigned  long)  p->len) )  —  NULL)  { 

34 

perror (  FILE  )  ; 

35 

exit  (1)  ; 

36 

37 

i 

38 

/*  Locate  the  position  of  the  segment  in  the  load  module  file  */ 

39 

if  (lseek (tmp  file,  p->position,  SEEK  SET)  —  -1L)  ( 

40 

perror  (  FILE  )  ; 

41 

exit  (1)  ; 

42 

43 

i 

44 

/*  Read  in  the  segment  and  pad  with  zero  if  necessary  */ 

45 

count  -  read  (tmp  file,  text,  p->len)  ; 

46 

if  (count  ! -  p->Ten)  { 

47 

if  (count  —  -1)  { 

48 

perror (  FILE  )  ; 

49 

exit  (1)  ; 

) 

50 

51 

else 

52 

53 

54 

memset (text  +  count,  '\0',  p->len  -  count)  ; 

55 

/*  Write  the  segment  number  out  in  an  address  record  */ 

56 

write_ADDR_record (hex_file,  p->pseg)  ; 

58 

/*  Output  the  segment  as  a  series  of  16  byte  data  records  */ 

59 

offset  -  p->offset  ; 

60 

seg  start  -  text  ; 

61 

for  (1  -  0;  i  <  p->len  /  16;  i++)  I 

62 

write  DATA  record (hex  file,  offset,  seg  start,  16)  ; 

63 

offset  +-  16  ; 

64 

seg  start  +-  16  ; 

65 

66 

} 

67 

/*  Handle  any  remaining  data  */ 

68 

if  { (p->len  %  16)  1-  0) 

69 

70 

write_DATA_record (hex_file,  offset,  seg_start,  p->len  %  16)  ; 

71 

free  mem  (text)  ; 

72 

t 

73 

p  •  p->next  ; 

74 

75 

i 

76 

/*  Write  the  START  and  EOF  records  */ 

77 

write  START  record  (hex  file,  entry_point)  ; 

78 

write  EOF  record  (hex  fl"le)  ; 

79 

80 

return  ; 

(continued  on  next  page) 

953 


DOS  LOCATE  UTILITY 


Listing  Six  (Listing  continued,  te^t  begins  on  page  38.) 

81  } 

82 

83 

84  void  write  ADDR_record (file,  usba) 

85  int  file  ; 

86  unsigned  int  usba  ; 

87  { 

88  unsigned  char  buf [16],  *p  ; 

89  unsigned  char  len_field  -  2  ; 

90 

91  /* 

92  This  function  writes  an  Intel  extended  hex  Address  record  to  the 

93  output  file.  Inputs  are  the  file  handle  and  the  USBA  (segment 

94  base  address) . 

95  */ 

96 

97  p  -  buf  ; 

98  *p++  -  high  byte (usba)  ; 

99  *p++  -  low_Eyte (usba)  ; 

100 

101  output_hex_record (file,  ADDR_RECORD,  0,  buf,  p  -  buf)  ; 

102 

103  return  ; 

104  ) 

105 

106 

107  void  write_EOF_record (f ile) 

108  int  file  ; 

109  ( 

110  /* 

111  This  function  writes  an  Intel  extended  hex  EOF  record  to  the 

112  output  file. 

113  V 

114 

115  out put _hex_re cord (file,  EOF_RECORD,  0,  NULL,  0)  ; 

116  return  ; 

117  ) 

118 

119 

120  void  write_DATA_record (file,  offset,  text,  len) 

121  int  file  ; 

122  unsigned  int  offset  ; 

123  unsigned  char  *text  ; 

124  unsigned  int  len  ; 

125  { 

126 

127  /* 

128  This  function  writes  an  Intel  extended  hex  Data  record  to  the 

129  specified  output  file. 

130  */ 

131 

132  output_hex_record (file,  DATA_RECORD,  offset,  text,  len)  ; 

133 

134  return  ; 

135  ) 

136 

137 

138  void  write_START_record (file,  entry) 

139  int  file  ; 

140  unsigned  char  *entry  ; 

141  { 

142  unsigned  char  *buf,  *p  ; 

143  unsigned  int  count  ; 

144 

145  unsigned  char  len_field  -  4; 

146  unsigned  int  addr_field  -  0  ; 

147  unsigned  char  rec_type  -  START_RECORD  ; 

148 

149  /* 

150  This  function  writes  an  Intel  extended  hex  Start  record  to  the 

151  output  file. 

152  */ 

153 

154  /*  Allocate  some  memory  to  build  the  data  field  in  */ 

155  if  ((p  -  buf  -  (unsigned  char  *)  malloc(32))  —  NULL)  ( 

156  perror  ( _ FILE _ )  ; 

157  exit  (1)  ; 

158  ) 

159 

160  /*  Store  the  start  address  in  the  data  field  (segment  first)  */ 

161  *p++  -  high  byte(FP  SEG (entry))  ; 

162  *p++  -  low_Eyte (FP_SEG (entry) )  ; 

163 

164  /*  And  then  the  offset  */ 

165  *p++  -  high  byte (FP_OFF (entry) )  ; 

166  *p++  -  low_Eyte (FP_OFF (entry) )  ; 

167 

168  /*  Output  the  record  */ 

169  output_hex_record (file,  START_RECORD,  0,  buf,  p  -  buf)  ; 

170 

171  free (buf)  ; 

172  return  ; 

173  } 

174 

175 

176  void  out put__hex_re cord  (file,  type,  addr,  data,  length) 

177  int  file  ; 

178  unsigned  char  type  ; 

179  unsigned  int  addr  ? 

180  unsigned  char  *data  ; 

181  unsigned  char  length  ; 

182  { 

183  char  *p,  *buf  ; 

184  unsigned  int  size,  count  ; 

185  unsigned  char  chksum,  digit  ;  (continued  On  page  86) 


84 

954 


Dr.  Dobbs  Journal,  December  1987 


DOS  LOCATE  UTILITY 

Liisuns  si x  Lisuns  continued,  text  oesins  on  Daze  ja.y 

186 

187 

/* 

188 

This  function  does  all  of  the  work  of  writing  an  Intel  extended 

189 

hex  output  record.  The  inputs  to  this  routine  are: 

190 

file  output  file  handle 

191 

type  record  type 

192 

addr  address  field  value 

193 

data  data  field  contents 

194 

length  size  of  the  data  field 

195 

*/ 

196 

197 

/‘  Allocate  some  memory  to  build  the  output  record  in  */ 

198 

if  { (p  -  buf  -  malloc  (550)  )  —  NULL)  { 

199 

perror  (  FILE  ) ; 

200 

exit  (1)  ; 

201 

i 

202 

203 

Build  the  prefix  for  the  data  field  */ 

204 

p  +-  sprint f(p,  ":%02X%02X»02X%02X",  length,  high  byte (addr) ,  \ 

205 

low  byte (addr),  type)  ; 

206 

207 

/‘  Compute  the  checksum  on  the  prefix  */ 

208 

chksum  -  length  +  high  byte (addr)  +  low  byte (addr)  +  type  ; 

209 

210 

/*  Build  the  data  field  byte  by  byte  */ 

211 

while  (length — )  { 

212 

digit  -  (‘data  »  4)  &  OxOf  ? 

213 

*p++  -  (digit  >  9)  2  digit  +  0x37  :  digit  +  ‘O'  ; 

214 

digit  -  ‘data  &  OxOf  ; 

215 

*p++  -  (digit  >  9)  ?  digit  +  0x37  :  digit  +  'O'  ; 

216 

chksum  +«  ‘data++  ; 

217 

} 

218 

219 

/*  Compute  the  complement  of  the  checksum  and  output  ‘/ 

220 

chksum  -  -chksum  +  1  ; 

221 

p  +-  sprintf(p,  "%02X\r\n", chksum)  ; 

222 

223 

/*  Compute  the  size  of  the  output  record  and  output  */ 

224 

size  -  p  -  buf  ; 

225 

count  -  write (file,  buf,  size)  ; 

226 

if  (count  !-  size)  { 

227 

perror (  FILE  )  ; 

228 

exit  (1)  ; 

229 

i 

230 

231 

free  (buf)  ; 

End  Listing  Six 

232 

return  ; 

233 

) 

Listing  Seven 

PRINT LOC.C 

1 

♦include  <stdio.h> 

2 

♦include  <stdlib.h> 

3 

♦include  <string.h> 

4 

♦include  <time.h> 

6 

♦include  "loc.h" 

7 

8 

♦include  "externs.h" 

9 

10 

int  print  statist ics (map  filename,  stat  filename,  command  line,  \ 

11 

exename,  output  file,  configname,  entry_point) 

12 

char  ‘map  filename  ; 

13 

char  ‘stat  filename  ; 

14 

char  ‘command  line  ; 

15 

char  ‘exename  ; 

16 

char  ‘output  file  ; 

17 

char  ‘configname  ; 

18 

unsigned  char  *entry_point  ; 

20 

unsigned  long  temp  ; 

21 

long  ltime  ; 

22 

int  i  ; 

23 

SEG  DESCRIPTOR  *p  ; 

24 

SYMBOL  LIST  ‘q  ; 

25 

26 

/* 

27 

This  function  generates  the  locate  map  file.  The  locate  map 

28 

file  contains  the  segment  information  with  the  physical 

29 

segment  addresses. 

30 

*/ 

31 

32 

fprintf (print  file,  "MS-DOS  Locate  Utility  Version  1.0\n\n")  ; 

33 

fprintf (print  file,  "Input  File:  %s\n",  exename)  ; 

34 

fprintf (print  file,  "Output  File:  %s\n",  output  file)  ; 

35 

fprintf (print  file,  "Configuration  File:  %s\n",  configname)  ; 

36 

fprintf (print  file,  "Invoked  by:  %s\n",  command  line)  ; 

37 

38 

time  (&ltime)  ; 

39 

fprintf (print  file,  "Date/Time:  %s\n",  ct ime (&lt ime) )  ; 

40 

41 

/‘  Display  the  located  segment  information  */ 

42 

fprintf (print  file,  "Segment  Informat ion\n" ) ; 

43 

fprintf (print  file,  "%-16s%-16s%-12s%-12s\n",  "Name",  "Class", 

44 

"Address",  "Length"); 

45 

46 

p  -  seg  list  ; 

47 

while  (p  !-  NULL)  { 

48 

temp  -  (unsigned  long)  p->pseg  ; 

49 

temp  -  temp  *  16  +  p->offset  ; 

50 

fprintf (print  file,  "%-16s%-16s%051XH%10 . 04XH\n" ,  p->name. 

51 

p->class,  temp,  p->len)  ; 

52 

p  -  p->next  ; 

53 

i 

86 


Dr.  Dobb's  Journal,  December  1987 

955 


54 

55 

/*  Display  the  symbol  information  */ 

56 

fprintf (print  file,  "\n\nPublic  Symbols\n") ; 

57 

i  -  0  ; 

58 

p  -  seg  list  ; 

59 

while  (p  !-  NULL)  { 

60 

q  -  p-> symbol  list  ; 

61 

while  (q  !-  NULL)  { 

62 

fprintf  (print  file,  "%04X:%04X  %-16s%sM,  p->pseg. 

63 

q->value,  q->name,  i++  ?  "\nM  :  "\t\t")  ; 

64 

i  %-  2  ; 

65 

q  -  q->next  ; 

66 

i 

67 

p  -  p->next  ; 

68 

) 

69 

70 

fprintf  (print  file,  M%sEntry  Point  -  %p\n'*,  (i  —  1)  ?  "\n\n"  :  H\n“, 

71 

entry  point)  ; 

72 

73 

74 

return  ; 

) 

End  Listing  Seven 

Listing  Eight 

READCFG.C 

1 

•include  <stdio.h> 

2 

•include  <stdlib.h> 

3 

•include  <string.h> 

5 

•include  "loc.h" 

6 

•include  "externs.h" 

8 

9 

int  process_class_keyword () ,  process_order_keyword () ,  process_null ()  ; 

int  process_rom_keyword  () ,  process_dup_keyword () ,  process_comment ()  ; 

11 

12 

static  struct  CFG  COMMANDS  ( 

13 

char  *cmd  ; 

14 

int  (‘command)  ()  ; 

15 

)  cfg  cmds[]  »  { 

16 

“CLASS" ,  process  class  keyword. 

17 

"ORDER",  process  order  keyword. 

18 

"ROM",  process  rom  keyword. 

19 

"DUP",  process  dup  keyword, 

20 

process  comment 

21 

)  ; 

22 

23 

24 

int  process_locate_file (seg_list) 

(continued  on  ne?ct  page) 

956 


DOS  LOCATE  UTILITY 

Listing  Eight  (Listing  continued,  text  begins  on  page  38.) 

25 

SEG  DESCRIPTOR  *seg  list  ; 

26 

{ 

27 

int  i,  error  -  OK; 

28 

char  *tok,  *buf  ; 

29 

30 

/* 

31 

This  function  reads  the  configuration  file  and  performs  the  parsing 

32 

and  control  transfer  to  routines  which  perform  the  desired  action. 

33 

*/ 

34 

35 

/*  Allocate  some  memory  for  the  line  buffer  */ 

36 

if  ((buf  -  malloc  (256)  )  —  NULL)  { 

37 

perror  (  FILE  )  ; 

38 

exit  (1)  ; 

39 

} 

40 

41 

/*  Read  and  categorize  a  token  from  the  configuration  file  */ 

42 

while  (fgets(buf,  256,  config  file)  !-  NULL)  { 

43 

/*  Extract  the  first  token  (read  the  next  line  if  none  is  found  */ 

44 

if  ( (tok  -  strtok (buf ,  "  \t\n"))  —  NULL) 

45 

continue  ; 

46 

47 

for  (i  -  0;  i  <  dim(cfg  cmds) ;  i++)  { 

48 

if  (stricmp(cfg  cmdsTil.cmd,  tok)  —  0)  { 

49 

error  -  (*cfg  cmds ( i ]. command)  ()  ; 

50 

break  ; 

51 

i 

52 

) 

53 

54 

if  (i  —  dim(cfg  cmds))  { 

55 

fprintf (stderr,  "Illegal  input  -  <%s>\n",  tok)  ; 

56 

exit  (1)  ; 

57 

) 

58 

i 

59 

free (buf)  ? 

60 

return  error  ; 

61 

i 

62 

63 

64 

int  process  comment  () 

65 

{ 

66 

return  OK  ; 

67 

i 

68 

69 

70 

int  process  class  keyword () 

71 

i 

72 

char  *tok,  name [32],  *p  ; 

73 

unsigned  int  seg  ; 

74 

75 

/* 

76 

This  function  parses  the  remainder  of  the  CLASS  directive. 

77 

V 

78 

79 

/*  Read  the  class  name  */ 

80 

strcpy(name,  strupr (strtok (NULL,  "  \t\n-")))  ; 

81 

82 

/*  Verify  that  an  equal  sign  is  present  */ 

83 

if  (strcmp((tok  -  strtok (NULL,  "  \t\n") )  ,  "-**)  !-  0)  { 

84 

fprintf (stderr,  expected  instead  found  <%s>\n“,  tok)  ; 

85 

return  ERROR  ; 

86 

) 

87 

88 

/*  Read  the  segment  number  for  the  class  */ 

89 

tok  -  strtok (NULL,  "  \t\n")  ; 

90 

seg  -  (unsigned  int)  strtol(tok,  &p,  0)  ; 

91 

if  <*p>  i 

92 

fprintf (stderr,  "Unrecognized  token  <%s>\n",  p)  ; 

93 

return  ERROR  ; 

94 

i 

95 

96 

/*  Assign  the  physical  segment  number  to  the  specified  class  */ 

97 

if  (assign  physical  segment (name,  seg)  —  ERROR)  { 

98 

fprintf (stderr,  "Undefined  class  <%s>\n",  name); 

99 

return  ERROR; 

100 

) 

101 

102 

return  OK  ; 

103 

) 

104 

105 

106 

int  process  order  keyword  () 

107 

( 

108 

char  *tok,  pclass[32],  class[32]  ; 

109 

unsigned  int  next  seg  ; 

110 

BOOLEAN  found  -  FALSE  ; 

111 

112 

113 

/* 

114 

This  function  processes  the  ORDER  directive. 

115 

V 

116 

117 

/*  Read  the  leading  class  name  from  the  command  */ 

118 

strcpy  (pclass,  strupr  ( (tok  -  strtok (NULL,  "  \t\n"))))  ; 

119 

120 

/*  Process  the  remaining  class  names  in  the  command  */ 

121 

while  ((tok  -  strtok (NULL,  "  \t\n"))  !-  NULL)  { 

122 

if  (*tok  —  * ;  *) 

123 

break  ; 

124 

125 

found  -  TRUE  ; 

126 

strcpy (class,  strupr (tok))  ; 

127 

128 

/*  Compute  the  segment  address  for  this  class  to  be  made 

129 

contiguous  with  the  previous  class  */ 

(continued  on  page  90) 

130 

if  (get_next_segment (pclass,  class,  &next_seg)  —  ERROR)  ( 

88 


Dr.  Dobbs  Journal,  December  1987 

957 


s 


DOS  LOCATE  UTILITY 


Listing  Eight  (Listing  continued,  te\t  begins  on  page  38.) 


131  fprlntf  (stderr,  "Undefined  class  <%s>\n",  pclass)  ; 

132  return  ERROR  ; 

133  } 

134 

135  /*  Assign  the  computed  segment  number  to  the  class  */ 

136  if  (assign  physical_segment (class,  next_seg)  —  ERROR)  { 

137  fprintflstderr,  "Undefined  class  <%s>\n",  class)  ; 

138  return  ERROR  ; 

139  ) 

140 

141  /*  Setup  to  process  the  next  class  */ 

142  strcpy (pclass,  class)  ; 

143  } 

144 

145  return  (found  —  FALSE)  ?  ERROR  :  OK  ? 

146  ) 

147 

148 

149  int  process_dup_keyword () 

150  { 

151  char  *tok,  old  class  [32],  new  class[32]  ; 

152 

153 

154  /* 

155  This  function  is  responsible  for  processing  the  DUP  directive. 

156  */ 

157 

158  /*  Read  the  existing  class  name  * / 

159  strcpy (old  class,  strupr((tok  -  strtok (NULL,  “  \t\n")))J  ; 

160 

161  /*  Read  the  name  of  the  class  to  be  created  */ 

162  strcpy (new  class,  strupr((tok  -  strtok  (NULL,  "  \t\nM))))  ; 

163 

164  /*  Duplicate  the  existing  class  */ 

165  if  (dup_class (old_class,  new  class)  —  ERROR)  ( 

166  fprintf (stderr,  "Undefined  class  <%s>\n",  old_class) ; 

167  return  ERROR  ; 

168  } 

169 

170  return  OK  ; 

171  } 

172 

173 

174  int  process_rom_keyword () 

175  { 

176  char  *tok,  class[32]  ; 

177 

178  /* 

179  This  function  processes  the  ROM  keyword  and  marks  all  specified 

180  classes  as  ROMable. 

181  */ 

182 

183  /*  Read  all  of  the  tokens  on  the  line  */ 

184  while  ((tok  -  strtok  (NULL,  "  \t\n"))  !-  NULL)  { 

185  if  (*tok  ~  •;•) 

186  break  ; 

187 

188  strcpy (class,  strupr(tok))  ; 

189 

190  if  (rom_class (class)  --  ERROR)  ( 

191  fprintf (stderr,  "Undefined  class  <%s>\n",  class)  ; 

192  return  ERROR  ; 

193  ) 

194  ) 

195 

196  return  OK  ; 

197  }  End  Listing  Eight 


Listing  Nine 

READMAP.C 


1 

2 

3 

4 

5 

6 

7 

8 
9 

10 

11 

12 

13 

14 

15 

16 

17 

18 

19 

20 
21 
22 

23 

24 

25 

26 

27 

28 

29 

30 

31 


finclude  <stdio.h> 
•include  <stdlib.h> 
•include  <string.h> 
•include  <malloc.h> 

•include  "loc.h" 
•include  "externs.h" 

•define  BUFSIZE  256 


SEG_DESCRIPTOR  *build_seg_list () 


int  count  ; 

unsigned  long  start_seg,  end_seg, 
char  seg  name [32],  class [32]  ; 
char  *buf  ; 


length; 


SEG_DESCRIPTOR  *p,  ‘previous,  Mist_start,  *class_start  ; 


Tt!1S1fUnCtion  resP°nsible  for  the  processing  of  the  link  map. 
The  link  map  is  read  and  the  segment  information  such  as  segment 
name,  segment  length  and  class  name  are  recorded. 


/*  Seek  to  the  beginning  of  the  file  */ 
if  (f seek (map_f lie,  0L,  SEEK  SET)  !-  0)  { 

perror  (  FILE  )  ; 
exit  (1)  ; 


90 

958 


Dr.  Dobb’s  Journal ,  December  1987 


32 

33 

34 

35 

36 

37 

38 

39 

40 

41 

42 

43 

44 

45 

46 

47 

48 

49 

50 

51 

52 

53 

54 

55 

56 

57 

58 

59 

60 
61 
62 

63 

64 

65 

66 

67 

68 

69 

70 

71 

72 

73 

74 

75 

76 

77 

78 

79 

80 
81 
82 

83 

84 

85 

86 

87 

88 

89 

90 

91 

92 

93 

94 

95 

96 

97 

98 

99 
100 
101 
102 

103 

104 

105 

106  } 


/*  Allocate  some  memory  for  the  line  buffer  */ 
if  ((buf  -  malloc (BUFSIZE) )  —  NULL)  { 

perror  ( _ FILE _ )  ; 

exit  (1)  ; 


/*  Search  thru  the  file  looking  for  the  start  of  the  segment  informati/ 
while  (1)  { 

if  (fgets (buf,  BUFSIZE,  map_file)  —  NULL)  { 

fprintf (stderr,  "Unable  to  find  the  segment  list  in  %s\n",  map_f; 
exit{l)  ; 

} 

if  (strstr (strupr (buf) ,  "START")  1-  NULL) 
break  ; 

} 

/*  Scan  to  the  start  of  the  first  segment  record  */ 
while  (fgets  (buf,  BUFSIZE,  map_file)  !-  NULL)  { 

count  -  sscanf (buf ,  "  %lxH  %lxH  %lxH  %s  %s",  &start_seg,  &end_seg,  ; 
if  (count  —  5) 
break  ; 

} 

/*  Check  if  EOF  was  detected  and  an  error  message  should  be  printed  */ 
if  (feof (map_file) )  ( 

fprintf  (stderr,  "Unable  to  find  the  segment  list  in  %s\n",  map_fnam; 
exit(l)  ; 


/*  Begin  processing  the  list  of  segments  */ 
p  -  previous  -  NULL  ; 
while  (count  --  5)  { 

/*  Allocate  some  memory  to  hold  the  data  structure  */ 

if  ( (p  -  (SEG_DESCRIPTOR  *)  malloc (sizeof  (*p)))  —  NULL)  ( 

perror  { _ FILE _ )  ; 

exit(l)  ; 

} 

if  (previous  —  NULL) 
list_start  -  p  ; 
else 

previous->next  -  p  ; 

strcpy (p->name,  strupr (seg_name) )  ; 

strcpy (p->class,  strupr (class) )  ; 

p->vseg  -  (unsigned  int)  start_seg  /  16  ; 

p->offset  -  (unsigned  int)  start_seg  %  16  ; 

p->len  -  (unsigned  int)  length  ; 

p->position  -  start_seg  ; 

p->inited  -  FALSE  ; 

p->romable  -  FALSE  ; 

p->symbols  -  0  ; 

p->symbol_list  -  NULL  ; 

p->next  ■»'  NULL  ; 

/*  Check  if  the  class  name  has  changed  and  reset  the  offset  */ 
if  (strcmp (p->class,  class_start->class)  !-  0)  { 

p->pseg  -  0  ; 
class_start  -  p  ; 

) 

else 

p->pseg  -  p->vseg  -  class_start->vseg  ; 
previous  -  p; 

/*  Read  the  next  line  of  segment  information  */ 
fgets  (buf,  BUFSIZE,  map  file)  ; 

count  -  sscanf (buf,  “  %IxH  %lxH  %lxH  ts  %s”,  sstart_se3,  send_se3,  , 

) 

free  (buf)  ? 

return  (list_start)  ; 

End  Listing  Nine 


Listing  Ten 


SIEVE. C 
1  /* 

2  Sieve  Benchmark  -  ROM  Version 

3 

4  Copyright  (C)  Recycled  Software  1987.  All  rights  reserved. 

5 

6  Executes  100  Iterations  of  the  sieve  algorithm  for  microprocessor 

7  benchmarking  purposes. 

8 


9 

*/ 

10 

11 

12 

#def ine 

TRUE  1 

13 

#def ine 

FALSE  0 

14 

Idef ine 

SIZE  8190 

15 

16 

char  flags [SIZE  +  1]  , 

17 

18 

main  () 

19  l 

20  int  i,  prime,  k,  count,  iter  ; 

21 

(continued  on  ne?ct  page) 


Dr.  Dobb’s  Journal,  December  1987 


91 

959 


DOS  LOCATE  UTILITY 

Listing  Ten  (Listing  continued,  text  begins  on  page  38.) 

22 

for  (iter  -  1;  iter  <-  100;  iter++)  { 

23 

count  -  0  ; 

24 

for  (i  -  0;  i  <-  SIZE;  1++) 

25 

26 

flags  [1 )  -  TRUE  ; 

27 

for  (1  -  0;  i  o  SIZE;  i++)  { 

28 

if  (flags [1] )  ( 

29 

prime  -  i  +  i  +  3  ; 

30 

for  (k  -  i  +  prime;  k  <-  SIZE;  k  +-  prime) 

31 

32 

flags [k]  -  FALSE  ; 

33 

count++  ; 

34 

i 

35 

i 

36 

} 

37 

38 

} 

End  Listing  Ten 

Listing  Eleven 

STABLE. C 

1 

♦include  <stdio.h> 

2 

♦include  <stdlib.h> 

3 

♦include  <string.h> 

4 

♦include  <dos.h> 

6 

♦include  "loc.h" 

7 

♦include  “externs.h" 

9 

10 

11 

char  *errstr  -  "Unable  to  locate  global  symbol  \"%s\"\n"  ; 

12 

void  read  symbol  table (seg  list) 

13 

SEG  DESCRIPTOR  *seg  list  ; 

14 

{ 

15 

char  buf (128)  ; 

16 

int  count,  found  ; 

17 

unsigned  int  vseg,  off; 

18 

char  symbol [32],  attrb[10)  ; 

20 

SEG  DESCRIPTOR  *p  ; 

21 

SYMBOL  LIST  *q  ; 

22 

23 

/* 

24 

This  function  reads  the  linker  map  file  and  extracts  the 

25 

symbol  information. 

(continued  on  page  94) 

Listing  Eleven  (Listing  continued,  tejct  begins  on  page  38.) 


26 

27 

28 

29 

30 

31 

32 

33 

34 

35 

36 

37 

38 

39 

40 

41 

42 

43 

44 

45 

46 

47 

48 

49 

50 

51 

52 

53 

54 

55 

56 

57 

58 

59 

60 
61 
62 

63 

64 

65 

66 

67 

68 

69 

70 

71 

72 

73 

74  } 

75 


/*  Seek  to  the  beginning  of  the  file  */ 
if  (fseek (map_file/  0L,  SEEK  SET)  !-  0)  { 

perror (  FILE  )  ; 
exit  (1)  ; 

} 

/*  Search  thru  the  file  for  the  symbol  tables  */ 
while  (1)  { 

if  (fgets(buf,  sizeof  (buf) ,  map_file)  —  NULL) 
return  ; 

if  (strstr  (strupr  (buf) ,  "ADDRESS")  !-  NULL) 
break  ; 

} 

/*  Read  each  of  the  symbol  entries  */ 
while  (1)  { 

count  -  fscanf (map  file,  "  %4x:%4x%5c  %s",  ivseg,  &off,  attrb,  symb; 
if  (count  !-  4) 
break  ; 
else  { 

p  -  seg_list  ; 
found  -  FALSE  ; 
while  (p  !-  NULL)  ( 

if  { (p->vseg  !-  vseg)  ||  (p->len  —0))  { 

p  "  p->next  ; 
continue  ; 

} 

q  -  (SYMBOL_LIST  *)  malloc (sizeof (*q) )  ; 
strcpy  (q->name,  symbol)  ; 
q->value  -  off  ; 
q->type  -  1  ; 

q->next  -  p->symbol_list  ; 
p->symbol_list  -  q  ; 
p->symbols++  ; 
found  -  TRUE  ; 
break  ; 


if  (found  —  FALSE) 

fprintf (stderr,  errstr,  symbol) 


End  Listing  Eleven 

(Listings  will  continue  next  month) 


960 


INTEGERS  DONT  FLOAT 


Listing  One  (Text  begins  on  page  48.) 

Listing  1 


ISQRT32 .ASM  -  32  bit  integer  test  program  for  no  8087 
By  Ray  Mariella,  March  87 


page 

,  96 

macro 

mov 

dl,  13 

call 

char  out 

mov 

dl,  10 

call 

char_out 

endm 

plenty: 

local  plenty 
mov  dl, byte_string 

call  char_out 

cmp  byte_var,  9 

ja  plenty 

mov  dl, ' 0 ' 

call  char_out 

mov  al,byte_var 

xor  ah, ah 

call  dec_out 

endm 

data 

segment  word  public 

base 

dw  10 

uper 

db  ? 

secs 

db  ? 

hnds 

db  ? 

announc 

db  '  60000 

data 

ends 

stack 

segment  stack 

stack 

dw  64  dup(?) 

ends 

byte_string 


.•output  a  colon  or  period 

.•space  holder  if  var<10 
.•minutes,  secs,  or  hnds 


;base  for  dec_out 
; for  time_print  routine 


32  bit  square  roots  ',13,10,'$' 


segment  word  public  'code' 
assume  cs:code,  ds:data,  ss: stack 


sqrt : 

mov 

ax, data 

mov 

ds,  ax 

crlf 

mov 

dx, offset  announc 

mov 

ah,  9 

int 

21h 

mov 

dl,  13 

call 

char_out 

herald: 

crlf 

rolling: 

xor 

di,  di 

mov 

si, 32767 

goodies : 

call 

update 

;  square  root  procedure  of  Dl 

start : 

mov 

bx,  1 

mov 

dx,  di 

mov 

ax,  si 

biggest : 

or 

dx,  dx 

jz 

words 

rcr 

dx,  1 

rcr 

ax,  1 

shl 

bx,  1 

jmp 

short  biggest 

words : 

or 

bx,  bx 

jnz 

checkem 

mov 

bx, Offffh 

checkem: 

mov 

dx,  ax 

mov 

cx,  bx 

logit: 

shl 

cx,  1 

jc 

average 

cmp 

cx,  dx 

jae 

average 

shr 

dx,  1 

mov 

ax,  dx 

mov 

bx,  cx 

jmp 

short  logit 

average : 

add 

bx,  ax 

rcr 

bx,  1 

Newton: 

REPT 

2 

mov 

ax,  si 

mov 

dx,  di 

cmp 

bx,  dx 

je 

cont 

div 

bx 

add 

bx,  ax 

rcr 

endm 

bx,  1 

cont : 

inc 

di 

cmp 

di,  60000 

ja 

quit 

jmp 

start 

quit : 

call 

update 

.•print  string  function 
.-DOS  interrrupt 


; upper  1 6 
.•lower  16 


initial  value  for  infimum 
initial  supremum,  upper  16 
initial  supremum,  lower  16 

test  if  upper  16  -0  yet 

if  yes,  we  don't  need  upper  16  now 

supr.  upper  16/2 

:supr.  lower  16/2  +  carry  from  upper 
infim. *2 

.-now  infim.  and  supr.  are  16  bits 
; if  BX  was  made  0,  correct  it! 

; if  not,  O.K.  to  continue 
.•correction  for  the  largest  32  bitters 
;supr.  in  ax,  dx 
.•infim.  in  bx,cx 

; infimum*2 

.•necessary  for  large  integers 
;infimum*2  >  3upremum? 

;if  so,  ready  to  average 
; if  not,  supr/2 
.•store  latest  values 


.•ready  for  averaging 

; (infim. +  supr.) 

.•average  value  for  first  guess 


.•lower  16  of  target  in  ax, 

.•upper  16  of  in  dx,  for  division 
.•this  is  for  near  FFFE:0000  and  up 
; but  not  needed  for  FFFD:0000  and  less 
; N / ( g 1 )  in  AX,  now  get  g2 
.•Newton's  method  g2  -  (gl  +N/gl)/2 
;bx  now  has  g2 


(continued  on  page  100) 


98 


Dr.  Dobbs  Journal,  December  1987 

961 


INTEGERS  DON'T  FLOAT 

Listing  One  (Listing  continued,  text  begins  on  page  48.) 

xor 

al,  al 

mov 

ah,  4Ch 

int 

21h 

output  a  hex  word  in  decimal 

CX,AX,DX  destroyed 

dec_out 

proc 

near 

xor 

cx,cx 

another: 

inc 

cx 

xor 

dx,  dx 

div 

base 

;base  is  10  decimal! 

push 

dx 

.•remainder  is  less  sig  digits 

or 

ax,  ax 

;is  the  quotient  zero? 

jnz 

print  dig: 

another 

;if  not,  more  number  to  convert 

pop 

dx 

;retrive  digit  from  stack 

add 

dl, 'O' 

;ascii  offset 

call 

char  out 

loop 

ret 

print_dig 

;do  all  of  the  digits 

dec_out 

endp 

output  a  single  character 

char_out 

proc 

near 

mov 

ah,  2 

.•output  char  function 

int 

21h 

ret 

char_out 

endp 

update 

proc 

near 

mov 

ah,  2ch 

;get  dos  time 

int 

21h 

;hour  in  ch,  reins  in  cl, secs  in  dh 

mov 

uper, cl 

mov 

secs, dh 

mov 

hnds, dl 

mov 

al,  ch 

xor 

ah, ah 

call 

dec  out 

time_print  uper,  1 : 1 

tirae_print  secs, • : 1 

tiraejprint  hnds, ' . ' 

crlf 

ret 

update 

endp 

code 

ends 

end 

sqrt 

End  Listing  One 

Listing  Two 

Listing  2 

R32COMP .ASH  -  32  bit  sqr 

compares  CPU,  NDP 

By  Ray  Mariella,  30  March 

87 

increments  the  upper  16  bits 

0000 : 7FFF  to  FFFF : 7FFF 

requires  8087  or  80287 

page 

,  96 

.8087 

crlf 

macro 

mov 

dl,  13 

call 

char  out 

mov 

dl,  10 

call 

char  out 

endra 

time__print  macro  byte  var,  byte  string 

local 

plenty 

mov 

call 

dl, byte_string 
char  out 

.•output  a  colon  or  period 

crop 

byte  var,  9 

ja 

plenty 

mov 

dl, 'O' 

.•space  holder  if  var<10 

call 

char  out 

plenty: 

mov 

al, byte  var 

.•minutes,  secs,  or  hnds 

xor 

ah,  ah 

call 

dec  out 

endm 

data 

segment  public  'DATA' 

base 

BIGGUN 

even 

dw  10 
dq  ? 

.•base  to  print  the  numbers  in 

rootp 

dd  ? 

uper 

db  ? 

secs 

db  ? 

hnds 

db  ? 

announc 

data 

ends 

db  1  65535  increments 

of  upper  16,  CPU  then  8087  ',13,10, 

stack 

segment  stack 

dw  64  dup(?) 

stack 

ends 

code 

segment  public  'CODE' 

assume 

cs:code,  ds:data,  ss 

: stack 

sqrt : 

push 

bp 

100 

962 


Dr.  Dobb's  Journal,  December  1987 


mov 

ax, data 

raov 

ds,  ax 

mov 

bp,  offset  biggun 

crlf 

raov 

dx, offset  announc 

raov 

ah,  9 

; print  string  function 

int 

21h 

;DOS  inter rrupt 

raov 

si, 32767 

xor 

di,  di 

goodies 

call 

update 

;  square  root  procedure  via 

8086,  DI :SI 

CPU: 

mov 

bx,  1 

; guessl 

raov 

dx,  di 

;gues82 

mov 

ax,  si 

biggest 

or 

dx,dx 

jz 

words 

rcr 

dx,  1 

; upper  16/2 

rcr 

ax,  1 

; lower  16/2  +  carry  from  upper 

shl 

bx,  1 

; guess 1*2 

jrap 

short  biggest 

;next  is  for  guessl  and  guess2  16 

words : 

or 

bx,bx 

jnz 

checkera 

;if  all  32  were  used,  CF  is  set 

raov 

bx, 65535 

;in  case  all  32  bits  were  used 

checkera 

raov 

dx,  ax 

;guess2  ax,dx 

raov 

cx,  bx 

; guessl  bx,cx 

logit: 

shl 

cx,  1 

; guessl 

jc 

average 

.•necessary  for  very  large  integers 

crap 

cx,  dx 

.•larger  than  guess2? 

jae 

average 

;if  not,  guess2/2 

shr 

dx,  1 

raov 

ax,  dx 

raov 

bx,cx 

>p 

short  logit 

; ready  for  averaging 

average 

add 

bx,ax 

rcr 

bx,  1 

; average  value 

Newton : 

REPT 

2 

raov 

ax,  si 

.•lower  16 

raov 

dx,  di 

.•prepare  for  division,  upper  16  in  dx 

crap 

bx,  dx 

; needed  for  really  BIG  ints 

je 

quit 

;  " 

div 

bx 

;ax  still  has  target,  bx  first  guess 

add 

bx,  ax 

; newton 

rcr 

bx,  1 

endra 

quit: 

inc 

di 

jz 

done 

jrap 

CPU 

done: 

call 

update 

square  root  via  8087 

if  you  need  roots  of  7FFF:FFFF  and  less,  BIGG UN  can  be  32  bits, 
and  ROOTP  can  be  a  16  bits.  The  extra  length  is  needed  here  because 
the  8087  does  not  expect  unsigned  integers. 


xor  di,di 
raov  ds : [bp] , si 
NDP:  mov  ds:[bp+2],di 

fild  biggun 
fsqrt 

fistp  rootp 
fwait 


; 8 0 8 7  loads  from  memory, 
;not  regs  directly 

put  integer  into  8087  stack 

:  store  to  memory,  too 


we  now  have  an  8087  square  root  -rootp 


dec_out 

another: 


inc 

di 

jnz 

NDP 

call 

update 

pop 

bp 

xor 

al,al 

raov 

ah,  4Ch 

int 

21h 

output  a  hex  word  in  decimal 

CX, AX, DX  destroyed 

proc  near 
xor  cx,cx 
inc  cx 
xor  dx,dx 
div  base 
push  dx 
or  ax, ax 

jnz  another 


print_dig : 

pop  dx 
add  dl, 'O' 
call  char_out 
loop  print_dig 
ret 

dec_out  endp 


;base  is  10  decimal! 

.•remainder  is  less  sig  digits 
; is  the  quotient  zero? 

;if  not,  more  number  to  convert 

;retrive  digit  from  stack 
;ascii  offset 

;do  all  of  the  digits 


output  a  single  character 


(continued  on  ne/ct  page) 


Dr.  Dobb’s  Journal,  December  1987 


101 

963 


INTEGERS  DONT  FLOAT 

Listing  Two  (Listing  continued,  text  begins  on  page  48.) 

char  out 

proc 

near 

mov 

ah,  2 

; output  char  function 

int 

21h 

;do  it 

ret 

char_out 

endp 

update 

proc 

near 

mov 

ah, 2ch 

;get  dos  time 

int 

21h 

;hour  in  ch,  mins  in  cl, secs  in  dh 

mov 

uper, cl 

mov 

secs, dh 

raov 

hnds, dl 

mov 

al,  ch 

xor 

ah,  ah 

call 

dec_out 

timejprint  uper,  ' :  ' 

time_print  sees,  1 :  1 

time_print  hnds, 1 . 1 

crlf 

ret 

update 

endp 

code 

ends 

end 

sqrt 

End  Listing  Two 

Listing  Three 

Listing  3 

R32fail.ASM  -  32  bit  integer  test  program  REPT  macros 

looks 

for  the  first  time  that  Newton  fails  0,-1 

By  Ray  Mariella,  April  87 

page 

,96 

.8087 

crlf 

macro 

mov 

dl,  13 

call 

char  out 

mov 

dl,  10 

call 

char  out 

endm 

time  print  macro 

byte  var,  byte  string 

local 

plenty 

mov 

dl,  byte  string 

.•output  a  colon  or  period 

call 

char  out 

emp 

byte  var,  9 

ja 

plenty 

mov 

dl,  'O' 

.•space  holder  if  var<10 

call 

char  out 

plenty: 

mov 

al,byte  var 

.•minutes,  secs,  or  hnds 

xor 

ah,  ah 

call 

dec  out 

endm 

data 

segment  word  public  'DATA' 

base 

dw  10 

;base  to  print  the  numbers  in 

BIGGUN 

dq  ? 

rootp 

dd  ? 

contwd 

dw  ? 

uper 

db  ? 

secs 

db  ? 

hnds 

db  ? 

announc 

db  'incr.  lower  16 

from  1  ',13, 10, •$' 

intermed 

db  '  passed  65535 

',13,10, 

data 

ends 

stack 

segment  stack 

dw  64  dup(?) 

stack 

ends 

code 

segment  word  public  'code' 

assume 

cs:code,  ds:data,  ss 

: stack 

sqrt : 

push 

bp 

mov 

ax, data 

mov 

ds,  ax 

mov 

bp, offset  biggun 

felex 

.•clear  8087  exceptions,  if  any 

fstew 

contwd 

.-get  control  word 

and 

contwd, 1111001111111111B  ;round  to  nearest 

fldcw 

contwd 

.-load  changed  control  word 

crlf 

mov 

dx, offset  announc 

mov 

ah,  9 

.•print  string  function 

int 

21h 

;DOS  inter rrupt 

mov 

dl,  13 

call 

char_out 

herald: 

crlf 

rolling: 

xor 

bx,bx 

xor 

ax,  ax 

mov 

si,  1 

.•lower  16,  will  vary 

mov 

ds : [bp] , si 

; BIGGUN  lower  16 

mov 

di,  0 

.•upper  16 

•; 

mov 

ds: [bp+2] ,di 

102 

964 


Dr.  Dobb’s  Journal,  December  1987 


goodies : 

call 

crlf 

update 

start: 

FILD  BIGGUN 

;  square  root  procedure  via 

8086/V30 

mov 

bx,  1 

; guessl 

mov 

dx,  di 

; guess2 

mov 

ax,  si 

REPT  8 

or 

dx,  dx 

jz 

halfway 

rcr 

dx,  1 

; upper  16/2 

rcr 

ax,  1 

; lower  16/2  +  carry  from  upper 

shl 

endm 

bx,  1 

; guessl*2 

jmp 

short  rest 

halfway: 

jrap 

short  words 

rest: 

REPT  8 

or 

dx,  dx 

jz 

words 

rcr 

dx,  1 

; upper  16/2 

rcr 

ax,  1 

; lower  16/2  +  carry  from  upper 

shl 

bx,  1 

; guessl*2 

endm 

; next  is  for  guessl  and  guess2  16  bits 

words: 

FSQRT 

or 

bx,bx 

;if  all  32  were  used,  CF  is  set 

jnz 

checkem 

mov 

bx, 65535 

;in  case  all  32  bits  were  used 

checkem: 

mov 

dx,  ax 

;guess2  ax,dx 

mov 

cx,  bx 

; guessl  bx,cx 

logit : 

REPT  8 
shl 

CX,  1 

; guessl 

jc 

average 

.•necessary  for  2000:4000  and  up 

cmp 

cx,  dx 

.•larger  than  guess2? 

jae 

average 

shr 

dx,  1 

; if  not,  guess2/2 

mov 

ax,  dx 

mov 

endm 

bx,  cx 

; ready  for  averaging 

average : 

FISTP 

rootp 

add 

bx,  ax 

rcr 

bx,  1 

.•average  value 

Newton: 

REPT 

2 

mov 

ax,  si 

.•lower  16 

mov 

dx,  di 

.•prepare  for  division,  upper  16  in  dx 

cmp 

bx,  dx 

; for  FFFE: 0000  and  up 

je 

quit 

; ax  still  has  target,  bx  first  guess 

div 

bx 

add 

bx,  ax 

; newton 

rcr 

endm 

FWAIT 

bx,  1 

done : 

mov 

ax,bx 

;ax,  and  bx  have  approx,  root 

mov 

dx,word  ptr  ds 

[bp+8]  ;bp+8  is  rootp,  8087  root 

xor 

ax,  dx 

; see  if  rootp  agrees 

jz 

cont 

cmp 

bx,  dx 

ja 

quit 

belo: 

inc 

bx 

cmp 

bx,dx 

jnz 

quit 

cont : 

inc 

si 

.•lower  16 

jnz 

notyet 

inc 

di 

mov 

ax,  di 

call 

dec  out 

mov 

dx, offset  interned 

mov 

ah,  9 

int 

crlf 

21h 

not yet : 

mov 

ds : [bp+2] , di 

,-biggun  upper  16 

mov 

ds : [bp] , si 

;biggun  lower  16 

jrep 

start 

quit : 

call 

update 

mov 

ax,  di 

call 

dec  out 

mov 

dl, ' 

call 

char  out 

mov 

ax,  si 

call 

dec  out 

crlf 

pop 

bp 

xor 

al,  al 

mov 

ah, 4Ch 

int 

21h 

ret 

output  a  hex  word  in 

decimal 

CX,  AX 

DX  destroyed 

dec  out 

proc 

near 

xor 

cx,  cx 

another: 

inc 

cx 

(continued  on  nejct  page) 

Dr.  Dobb's  Journal,  December  1987 


103 

965 


INTEGERS  DONT  FLOAT 


Listing  Three  (Listing  continued,  tc^t  begins  on  page  48.) 


xor 

dx,dx 

div 

base 

;base  is  10  decimal! 

push 

dx 

.•remainder  is  less  sig  digits 

or 

ax,  ax 

;is  the  quotient  zero? 

jnz 

another 

;if  not,  more  number  to  conve 

print_dig 

pop 

dx 

;retrive  digit  from  stack 

add 

dl, 'O' 

; ascii  offset 

call 

char_out 

loop 

print_dig 

;do  all  of  the  digits 

ret 

dec_out 

endp 

output 

a  single  character 

from  dl 

char_out 

proc 

near 

mov 

ah,  2 

.•output  char  function 

int 

21h 

;do  it 

ret 

char_out 

endp 

update 

proc 

near 

mov 

ah, 2ch 

;get  dos  time 

int 

21h 

;hour  in  ch,  mins  in  cl, secs 

mov 

uper, cl 

mov 

secs, dh 

mov 

hnds, dl 

mov 

al,ch 

xor 

ah,  ah 

call 

dec  out 

time_print  uper, ' : 1 

time_print  secs, ' : ' 

time_print  hnds, • . 1 

crlf 

ret 

update  endp 

code  ends 

end  sqrt 


End  Listing  Three 


Listing  Four 

Listing  4 


RALL16 

-  square 

; 

By  Ray  Mariella, 

page 

,96 

crlf 

macro 

mov 

dl,  13 

call 

char  out 

mov 

dl,  10 

call 

char_out 

endra 

time  print  macro 

byte_var, 

local 

plenty 

mov 

dl,  byte_i 

call 

char_out 

crap 

byte_var. 

ja 

plenty 

mov 

dl, 'O' 

call 

char_out 

plenty: 

mov 

al,  byte_t 

xor 

ah,  ah 

call 

dec_out 

endra 

1  to  65535 


; output  a  colon  or  period 

; space  holder  if  var<10 
.•minutes,  secs,  or  hnds 


data 

segment  word  public  'DATA*' 
even 

base 

dw  10 

;base  to  print  the 

uper 

db  ? 

secs 

db  ? 

hnds 

db  ? 

announc 

db  'square  roots  of 

1  -  65535  ',13,10, '$' 

data 

ends 

stack 

segment  stack 

dw  64  dup(?) 

stack 

ends 

code 

segment  word  public  'CODE' 

assume  cs:code,  ds:data,  ss: 

stack 

even 

sqrt: 

crlf 

mov  ax, data 

mov  ds,ax 

mov  dx, offset  announc 

mov  ah, 9 

.•print  string  function 

int  21h 

mov  dl,13 

call  char_out 

;DOS  interrrupt 

104 

966 


Dr.  Dobb’s  Journal,  December  1987 


call  update 
mov  cx, 65535 


square  root  procedure  via  8086,  integer  in  CX 


ooundit : 

mov 

ax,  1 

; inf imum  -  will  be  lower  bound 

mov 

dx,  cx 

;supremura  -  will  be  upper  bound 

;  the  next  section  gets  the  max  lower  bound  and  the  min  upper  bound 

REPT 

8 

shl 

ax,  1 

;rapy  inf imum  by  2 

crap 

ax,  dx 

; above  supreraura  ? 

ja 

root 

;  then  done 

shr 

dx,  1 

;if  not,  div  supremum  by  2 

mov 

bx,  dx 

; store  supr. 

mov 

si,  ax 

; store  infira. 

ENDM 

root: 

add 

bx,  si 

; get  avg.  of  bounds  for 

shr 

bx,  1 

; first  guess  of  root 

mov 

ax,cx 

;  Newton 1 s 

Method  ->  X  - 

(Xo  +  N/Xo)/2 

Newton: 

xor 

dx,  dx 

; prepare  for  division 

div 

bx 

;ax  still  has  target 

add 

bx,  ax 

; newton 

shr 

bx,  1 

;  we  now 

have  a 

square  root  in 

bx 

;to  print 

,  remove  the  next  8  semicolons 

mov 

ax,  cx 

call 

dec  out 

mov 

dl,T  ■ 

call 

char  out 

mov 

ax,  bx 

call 

dec  out 

crlf 

didit: 

loop  boundit 

done : 

call 

update 

crlf 

xor 

al,  al 

mov 

ah, 4Ch 

int 

21h 

output  a  hex  word  in 

decimal 

CX,  AX 

DX  destroyed 

dec  out 

proc 

near 

xor 

cx,  cx 

another: 

inc 

cx 

xor 

dx,dx 

div 

base 

;base  is  10  decimal! 

pu3h 

dx 

.•remainder  is  less  sig  digits 

or 

ax,  ax 

;is  the  quotient  zero? 

jnz 

another 

;if  not,  more  number  to  convert 

print  dig 

pop 

dx 

;retrive  digit  from  stack 

add 

dl, 'O' 

;ascii  offset 

call 

char  out 

loop 

print  dig 

;do  all  of  the  digits 

ret 

dec_out 

endp 

output  a  single  character 

char  out 

proc 

near 

mov 

ah,  2 

.•output  char  function 

int 

21h 

;do  it 

ret 

char_out 

endp 

update 

proc 

near 

mov 

ah, 2ch 

;get  dos  time 

int 

21h 

;hour  in  ch,  mins  in  cl, secs  in 

mov 

uper, cl 

mov 

secs, dh 

mov 

hnds, dl 

mov 

al,ch 

xor 

ah,  ah 

call 

dec_out 

time_print  uper,  ' : ' 


time_print  secs, ' : 1 

time_print  hnds, 1 . ' 

crlf 

ret 

update  endp 

code  ends 

end  sqrt 

End  Listings 


Dr.  Dobb’s  Journal,  December  1987 


105 

967 


GRAPHICS  TOOLBOX 


Listing  One  (Text  begins  on  page  54.) 

Listing  1 

/*  POPWIN.I:  Routines  for  pop-up  windows 

/*  Externally  defined  are  gotoxy () ,  wrtcha (),  window  (),  chattrO, 
/*  and  videoraode(). 

/*  Notes:  Programmer  is  responsible  for  assuring  that  the  pop-up 
/*  defined  in  the  POPDESCR  structure  is  large  enough  to 

/*  contain  the  text. 

/*  Save  your  cursor  position  before  calling  these  routines. 

/*  They  don't  preserve  or  restore  caller's  cursor. 

/*  Unlike  menubar,  the  POPDESCR  structure  doesn't  contain  a 

/*  pointer  to  the  text  elements.  Thus,  you  can  use  the 

/*  routines  to  create  dialog  boxes  and  help  windows. 

/* - 

typedef  struct  {  /*  window  descriptor/control  structure 

int  left,  top,  right,  bottom;  /*  window  location,  inclusive 
char  textAttr;  /*  text  attribute  in  window 

int  border;  /*  0  -  none,  1  -  single,  2  -  double 

}  POPDESCR; 


/*  border  characters  */ 


win->bottom, 

/*  open  window 
/*  draw  border  outside 


static  bord[] [6]  -  { 

{  196,  179,  218,  191,  217,  192  ), 

{  205,  186,  201,  187,  188,  200  } 

}; 

void  popMake  (POPDESCR  *win)  /*  create  pop-up  window 

/*  Must  initialize  the  POPDESCR  structure  before  calling. 

/*  Does  not  write  text  to  the  window,  but  only  creates  it.  After 
/*  this  fen  returns,  you  can  write  text  using  popPutsO  and 
/*  control  the  cursor  with  popxyO,  both  given  below. 

int  x,  y,  style,  a; 

a  -  win->textAttr; 

window  (win->left,  win->top,  win->right 
win->textAttr) ; 

if  ((style  -  win->border-l)  >-  0)  { 

gotoxy  (win->left-l,  win->top-l,  0); 
wrtcha  (bord  [style] [2],  a,  0) ; 
gotoxy  (win->right+l,  win->top-l,  0) ; 
wrtcha  (bord  [style] [3],  a,  0) ; 
gotoxy  (win->right+l,  win->bottom+l,  0) ; 
wrtcha  (bord  [style] [4],  a,  0) ; 
gotoxy  (win->left-l,  win->bottom+l,  0); 
wrtcha  (bord  [style] [5],  a,  0) ; 
for  (x  -  win->left;  x  <-  win->right;  x++)  ( 
gotoxy  (x,  win->top-l,  0) ; 
wrtcha  (bord  [style] [0],  a,  0)  ; 
gotoxy  (x,  win->bottom+l,  0); 
wrtcha  (bord  [style] [0],  a,  0)  ; 

) 

for  (y  -  win -> top;  y  <-  win->bottom;  y++)  ( 
gotoxy  (win->left-l,  y,  0); 
wrtcha  (bord  [style] [1],  a,  0) ; 
gotoxy  (win->right+l,  y,  0)  ; 
wrtcha  (bord  [style] [1],  a,  0)  ; 


/*  corners:  upper  left  */ 
/ 
/ 


/* 


upper  right 

/*  lower  right 

/*  lower  left 
/*  horizontals 


/*  verticals  */ 


/*  scroll  pop  up  one  line  */ 


} 

) 

gotoxy  (win->left,  win->top,  0) ; 

}  /* - */ 

void  popScroll  (POPDESCR  *win) 

( 

winScroll  (win->left,  win->top,  win->right,  win->bottom, 
win->textAttr) ; 

gotoxy  (wherex  (0),  wherey  (0)  -  1,  0) ; 

}  /* - */ 

void  popxy  (int  x,  int  y,  POPDESCR  *w)  /*  gotoxy  for  popup  window 

/*  Allows  you  to  express  text  coordinates  relative  to  upper  left 
/*  corner  of  the  window  in  video  page  0 

int  row,  col; 

row  -  w->top  +  y; 
col  -  w->left  +  x; 
gotoxy  (col,  row,  0); 

}  /. - 

void  popPuts  (int  x,  int  y,  char  string!], 

POPDESCR  *win) 

{ 

int  len,  ch,  p; 


/*  write  string  to 
/*  specified  window 


position  cursor  in  window 
/*  handle  newline 


/*  scroll  if  required  *1 


popxy  (x,  y,  win)  ; 
len  -  strlen  (string); 
for  (ch  -  0;  ch  <  len;  ch++)  { 
if  (string  [ch]  —  '\n')  { 
x  -  0; 

++y; 

popxy  (x,  y,  win); 
if  ( (y  +  win->top)  >  win->bottom)  { 
popScroll  (win); 

y; 

} 

}  else  ( 

if  ((x  +  win->left)  >  win->right)  { 
x  -  0; 

++y; 

popxy  (x,  y,  win); 

if  ( (y  +  win->top)  >  win->bottora) {  /*  scroll  if  required 
popScroll  (win) ; 

— y; 

) 

} 

wrtcha  (string  [ch] ,  win->textAttr,  0);  /*  write  next  char 
++x; 

popxy  (x,  y,  win) ;  /*  advance  cursor 


/*  outside  window 
/*  so  wrap  cursor 


) 

)  /* 


) 


char  *saveScm  (void)  /*  saves  screen  image  */ 

/*  Call  this  routine  before  popMake().  It  saves  the  current  screen  */ 

/*  image  at  a  location  pointed  to  by  the  returned  value.  You  must  */ 

/*  pass  the  same  pointer  back  to  restScrn()  later  in  order  to  */ 

/*  make  the  pop-up  go  away.  */ 

{ 

int  c; 

unsigned  srcSeg; 
char  * saveArea; 


saveArea  -  (char  *)  malloc  (4096); 
if  (videomode  (&c)  --  7) 
srcSeg  -  OxBOOO; 
else 

srcSeg  -  0xB800; 
movedata  (srcSeg,  0, 
return  (saveArea); 


/*  allocate  space  */ 
/*  smonochrome  buffer  */ 


/*  &text  graphics  buffer 
_SS,  (unsigned)  saveArea,  4096);  /*  save 

/*  return  pointer 

>  /* - */ 

void  restScrn  (char  * saveArea)  /*  restores  screen  image 

/*  Call  this  routine  when  you  want  the  pop  up  window  to  go  away. 

/*  It  restores  the  screen  to  its  appearance  before  the  window.  It 
/*  does  NOT  restore  the  cursor.  That  is  your  responsibility. 

/*  This  routine  does  not  worry  about  snow  on  IBM's  poorly  designed 
/*  CGA  board.  Snow  may  result  when  restoring  the  screen. 

int  c; 

unsigned  destSeg; 

if  (videomode  (&c)  --  7) 
destSeg  -  OxBOOO; 
else 

destSeg  -  0xB800;  /*  text  graphics  buffer 

movedata  (_SS, (unsigned)  saveArea,  destSeg,  0,  4096);  /*  restore 
free  (saveArea);  /*  de-allocate  space 

)  /* - */ 


/*  ^monochrome  buffer  */ 


End  Listing  One 


Listing  Two 

Listing  2 

/*  MENUBAR. I:  Constructs  a  menu  bar  per  MENUBARSPEC  structure 


Externally  defined  are  activepage  ( ) ,  videomode  (),  wrtstraO 
Notes:  Preserve  your  cursor  position  before  calling  this  fen. 
It  does  not  save  or  restore  caller's  cursor. 

The  ' sel '  pointer  points  to  a  solid  string  of  menu 
selections  in  the  form  "sell\0sel2\0sel3\0 . . . seln\0" 


typedef  struct  { 

int  background, 

int  nsels; 

char  *sel; 

}  MENUBARSPEC; 


foreground;  /*  colors  used  in  menu  bar 

/*  number  of  selections 
/*  pointer  to  static  selections  (see  above) 
/*  caller  sets  up  as  many  as  needed 
/*  does  not  remember  previous  cursor  position 


void  menubar  (MENUBARSPEC  *spec) 

int  p,  i,  ncols,  start,  interval,  page; 
char  attr; 

page  -  activepage  ();  /*  get  active  page 

videomode  (fincols) ;  /*  get  screen  width 

gotoxy  (0,  0,  page) ;  /*  go  home 

attr  -  chattr ( spec->f oreground,  spec->background) ;  /*  attributes 
for  (p  -  0;  p  <  (ncols-1);  p++)  { 
wrtcha  (•  ',  attr,  page); 

^  gotoxy  (wherex  (page)+l,  0,  page); 

interval  -  ncols  /  spec->nsels; 
start  -  i  -  0; 

for  (p  -  0;  p  <  spec->nsels;  p++) 
gotoxy  (start,  0,  page) ; 
wrtstra  (& (spec->sel [i] ) ,  attr, 
i  +-  strlen  (& (spec->sel [i] ) )  + 
start  +-  interval; 


blank  menu  bar 
/*  advance 


/*  spacing  between  entries  */ 


{ 


/* 


} 


page)  ; 
1; 

/* 


write  selections  */ 
/*  place  cursor  */ 

/*  find  next  string  */ 
next  position  for  entry  */ 


End  Listing  Two 


Listing  Three 

Listing  3 

/*  MENUDEMO . C :  Demonstrates  principles  of  menu  bars  and  pop-down  */ 
/*  menus  in  Turbo  C  */ 

/. - */ 

/*  INCLUDES  */ 

♦include  <dos.h> 

♦include  <video.i> 

♦include  <colors.h> 

♦include  <menubar.i> 

♦include  <popwin.i> 

/*  LOCAL  FUNCTION  PROTOTYPES  */ 

void  popFileMenu  (POPDESCR  *mfile,  char  *text) ; 

void  popEditMenu  (POPDESCR  *medit,  char  *text) ; 


(continued  on  page  108) 


106 

968 


Dr.  Dobb’s  Journal,  December  1987 


GRAPHICS  TOOLBOX 


Listing  Three 

(Listing  continued,  tejct  begins  on  page  54.) 

/*  STATICS  FOR  PROGRAM  */ 

char  entries []  -  { "File\0Edit\0Browse\0Reports\0Exit\0" } ; 

MENUBARSPEC  raenuspec  -  {{BLUE),  {WHITE},  {5},  (Sentries)}; 

char  fileMenu[]  -  ( "Open\nSave\nNew  file\nAbandon" } ; 

char  editMenuf]  -  {"Select  record\nReraove  field\nAdd  field"}; 

/* - */ 

main  () 

{ 

static  POPDESCR  filePop  -  {{1),  {2},  {8},  {5},  {0),  (2)}; 

POPDESCR  editPop; 

char  * prevScrn; 

int  n,  oldx,  oldy,  cols; 

/*  SET  UP  POP-DOWN  MENU  FOR  FILE  SELECTION  */ 
filePop. textAttr  -  chattr  (YELLOW,  BLUE); 

/*  SET  UP  POP-DOWN  MENU  FOR  EDIT  SELECTION  */ 
editPop  -  filePop; 

editPop. left  -  80  /  menuspec .nsels  +  1; 
editPop. right  -  editPop. left  +  12; 
editPop. bottom  -  editPop. top  +2; 

/*  NOW  SHOW  THE  TWO  MENUS  IN  SUCCESSION  */ 
setmode  (3); 
els  <); 

menubar  (Smenuspec) ;  /*  write  menu  */ 

printf  ("\nActive  page  is  %d",  activepage () ) ;  /*  show  video  info  */ 

printf  ("\nVideo  mode  is  %d",  videoraode  (Scols)); 

printf  ("\nNumber  of  columns  on  screen  is  %d",  cols); 

oldx  -  wherex  (activepage ()) ; 

oldy  -  wherey  (activepage ()) ; 

printf  ("\nCursor  is  at  x  -  %d,  y  -  %d",  oldx,  oldy); 
gotoxy  (oldx,  oldy,  activepage ()) ; 

getch  ();  /*  wait  for  keypress  *! 

cursoff  ()  ; 

popFileMenu  (SfilePop,  fileMenu) ;  /*  pop  down  file  menu  */ 

popEditMenu  (SeditPop,  editMenu) ; 

setcursor  (0,  cursend());  /*  make  a  block  cursor  */ 

gotoxy  (oldx,  oldy,  activepage ()) ; 

puts  ("\nPress  any  key  to  end  demo..."); 

getch  (); 

setcursor  (cursend ( ) -1,  cursendO);  /*  restore  underline  cursor  */ 
els  ( ) ; 

}  /* - */ 

void  popFileMenu  (POPDESCR  *mfile,  char  *text) 

( 

char  *prevScrn; 

prevScrn  -  saveScrn  () ; 


popMake  (mfile) ; 

popPuts  (0,  0,  text,  mfile) ; 

getch  ( ) ; 

restScm  (prevScrn)  ; 

}  /* - */ 

void  popEditMenu  (POPDESCR  *medit,  char  *text) 
( 

char  *prevScrn; 

prevScrn  -  saveScrn  () ; 
popMake  (raedit) ; 
popPuts  (0,  0,  text,  medit) ; 
getch  (); 

re3tScm  (prevScrn) ; 

}  /* - */ 


End  Listings 


108 


Dr.  Dobb’s  Journal,  December  1987 

969 


C  CHEST 


Listing  One  (Tejct  begins  on  page  126.) 


93 

94 

95 

96 

97 

98 

99 
100 
101 
102 

103 

104 

105 

106 


Listing  1  —  kernel. h.  Printed  9/11/1987 


71 

♦define 

TE 

NOERR 

0 

/* 

81 

♦define 

TE" 

"TOOMANY 

-1 

f* 

91 

♦define 

TE[ 

"nqmem 

-2 

/* 

10  1 

♦define 

TE[ 

"badarg 

-3 

/* 

HI 

♦define 

TE~ 

"timeout 

-4 

/* 

12| 

♦define 

TE’ 

"qfull 

-5 

/* 

13| 

♦define 

TE~ 

"notasks 

-6 

/* 

14  1 

♦define 

TE" 

"internal 

-7 

/* 

15  1 

♦define 

te" 

"deadlock 

-8 

/* 

16| 

♦define 

te' 

"stack 

-9 

/* 

17| 

♦define 

te‘ 

[kill 

-10 

/* 

18| 

19| 

♦define 

TS 

NORMAL 

0 

/* 

20| 

♦define 

TS_ 

"WAIT 

1 

211 

♦define 

TS" 

"timeout 

2 

22| 

23| 

♦define 

T  MAXTASK 

32 

/*  1 

24| 

♦define 

TQ_ 

_SIG 

0xa5a5 

/*  ! 

♦ifndef  NULL 
♦include  <stdio.h> 
♦endif 

♦include  <tools/pq.h> 


Error  codes 
No  error 

Maximum  number  of  tasks  (32)  already  exists 

Insufficient  memory  available 

Illegal  Argument 

Timeout 

Queue  is  full 

No  tasks  to  send  message 

Internal  error 

Delete  would  have  caused  a  deadlock. 

Stack  overflow 
Ctrl-Break  encountered 

*  Must  be  0 


/*  PRIORITY (a, b)  evaluates  to  a  neagive  number  if  task  a  is  lower  priority 

*  than  task  b,  to  0  if  they're  equal,  to  a  positive  number 
if  task  a  is  higher  priority  than  task  b.  If  priorities 
are  the  same,  the  timestamps  are  compared  and  the  routine 

*  with  the  smaller  (older)  time  stamp  is  assumed  to  be  the 

*  higher  priority. 


♦define  T_PRIORITY(a,b)  ( 


( (a) ->priority  1-  (b) ->priority)  \ 

?  (a)->priority  -  (b)->priority  \ 

:  (b) ->timestarap  -  (a)->timestamp  ) 


Task  Control  Block.  Do  not  change  the  register-save  area 
(ax,  bx,  cx  . . .  )  without  also  changing  the  code  in  swap. asm. 
Don't  change  anything  without  changing  the  offset  to  the  stack 
base  in  chkstk.asm. 

I'm  assuming  the  small  model  here.  That  is,  I'm  assuming  that 
the  only  segment  register  that  can  change  is  the  extra  segment 
and  that  the  stack  and  data  segments  always  have  the  same  value. 

Be  sure  to  block()  if  you're  going  to  modifiy  the  CS,  DS,  or.  SS 
registers. 

A  context  swap  is  done  by  pushing  the  registers  in  the  following 
order : 

flags,  cs, ip, ax, bx, cx, dx, si, di, bp, ds,  es 

Then,  the  current  stack  pointer  is  saved  in  the  TCB.  Context  is 
restored  by  popping  es,ds,bp,di,  si,dx,cx,bx,  and  ax,  and  then 
restoring  the  flags,  cs,  and  ip  with  an  iret  instruction. 


typedef  struct  tcb 
{ 


62| 

void 

**sp; 

/* 

63| 

unsigned 

as; 

/* 

64| 

65| 

unsigned 

priority; 

/* 

66| 

67  1 

unsigned  long  timestamp; 

/* 

68| 

unsigned 

wait; 

/* 

69| 

* 

70| 

* 

711 

* 

72| 

* 

73| 

* 

74| 

75| 

76| 

struct  tcb 

*next ; 

/* 

77| 

78| 

int 

status; 

/* 

79| 

* 

80| 

811 

82  | 

void 

*rasg; 

/* 

83| 

* 

84| 

*, 

85| 

86| 

/* 

87| 

/* 

88  | 

char 

*tag; 

/* 

89  | 

void 

** initial  sp; 

/* 

90  | 

911 

void 

* stack [1] ; 

/* 

92  | 

* 

Must  be  first  and  must  be  16  bits  */ 
Must  be  second  &  must  be  16  bits  */ 

priority  0-lowest,  65, 535-highest  */ 
Clock  tick  when  task  was  preempted.  */ 

Counting  semaphore  used  by  tasks  that 
are  waiting  at  a  queue.  Set  to  initial 
timeout  value  and  decremented  on  each 
clock  tick.  Task  is  put  back  into 
the  active  list  if  semaphore  gets  to 
0.  If  wait  <  0,  task  will  not  time  out. 


/*  Pointer  to  next  task  waiting  at  queue.  */ 


Suspended  by  wait 


for  a  message.  NULL  if  task  timed  out. 


Identifying  string  of  some  sort 


as  pointer-sized  for  t_create() . 


) 

TCB; 

typedef  struct  t  queue 
{ 


int 

signature; 

/*  Signature 

V 

struct 

t_queue  "next ; 

/*  Next  queue  in  chain. 

*/ 

TCB 

*task  h; 

/*  Head  (start)  of  task  list. 

*/ 

TCB 

*task_t; 

/*  Tail  (end)  of  task  list. 

V 

int 

q_size; 

/*  Maximum  number  of  elements 

*/ 

int 

numele; 

/*  ♦  of  elements  currently  in 

queue  */ 

void 

**headp; 

/*  Head  pointer 

*/ 

(continued  on  page  113) 


110 

970 


Dr.  Dobb’s  Journal,  December  1987 


C  CHEST 


Listing  One  (Listing  continued,  text  begins  on  page  126.) 


107 

108 

109 

110 
111 
112 

113 

114 

115 

116 

117 

118 

119 

120 
121 
1221 
123  | 
124! 

125 

126 

127 

128 

129 

130 

131 

132 

133 

134 

135 

136 

137 

138 

139 

140 

141 

142 

143 

144 

145 

146 

147 

148 

149 

150 

151 

152 

153 

154 

155 

156 

157 

158 

159 

160 
161 
162 


void 

void 


**tailp;  /*  Tail  pointer 

*queue[l];  /*  First  cell  of  actual  queue. 

*  Must  be  at  the  bottom  of  the 

*  structure. 

*/ 


Global  variables.  Actually  declared  in  globals.c.  I'm  assuming 
the  default  initialization  to  0  here.  These  may  be  used  by 
your  programs  (T_clock  and  T_numtask's  are  useful)  but  should 
never  be  modified  by  them.  It's  safest  to  block  while 
accessing  them. 


♦ifdef  ALLOC 


* 

i 

♦else 

♦ 

♦ 

♦endif 


define  CLASS 
define  I (x)  x 


define  CLASS  extern 
define  I (x) 


CLASS  PQ  *T  tasks  I(-0); 


/*  Priority  queue  of  tasks  that  are  waiting  */ 
/*  for  service.  See  /src/tools/pq. c  for  priority  */ 
/*  queue  routines  and  definition  of  PQ.  */ 


CLASS  TCB  *T_active  I(-0);  /*  Pointer  to  currently  active  task.  NULL  if 

*  multitasking  is  off  or  if  no  tasks  are  active 

*  (this  latter  is  a  deadlock) . 

*/ 

CLASS  unsigned  long  T_clock  I  (-0) ; 

/*  Incremented  on  each  system  clock  tick.  If  you 

*  assume  the  default  18.2  ticks/second, 

*  the  clock  will  roll  over  after  about 

*  65552  hours  (about  7.47  years): 


( (Oxffffffff/18.2)  /60 )  / 60 
65552  /  24  /36S.35 


65552 

7.47798 


*  Of  course,  this  number  will  scale  with  faster 

*  tick  rates  but  the  resolution  should  be  ok  for 

*  all  reasonable  tick  rates. 

*/ 

CLASS  T_QUEUE  *T_queues  I(-0);  /*  Pointer  to  head  of  linked  list  of  queues.  */ 
CLASS  int  T_numtasks  I(-0);  /*  Total  #  of  tasks  that  have  been  created.  */ 

/* - 

*  Function  prototypes .  You  should  never  call  any  of  the  _t_xxxx 

*  functions  directly . 

*/ 


163  | 

extern 

T  QUEUE 

*t  raakequeue 

(int  size 

) 

164  | 

extern 

int 

t_send 

(T  QUEUE  *q. 

void  *msg  ) 

165  | 

extern 

void 

*t_wait 

(T_QUEUE  *q. 

int  timeout  ) 

166| 

extern 

int 

t_yield 

(void 

) 

167| 

extern 

int 

t_perror 

(char  *str. 

int  errcode  ) 

168| 

extern 

int 

t_start 

(int  factor 

) 

169| 

extern 

TCB 

*t_create 

(int  (*subr)(). 

char*  tag,  unsigned  pri 

170  1 

int  stk  size, . . 

171  | 

extern 

int 

t_chg_priority 

(TCB  *tp. 

int  new_priority  ) 

172  | 

extern 

int 

t_delete 

(TCB  "task 

) 

173  1 

extern 

int 

tjprint 

(TCB  *task 

) 

174  | 

extern 

void 

t  stop 

(int  exit_code 

) 

175| 

extern 

int 

t_second 

(void 

) 

176| 

extern 

void 

_t_swap 

(TCB  *old,  TCB 

*new  ) 

177| 

extern 

void 

_t_install 

(TCB  *new 

) 

178  | 

extern 

void 

_t_s hazara 

(void 

) 

End  Listing  One 


Listing  Two 


Listing  2  —  schedule .asm.  Printed  9/11/1987 


II 

21 

31 

41 

51 

61 

71 

81 

91 

10| 

HI 

12| 

13  1 

14  1 
15| 
16| 
17| 
18| 
19| 
20| 
211 
22  | 


PAGE  56,132 

TITLE  SPEEDUP .ASM:  Systera-clock-modification  routines 


DEBUG  equ  1  ;  Set  to  1  to  make  internal  symbols  public  for 

;  debugging. 

DOSPEED  equ  1  ;  Set  to  0  to  disable  everything  except 

;  global-variable  initialization  in  speedup. 


Public  Subroutines  are: 
t_cli() 

Disable  Interrupts 
t _ sti  ( ) 

Enable  Interrupts 


t_speedup(  factor  ) 
Tnt  factor; 


(continued  on  nejtt  page) 


Dr.  Dobb's  Journal,  December  1987 


113 

971 


C  CHEST 


Listing  Two 

(Listing  continued,  tejit  begins  on  page  126.) 


Speed  up  the  system  clock  by  the  indicated  factor 
(1,  2,  3,  4,  etc.).  Call  the  scheduler 
on  every  timer  interrupt  and  call  the  default  clock 
routine  as  well  every  "factor"  ticks.  For  example, 
speedup!  2  )  speeds  up  the  system  clock  by 
a  factor  of  two;  the  normal  interrupt- service 
routine  that's  used  by  DOS  will  be  called  every- 
other  tick.  A  speedup  factor  of  1  or  0  doesn't  modify 
the  clock  rate. 

_t_slowdown() 

Restore  the  clock  to  the  normal  speed  and  disconnect 
the  routine  installed  with  a  previous  speedup!)  call 
(if  one  is  installed) . 


t_block () 
t_release ( ) 

Disable  the  scheduler  but  not  the  normal  clock  interrupt. 
The  default  system  interrupt-service  routine  is  processed 
in  the  normal  way,  on  every  Nth  clock  tick.  Use  these 
routines  carefully.  If  they're  used  in  a  tight  loop,  it's 
possible  that  ALL  timer  interrupts  will  be  ignored,  even 
if  the  processor  is  released  for  a  while  inside  the  loop. 
In  this  situation  sti!)  and  cli()  are  better. 


long  numint ( ) ; 
long  numblkO; 


Number  of  unblocked  interrupts. 
Number  of  blocked  interrupts. 


TEXT 

SEGMENT 

BYTE 

PUBLIC 

TEXT 

ENDS 

DATA 

SEGMENT 

WORD 

PUBLIC 

DATA 

ENDS 

CONST 

SEGMENT 

WDRD 

PUBLIC 

CONST 

ENDS 

BSS 

SEGMENT 

WORD 

PUBLIC 

BSS 

ENDS 

DGROUP 

GROUP 

CONST, 

,  BSS, 

DGROUP  GROUP  CONST,  _BSS,  _DATA 

ASSUME  CS:  _TEXT,  DS :  DGROUP,  SS;  DGROUP,  ES :  DGROUP 

EXTRN  _ chkstk ; NEAR 

EXTRN  _ t_reschedule : NEAR 

EXTRN  T  active :WORD 


TIMR_CTRL  -  43H 
T I MR_0_DATA  -  4 OH 
TIMR  0  LOAD  -  36H 


;  address  of  timer  control  port 
;  address  of  counter  0  data  port 
;  control  word  for  timer 

;  Number  of  bytes  on  interrupt- 
;  service-routine  stack 


;  Misc.  variables.  Note  that  I'm  putting  all  these  in  the 
;  code  (_TEXT)  segment  so  that  I  can  find  them  when  an 
;  interrupt  comes  along.  The  PUBLIC  statements  are  just  for 
;  debugging. 


old_int  equ 
old_off  dw 
old_seg  dw 

tick  reset  dw 
num ticks  dw 


Offset  of  old  timer  interrupt  routine. 
Segment  address  of  same. 


numticks  dw  ?  ;  Initialized  to  tick_reset,  decre- 

;  mented  on  each  timer  interrupt, 

;  reset  to  the  speedup  factor  (and  the 
;  old  service  routine  is  called)  when 
;  it  reaches  zero. 

stack  db  STKSIZE  dup  (0)  ;  Local  stack  for  service  routine 

;  30  bytes  are  used  by  real  service 
;  routine,  the  rest  is  available 
;  for  the  user  service  routine. 

stack_end  dw  ? 


;  total  number  of  interrupts 
;  number  of  times  user  routine  blocked 


;  don't  execute  user  routine  if  true 


IF  DEBUG 
PUBLIC 
PUBLIC 


PUBLIC  old  int,  old__of f,  old_seg,  old  ds, 

PUBLIC  ticE_reset,  numticks,  stack,  stack  end,  old_sp, 
PUBLIC  old_ss,  old_ax,  old_ip,  old  cs,  ol3_fl 
PUBLIC  blocked,  aerv,  numint,  numbTk 


;  statistics  stuff. 

PUBLIC  _t_nuraint,  _t_numblk 

_t_numint  PROC  NEAR 

mov  ax, WDRD  PTR  cs: numint 

mov  dx, WORD  PTR  cs:numint+2 


ret 

_t_numint  ENDP 

_t_numblk  PROC  NEAR 

mov  ax,  WORD  PTR  csrnumblk 

mov  dx,  WDRD  PTR  cs:numblk+2 

ret 

t  nurablk  ENDP 


;  t_cli();  t_sti();  Disable  and  enable  interrupts. 
PUBLIC  t  cli,  t  sti 


_t_cli 

PROC 

NEAR 

cli 

_t_di 

ret 

ENDP 

_t_sti 

PROC 

NEAR 

sti 

t  sti 

ret 

ENDP 

;  t_block();  Disable  and  enable  user  interrupt  service 
:  t_release();  routine  but  not  the  real  interrupt  service 
;  routine  (or  the  interrupt  itself) . 

PUBLIC  _t_block,  _t_release 

_t_block  PROC  NEAR 

mov  byte  ptr  cs:blocked,l 
ret 

_t_block  ENDP 

_t_release  PROC  NEAR 

mov  byte  ptr  cs: blocked, 0 
ret 

t  release  ENDP 


t_speedup(  factor  ) 
Tnt  factor; 


PUBLIC  _ t_speedup 

_ t_speedup  PROC 


factor  -  [bp+4J 
routine  -  [bp+6] 


push 

bp 

mov 

bp,  sp 

xor 

ax,  ax 

call 

chkstk 

mov 

_TEXT:old_ds,ds 

mov 

ax, [bp+4] 

mov 

_TEXT: tick  reset,  i 

mov 

_TEXT : numtTcks,  ax 

EED 

mov 

ax, (bp+4 ) 

cmp 

ax, 01H 

je 

noload 

cmp 

ax, 00H 

je 

noload 

mov 

al,  TIMR  0  LOAD 

out 

TIMR  CTRL,  al 

mov 

ax, 00000H 

mov 

dx, 00001H 

mov 

bx,  [bp+4] 

div 

bx 

cli 

out 

TIMR  0  DATA, a 1 

mov 

al,  aFT 

out 

TIMR_0_DATA,  al 

sti 

mov 

ah, 35H 

mov 

al, 08H 

int 

21H 

remember  current  DS. 
AX  -  factor 
tick  reset  -  factor; 
numticks  -  factor; 


if(  factor  4&  factor  !- 


Set  up  timer  for  load 

Number  of  ticks 

-  65536/factor 
BX  -  factor. 

AX  -  number  of  ticks 

Send  new  count  to  timer 


Get  the  old  vector 


_TEXT:old_off,bx 

_TEXT:old_seg,es 


ah, 25H 
al, 08H 

dx,  OFFSET  TEXT:  serv 
ds 


set  up  the  new  vector 


PUBLIC  _ t_slowdown 

_ t_slowdown  PROC  NEAR 


push  bp 

mov  bp, sp 

xor  ax,  ax 

call  chkstk 


114 

972 


Dr.  Dobb's  Journal,  December  1987 


2491 

250  1 

mov 

ax,  TEXT : old  off 

See  if 

the  interrupts  have 

251  | 

or 

ax,  ax 

changed. 

2521 

j* 

no  int 

No,  don't  fix  them  then 

253| 

254  1 

restore  old  timer  interrupt 

255| 

push 

ds 

2561 

mov 

ah, 25H 

257| 

mov 

al, 08H 

258| 

mov 

ds,  TEXT: old  Beg 

259| 

mov 

dx,  TEXT: old  off 

2601 

int 

21H 

261  | 

pop 

ds 

262| 

263| 

no  lnt 

264  | 

mov  al 

TIMR  0  LOAD 

Restore  default  system 

265  1 

out  TIMR  CTRL, al 

clock 

tick  rate 

266| 

mov  al 

0 

267| 

out  TIMR  0  DATA, a 1 

268| 

out  TIMR  0  DATA, a 1 

2691 

270| 

mov 

op,  bp 

2711 

pop 

bp 

2721 

ret 

273| 

274  1 

t  slowdown 

ENDP 

275| 

2761 

277| 

;  Actual  interrupt  service  routine. 

278| 

;  Note 

that  the 

flags,  cs,  and  ip 

are  pushed  on  entry  (because 

2791 

;  if  the  interrupt) 

2801 

281| 

serv 

PROC 

NEAR 

2821 

283| 

push 

ax 

284| 

If(  servicing  blocked) 

285  | 

mov 

al,  byte  ptr  cs: blocked  ; 

286| 

or 

al,  al 

{ 

287| 

j* 

servl 

288  | 

2891 

add 

WORD  PTR  cs:numblk, 1 

++numblk; 

290| 

adc 

WORD  PTR  cs:  numbllc+2, 0 

291  | 

jmp 

servexit 

} 

292  1 

servl : 

; 

else 

293| 

add 

WORD  PTR  cs :nuraint, 1 

{ 

294  | 

adc 

WORD  PTR  cs :numint+2, 0 

+numint; 

295| 

push 

bx 

296| 

push 

cx 

Save  rest  of 

297| 

push 

dx 

current  context 

298| 

push 

8i 

299| 

push 

di 

3001 

push 

bp 

3011 

push 

ds 

3021 

push 

es 

303| 

304  i 

mov 

ds,  TEXT : old  ds 

Restore  Data  segment 

3051 

mov 

bx,  T  active 

BX  -  T  active 

306| 

mov 

[bxT, sp 

T  active->sp  -  SP 

307| 

mov 

[bx+2] ,88 

T  active->88  -  SS 

308| 

3091 

push 

cs 

Set  up  local  stack 

310  1 

pop 

ss 

3111 

mov 

sp, offset  TEXT: 

stack  end  ; 

3121 

in  task.c 

3131 

call 

t  reschedule 

3141 

BX  -  new  T  active. 

315| 

mov 

bx,  T  active 

3161 

mov 

ss,Tbx+2J 

will  be  the  same  as 

317| 

mov 

sp,  [bx] 

the  old  T  active  if 

318| 

pop 

es 

no  change  is  reqd. 

3191 

pop 

ds 

3201 

pop 

bp 

3211 

pop 

di 

322  1 

pop 

si 

3231 

pop 

dx 

324  1 

pop 

cx 

3251 

pop 

bx 

326| 

327  | 

) 

if( — numticks  >  0) 

328| 

dec 

TEXT: numticks 

329| 

serv3 

{ 

330| 

mov 

al, 20h 

send  EOI 

331| 

out 

20h, al 

3321 

pop 

ax 

333| 

iret 

} 

334  1 

else 

335  | 

serv3 

i 

numticks  -  tick_reset; 

336| 

mov 

ax,  TEXT: tick  reset 

337| 

mov 

TEXT : numticks,  ax 

338  | 

ax 

jmp  to  old  vector 

3391 

jmp 

dword  ptr  TEXT: old  int 

340  1 

341| 

serv 

ENDP 

3421 

343| 

3441 

TEXT 

ENDS 

3451 

END 

End  Listing  Two 

(Listing  Three 

begins  on  page  116.) 

Dr.  Dobb's  Journal,  December  1987 


115 

973 


Listing  Three 

(Listing  continued,  le\l  begins  on  page  1Z6.) 


54 

55 

56 

57 

58 

59 

60 
61 
62 

63 

64  i 

65 

66 

67 

68 

69 

70 

71 

72 

73 

74 

75 

76 

77 

78 

79 

80 
81 
82 

83 

84 

85 

86 

87 

88 

89 

90 

91 

92 

93 

94 

95 

96 

97 

98 

99 
100 
101 
102 

103 

104 

105 

106 

107 

108 

109 

110 


Listing  3  —  swap. asm,  Printed  9/11/1987 


41 

118| 

51 

TITLE 

swap.c 

119| 

6| 

NAME 

swap 

1201 

7| 

121| 

8| 

TEXT 

SEGMENT 

WORD  PUBLIC 

•CODE' 

1221 

91 

TEXT 

ENDS 

1231 

10| 

DATA 

SEGMENT 

WORD  PUBLIC 

•DATA' 

124| 

HI 

DATA 

ENDS 

125| 

121 

CONST 

SEGMENT 

WORD  PUBLIC 

'CONST' 

126| 

13| 

CONST 

ENDS 

127| 

14| 

B3S 

SEGMENT 

WORD  PUBLIC 

'BSS ' 

128| 

20| 

save  bx  dw  0  ;  1 

211 

221 

BSS 

ENDS 

23| 

DGROUP 

GROUP  CONST,  BSi 

24| 

ASSUME  CS:  TEXT, 

25| 

26| 

DATA 

SEGMENT 

27| 

DATA 

ENDS 

28| 

29| 

IF  1 

30| 

PUBLIC 

stack  err,  mychkstk, 

31| 

PUBLIC 

op,  segm,  off 

32| 

ENDIF 

33| 

351 

TEXT 

SEGMENT 

361 

37| 

ASSUME  CS:  TEXT 

38| 

39| 

401 

PUBLIC 

PUBLIC 

t  swap  in 
t  install 

411 

PUBLIC 

t  shazam 

421 

PUBLIC 

t  stop 

43  1 

PUBLIC 

44| 

PUBLIC 

451 

46| 

EXTRN 

chkstkrNEAR 

47| 

EXTRN 

“free: NEAR 

48| 

EXTRN 

t  slowdown: NEAR; 

491 

EXTRN 

t  Block: NEAR  ;  1 

501 

EXTRN 

t  release: NEAR  ;  1 

511 

EXTRN 

t  iserr : NEAR  ;  : 

521 

EXTRN 

T  active: WORD  ;  ] 

531 

;  1 

Routine  to  do  context  swaps.  Everything  in 
this  file  is  VERY  compiler  dependant  and 
VERY  nonportable. 


op 

segm 

off 


Used  by  rst_chkstk  and  chg  chkstk 
(below) . 


;  used  by  t  swap  in 


Swaps  two  tasks 

Installs  a  task  when  none  active 


Stops  multitasking. 


in  standard  library 
In  standard  library 
In  schedule. asm 


to  currently  active  task 


save_sp  dw  0 
save_ss  dw  0 
chk  on  dw  0 


t  shazam  PROC  NEAR 


;  No  stack  probes  while  nonzero 


the  ball  rolling.  Save  the  current  context, 
then  start  up  the  first  task.  On  entry, 

T_active  must  point  at  the  first  task  to  activate. 
Stack  on  entry  (from  top  to  bottom)  is: 

return  address  from  _t_shazara  call 
old  bp  saved  by  t_start 
t_start's  return  address 
speedup_factor  passed  to  t_start 


add  sp,  2 


push  si 
push  di 
push  ds 


Discard  return  addr  to  _t  shazam 
Uncovering  bp  from  main  tBat 
was  saved  by  t_start() 


push  this  last 


mov  WORD  PTR  cs:save_sp,  sp 
raov  WORD  PTR  cs:save_ss,  ss 

call  chgchkstk 

mov  bx, "WORD  PTR  _T  active 

jmp  shazam 

t  shazam  ENDP 


_t _ stop  PROC  NEAR 

;  t_stop(  err code  ) 

i  This  routine  deletes  the  current  task,  causes 
;  multitasking  to  be  turned  off,  and  passes  control 
;  back  to  the  routine  that  called  t_start() 

;  (immediately  following  the  t_start  call) . 

;  Errcode  is  passed  back  to  the  calling  routine  as  the 
;  return  value  of  t_start(). 

;  It  can  be  called  directly  by  a  running  task;  it's 
;  called  automatically  when  the  last  task  is  deleted, 

;  or  when  the  only  running  task  deletes  itself. 


cli 


;  Just  to  make  sure 


111 

112 

113 

114 

115 

116 
117 


129 

130 

131 

132 

133 

134 

135 

136 

137 

138 

139 

140 

141 

142 

143 

144 

145 

146 

147 

148 
149; 

150 

151 

152 

153 

154 

155 

156 

157 

158 

159 

160 
161 
162 

163 

164 

165 

166 

167 

168 

169 

170 

171 

172 

173 

174 

175 

176 

177 

178 

179 

180 
181 
182 
1831 

184 

185 

186 

187 

188 

189 

190 

191 

192 

193 

194 

195 

196 

197 
1981 

199 

200 
201 
202 

203 

204 

205 

206 

207 

208 

209 

210 
211 
212 

213 

214 

215 

216 

217 

218 
219 
?20 
221 
222 

223 

224 


add 

®P,  2 

Discard  return  address 

pop 

ax 

return  value  -  errcode 

mov 

ss,  WORD  PTR  cs: save  ss 

Restore  initial  stack... 

raov 

sp,  WORD  PTR  cs:save  sp 

pop 

ds 

. . . and  data  segment . 

push 

ax 

call 

rat_chkstk 

Put  back  original  _ chkstk 

call 

_ t_alowdown 

Disable  weird  timer  int. 

pop 

ax 

get  back  return  value 

push 

ax 

or 

ax,  ax 

if (  jrrcode  —  NOERR  )  ) 

jnz 

t_stopl 

( 

push 

_T_active 

free(  T  active  ); 

call 

free 

add 

«P,2 

) 

t_stopl : 
pop 


pop  di 
pop  si 
pop  bp 
ret 

_ t _ Stop  ENDP 


return (  errcode  ) 

resore  si  and  di  saved  by 

by  t_shazem, 

and~Bp  saved  by  t_start. 


_t_inetall  PROC  NEAR 

;  _t_install (new) 

;  TCB  *new; 

;  Delete  the  current  task  and  replace  it  with  the 
;  new  one.  This  routine  saves  some  space  (and  execution 
;  time)  by  jumping  into  the  middle  of  swapO  to  install 
;  the  new  task.  The  scheduler  must  be  blocked  when 
i  this  routine  is  called.  This  routine  does  not 
;  return. 


add 

pop 


sp,2 

bx 


SS, WORD  PTR  [bx+2] 
sp, WORD  PTR  [bx] 


push  bx 
push  _T_active 
_free 
•P,  2 
bx; 


call 

add 

pop 


;  discard  return  address 
;  bx  -  new 


ss:sp  -  new  task's  stack; 


:  free(  T_active  ) 

;  Discard  arg  to  free() 
;  get  back  bx. 


jmp  shazam 
t  install  ENDP 


_ t_swap_in  PROC  NEAR 

;  _t_swap_in(  new  ) 

;  TCB  *new; 

;  Do  a  context  swap.  Replace  T_active  with  new.  This 
;  routine  returns  only  when  the  original  context  is 
•  restored.  Swapping  MUST  be  blocked  when  this  subroutine 
;  is  called.  Release ()  is  called  once  the  new  context 
;  is  installed  and  T_active  is  modified  to  point  at  the 
;  new  task. 


cli 

mov 


pop  hx 
puahf 
push  c s 
push 
push 
push 
push 
push 
push 
push 
push 
push 
push 


WORD  PTR  cs: save  bx,bx 


bx 


dx 

si 

di 

bp 

ds 


Bx  -  return  address 
Save  current  context 

(Push  return  address  as  new  ip) 


:ack  now 

looks 

like  this 

new 

[sp  ♦ 

24] 

flags 

[sp  + 

22] 

cs 

(sp  + 

20] 

ip 

[sp  + 

18] 

ax 

[sp  + 

16] 

bx 

[sp  + 

14] 

cx 

[sp  + 

12] 

dx 

[sp  + 

10] 

si 

(sp  + 

8] 

di 

[sp  ♦ 

6] 

bp 

[sp  + 

4] 

ds 

[sp  + 

2] 

es 

[sp] 

(top  of  stack) 


mov  bx, WORD  PTR  _T  active 
mov  WORD  PTR  [bx+2 7,  ss 

mov  WORD  PTR  [bx] ,  ap 

raov  bx, sp 

mov  bx, WORD  PTR  [bx+24] 


;  bx  -  new 


;  _ t_shazara  and  t  install 

;  come  here  to  do  the  swap 


116 

974 


Dr.  Dobb's  Journal,  December  1987 


2251 

226| 

227  | 

228  | 
229| 
230| 
231| 
2321 
233| 
234| 
2351 
236| 
237| 
238| 
239| 
240| 
2411 
242| 
243| 
2441 
245| 
246| 
247| 
248| 
249| 
250| 
2511 
2521 
2531 

254  1 

255  1 
256| 
257| 
258| 
2591 
260| 
2611 
262| 
263| 
264  | 
2651 
266| 
267| 
268| 
269| 
270| 
271| 
2721 
2731 
274  1 
2751 
276| 
277| 
2781 
279| 
280| 
281| 
2821 

283  1 

284  I 
2851 
286| 
287| 
288| 
2891 
290| 
2911 
292  | 
293| 
2941 
2951 
296| 
297| 
298| 
299| 
300| 
3011 
302| 
3031 
304  | 
3051 
306| 
307| 
308  1 
3091 
3101 
3111 
312  | 
3131 
3141 
315| 
3161 
317| 
318| 
3191 
320| 
3211 
3221 
3231 
3241 
3251 
3261 
327; 
328 
3291 

330 

331 

332 

333 

334 


mov  WORD  PTR  _T_active,hx  ; 
mov  as, WORD  PTR  [bx+2 ) 

mov  ap, WORD  PTR  [bx] 


pop 

es 

pop 

ds 

pop 

bp 

pop 

di 

pop 

si 

pop 

dx 

pop 

cx 

pop 

bx 

pop 

ax 

T_active  -  new; 

Switch  to  new  task's  stack 


call  t  release 

sti 

lret 


_t_swap_in  ENDP 


_t_chg_chkstk  ( ) 
]t_rst_chkstk() 


Normal  stack  checking  off 
back  on  again 


t_8us_chkstk()  Suspend  stack  checking  temporarily 

^t_rst_chkatk<)  restore  it  again. 


Turn  off  Microsoft  stack  checking  by  overwriting  the  first 
5  bytes  of  chkstk  with  an  absolute  jump  to  mychkstk. 

This  is  a  kludge  but  I  can't  get  the  Microsoft  compiler 
to  link  my  own  version  of  chkstk,  even  when  I  use  the 
source  file  that  they  supply. 


chg_chkstk  PROC  NEAR 


mov  bx, OFFSET  _ chkstk 

mov  ah, BYTE  PTR  cs:Ibx+0] 
mov  BYTE  PTR  op, ah 

mov  ax, WORD  PTR  cs;[bx+l] 
mov  WORD  PTR  off, ax 

mov  ax, WORD  PTR  cs:[bx+3] 
mov  WORD  PTR  aegm, ax 

mov  BYTE  PTR  CS : [bx+0] , 0EAH  f  EA-JMP 

mov  WORD  PTR  cs: [bx+1] , OFFSET  mychkstk  ;  offset 

mov  WORD  PTR  cs;[bx+3],cs  ;  segment 

mov  cs:chk_on,l  ;  Enable  stack  checking 

ret 


chg_chkstk  ENDP 

rst_chkstk  PROC  NEAR 

mov  bx, OFFSET  _ chkstk 

mov  ah, BYTE  PTR  op 

mov  BYTE  PTR  cs; [bx+0], ah 

mov  ax, WORD  PTR  off 
mov  WORD  PTR  cs:[bx+l],ax 

mov  ax, WORD  PTR  aegm 

mov  WORD  PTR  ca:[bx+3],ax 

ret 


rat_chkstk  ENDP 

_ t_sus_chkstk  PROC  NEAR 

mov  cs : chk_on, 0 
ret 

_ t_sus_chkatk  ENDP 

_ t_rst_chkstk  PROC  NEAR 

mov  cs : chk_on, 1 
ret 

t  rat  chkstk  ENDP 


On  entry  AX  holds  the  number  of  bytea  required  for  local 
variables.  Chkstk  normally  checks  the  stack  and,  at  the 
same  time  1  .ies  setting  up  the  stack  frame  by 

racf  ..e  contents  of  ax  from  the  stack  pointer. 


mychkstk  PROC  NEAR 

mov  cx,cs:chk_on 
or  cx, cx 
jz  nocheck 

mov  cx,  T_active 
add  cx,34 
crap  sp,  cx 
jbe  stack_err 

nocheck; 

pop  cx 
sub  sp,ax 
jmp  cx 

stack_err: 

mov  ax, -9 
push  ax 
call  _t_stop 


;  If  stack  checking  disabled  at 
;  run  time,  skip  past  the  actual 
;  test. 


;  Offset  to  stack  base  +  4 
;  if (  sp  <-  stack_base  ) 


;  cx  -  return  address 
;  finish  setting  up  stack  frame 
;  ret  to  caller  w/o  modifying  stack. 

;  Return  address  still  on  the  stack 


;  Shouldn't  return 


(continued  on  nejtt  page) 


Dr.  Dobbs  Journal ,  December  1987 


117 

975 


C  CHEST 


Listing  Three 

(Listing  continued,  text  begins  on  page  1Z6.) 


335| 

336| 

337| 

338  |  raychkstk  ENDP 
339| 

340| 


raov  cx, 0 
jmp  cx 


TEXT  ENDS 


;  Panic  abort  to  DOS 


End  Listing  Three 


Listing  Four 


Listing  4  --  globals.c,  Printed  9/11/1987 


1 1  #define  ALLOC  1 
21 

3 |  tinclude  "kernel .h" 


End  Listing  Four 


Listing  Five 


Listing  5  — 
♦include  "  kernel,  h" 


queue. c,  Printed  9/11/1987 


- V 


T_QUEUE  *t_makequeue (  size  ) 


int 

< 


size; 


Make  a  queue  for  intertask  conrounication  and  link  it 
into  the  the  queue  chain.  Return  values  are  pointer 
to  queue  on  suceess  or  TE  NOMEM  if  Insufficient  memory. 
Queues  are  searched  in  orHer  of  creation  so  it's  best 
to  create  the  most  active  queues  first.  Be  sure  that 
multitasking  is  blocked  when  tailp  is  modified  (because 
it's  static). 


static  T_QUEUE  *tailp; 

T_QUEUE  *p; 

void  •raalloc(); 

t_block() ; 

P  “  (T_QUEUE  *)  malloc(  sizeof (TjQUEUE) 
+  ((size-1) 

t_release()  ; 


sizeof (void  *))  ); 


if<  !p  ) 

return  (T_QUEUE  *)  TE_NOMEM; 

p->signature  -  TQ_SIG; 
p->next  -  NULL; 


-  NULL; 

-  NULL; 

-  0; 

-  size; 

-  p->queue; 

-  p->queue; 


p->task_h 

p->task“t 

p->numeTe 

p->q_size 

p->headp 

p->tailp 

t_block ( ) ; 


if(  IT_queues  ) 

T_queues  -  tailp  -  p  ; 

else 

i 

tailp->next  -  p; 
tailp  -  p; 


t_release() ; 
return  p; 


int  t_send (  q,  mag  ) 

T_?UEUE  *q;  /*  pointer  to  queue  <*/ 

*msg;  /*  Pointer  to  message  to  enqueue  */ 


void 

1 


/*  Send  a  message  and  reschedule  if  necessary. 


Return  Values: 
TE_NOERR 
TE_BADARG 
TE_QFULL 


No  error; 

Bad  q  argument. 
Queue  is  full 


*  The  message  is  always  enqueued  in  the  indicated  queue. 

*  Then,  if  a  task  is  waiting.  The  message  at  the  head  of 

*  the  queue  is  dequeued  and  attached  to  the  task,  which 

*  is  put  back  into  the  active  list.  Finally,  if  a  task 

*  was  activated,  the  current  process  yields.  Note, 

*  however,  that  the  current  task  will  still  be  the 

*  active  task  if  it’s  higher  priority  than  the  one  to 
which  you  send  a  message.  The  sending  task  should  wait() 

*  somewhere  to  make  room  for  the  lower-priority  task. 


741 
75| 
76| 
77  | 
78| 
79| 
80| 
811 
82| 
83| 
84| 
85| 
86  | 
87| 
88| 
89| 
90  | 
911 
921 
93| 
94  1 
95| 
96| 
97| 
98  | 
99 1 
1001 
101| 
1021 
103| 
104| 
105| 
1061 
107| 
108| 
109| 
110| 
nil 
1121 
1131 
114  1 
1151 
116| 
117| 
118| 
119| 
120| 
121| 
122| 
123| 
124| 
125| 
126| 
127| 
128| 
129| 
130| 
1311 
132| 
133| 
134  1 
135| 
136| 
137| 
138| 
139| 
140| 
1411 
1421 
1431 
144  1 
1451 
146| 
147| 
148| 
149| 
150| 
151| 
1521 
153| 
1541 


if(  q->signature  !-  TQ  SIG  ) 
return (  TE_BADARG  J; 

t_block ( ) ; 

if(  q->numele  ~  q->q_size  ) 


/*  Queue  is  full  */ 


t_releaae () ; 
return (  TE_QFULL  ) ; 


Enqueue  the  message. 


++  q->numele; 

if (  ++q->tailp  >-  q->queue  +  q->q_size  ) 
q->tailp  -  q->queue  ; 

* (q->tailp)  -  rasg  ; 

if(  q->task  h  ) 

{ 

/*  A  task  is  waiting,  dequeue  both  it  and  the  message, 
*  attach  the  message  to  the  task,  and  reschedule 
V 

task  -  q->task_h; 

q->task_h  -  task->next; 

— q->numele; 

if(  ++q->headp  >-  q->queue  +  q->q_size  ) 
q->headp  —  q->queue; 

task->msg  -  * (q->headp) 
task->status  -  TS_WAIT 
pq_ins(  T_tasks,  ttask  ) 

t_yield()  ; 

t_release ( )  ; 
return  TE  NOERR; 


void  *t  wait(  q,  timeout  ) 
T_QUEUE  *  q; 


int 

{ 


timeout; 

/*  Wait  for  a  message  to  arrive  at  the  queue.  If  several 

*  tasks  are  waiting  at  the  same  queue,  the  first  task 

*  in  the  queue  gets  the  message.  Return  if  no  message 

*  arrives  within  timeout  system  clock  ticks.  Maximum 

*  timeout  is  32,767  ticks.  A  0  timeout  value 

*  means  that  the  subroutine  returns  immediately 

*  (without  a  reschedule)  if  no  message  is  waiting 

*  in  the  queue. 

*  Message  requests  are  queued  up  in  order  recieved, 

*  without  reguard  to  priority.  I've  done  this  both 

*  because  it's  easy  and  because,  in  most  applications, 

*  tasks  with  different  priorities  will  not  be  pending 

*  on  the  same  queue. 

*  If  a  message  is  present,  the  routine  returns  it 

*  iraediately  without  yielding,  otherwise  the  current 

*  task  is  removed  from  the  active  list  and  yield () 

*  is  called. 

*  Hints:  Use  this  routine  to  suspend  a  task  for  a 

*  limited  amount  of  time  (as  compared  to  deleteing 

*  the  task) .  Just  pend  on  a  queue  that  will  never  have 

*  a  message  sent  to  it. 

Normally, 


on  a  timeout  or  u  tne  input  value  of 
timeout  is  0  and  no  message  is  waiting 


155| 

return  values 

1561 

* 

157| 

* 

TE  TIMEOUT 

158| 

• 

1591 

* 

160| 

* 

TE  NOTASKS 

161| 

* 

” 

162| 

*/ 

163| 

164| 

TCB 

•new; 

165| 

166| 

if< 

q->signature 

existance.  This  is  a  guaranteed  deadlock. 


167| 

168| 

169| 

170| 

171| 

172| 

173  1 

174  1 
1751 
176| 
177| 
1781 
1791 
180| 
1811 
1821 
183| 
184  | 
185| 
186| 
187| 


return (  TE_BADARG  T; 

t_block ( ) ; 

if(  q->numele  ) 

{ 

/*  There's  a  message  waiting  in  the  queue.  Dequeue 

*  the  message  and  return  it  immediately.  Strictly 

*  speaking,  we  don't  have  to  attach  the  message 

*  to  the  task,  but  it's  convenient  to  do  it  for 

*  debugging  reasons. 


—  q->nuraele; 

if(  ++q->headp  >-  q->queue  +  q->q_size  ) 
q->headp  —  q->queue  ; 

T_active->msg  -  * (q->headp) ; 
T_active->status  -  TS_WAIT  ; 

trelease () ; 


(continued  on  page  120) 


118 

976 


Dr.  Dobb's  Journal,  December  1987 


C  CHEST 


Listing  Five 

(Listing  continued,  text  begins  on  page  126.) 


1881 

189| 

1901 

1911 

192| 

193| 

1941 

1951 

196| 

197| 

198| 

199| 

200| 

201| 

202| 

203| 

204| 

2051 

2061 

207| 

208| 

209| 

2101 

2111 

212| 

2131 

2141 

2151 

2161 

217| 

218| 

2191 

220| 

2211 

2221 

223| 

2241 

2251 

2261 

227| 

228| 

229| 

230| 

2311 

2321 

2331 

234| 

2351 

236| 

2371 

238| 

239| 

2401 

2411 

242| 

2431 

244| 

2451 

246| 

247| 

248| 

249| 

250| 

251| 

252| 

253| 

254| 

2551 

256| 

257| 

258| 

2591 

260| 

261| 

2621 


return  T  active- >ratg; 

> 

el*e  /•  Ho  messages  waiting  */ 

if(  timeout  —  0  )  /*  Immediate  time  out  */ 

{ 

t  release  ()  ; 
return  TE  TIMEOUT; 

) 

else 

{ 

/*  Enqueue  the  current  task  to  wait  for  an 

*  incoicming  message.  The  pq_del  call 

*  gets  a  task  to  preempt  the  current  one. 


if(  !pq_del(  T_taeks,  4new)  ) 

t  releaseO;  /*  Ho  tasks  to  activate  •/ 

return  TE_HOTASKS ; 

) 

else 

{ 

T_active->wait  -  timeout; 

T_active->next  -  HULL; 

T_active->timestarap  -  T_clock; 

if{  !q->task  h  ) 

q->taak  Ti  -  T  active; 

else 

q->task__t->next  -  T_active; 
q->taak_t  -  T_active; 

__t_swap_in(  new  );  /*  Returns  on  either  a  message 

being  */ 

/*  sent  to  this  queue  or  a 
timeout.  •/ 


) 


return  (  T_active->msg  ?  T_active->msg  :  TE_TIMEOUT 

is 


/*. 


•/ 


t_yield() 

{ 

/*  Yield  to  the  highest  priority  task  (if  there  is  one). 

*  You  can't  yield  to  tasks  waiting  at  queues,  only  to 

*  active  ones. 

* 

*  Returns: 

*  TE_HOERR  succesafull  yield 

*  TE  HOTASKS  Current  task  is  the  only  active  task 
*/  “ 

TCB  *new,  *old  ; 
t_block ( ) ; 

if(  !  pq_del(  T_tasks,  4new  )  ) 

( 

t_release<)  ; 
return  TE  HOTASKS; 

) 

old  -  T_active; 

old->tiraestarop  -  T_clock  ; 
pq_ins (  T_taaka,  4old  ) ; 

_t_awap_in(  new  );  I*  _t_swap_in()  changes  T_active  to  new;  */ 
return  TE_HOERR  ; 


End  Listing  Five 


Listing  Six 


Listing  6  —  task.c.  Printed  9/11/1987 


1|  ♦ include  <doe.h> 

2|  tinclude  <stdarg.h> 

3|  #include  <signal.h> 

4 |  tinclude  <t oo Is /hardware .h>  /*  tdefine  for  TIMR_CLK  */ 

5|  tinclude  "kernel. h" 

61 

7 |  static  int  Speedup  factor  -  0; 

81  static  long  Executed,  Timed  out.  Did  swap; 

91  “ 

10 |  tdefine  max(a,b)  ((a)  >  (b)  7  (a)  :  (b) ) 


13 |  *  Strings  for  error  codes.  Hote  that  these  are  upside  down 

14  j  *  to  compensate  for  the  negative  indexes 
15|  */ 

16| 

171  static  char  *msgsU  - 
181  < 

191  "Ctrl -Break  or  Ctrl-C"  /*  -10  */ 


20| 
211 
22| 
23| 
24| 
25| 
26| 
27| 
28| 
29| 
30  1 
311 
32| 
33| 
34| 
351 
36| 
37| 
38| 
39| 
40| 
411 
42| 
43| 
44| 
451 
46| 
47| 
48| 
491 
501 
511 
52; 
53| 

54 

55 
56| 
57 
581 
59 
601 
61 
62 

63 

64 

65 

66 

67 

68 

69 

70 

71 

72 
731 

74 

75 

76 
77; 

78 

79 

80 
81 
82 
83 
84! 

85 

86 
87| 
88 

89 

90 

91 
921 

93 

94 
951 

96 

97  | 

98  1 
991 

1001 
1011 
102| 
103| 
104  | 
105| 
1061 
107| 
108  | 
1091 
1101 
1111 
112 
1131 

114 

115 
116| 

117 

118 
119| 
120 
121 
1221 
123 
124| 
1251 
126 

127 

128 
129 
1301 
1311 

132 

133 


"Stack  overflow", 

"Delete  would  have  caused  a  deadlock", 
"Internal  error", 

"Ho  tasks  to  send  message  to", 

"Queue  is  full", 

"Timeout", 

"Illegal  Argument", 

"Insufficient  memory  available", 

"Maximum  number  of  tasks  (32)  already  exists", 
"Ho  error" 


/* 

/* 

/* 

f 

/• 

/* 

/* 

/* 

/* 

/* 


-9  V 
-8  */ 
-7  •/ 
-6  */ 
-5  */ 
-4  */ 
-3  */ 
-2  */ 
-1  */ 
0  */ 


char  **t_errlist  -  msgs  +  ( (sizeof (msgs) /sizeof (*msgs) )  -1); 


*/ 


t_ieerr  (x) 

( 

return (  -10  <-  (x)  u  (x)  <  0  ); 

) 

/* - - - - -  -  V 


t  perror(  str,  errcode  ) 
char  *str; 

( 

if(  !t_iserr (errcode)  44 
t_printf(  "%s  status 

else 


t_printf (  "%s  %s\n". 


errcode  !-  0  ) 

%d\n",  str,  errcode  ) ; 

str,  t_errlist  [errcode] 


); 


static  intr() 

( 

t  stop (  TE  KILL  ); 

) 

/* - 


int  t  start (  speedup  factor  ) 

( 

/*  Start  multitasking.  At  least  one  task  must  have 

*  been  created  prior  to  this  call.  The  speedup  factor 

*  determines  the  system  dock-tick  rate.  A  value  of  1 

*  gives  the  default  rate  of  roughly  18.2  times  a  second 

*  (once  every  55  milliseconds,  more  or  less) .  A 

*  speedup_factor  of  2  gives  twice  that  speed:  36.4  ticks 

*  per  second,  one  every  27  milliseconds  or  thereabouts. 

*  Speeding  up  the  clock  rate  shouldn't  affect  the  DOS 

*  clock,  nonetheless,  it's  safest  if  the  speedup  factor 

*  is  a  power  of  two. 

* 

*  If  the  speedup  factor  is  0  then  the  system  is 

*  nonpreemptive.  You'll  have  to  use  t_yield(),  t_send(), 

*  and  t_wait()  to  change  contexts. 

*  Control  passes  imediately  to  the  highest  priority  task. 

*  Control  will  automatically  pass  back  to  the  calling 

*  subroutine  when  all  tasks  have  been  deleted.  The  task 

*  is  actually  started  in  _t_shazam(),  declared  in  swap. asm. 

*  normal  return  values: 


TE_NOTASKS 
TE  DEADLOCK 


TE_NQERR 
TE  STACK 


Ho  tasks  exist,  multitasking  not 
started; 

A  task  deleted  itself  and  it's  the 
only  active  task  in  the  system.  Other 
tasks  exist  but  they're  all  pending 
on  queues. 

All  tasks  have  been  deleted  normally, 
no  tasks  are  waiting  on  queues. 

Task  stack  overflow. 


If  TE_KOERR  is  returned,  then  all  memory  allocated  to 
tasks  will  have  been  saved,  otherwise,  if  one  of  the 
above  errors  was  returned,  T_active  will  point  at  the 
TCB  of  the  offending  task. 

Other  return  values  are  possible  if  a  task  calls  t_stop() 
directly.  The  argument  passed  to  t_stop()  is  returned  by 
t_start().  The  process  is  analogous  to  the  value  of 
exit(),  which  doesn't  return  and  who's  argument  is 
passed  back  to  a  wait()  call  in  the  parent  process.  Hote 
that  the  TCB  pointed  to  by  T_active  will  not  be  free()ed 
unless  TE_HOERR  (0)  is  returned. 


Speedup_f actor  -  speedup_factor; 

if(  !pq_del(  Ttasks,  4T_active  )  ) 
return  TE_HOTASKS; 

if (  speedup_factor  >  0  ) 

( 

t_cli ( ) ; 

_t_speedup(  speedup_f actor  ); 


signal (  SIGIHT,  intr  ); 

_t_shazam() ; 

return  TE_IHTERHAL;  /•  Shouldn't  ever  get  here  */ 


'/ 


int  t_second() 

( 

/*  Returns  the  number  of  system  clock  ticks  in  a 


120 


Dr.  Dobb's  Journal,  December  1987 

977 


134  1 
135| 
136| 
137| 
138| 
139| 
140| 
1411 
1421 
1431 
1441 
1451 
146 
147| 
148: 
149 
150i 
151i 
152 
1531 

154 

155 

156 

157 

158 

159 

160 
161 
162 

163 

164 

165 

166 

167 

168 

169 

170 

171 

172 

173 

174 

175 

176 

177 

178 

179 

180 
181 
182 

183 

184 

185 

186 

187 

188 

189 

190 

191 

192 

193 

194 

195 

196 

197 

198 

199 

200 
201 
202 

203 

204 

205 
?06 

207 

208 

209 

210 
2111 
212 
2131 

214 

215 
216| 

217 

218 
2191 
220! 
2211 
2221 
2231 
224  1 
225| 
2261 
227| 
228| 
229| 
230| 
231| 
232| 
233| 
234  1 
235| 
236| 
237| 
2381 
239| 
2401 
2411 
2421 
243! 
244  i 
245! 
246| 


*  second,  given  the  speedup  factor  passed  to  t  start (). 

*  Returns  0  If  the  speedup  factor  was  0.  In  this  case 

*  a  t  wait()  call  will  never  time  out. 

*/  “ 


return  (TIMR  CLK  /  65536)  *  Speedup  factor  ; 

) 


TCB  *t_create (  subr,  tag,  priority,  stack_size,  ...  ) 


int  (*subr)(); 

char  *tag; 

unsigned  priority; 
int  stack  size; 

{ 


/*  Subroute  that  forms  main  module  */ 
/*  String  used  to  identify  TCB  */ 
/*  Priority  */ 
/*  Stack  size  (in  2-byte  words)  */ 


/*  Creates  a  new  task.  Subr  is  a  pointer  to  the  main() 
*  subroutine  for  the  task. 


Priorities  must  be  in  the  range  0-255.  255  is  the 
highest.  If  more  than  one  task  has  the  same  priority, 
they  are  executed  in  a  round-robin 
fashion.  Forces  a  reschedule  if  tasking  is  active. 


Arguments  may  be  passed  to  the  subroutine  at  startup. 
That  is,  a  NULL-terminated  list  of  pointer-sized 
arguments  follow  stack_aize  in  the  t_create()  call. 
These  are  passed  to  the  subroutine  in  the  normal  way. 
For  example: 


foo  (  a,  b,  c  )  int  a,  b,  c;  { ) 

t_create(  foo,  "foo"  10,  128,  doo,  wha,  ditty,  NULL); 


atarts  up  foo<)  as  a  task  at  priority  10  with  a 
128-byte  stack.  Doo,  wha,  and  ditty  are  passed 
to  foo  as  arguments  a,  b,  and  c.  Note  that  the 
arguments  use  6  of  the  128  bytes  in  the  stack 
(two  for  each  argument) .  The  •‘foo"  tag  is  just 
used  for  identification  purposes  in  debugging. 

It  can  be  any  string. 

Note  that  a  few  Microsoft  functions  (like  printf) 
use  up  inordinate  amounts  of  stack.  If  you're  going 
to  call  Microsoft  library  routines,  you'll  need 
at  least  IK  bytes  of  stack  (stack  size— 512)  per 
task. 


A  pointer  to  the  created  TCB  is  returned  normally. 
Error  return  values  are: 


TE_TOOMANY  Maximum  number  of  tasks  already  exists 
TE_NOMEM  Insufficient  memory  available 


TCB  *t; 
struct  SREGS  sags; 
int  pq_cmp  ( ) ; 
int  pq_swap ( ) ; 
va  list  argptr; 
void  *arg; 
void  *malloc(); 


va_start(  argptr,  stack_size  ); 

if(  ++T  numtasks  >  TMAXTASK  ) 
return  (TCB  *)  TE_TOOMANY; 

t_block ( ) ; 


/* 


*/ 


Allocate  the  stack,  converting  stack_size  to  bytes. 

I'm  requesting  one  more  cell  than  specified  in  order 
to  make  room  for  the  error  return  address,  below 
[stack[0]  is  included  in  the  sizeof (TCB) ] .  The  minimum 
stack  size  is  20  words,  this  gives  us  enough  for  a 
context  swap  plus  a  little  slop. 


stack_size  -  max(  stack_size,  20  ); 

if(  ! (t  -  (TCB  *)  malloc(sizeof (TCB)+ (stack_size*aizeof (void*) ) 

.  ~ 
t_release  () ; 

return  (TCB  *)  TE  NCMEM; 


> 


/* 


*/ 


Create  the  active  queue  if  necessary,  then  initialize 
the  TCB.  The  stack  pointer  is  initialized  to  point 
just  past  the  end  of  the  stack  (rather  than  to  the 
last  cell)  because  a  push  uses  a  predecrement.  The 
PC  points  at  the  subroutine.  Uninitialized  registers 
are  unimportant,  but  will  contain  0.  The  stack  area 
is  initialized  to  the  pattern  a5a5a5a5....  so  that 
we  can  look  it  with  a  debugger  and  see  what's  been 
used.  0  is  no  good  for  this  purpose  because  0  is 
a  likely  thing  to  ba  pushed  on  the  stack. 


if(  !T  tasks  ) 

T_taaks  -  pq_create(  T_MAXTASK,  sizeof (TCB  *), 

pq_cmp,  pq_swap,  NULL) ; 

segread(  4  sags  ); 

meraset (  t,  0x0,  sizeof (TCB)  -  sizeof (void*)  ); 

memsetf  t->etack,  0xa5,  stack_size  *  sizeof (void*)  ); 

t->sp  -  t->stack  +  ++stack  size; 

t->88  -  segs.ds  ;  /*  StacJcs  are  in  data  seg  */ 

t->initial_sp  -  t— >sp. 

t->priority  -  priority  ; 

t->timestamp  —  T_clock 

t->tag  -  tag 

(continued  on  next  page) 


Dr.  Dobb’s  Journal,  December  1987 

978 


121 


C  CHEST 


Listing  Six 

(Listing  continued,  tegt  begins  on  page  126.) 


247 

248 

249 

250 

251 

252 

253 

254 

255 

256 

257 

258 

259 


260| 

•— (t->sp) 

- 

t  stop; 

/‘ 

Vector 

26I| 

•— (t->sp) 

- 

0“ 

/* 

flags 

*/ 

262| 

*—  (t->sp) 

— 

segs.cs; 

/* 

cs 

*/ 

263| 

*— (t->sp) 

- 

subr; 

/* 

ip 

*/ 

264  | 

* — (t->sp) 

- 

0; 

/* 

ax 

*/ 

26S| 

•— (t->sp) 

- 

0i 

/* 

bx 

*/ 

2661 

*— (t->sp) 

- 

0; 

/‘ 

cx 

*/ 

267| 

*— (t->sp) 

- 

0; 

/* 

dx 

‘/ 

268  | 

*— (t->sp) 

- 

0; 

/* 

si 

‘/ 

2691 

•—  (t->sp) 

- 

Of 

/• 

di 

*/ 

270| 

*— (t->sp) 

- 

0/ 

/* 

bp 

*/ 

27I| 

*— (t->sp) 

- 

segs.ds; 

/* 

ds 

*/ 

2721 

*— (t->sp) 

- 

•egs.es; 

/* 

es 

*/ 

273 

274 

275 

276 

277 

278 

279 

280 
281 
282 

283 

284 

285 

286 

287 

288 

289 

290 

291 

292 

293 

294 

295 

296 

297  i 

298 

299 
3001 

301 

302 

303 
304| 

305 

306 

307 

308 

309 

310 

311 

312 

313 

314 

315 

316 

317 

318 

319 

320 

321 

322 

323 

324 

325 

326 

327 

328 

329 

330 

331 

332 

333 

334 

335 

336 
3371 
3381 

339 

340 

341 

342 

343 

344 

345 

346 

347 

348 

349 

350 

351 

352 

353 

354 

355 

356 

357 

358 


/*  Initialize  the  stack.  First  pretend  that  we've 

*  already  called  the  initial  subroutine  by  pushing 

*  the  arguments,  and  t_stop  as  a  dummy  return  address. 

*  The  task  shouldn't  return,  but  if  it  does,  t_stop 

*  seems  like  a  reasonable  thing  to  call,  even  though 

*  it  will  return  garbage. 

*  The  last  11  things  are  the  initial  context. 

*  Then'll  be  popped  as  part  of  the  context  swap. 


while (  arg  -  va_arg (argptr,  void*)  ) 
* — (t->ep)  -  arg; 


if<  T_active  44  T_PRIORITY(  t,  T_active  )  <-  0  ) 


) 

else 

< 


pq_ins(  T_tasks,  4T_active  ); 
_t_swap_in(  t  ); 


pq_ins(  T  tasks,  4t  ); 
t_release7) ; 


static  TCB 
TCB 


- ./ 


‘del  fra_queues(  task  ) 
•taslc; 


/*  Traverse  the  queue  list  and  if  the  task  is  waiting 

*  for  a  message,  delete  it  from  the  queue 

*  and  return  a  pointer  to  it,  otherwise  return  HULL. 


T_QUEUE  *q; 

TCB  *t,  “prev  ; 


for (  q  -  T  queues;  q  ;  q  -  q->next  ) 

{ 

prev  -  4q->task_h; 

for(  t  -  q->taek_h;  t;  prev  -  4t->noxt,  t  -  t->next) 


{ 


if(  t  —  task  ) 

{ 


•prev  -  t->next 
return  t; 


return  NULL; 


int  t_chg_prior ity (  tp,  new_priority  ) 

TCB  *tp; 

int  new_priority; 


{ 


Task  doesn't  exist; 


/‘  Change  priority  for  indicated  task.  Forces  a  reschedule. 

*  If  the  task  was  waiting  on  a  message,  it  is  immediately 

*  timed  out  and  put  back  on  the  active  list.  A  task  may 

*  change  it's  own  priority. 

* 

*  Return  values: 

* 

*  TE_NOERR 

•  TE  BADARG 
*/  " 

int  rval  -  TE_NOERR; 

TCB  ‘deleted; 
int  pq_rm_cmp  ( )  / 

if(  new_priority  >  255  ) 
return  TE_BADARG ; 

t_block  ( ) ; 

if  (  tp  —  T_active  ) 


{ 


) 

else 

{ 


T_active->priority  -  new_priority; 
t_yield() ; 


if(  !pq_remove(  T_tasks,  4deleted,  pq_rm_cmp,  tp  )  ) 
if (  deleted  -  del_fm_queues (  tp  )  ) 


) 

else 


deleted->status  -  TS_TIMECUT; 

deleted->msg  -  HULL; 


return  TE  BADARG; 


3591 

360|  deleted->priority  -  newjpriority; 

361|  pq_ins(  T  tasks,  adeleted  ); 

3621  t  yield () ; 

3631  ) 

364  1  ) 

365| 

3661  /‘ - */ 

367| 

368|  int  t_delete(  task  ) 

369|  TCB  ‘task; 


370 

371 

372 

373 

374 

375 

376 

377 

378 

379 

380 

381 

382 

383 

384 

385 

386 

387 

388 

389 

390 

391 

392 

393 

394 

395 

396 

397 

398 

399 

400 

401 

402 

403 

404 

405 

406 

407 

408 

409 

410 

411 

412 

413 

414 

415 

416 

417 

418 

419 

420 

421 

422 

423 

424 

425 

426 

427 

428 

429 

430 

431 

432 

433 

434 

435  i 

436  i 
437; 
438 
439! 


4401  /* - ‘/ 

441| 

442|  static  pq_cmp(  taskl,  task2  ) 

443|  TCB  ‘“taskl,  “task2; 

444  1  { 

4451  return  T  PRIORITY (  ‘taskl,  *task2  ); 

4461  ) 

447| 

448|  static  p<i_»wap(  taskl,  task2  ) 

449|  TCB  “taskl,  “task2; 

450  1  { 

4511  TCB  ‘trap; 

4521 

453|  trap  -  ‘taskl; 

4541  * taskl  -  *task2; 

455|  *task2  -  trap; 

4561  ) 

457| 

458|  static  pq_rm_crap(  taskl,  item  ) 

459|  TCB  “taskl; 

460|  TCB  ‘item; 

461|  { 

462|  return  !(  ‘taskl  —  item  ); 

4631  ) 

464| 

4651  /* - */ 

466| 

467|  static  outc(c) 

468  1  { 

469|  if(  c  —  '  \n'  ) 

470|  putch('\r'); 

471| 

472|  putch(c); 


/*  Delete  task  created  with  previous  t_create()  call  and 
‘  free  all  memory  associated  with  task.  Note  that 

*  mallocedO  memory  is  not  freed,  only  the  memory  that 
‘  t  create ()  allocated  to  begin  with.  A  task  may  delete 
‘  itself.  Forces  a  reschedule. 

*  Return  values: 

*  TE_BADARG  Task  doesn't  exist 

*  TE_NQERR 

* 

*  T_stop  is  called  automatically  when  the  only  active 

*  task  deletes  itself.  See  t__start()  for  an  explanation 

*  of  the  return  stati. 

*  For  convenience,  a  task  may  delete  itself  with  a 

*  t  delete (NULL)  call. 

•/  “ 

TCB  ‘deleted; 

static  TCB  garbage; 
int  pq_rm_crap  ( )  ; 

t_block ( ) ; 

if (  T_active  —  task  | |  task  —  NULL  ) 

{ 

/*  Delete  the  current  task 
‘  Note  that  the  t_install()  call  below  will  not 

*  return.  It  replaces  the  current  task  with  the  new 

*  one,  a  pointer  to  which  is  in  "deleted." 

•/ 

if (  !pq_del(  T_tasks,  adeleted)  ) 

t _ stop (  T_nuratasks  <-  1  ?  TE_NOERR  :  TE_DEADLOCK  ); 

else 

( 

— T  nuratasks; 

_t_rnstall(  deleted  ); 

) 

> 

else 

/*  Delete  a  task  that's  not  active.  pq_remove  tries 

*  to  get  it  from  the  active  list.  If  that's  not 

‘  successful,  del  fra_queues()  scans  the  queues  looking 

*  for  it.  If  thatTs  not  succeasfule,  TE_BADARG  is 

*  returned. 

*/ 

if (  pq_reraove(  T  tasks,  adeleted,  pq_rra_cmp,  task  )  ) 
free (  delete?  ); 

else  if(  deleted  -  del_fm_queues (task)  ) 
free (  deleted  ); 

else 

{ 

t_release  () ; 
return  TE_BADARG  ; 

) 

if(  — T_numtasks  <-  0  )  /*  Deleted  the  only  task  */ 

t _ Stop (  TE_NOERR  )  ; 

) 

t_release ( ) ; 
return  TE_NOERR; 


122 


Dr.  Dobb's  Journal,  December  1987 

979 


473| 

4741 

4751 
476| 
477| 
478  1 
479| 
480| 
481| 
482| 
483| 
484  1 
485| 
4861 
487| 
488| 
489| 
490| 
491 
492| 
493 
4941 
495 
4961 
497| 

498 

499 

500 

501 
5021 

503 

504 

505 

506 
507| 
508 
5091 
5101 
5111 
5121 

513 

514 

515 
5161 

517 

518 
5191 

520 

521  i 

522 
523| 
5241 
5251 
5261 
527| 
5281 
5291 
530| 
531| 
5321 
5331 
5341 
5351 
536| 
537| 
5381 
5391 
5401 
5411 
5421 

543 

544 

545 
5461 

547 

548 

549 

550 
5511 

552 

553 

554 

555 

556 

557 

558 
559| 
560 
561| 

562 

563 

564 
565| 
566! 

567  1 

568  1 
569! 
570! 
571| 

572 

573 
5741 


t  print  f(  fint  ) 
cKar  *fmt; 

{ 

/*  The  doprntO  function  used  here  is  from:  Allen  Holub, 

*  The  C  Companion  (Englewood  Cliffs:  Prentice-Hall, 

*  1987).  You  can  also  use  the  ANSI  vprintf().  Note 

*  that  the  Microsoft  version  of  vprintf()  uses  A  LOT 

*  of  stack.  If  you're  using  vprintf,  your  tasks  chould 


va_list  args; 
va_start(args,  fmt) ; 
t_block ( ) ; 

doprnt (outc,  0,  frat,  args);  /*  vprintf (  fmt,  args  );  */ 
t_release ( ) ; 


t  sstatsO 

r 

/*  Print  various  scheduler  related  statistics.  This  routine 
*  should  not  be  called  from  a  task  (because  it  uses  printf) . 


) 


printf ("\nScheduler  called  %ld  times:  %ld  Tasks  timed_out,  M 

N%ld  context  swaps\n". 
Executed,  Timed_out,  Did_awap  ) ; 


♦pragma  check_stack- 

TCB  *  t_reschedule ( ) 

( 

static  T_QUEUE  *q; 

static  TCB  *t,  **prev  ; 

/*  Workhorse  function  called  by  the  schedular 

*  (timer-interrupt  service  routine) . 

* 

*  Scan  all  the  queues,  checking  for  timed-out  tasks.  If 

*  you  find  one,  remove  it  from  the  queue  and  add  it  to 

*  the  active  list. 

* 

*  Modify  T_active  to  point  at  the  next  task  to  activate. 

*  (the  original  T_active  if  no  change) . 


++ Executed; 


_t_sus_chkstk() ;  /*  stack  probes  off  for  the  nonce  */ 

for(  q  -  T_queues ;  q  ;  q  -  q->next  ) 

{ 

prev  -  6  (q->task__h); 


for(  t  -  q->task  h  ;  t  ;  ) 

{ 

if(  — (t->vait)  <-  0  ) 

{ 


++Timed_out; 

*prev  -  t->next  ; 

t->mag  -  NULL  ; 

t-> status  -  TEJTIMEOUT; 
pq_ins(  T_tasks,  4t  ); 


prev  -  4 (t->next) ; 
t  -  t->next  ; 

) 

) 

/*  Check  the  highest-priori ty  element  of  the  queue. 

*  If  it's  not  higher  than  the  current  task,  do  nothing. 

*  Otherwise  do  a  context  swap,  pq  replace  will 

*  extract  the  highest  priority  object  from  the  active 

*  list  and  put  it  into  t,  simultaneously  putting 

*  T  active  into  the  list. 

*/  “ 


T_active->timestarap  -  ++T_clock  ; 
if<  t  -  •(  (TCB  **)  pq_look (T_tasks) )  ) 
if(  T_PRIORITY (t,  T_active)  >-  0  ) 


( 


++Did_swap; 

pq_replace (  T  tasks,  4t,  4T_active  ); 
T  active  -  t  ; 


^  _t_rst_chkstk() ;  /*  stack  probes  on  again  •/ 

♦pragma  check_stack+ 

End  Listing  Six 


Listing  Seven 


3 


5 

6 


9 

10 

11 

12 

13 

14 

15 

16 

17 

18 

19 

20 
21 
22 
23 1 

24 

25 

26 
27| 
28 

29 

30 

31 
321 
33 
341 

35 

36 

37 

38 

39 

40 

41 
42| 

43 

44 
451 

46 

47 

48 

49 

50 

51 

52 

53 

54 

55 

56 

57 

58 

59 

60 
61 
62 

63 

64 

65 

66 

67 

68 

69 

70 

71 

72 

73 

74 

75 

76 

77 

78 

79 

80 
81 
82 

83 

84 
851 
86 

87 

88 
•  91 

90 

91 

92 

93 

94 
95! 
96! 
97| 
98  1 
99| 

100| 
101| 
102| 
103| 
104  | 
105| 
106| 
107| 
108  | 
109| 


/*  TDEBUG.C  Various  routines  that  are  useful  for  debugging  but 
*  probably  won't  end  up  in  the  final  system. 


t_tprint(  t  ) 

TCB  *t; 

{ 

/•  Print  a  TCB  */ 

char  **p,  *str  ; 
int  i; 


printf  (" - <%s>  at  %04x - \n", 

t->tag,  t  ); 

printf ("stack  %04x:%04x,  priority  %d,  timestamp  %d, ", 

t->ss,t->sp,  t->priority,  t->timestamp  ); 

printf ("  wait  %d,  status  %d\n",  t->wait,  t->status  ); 

printf ("next  %04x,  initial  sp  %04x,  msg  %04x  ", 

t->next,  t->initial_sp,  t->msg  ) ; 

pstr(  t->msg,  10  ); 

printf ("stack [  0]  %  %04x  -  %04x,  ", 

4 (t— >stack) [0],  (t->stack) (0) ) ; 

printf ("stack[  1]  8  %04x  -  %04x\n\n", 

4(t->stack) [1J,  (t->stack) [1] ) ; 

/*  Print  the  top  15  elements  of  the  stack  */ 

i  -  15; 

for(  p  -  (char  **)t->sp;  p  <  t->initial_sp  44  — i>-0;  p++  ) 

printf ("  sp[%2d]  I  «04x  -  %04x  -  ", 

(char  **)t->sp  -  p,  p,  *p) ; 

P»tr (  *p,  32  ); 


) 


if(  i  <  0  ) 

printf ("(Stack  dump  truncated  at  15  elements) \n") ; 


static  patr(  str,  len  ) 
char  *atr; 

( 

/*  Print  a  string  with  dots  instead  of  nonprinting 
*  characteres 
•/ 


put char ( '<') ; 

for (  len  -  32  ;  — len>-0  44  *str  ;  str++  ) 

putchar (  •  '  <-  *str  44  *str  <  0x7f  ?  *str  :  ); 

printf (">\n") ; 

) 

/* - */ 

t_qprint (  q  ) 

TjQUEUE  *q  ; 

{ 

/*  Print  out  the  contents  of  a  queue  */ 

TCB  *t; 
int  i; 

if(  q->signature  !-  TQ  SIG  ) 

{ 

printf ("Queue  is  invalid  (bad  signature) \n") ; 
return; 

) 

printf  (" - Queue  at  %04x - \n",  q)  ; 

printf  ("%d/%d  messages  in  queue,  next  queue  at  %04>:\n", 

q->numele,  q->q_size,  q->next  ) ; 


printf ("Waiting  tasks:  "); 
if (  t  -  q->task_h  ) 

printf (" (none) \n"  ); 

else 


{ 


) 


for(  ;  t  ;  t 

( 

printf ( 
if(  t  — 

> 

printf ("\n") 


-  t->next  ) 

"%s,  ",  t->tag  ) ; 
q->task_t  ) 
print  f ( " ( end ) \n" ) ; 


printf(  "head  -  4queue[%d],  tail  -  4queue[%d],  queue  is:\n", 
q->headp  -  q->queue,  q->tailp  -  q->queue) ; 

for(  i  -  0;  i  <  q->q_size  ;  i++  ) 

printf ("queue [%d] :  %04x  ",  i,  q->queue[i]  ); 
pstr(  q->queue[i],  32  ); 

) 

printf  (" - \n") ; 


Listing  7  —  tdebug.c.  Printed  9/11/1987 

11  I include  <stdio.h>  " 

2|  4include  "kernel. h" 


End  Listings 


Dr.  Dobb's  Journal,  December  1987 

980 


123 


THE  FORTH  COLUMN 


Listing  One  (Text  begins  on  page  144.) 

(  LOAD  screen  for  DDJ  Standard  Prelude  and  String  Extension) 

(  MJT  Aug  30  1987  for  DDJ  December  1987) 

(  2  LOAD  (  Standard  prelude) 

3  LOAD  (  Augmented  interpretation) 

4  5  THRU  (  Controlled  words) 

6  13  THRU  (  Strings) 

(  FORTH-83  functions —  typical  definitions) 

(  Adjust  these  words  for  your  Forth.  See  DDJ  Oct  1987.) 

(  Note:  functions  already  provided  need  not  be  redefined.) 

:  RECURSE  [COMPILE]  MYSELF  ;  IMMEDIATE 
:  INTERPRET  INTERPRET  ; 

:  I>  (  -  'data)  COMPILE  R>  ;  IMMEDIATE 

:  >1  (  -  'data)  COMPILE  >R  ;  IMMEDIATE 

(  Used  for  alignment:  ) 

:  ALIGN  (  HERE  1  AND  ALLOT)  ; 

:  REALIGN  (  a  -  a'  )  (  DUP  1  AND  +)  ; 

2  CONSTANT  CELL  :  CELL+  2+  ;  :  CELLS  2*  ; 

:  UNDO  I>  R>  R>  2DROP  >1  ;  \  Undoes  a  DO —  LOOP. 

(  Required  definitions  -  used  to  support  further  compilation) 

:  THRU  (  n  n2)  1+  SWAP  DO  I  LOAD  LOOP  ; 

\  LOADS  screens  n  through  n2 . 

:  \  >IN  @  64  +  -64  AND  >IN  !  ;  IMMEDIATE 
\  comment  to  end  of  line.  For  use  in  screens  only. 

:  \\  1024  >IN  !  ;  IMMEDIATE 

\  stops  interpreting  or  compiling  screen  immediately. 

:  \IF  (  f  )  0-  IF  [COMPILE]  \  THEN  ;  IMMEDIATE 

\  conditional  interpretation  or  compilat ion . 

:  NEED  (  -  f)  32  (  ie  blank)  WORD  FIND  SWAP  DROP  0-  ; 

\  true  if  the  following  word  is  in  the  search  order. 

V*  FORTH-83  Controlled  Words 


NEED 

2* 

\IF 

:  2* 

DUP  +  ; 

NEED 

D2* 

\IF 

:  D2* 

2 DUP  D+  ; 

NEED 

HEX 

\IF 

:  HEX 

16  BASE  !  ; 

NEED 

C, 

\IF 

:  C,  ( 

n  )  HERE  1  ALLOT  C! 

NEED  BL  \IF  32  CONSTANT  BL 

NEED  ERASE  \IF  :  ERASE  (an)  00  FILL  ; 

NEED  BLANK  \IF  :  BLANK  (an)  BL  FILL  ; 

NEED  .R  \IF  :  .R  (  n  width)  >R  DUP  0<  R>  D.R  ; 

\  DDJ  Forth  Column  Controlled  Words 
NEED  2>R 

\IF  :  2>R  COMPILE  SWAP  COMPILE  >R  COMPILE  >R  ;  IMMEDIATE 
NEED  2R> 

\IF  :  2R>  COMPILE  R>  COMPILE  R>  COMPILE  SWAP  ;  IMMEDIATE 
NEED  8EXECUTE  \IF  :  0EXECUTE  0  EXECUTE  ; 

NEED  AGAIN 

\IF  :  AGAIN  0  [COMPILE]  LITERAL  [COMPILE]  UNTIL  ;  IMMEDIATE 
NEED  DLITERAL 

DUP  \IF  :  DLITERAL  SWAP  [COMPILE]  LITERAL  [COMPILE]  LITERAL  ; 
\IF  IMMEDIATE 

NEED  S>D  \IF  :  S>D  (  n  -  d)  DUP  0<  ; 

NEED  WITHIN  \IF  :  WITHIN  (  n  n2  n3  -  f)  OVER  -  >R  -  R>  U<  ; 

NEED  TRUE  \IF  -1  CONSTANT  TRUE 
\  String  primitives 

:  /STRING  (  a  n  n2  -  a+n2  n-n2)  ROT  OVER  +  ROT  ROT  -  ; 

\  truncates  leftmost  n  chars  of  string,  n  may  be  negative. 

VARIABLE  CTEMP 

:  CTO""  (  c  -  a  1)  CTEMP  C!  CTEMP  1  ; 

\  converts  character  to  string. 


\  SKIP  and  SCAN 
:  SKIP  (  a  1  c  -  a2  12) 

\  returns  shorter  string  from  first  position  unequal  to  byte. 
>R  BEGIN  DUP 

WHILE  OVER  C@  R@  -  IF  R>  dROP  EXIT  THEN  1  /STRING 
REPEAT  R>  DROP  ; 


124 


Dr.  Dobb's  Journal,  December  1987 

981 


:  SCAN  (  a  1  byte  -  a2  12) 

\  returns  shorter  string  from  first  position  equal  to  byte. 

>R  BEGIN  DUP 

WHILE  OVER  C@  R@  -  IF  R>  DROP  EXIT  THEN  1  /STRING 
REPEAT  R>  DROP  ; 


\  String  compilation 

:  PLACE  (  a  n  a2)  2DUP  !  1+  SWAP  CMOVE  ; 

\  moves  string  (an)  to  be  a  packed  string  at  a2. 

:  ASCII  (  -  c)  \  value  of  following  character. 

BL  WORD  1+  C@  STATE  6  \  STATE-smart  ASCII 

IF  [COMPILE]  LITERAL  THEN  ;  IMMEDIATE 

:  , "  \  compiles  following  string  as  packed  string  at  HERE 

*  ASCII  »  WORD  COUNT  DUP  >R  HERE  PLACE  R>  1+  ALLOT  ALIGN  ; 


\  String  literals 

:  (")  I>  COUNT  2DUP  +  >1  ; 

:  "  (-an)  STATE  0  \  string  literal. 

IF  COMPILE  (") 

ELSE  ASCII  "  WORD  COUNT  >R  PAD  I  CMOVE  PAD  R>  THEN  ; 
IMMEDIATE 


\  Number  conversion  operator 
VARIABLE  DPL  \  punctuation  locator. 

:  VAL?  (  a  n  -  d  2  ,  n2  1  ,  0) 

\  string  to  number  conversion  primitive.  True  if  d  is  valid. 
\  Returns  d  if  number  contains  ",-./:"  and  sets  DPL  -  0 

\  Returns  n  if  no  punctuation  present  and  sets  DPL  *  0< 

PAD  OVER  -  SWAP  OVER  >R  CMOVE 

BL  PAD  C!  PAD  DPL  !  0  0  R>  DUP  C0  ASCII  -  -  DUP  >R  -  1- 

BEGIN  CONVERT  DUP  C@  DUP  ASCII  :  - 
SWAP  ASCII  ,  ASCII  /  1+  WITHIN  OR 
WHILE  DUP  DPL  !  REPEAT  R>  SWAP  >R  IF  DNEGATE  THEN 
PAD  1-  DPL  0  -  DPL  !  R>  PAD  -  (  valid?) 

IF  DPL  0  0<  IF  DROP  1  ELSE  2  THEN  ELSE  2 DROP  0  THEN  ; 


\  -TEXT  and  COMPARE 
:  -TEXT  (  a  n  a2  - 1  ,  0  ,  1) 

\  returns  -1  if  string  a  n  <  a2  n  ,  0  if  equal,  and  1  if  >. 
OVER  0-  IF  ROT  2 DROP  EXIT  THEN 
SWAP  0  DO  OVER  C@  OVER  C@  -  (  these  chars  <>  ?) 

IF  UNDO  C@  SWAP  C0  >  2*  1+  EXIT  THEN  1  1  D+ 
LOOP  2 DROP  0  ; 

:  COMPARE  (  a  n  a2  n2  - 1  ,  0  ,  1) 

\  returns  -1  if  a  n  <  a2  n2  f  0  if  equal,  and  1  if  >. 

ROT  2DUP  (  lengths  )  2>R  MIN  SWAP  -TEXT  DUP 
IF  2R>  2 DROP 

ELSE  DROP  2R>  2DUP  -  (  lengths  -  ?) 

IF  2 DROP  0  ELSE  >  2*  1+  THEN 
THEN  ; 

\  IN 

:  -MATCH  (  a  n  a2  n2  -  ????  -1  ,  offset  0) 

\  returns  the  position  of  string  a2  n2  in  (a  n) . 

\  Offset  is  zero  if  (  a  n  )  is  found  in  first  char  position. 

\  Returns  true  with  invalid  offset  if  (an)  isn't  in  a2  n2 . 

2 SWAP  2  PICK  DUP  (  lenl  )  >R  OVER  SWAP  -  DUP  0<  R>  0-  OR 
IF  2 DROP  2 DROP  TRUE  EXIT  THEN 
0  TRUE  (  index  match?  )  ROT  1+  0 
DO  DROP  (  index  )  >R 

2 OVER  2 OVER  DROP  -TEXT  0-  (  equal?  ) 

IF  R>  0  LEAVE  THEN  1  /STRING  R>  1+  TRUE 
>  LOOP 

2>R  2 DROP  2 DROP  2R>  ; 


\  Useful  string  operators 
:  VAL  (  a  n  -  d  f)  VAL?  DUP  3  <  AND 

\  converts  string  to  double  number.  True  if  number  is  valid. 

DUP  IF  1  -  IF  S>D  THEN  TRUE  EXIT  THEN  DUP  DUP  ; 

:  EVAL  (an) 

\  evaluates  ("text  interprets")  a  string. 

DUP  >R  TIB  SWAP  CMOVE  R0  #TIB  ! 

0  >IN  !  0  BLK  !  INTERPRET  R>  >IN  !  / 

_ _ _  End  M  Jatingft 


Dr.  Dobb’s  Journal,  December  1987 

982 


125 


COLUMNS 


C  CHEST 


A  Preemptive  Multitasking  Kernel  and 
More  Mean  Subroutines 


The  main  topic  of  both  this  and 
next  month’s  columns  is  a 
small  preemptive  multitasking 
kernel  for  the  IBM  PC  (see  Listings 
One-Seven,  beginning  on  page  110). 
Most  of  the  kernel  is  written  in  C, 
so  it  shouldn’t  be  too  difficult  to 
port  it  to  another  environment,  in¬ 
cluding  a  ROMed  environment.  The 
kernel  lets  you  do  multitasking 
within  a  program — that  is,  it  lets 
you  run  subroutines  as  independ¬ 
ent  tasks.  It  also  supports  a  message¬ 
passing  system  that  allows  for  inter¬ 
task  communication  (and  time-outs 
when  a  message  isn't  received 
within  a  specified  time).  It  doesn’t 
let  you  multitask  programs,  how¬ 
ever,  and  it  uses  DOS  as  its  I/O 
system.  This  month  I’ll  show  you 
how  it  works;  next  month  I’ll  dissect 
the  code  itself. 

This  month,  I’m  also  passing  on  a 
neat  routine  for  computing  a  run¬ 
ning  mean,  sent  in  by  Kevin  Jen¬ 
nings. 

What  is  Multitasking ? 

Let’s  start  with  some  definitions,  just 
so  everybody  is  starting  from  the 
same  base. 

An  operating  system  is  a  collec¬ 
tion  of  programs  that  lets  you  exe¬ 
cute  other  programs.  Typically, 
some  of  these  programs  are  resident 
(they  stay  in  the  computer's 
memory),  others  are  nonresident 
(they  stay  on  the  disk  until  they're 
needed,  whereupon  they’re  read 
into  main  memory  and  executed), 
and  some  are  somewhere  in  be- 


by  Allen  Holub 

tween  (they  stay  in  memory  until 
the  space  is  needed  for  something 
else,  whereupon  they  are  overwrit¬ 
ten,  but  they’ll  be  read  back  into 
memory  once  space  is  available). 

The  main  parts  of  the  operating 
system  are  the  kernel  (or  executive), 
the  I/O  system,  and  the  shell.  The 


kernel  is  in  charge  of  actually  read¬ 
ing  and  executing  other  programs; 
the  I/O  system  takes  care  of  all  com¬ 
munications  with  the  outside 
world — the  disk,  printers,  terminals, 
and  so  forth;  and  the  shell  is  the 
part  that  talks  to  you.  The  shell 
typically  uses  both  the  I/O  system 
and  the  kernel.  This  month’s 
column  does  not  present  an  operat¬ 
ing  system.  In  fact  it  only  looks  at  a 
small  part  of  the  kernel — the  part 
that  takes  care  of  multitasking. 

Multitasking  is  a  method  for 
making  a  computer  appear  to  be 
doing  several  things  at  once.  In  real¬ 
ity,  it’s  swapping  back  and  forth  be¬ 
tween  various  tasks  very  quickly. 
Nonetheless,  from  a  human  user’s 
perspective,  several  things  seem  to 
be  going  on  at  the  same  time.  You 
can  actually  do  multitasking  with¬ 
out  a  kernel.  Consider  an  interrupt- 
driven  terminal  multiplexer — a 
device  that  collects  lines  of  text  from 
several  terminals  and  passes  them 
(accompanied  by  some  sort  of  iden¬ 
tifying  information)  along  one  com¬ 
munications  channel  to  a  single 
mainframe  computer.  You  might 
have  eight  input  channels,  each  of 
which  would  trigger  a  unique  inter¬ 
rupt  in  the  processor.  There  would 
also  be  eight  interrupt-service  rou¬ 
tines,  each  of  which  would  collect 
characters  until  an  end  of  line  was 
found  and  then  send  that  line  to  the 
mainframe.  Each  routine  would  main¬ 
tain  its  own  input  buffer  and  attach 
a  unique  identifier  to  each  line.  The 
multiplexer  would  appear  to  be 
doing  eight  things  at  once,  reading 
from  eight  input  sources  simultane¬ 
ously,  but  in  reality  it  would  be 


performing  eight  independent  tasks 
when  stimulated  by  distinct  exter¬ 
nal  events  (characters  arriving).  If 
eight  characters  were  to  arrive  si¬ 
multaneously,  the  multiplexer  might 
fail  because  of  insufficient  time  to 
react  to  all  eight  inputs  before  more 
characters  come  along. 

A  different  approach  to  the  same 
problem  is  to  have  a  single  inter¬ 
rupt,  usually  created  by  a  timer  chip 
of  some  sort,  that  starts  up  a  sched¬ 
uler  subroutine.  The  scheduler  acti¬ 
vates  a  particular  subroutine,  just  as 
if  it  had  been  activated  by  the  inter¬ 
rupt  itself.  When  the  next  timer  in¬ 
terrupt  comes  along,  the  scheduler 
suspends  the  current  subroutine 
and  activates  a  new  one,  just  as  if  a 
second,  higher-priority  interrupt 
had  come  along.  The  scheduler  also 
takes  care  of  some  of  the  overhead 
involved  with  the  interrupt  service. 
In  particular,  it  does  a  context  swap 
— it  saves  all  the  registers  (including 
the  instruction  pointer)  and  pre¬ 
serves  the  running  subroutines’s 
stack.  (Each  task  has  its  own  stack, 
just  as  a  well-behaved  interrupt- 
service  routine  should  have  its  own 
stack).  A  task,  then,  is  different  from 
a  normal  subroutine  in  that  it  is 
activated  only  by  the  scheduler,  not 
by  a  subroutine  call  in  the  common 
sense,  and  it  has  its  own  stack  and 
a  place  to  save  registers  during  a 
swap. 

Note  that  a  task  is  not  the  same 
as  a  process  in  normal  operating 
system  parlance.  A  process  usually 
implies  a  program  that  is  read  in 
from  the  disk  and  executed  by  the 
operating  system.  A  task,  however, 
is  any  independently  executing 
code.  In  the  case  of  the  kernel  pre¬ 
sented  here,  a  task  is  a  subroutine 
(and  the  subroutines  that  it  calls) 
that  has  its  own  stack  and  register 
set.  That  is,  it  is  the  stack  and  regis¬ 
ter-save  area  that  define  the  task, 
not  the  code  itself.  Several  tasks  can 
share  the  same  code — provided  that 


126 


Dr.  Dobb's  Journal,  December  1987 

983 


C  CHEST 

(continued  from  page  126) 

the  code  doesn’t  use  static  variables. 
Because  each  task  has  its  own  stack, 
and  because  the  program  counter  is 
saved  as  part  of  the  context  swap, 
the  local  variables  (on  the  stack)  will 
be  distinct  and  each  task  will  re¬ 
member  where  it  was  in  the  code 
when  it’s  interrupted.  Much  of  the 
context-swapping  code  assumes 
that  everything  is  contained  in  a 
single,  8086,  small-model  program, 
however.  You’ll  have  to  play  with 
the  program  a  little  to  make  it  sup¬ 
port  other  models  or  external  pro¬ 


gram  execution. 

Several  strategies  are  used  to  de¬ 
termine  which  task  gets  control 
when  the  timer  interrupt  comes 
along.  In  round-robin  scheduling, 
the  various  tasks  are  activated  one 
at  a  time  until  all  of  them  have  been 
executed  once,  then  the  processor 
goes  around  the  circle  again.  In  a 
priority-based  system,  each  task  is 
assigned  a  unique  priority.  All  non- 
running  tasks  are  organized  into  an 
active  list.  When  an  interrupt  comes 
along,  the  scheduler  compares  the 
priority  of  the  running  task  with 
that  of  the  highest-priority  task  in 
the  active  list  and,  if  necessary,  puts 


the  running  task  to  sleep  and  acti¬ 
vates  the  top  task  in  the  list.  It’s 
possible  to  have  a  system  that  com¬ 
bines  both  of  these  strategies — tasks 
with  the  same  priority  are  executed 
in  round-robin  fashion  but  a  higher- 
priority  task  takes  precedence  over 
lower-priority  ones. 

There  are  two  approaches  to  mul¬ 
titasking,  too,  that  are  useful  in  dif¬ 
ferent  situations.  The  method  I've 
just  described  is  preemptive  multi¬ 
tasking.  The  timer  interrupt  causes 
the  next  task  to  preempt  the  current 
one.  You  can  also  have  a  nonpreemp- 
tive  system  (such  as  Microsoft  Win¬ 
dows)  in  which  individual  tasks  vol¬ 
untarily  give  up  control  when  they 
don’t  need  the  CPU  anymore.  There 
are  two  advantages  to  the  nonpreemp- 
tive  approach.  You  don’t  need  a 
timer  interrupt  and  CPU  time  is  not 
wasted  figuring  out  that  the  sched¬ 
uler  doesn’t  need  to  do  a  context 
swap.  On  the  down  side,  a  task  re¬ 
tains  control  until  it  yields  (gives  up 
control).  If  the  running  task  crashes, 
the  whole  system  stops. 

Because  tasks  aren’t  subroutines 
in  the  normal  sense  of  the  word, 
they  can’t  pass  information  to  each 
other  using  arguments  and  return 
values.  Intertask  communication  is 
then  done  with  a  message-passing 
system.  The  operating  system  lets 
you  create  a  set  of  message  queues, 
or  mailboxes,  to  which  messages 
can  be  sent.  Other  tasks  can  then 
wait  at  the  queue  for  a  message  to 
arrive.  Several  tasks  can  wait  at  the 
same  queue — they’re  just  given  mes¬ 
sages  as  they  arrive,  and  each  task 
gets  one  message.  If  no  tasks  are 
waiting,  an  incoming  message  is 
queued  up  until  a  task  comes  along 
to  fetch  it.  A  task  that’s  waiting  for  a 
message  is  suspended — it  will  not 
be  activated  by  the  scheduler  until 
a  message  arrives  at  the  queue — 
though  it’s  also  possible  for  a  task 
to  time-out — the  task  is  put  back 
into  the  active  list  if  the  message 
hasn’t  arrived  within  a  specified 
time.  Once  the  message  arrives,  if 
the  receiving  task  is  of  higher  prior¬ 
ity  than  the  running  task,  it  gets 
control  immediately;  otherwise,  the 
task  (and  the  message)  are  put  back 
into  the  active  list  and  will  be  reacti¬ 
vated  in  the  normal  way.  Note  that 
a  mailbox  is  a  data  structure  that 
contains  two  queues:  a  queue  of 


130 

984 


Dr.  Dobb’s  Journal,  December  1987 


C  CHEST 

(continued  from  page  130) 

messages  and  a  queue  of  waiting 
tasks. 

A  problem  with  this  communica¬ 
tions  method  is  deadlock,  a  situ¬ 
ation  in  which  two  tasks  are  both 
waiting  for  messages  that  can  only 
be  sent  by  the  other  task.  That  is, 
task  A  is  waiting  for  a  message  that 
can  only  be  sent  by  task  B,  and  at 
the  same  time,  task  B  is  waiting  for 
a  message  that  can  only  be  sent  by 
task  A.  A  similar  situation  can  arise 
when  every  task  in  the  system  is 
waiting  for  a  message  simultane¬ 
ously. 

Another  problem  is  resource  block¬ 
ing.  The  DOS  I/O  system  is  a  good 
example  of  why  this  blocking  is  nec¬ 
essary.  DOS  is  not  reentrant,  which 
means  that  a  DOS  function  cannot 
be  called  reliably  from  an  interrupt- 
service  routine.  Put  another  way, 
once  you  call  DOS,  you  can’t  call 
DOS  a  second  time  until  the  first 
DOS  call  has  finished.  Because  task 
activation  can  happen  at  any  time, 
even  when  you’re  inside  DOS,  some 
method  is  needed  to  turn  off  the 
scheduler  when  a  task  is  doing  a 
DOS  call.  A  task  blocks  to  get  control 
of  the  CPU — no  other  task  will  be 
activated  as  long  as  scheduling  is 
blocked.  Note  that  the  system  will 
halt  if  scheduling  is  blocked  and  the 
running  task  crashes. 

An  alternate  approach  to  blocking 
that  doesn’t  have  this  disadvantage 
is  an  exclusion  semaphore.  An  ex¬ 
clusion  semaphore  is  just  a  queue 
of  length  1.  A  message  is  waiting  at 
the  queue  when  a  resource  (such  as 
DOS)  is  available.  To  use  the  re¬ 
source,  a  task  gets  the  message,  uses 
the  resource,  and  then  reposts  the 
message  to  free  up  the  resource. 


This  approach  has  its  disadvantages, 
too.  If  the  task  that  has  the  resource 
is  suspended  for  some  reason,  no 
other  task  can  use  the  resource  until 
the  task  is  reactivated. 

Note  that,  though  you  can  block 
by  disabling  interrupts,  that’s  not 
usually  a  good  idea.  First  of  all,  on 
the  IBM  PC  you'll  mess  up  the 
system  clock,  which  is  interrupt- 
driven.  The  other  problem  is  the 
console  I/O  and  disk  subsystems, 
which  are  also  interrupt-driven.  You 
won’t  be  able  to  send  or  receive 
characters  when  interrupts  are  off. 
You’re  also  likely  to  mess  up  your 
disk  transfers.  If  you  do  disable  in¬ 
terrupts,  do  it  for  the  shortest  time 
possible.  Also  note  that  tasks  must 
be  treated  as  interrupt-service  rou¬ 
tines.  In  practice  this  means  that 
you  must  always  block  before  you 
call  DOS. 

Kernel  Users’  Manual 

This  section  is  just  a  users’  manual 
for  the  kernel  subroutines.  I  haven't 
gone  into  any  implementation  de¬ 
tails,  all  of  which  I’ll  discuss  next 
month. 

Several  error  codes  are  declared 
in  kernel.h  and  are  returned  by  the 
various  multitasking  subroutines. 
The  codes  are  shown  in  Table  1, 

below.  TE _ NOERR  has  the  value  0, 

and  the  other  error  codes  are  small 
negative  numbers.  In  addition, 
kernel.h  holds  typedefs  for  the  TCB 

and  T _ QJJEUE.  The  former  is  a  task 

control  block,  used  to  hold  a  task's 
stack  and  so  forth;  the  latter  is  a 
message  queue.  I’ll  discuss  all  of 
these  codes  in  greater  depth  later. 

Various  global  variables  and  sub¬ 
routines  are  used  internally.  These 
are  listed  in  Table  2,  below.  You 
shouldn’t  use  the  names  in  your 
own  programs,  because,  for  the 


most  part,  these  variables  are  not 
useful  to  an  application  program. 

The  possible  exception  is  T _ clock, 

the  system  clock.  T _ clock  is  incre¬ 

mented  on  every  system  clock  tick. 
It  is  not  incremented  while  schedul¬ 
ing  is  blocked,  however.  Because 
T _ clock  is  an  unsigned  long,  roll¬ 

over  is  not  a  problem.  If  you  assume 
the  default  of  18.2  ticks/second,  the 
clock  will  roll  over  after  about  65,552 
hours  (about  7.47  years); 

((0xfffffiffrl8.2)/60)/60  =  65,552  hours 
65,552/24/365.35  =  7.47798  years 

Of  course,  this  number  will  scale 
with  faster  tick  rates,  but  the  resolu¬ 
tion  should  be  OK  for  all  reasonable 
tick  rates.  The  other  variable  of  pos¬ 
sible  interest  is  T _ numtasks,  which 

holds  the  number  of  tasks  that  have 
been  created  so  far. 

There  are  a  few  additional  consid¬ 
erations.  The  Microsoft  compiler  in¬ 
serts  a  call  to  a  subroutine  called 
_ chkstkf )  at  the  head  of  every  sub¬ 
routine.  This  can  cause  problems 
because  a  task  is  running  on  its  own 

stack,  not  on  the  one  that _ chkstkt ) 

expects  it  to  be  using.  The  problem 
is  solved  by  t _ start) ),  which  re¬ 
places  the  default  _ chkstkf )  with 

its  own  version.  The  new  version 
checks  the  current  stack  pointer 
against  the  base  of  the  task’s  local 
stack.  If  a  stack  overflow  happens, 
multitasking  is  terminated  and 

t _ start( )  will  return  an  error  code 

(see  later). 

Starting  and  Stopping 
Multitasking 

A  typical  program  will  first  create  a 
few  tasks  (I’ll  look  at  how  in  a 
moment)  and  then  start  up  the  mul¬ 
titasking  environment.  (Tasks  can 


TCB 

*T_active; 

unsigned  long 

T_clock; 

int 

T_numtasks; 

T _ QUEUE 

*T_queues; 

PQ 

*T_tasks; 

void 

__t__reschedule( ); 

void 

_ t _ install( ); 

void 

__t_swap_in( ); 

void 

_t_shazam( ); 

void 

__t_speedup( ); 

void 

_t_slowdown( ); 

Table  Z:  Global  variables  and  sub¬ 
routines  used  by  kernel.h 


TE _ NOERR 

No  error 

TE _ TOOMANY 

Maximum  number  of  tasks  (32)  already  exists 

TE _ NOMEM 

Insufficient  memory  available 

TE _ BADARG 

Illegal  argument 

TE  TIMEOUT 

Time-out 

TE _ QFULL 

Queue  is  full 

TE _ NOTASKS 

No  tasks  to  send  message 

TE _ INTERNAL 

Internal  error 

TE _ DEADLOCK 

Delete  would  have  caused  a  deadlock 

TE _ STACK 

Stack  overflow 

TE _ KILL 

Ctrl-Break  encountered 

Table  1:  Error  codes  declared  in  kernel.h 


132 


Dr.  Dobb's  Journal,  December  1987 

985 


C  CHEST 

(continued  from  page  132) 

create  other  tasks,  too.  They  don’t 
all  have  to  exist  before  multitasking 
is  started.)  Multitasking  is  started 
with  a  call  to: 

int  t _ start!  speedup  ) 

int  speedup 

Speedup  is  a  system-clock  speedup 
factor.  If  it's  0,  then  the  system  will 
not  be  preemptive  and  tasks  will 
have  to  yield  (either  with  a  call  to 
t _ yield ( )  or  by  waiting  for  a  mes¬ 

sage)  to  give  up  control  of  the  CPU. 
A  speedup  factor  of  1  uses  the  de¬ 
fault  PC  clock  rate  (roughly  18.2  in¬ 
terrupts/second).  A  factor  of  2  is 
twice  as  fast  (36.4  interrupts/second). 
If  the  factor  is  too  large,  the  system 
will  actually  slow  down  because  it 
will  start  missing  interrupts.  It’s  best 
if  the  speedup  factor  is  a  power  of  2. 

At  least  one  task  must  have  been 

created  prior  to  the  t _ startO  call. 

Control  passes  immediately  to  the 
highest-priority  task.  T _ start ( )  re¬ 

turns  either  when  all  tasks  are  de¬ 
leted  or  when  t _ stopO  (discussed 

later)  is  called.  TE _ NOTASKS  is  re¬ 

turned  if  no  tasks  exist  initially — 
multitasking  is  not  started  in  this 

situation.  TE _ NOERR  is  returned 

when  all  tasks  have  been  deleted 
successfully.  No  tasks  can  be  wait¬ 
ing  on  queues.  This  is  the  normal 
way  to  return.  TE _ STACK  is  re¬ 

turned  as  soon  as  a  stack  overflow 

in  any  task  is  detected.  T _ active 

will  point  at  the  TCB  of  the  offend¬ 
ing  task. 

TE — DEADLOCK  is  returned  when 
the  only  active  task  in  the  system 
deletes  itself.  Other  tasks  exist  but 
they’re  all  pending  on  queues.  That 


Example  1:  A  typical  idle  task 


is,  there  must  always  be  at  least  one 
running  task  in  the  system.  To  ac¬ 
complish  this,  a  very-low-priority 
idle  task  is  often  created.  This  task 
doesn’t  do  anything  but  spin 
around;  it's  only  active  when  all 
other  tasks  are  doing  something 
else,  and  it  deletes  itself  when  it’s 
the  only  task  left  in  the  system.  A 
typical  idle  task  is  shown  in  Exam¬ 
ple  1,  below. 

If  TE _ NOERR  is  returned,  then 

all  memory  allocated  to  the  tasks 
will  have  been  restored  to  the  heap; 
otherwise,  if  one  of  the  above  errors 

was  returned,  T _ active  will  point  at 

the  TCB  of  the  offending  task.  Other 
return  values  are  possible  if  a  task 
calls  t _ stopO  directly  (see  later). 

A  panic  abort  from  the  multi¬ 
tasking  environment  can  be  accom¬ 
plished  with  a  call  to: 

t _ stop)  errcode  ) 

int  errcode; 

Multitasking  is  turned  off,  and  con¬ 
trol  passes  back  to  the  routine  that 
called  t _ start( )  (immediately  follow¬ 

ing  the  t — startO  call).  That  is, 

t _ stop( )  forces  t _ start) )  to 

return — it  does  not  itself  return. 
Errcode  is  passed  back  to  the  call¬ 
ing  routine  as  the  return  value  of 

t _ startO.  The  process  is  analogous 

to  a  Unix  eyitO  call,  which  doesn't 
return  and  whose  argument  is 
passed  back  to  a  wait))  call  in  the 
parent  process. 

Creating  and  Deleting  Tasks 

Tasks  are  created  with  a  call  to: 

TCB  *t _ create!  subr,  tag,  priority, 

stack _ size,  . . .,  NULL) 

int  (*subr)(  ); 


char  *tag; 
unsigned  priority; 
int  stack _ size; 

Subr  is  a  pointer  to  the  subroutine 
that  forms  the  main  module  for  the 
task.  Tag  is  a  string  used  to  identify 
the  TCB — it’s  used  only  for  debug¬ 
ging.  Priority  is  the  task’s  priority — 
the  higher  the  number,  the  higher 
the  priority.  Priorities  should  be  in 
the  range  0-255.  If  more  than  one 
task  has  the  same  priority,  the  tasks 
are  executed  in  a  round-robin  fash¬ 
ion.  Stack _ size  is  the  stack  size  (in 

16-bit  words)  for  this  task  only.  Note 
that  a  few  Microsoft  functions  (such 
as  printfl ) )  use  up  inordinate 
amounts  of  stack.  If  you’re  going  to 
call  Microsoft  library  routines,  you’ll 

need  at  least  IK  stacks  ( stack _ size 

of  512). 

The  remaining  arguments  are  a 
/VULL-terminated  list  of  pointer-size 
arguments  that  will  be  passed  to  the 
task  at  start-up.  (More  on  this  in  a 
moment.) 

T _ startO  forces  a  reschedule. 

That  is,  if  the  task  that  you’re  creat¬ 
ing  is  of  higher  priority  than  the 
running  task,  the  new  task  will  get 
control  immediately. 

A  pointer  to  the  created  TCB  is 
returned  normally.  This  pointer  is 
useful  if  you  want  to  delete  the  task 
later.  Error  return  values  are 

TE _ TOOMANY  (maximum  number 

of  tasks  [32]  already  exists)  and 

TE _ NOMEM  (insufficient  memory 

available  to  create  task). 

An  example  of  task  creation  is 
shown  in  Example  2,  page  137.  Here, 
a  single  task  called  foot)  is  created 
in  main) ).  It  is  of  priority  10  and  has 
a  512-word  (IK)  stack.  The  remain¬ 
ing  arguments  are  passed  to  fool ) 
when  the  task  starts  up.  Foo( )  will 
print  its  arguments  and  delete  itself. 
(It  prints  “hello  world.”) 

Note  that  a  task  should  never 
return  in  the  normal  way.  It  should 
always  delete  itself  rather  than  re¬ 
turning.  If  a  return  is  executed 
(either  explicitly  or  implicitly  by  fall¬ 
ing  off  the  bottom  of  the  subrou¬ 
tine),  then  t _ stop( )  is  called  imme¬ 

diately  (with  a  garbage  argument). 

A  task’s  priority  can  be  changed 
at  any  time  with  a  call  to: 

int  t _ chg _ priority!  tp,  new _ prior¬ 

ity  ) 


idle (\scl28\) 

{ 

int  t,  i  ; 

do  ( 

for  (  i  -  1000;  - 

-1>=0  ;  ) 

t_cli (\scl28\) ; 

/* 

Clear  interrupts  */ 

t  =  T  numtasks; 

/* 

t  =  #  of  tasks  */ 

t  sti  (\scl28\) ; 

> 

while (  t  >  1  )  ; 

/* 

restore  lnts.  */ 

t  delete (  NULL  ) ; 

/*  delete  self  */ 

) 

134 

986 


Dr.  Dobb’s  Journal,  December  1987 


C  CHEST 

(continued  from  page  134) 

TCB  *tp; 

int  new _ priority; 

Like  t _ created,  this  routine  forces 

a  reschedule.  If  the  task  was  waiting 
on  a  message,  it  is  immediately 
timed-out  and  put  back  onto  the 
active  list.  A  task  may  change  its 
own  priority.  This  routine  returns 

TE _ NOERR  normally  and 

TE _ BADARG  if  the  task  doesn't  exist 

or  if  the  priority  of  the  task  is  greater 
than  255. 

Tasks  are  deleted  with  a  call  to: 

int  t _ delete!  task  ) 

TCB  "task; 

which  also  frees  all  memory  associ¬ 
ated  with  task.  Note  that  memory 
that  the  task  itself  allocates  (via  a 
mallocf )  call  or  equivalent)  is  not 
freed — only  the  memory  that 

t _ created  allocated  (for  the  stack 

and  so  forth)  is  released.  A  task  may 

delete  itself  with  a  t _ delete(NULL) 

call.  T _ deletef )  forces  a  reschedule 

when  the  current  task  is  deleted. 

Return  values  are  TE _ NOERR  on 

success  and  TE _ BADARG  if  the  task 

doesn’t  exist. 

Messages 

Three  routines  are  used  to  create 
message  queues  (mailboxes)  and  for 
passing  messages  between  tasks. 
Queues  are  created  with  a  call  to: 

T _ QUEUE  *t _ makequeuel  size  ) 

int  size; 

Size  is  the  maximum  number  of 
messages  that  can  be  waiting  in  the 
queue.  Any  number  of  tasks  can  wait 
at  a  queue,  however.  Normally  a 
pointer  to  the  queue  is  returned, 

but  TE _ NOMEM  is  returned  if 

there’s  insufficient  memory.  The 
queues  are  linked  into  a  linear  list 
that  is  searched  for  timed-out  tasks 
on  every  system  clock  tick,  so  it’s 
best  to  create  the  most  active 
queues  first. 

Messages  are  sent  from  a  task  to  a 
queue  with  a  call  to: 

int  t _ send(  q,  msg  ) 

T _ QUEUE  *q; 

void  *msg; 


where  q  is  a  pointer  returned  from 

a  previous  t _ makequeuel )  call  and 

msg  is  a  pointer  to  the  message.  The 
pointer,  not  the  message,  is  stored, 
so  the  messages  can  be  anything 
you  want.  Typically,  msg  will  be  a 
pointer  to  a  structure  or  a  string. 
The  message  itself  must  be  in  static 
memory. 

If  no  tasks  are  waiting  at  the 
queue,  t _ send( )  returns  immedi¬ 

ately;  otherwise,  the  message  is  at¬ 
tached  to  the  task  and  the  sched¬ 
uler  is  called.  This  means  that  if  the 
waiting  task  is  of  higher  priority, 
control  will  be  taken  away  from  the 
task  that  sent  the  message  and  given 
to  the  waiting  task.  TE _ NOERR  is 


returned  normally.  Error  returns  are 

TE _ BADARG  on  a  bad  q  argument 

and  TE _ QfULL  if  the  queue  is  full. 

The  other  side  of  the  message¬ 
passing  system  is: 

void  *t _ wait!  q,  timeout  ) 

T _ QUEUE  *q; 

int  timeout; 

which  is  used  by  a  task  to  wait  at  a 
queue  for  a  message  to  arrive.  Q  is  a 
pointer  returned  from  a  previous 

t _ makequeued  call  and  timeout  is 

a  time-out  value.  The  task  will  only 
wait  for  timeout  system  clock  ticks 
before  it  is  put  back  onto  the  active 
list.  (Remember,  though,  the  clock 


foo (  a,  b  ) 
char  *a,  *b; 

{ 

t_printf("%s  %s\n",  a,  b  ); 
t  delete!  NULL  ); 

) 

main  (\scl28\) 

{ 

t_create(  foo, "foo",  10,  512,  "hello",  "world”,  NULL); 
t  start  (  2  ) ; 

} 

Example  2:  An  example  of  task  creation 

Dr.  Dobb’s  Journal,  December  1987 


137 

987 


C  CHEST 

(continued  from  page  137) 

won't  tick  when  the  scheduler  is 
blocked).  The  maximum  time-out  is 
32,767  clock  ticks  (that’s  about  half 
an  hour  at  18.2  ticks/second).  If  the 
time-out  is  0,  tsvait) )  returns  im¬ 
mediately  (without  a  reschedule)  if 
no  messages  are  waiting  in  the 
queue. 

Message  requests  are  queued  up 
in  order  received,  without  regard  to 
priority.  I’ve  done  this  both  because 
it’s  easy  and  because,  in  most  appli¬ 
cations,  tasks  with  different  priori¬ 
ties  will  not  be  pending  on  the  same 
queue. 

If  a  message  is  present  in  the 

queue,  t _ waitO  immediately  returns 

the  pointer  to  the  message  (the 
same  pointer  as  was  passed  into 


t _ send ( ) )  without  a  reschedule;  oth¬ 

erwise,  the  current  task  is  sus¬ 
pended  (removed  from  the  active 
list)  until  a  message  arrives. 

Normally,  a  message  pointer  is  re¬ 
turned.  Error  return  values  are 

TE _ TIMEOUT  (on  a  time-out  or  if 

the  timeout  argument  is  0  and  no 
message  is  waiting)  and  TE _ NO¬ 

TASKS  (if  the  current  task  is  the  only 
running  task — a  guaranteed  dead¬ 
lock). 

Blocking  and  Yielding 

Four  subroutines  are  provided  to 

support  blocking:  void  t _ cli( );,  void 

t _ sti(  );,  void  t _ block(  );,  and  void 

t _ releaset );. 

T _ cli( )  and  f _ sti( )  disable  and 

enable  interrupts.  Use  these  advis¬ 
edly.  Interrupts  should  never  be  off 
for  extended  periods.  They  should 


never  be  off  when  you’re  using  the 
DOS  I/O  functions. 

T block ( )  and  t release ( )  are 

more  reliable.  T _ block) )  disables 

the  scheduler  but  not  the  normal 
clock  interrupt.  That  is,  the  current 
task  retains  control  of  the  system 

once  t _ block))  is  called.  Because 

interrupts  are  not  disabled,  this  is  a 
much  safer  routine  to  use  than 
t _ cli) ).  There  is  one  caveat,  how¬ 

ever.  If  a  normal  interrupt  happens 
while  interrupts  are  disabled,  the 
hardware  will  execute  the  interrupts 
as  soon  as  they’re  enabled  again. 

This  is  not  the  case  with  t _ block) ). 

That  is,  the  scheduler  does  not 
know  whether  an  interrupt  hap¬ 
pened  while  it  was  blocked.  Conse¬ 
quently,  a  tight  loop  such  as  this: 

while!  "str  ) 

{ 

t _ block)  ); 

putcharf  *str+  +  ); 

t _ restore! ); 

} 

won’t  work  as  expected.  Because  so 
little  time  elapses  between  the  t _ re¬ 
store)  )  and  the  next  t _ block) )  call, 

the  odds  are  that  an  interrupt  will 
never  occur  while  the  scheduler  is 
active.  That  is,  the  following  will  prob¬ 
ably  work  in  just  the  same  way  as 
the  previous  example: 

t _ block! ); 

while!  *str  ) 

putcharf  *str  +  +  ); 
t _ restore!  ); 

Note  that  a  t _ cli) )  can  disrupt  the 

DOS  system  clock  (time  will  be  lost) 

while  interrupts  are  off;  t _ block) ) 

doesn’t  have  this  limitation. 

The  final  control-related  subrou¬ 
tine  is  void  t _ yield) );.  This  routine 

puts  the  current  task  to  sleep  and 
activates  the  task  that  has  the  next- 
highest  priority.  Note  that  the  cur¬ 
rent  task  is  always  suspended,  even 
if  it’s  the  highest-priority  task.  It  will 
get  back  control  on  the  next  timer 
interrupt  in  this  case,  however. 

T _ yield))  is  used  primarily  when 

the  scheduler  is  blocked  or  when 
you’re  running  a  nonpreemptive 
system.  It’s  also  useful  when  a  high- 
priority  task  doesn’t  do  much  and 
doesn't  want  to  hog  the  CPU. 
T — yield  returns  TE _ NOERR  when 


#include  <stdio.h> 

#include  <tools/video.h> 

♦include  "kernel .h" 

♦define  TEST  3  /*  1  =  nonpreemptive  test, 

*  2  =  preemptive  test:  simple  timer 

*  3  =  test  round-robin  scheduling 
*/ 

T_QUEUE  *Queuel; 

TJ2UEUE  *Queue2; 

main  (\scl28\) 

{ 

int  status  =  0; 

long  t_numint (\scl28\) ,  t_numblk (\scl28\) ; 

int  sam(\scl28\) ,  dave (\scl28\) ,  timer (\scl28\) ,  idle (\scl28\) , 

maintask; 

if (  ! (Queue 1  =  (T_QUEUE  *)  t_makequeue (  2  )  )  ) 
printf ("Can ' t  make  Queuel  queueVn"),  exit(l); 

if (  ! (Queue2  =  (T_QUEUE  *)  t_makequeue (  2  )  )  ) 
printf  ("Can't  make  Queue2  queue\n") ,  exit(l); 

♦  if  (TEST  ~  1) 

status  =  (int)  t_create (sam,  "sam",  100,  512,  "SAM",  NULL); 
status  =  (int)  t_create (dave,  "dave",  50,  512,  "DAVE”,  NULL); 
status  =  t_start (  0  ) ; 

♦endif 

♦if  (TEST  —  2) 

status  -  (int)  t_create (timer, "timer",  10,  512, "timer" , NULL) ; 
status  =  (int)  t_create (idle,  "idle",  1,  100,  NULL); 
status  -  t_start (  2  )  ; 

♦endif 

♦if  (TEST  —  3) 

status  =  (int)  t_create (maintask, "maintask”, 200,  512,  NULL); 
status  =  t_start  (  2  ) ; 

♦endif 

t_perror(  "\ndone:  ”,  status  ); 
t_sstats  (\scl28\) ; 

printf (“%ld  interrupts,  %ld  blocked\n”,  t_numint (\scl28\) , 

,  t  numblk (\scl2; 


Example  3:  A  program  that  creates  queues,  starts  multitasking,  and 
performs  three  tests 


138 

988 


Dr.  Dobb’s  Journal,  December  1987 


the  yield  was  successful  and  T£_NO- 
TASKS  if  there  were  no  tasks  to 
which  to  give  control. 

Note  that  t _ yieldf )  normally 

doesn’t  return  until  after  it  regains 
control  of  the  system.  That  is,  the 
normal  control  flow  goes  like  this: 

1.  A  task  calls  t _ yieldf ). 

2.  The  second  task  is  activated. 

3.  The  second  task  is  suspended. 

4.  Control  goes  back  to  the  original 

task  and  t _ yield  returns 

TE—NOERR. 

Statistics 

Several  debugging  and  statistics  rou¬ 
tines  are  provided.  Two  routines  are 
useful  for  debugging: 

t _ tprintf  t  ) 

TCB  *t; 

t — qprintl  q  ) 

T _ QUEUE  *q  ; 

T _ tprintf )  is  passed  a  TCB  pointer, 

and  it  prints  the  TCB  in  human- 
readable  form  to  standard  output; 
t _ qprintf )  does  the  same  for  the 


queues. 

Several  statistics  routines  are  also 

useful:  long  t _ numintf  );  returns  the 

number  of  unblocked  interrupts, 

and  long  t _ numblkf );  returns  the 

number  of  blocked  interrupts.  Fi¬ 
nally,  t _ sstatsf )  prints  various  sched¬ 

uler-related  statistics  (the  number 
of  times  the  scheduler  was  called, 
the  number  of  time-outs,  and  the 
number  of  context  swaps  done  by 
the  scheduler).  This  routine  should 
not  be  called  from  a  task  (because  it 
uses  printff ) ). 

Miscellany 

int  t _ secondf ) 

Returns  the  number  of  system  clock 
ticks  in  a  second,  given  the  speedup 
factor  passed  to  t _ startf ).  It  re¬ 

turns  0  if  the  speedup  factor  was  0. 

In  this  case,  a  t _ waitf )  call  will 

never  time-out. 

char  *t _ errlistl  1 

Works  like  the  normal  errlistl  I  that’s 
supported  by  most  C  compilers.  In¬ 
dexed  by  error  code,  it  evaluates  to 
a  string  that  holds  an  appropriate 
error  message. 


t _ iserrfc) 

Returns  true  if  x  is  an  error  code 
and  false  otherwise. 

t _ perrorf  str,  errcode  ) 

char  *str; 

Prints  an  error  message.  If  errcode 
is  an  error  code,  it  prints  an  appro¬ 
priate  message;  otherwise,  it  prints 
status  XXX,  where  XXX  is  errcode 
represented  as  a  decimal  number. 

t _ printff  fint,  ...  ) 

char  *fmt; 

Works  like  printff )  except  that  it 
blocks  before  printing  anything. 

T _ PRIORITYfa,b) 

TCB  *a,  *b; 

This  macro,  which  is  in  kernel  .h, 
compares  the  priorities  of  the  two 
tasks.  It  returns  a  negative  number 
if  a’s  priority  is  less  than  fa’s,  0  if  the 
priorities  are  the  same,  and  a  posi¬ 
tive  value  if  task  b  is  of  higher  prior¬ 
ity  than  task  a.  Note  that  a  task's 
priority  takes  into  consideration 
both  the  priority  passed  to 

t _ createf )  and  the  time  at  which 

the  task  was  last  suspended.  That  is, 


Dr.  Dobb’s  Journal,  December  1987 


139 

989 


dave !  arg  ) 
char  *arg; 

{ 

char  *s; 

t_printf{  “In  dave(%s),  about  to  wait  for  message\n",  arg  ); 

if(  t_iserr(s  =  t_wait (Queuel,  100))  ) 

t_perror ("dave :  first  wait  call",  (int)  s  ); 
else 

t_printf(  "dave:  got  <%s>  from  QueuelXn",  s  ); 

if (  t_iserr(s  =  t_wait (Queuel,  100))  ) 

t_perror ("dave:  first  wait  call",  (int)  s  ); 
else 

t_printf(  "dave:  got  %s  from  QueuelXn",  s  ); 

t_printf ("dave:  yieldingXn" ) ; 
t_yield(\scl28\); 

t_printf ("dave:  deleting  selfXn"); 
t_delete(  NULL  ); 

) 


Example  4:  The  task  sam( ) 


sam(  arg  ) 
char  *arg; 

( 

int  err; 

t_printf(  "In  sam(%s),  yielding:  no  messages\n",  arg); 
t_yield(\scl28\)  ; 

t_printf(  "sam:  back  from  yield,  sending  messagesXn") ; 

if (  err  -  t_send(  Queuel,  "1st  message")  ) 
t__perror  (  "Foo:  sending  1st  message",  err  ); 

if (  err  =  t_send(  Queuel,  "2nd  message")  ) 
t_perror(  "Foo:  sending  2nd  message",  err  ); 

if (  err  =  t_send (  Queuel,  "3rd  message")  ) 
t_perror (  "Foo:  sending  3rd  message",  err  ); 

t_printf ("Foo:  yielding  againXn"); 
t_yield (\scl28\)  ; 

t_printf ("Foo :  Returned  from  2nd  yield,  deleting  selfXn") ; 
t_dfelete (  NULL  )  ; 

) 


Example  S:  The  task  dave( ) 


In  sam (SAM) ,  yielding:  no  messages 
In  dave (DAVE) ,  about  to  wait  for  message 
sam:  back  from  yield,  sending  messages 
dave:  got  <lst  message;  from  Queuel 
dave:  got  2nd  message  from  Queuel 
dave:  yielding 
Foo:  yielding  again 
dave:  deleting  self 

Foo:  Returned  from  2nd  yield,  deleting  self 
done:  No  error 

Scheduler  called  0  times:  0  tasks  timed-out,  0  context  swaps 
0  interrupts,  0  blocked 


Example  6:  Output  from  test  1  Example  3 


C  CHEST 

(continued  from  page  139i 

if  two  explicit  priorities  are  the 
same,  then  the  task  that  was  sus¬ 
pended  most  recently  is  considered 
to  be  of  lower  priority  than  the  ear¬ 
lier  task.  This  mechanism  makes 
round-robin  scheduling  easy  to  im¬ 
plement  because  tasks  will  move  to 
the  bottom  of  the  list  as  they’re 
suspended. 

Some  Examples 

Let’s  look  at  a  few  examples.  First, 
main( )  must  create  a  couple  of 
queues  and  start  up  multitasking. 
I’ll  use  a  common  main()  module 
for  all  the  examples,  which  are 
shown  in  Example  3,  page  138.  One 
of  three  tests  is  performed,  depend¬ 
ing  on  the  value  of  the  TEST  macro. 
Two  queues,  both  of  length  2,  are 
created  at  the  top  of  the  subroutine. 
Then  a  few  tasks  are  created  and 
multitasking  is  started  up  with  a 
t — start!)  call.  Finally,  after  control 
returns  from  t _ start,  various  statis¬ 

tics  are  printed. 

Now  let's  look  at  the  individual 
tests.  Test  1  exercises  the  nonpreemp- 
tive  mode  and  demonstrates  mes¬ 
sage  passing.  The  two  tasks,  sam() 
and  daveO,  are  shown  in  Examples 
4  and  5,  left.  They  both  print  their 
start-up  arguments  (from  the  origi¬ 
nal  t — start!)  call)  and  then  pass 
several  messages  back  and  forth. 
Note  that,  though  I’m  passing 
strings  as  the  messages,  you  could 
pass  pointers  to  anything.  Both  the 

t _ wait( )  calls  and  the  t _ yield! ) 

calls  cause  control  to  be  passed  be¬ 
tween  tasks.  Also,  sam! )  intention¬ 
ally  tries  to  enqueue  too  many  mes¬ 
sages  to  see  if  the  error  mechanism 
is  working  properly.  The  output 
from  the  program  is  shown  in  Exam¬ 
ple  6,  left.  Note  that  all  the  statistics 
at  the  bottom  of  the  output  are  0 
because  the  system  isn’t  preemp¬ 
tive. 

The  next  test  is  a  small  timer.  It 
uses  the  idle ( )  task  that  I  looked  at 
earlier.  The  timer! )  task  is  shown  in 
Example  7,  page  141.  It  prints  a  start¬ 
up  message  and  then  enters  a  for 
loop  that  executes  five  times.  The 
t — wait! )  call  times-out  after  one 
second  because  nobody’s  sending 
any  message  to  the  queue.  Conse¬ 
quently,  the  timer  will  print  five  ex- 


140 

990 


Dr.  Dobb's  Journal,  December  1987 


clamation  points,  one  every  second, 
and  then  delete  itself. 

The  third  test  is  shown  in  Exam¬ 
ple  8,  below.  This  test  demonstrates 
several  things.  First  of  all,  mainO 
creates  only  one  task,  maintaskt ), 
which  in  turn  creates  two  more 
tasks.  Note,  however,  that  both  of 
these  tasks  use  the  same  code!  This 
is  possible  because  every  task  has  its 
own  stack.  Several  tasks  can  share 
the  same  code,  provided  that  they 
don't  use  static  variables  (the  local 
variables  will  be  on  physically  dis¬ 
tinct  stacks).  Here,  the  task  identifies 
itself  by  looking  at  its  argument  and 
it  prints  that  argument  every  so 

often.  The  dv _ putcharf )  subroutine 

is  a  direct-video  output  function, 
any  of  these  functions  will  do.  I 
generally  use  direct  video  in  multi¬ 
tasking  applications  because  going 
through  DOS  is  so  unpredictable  (it 
messes  with  the  interrupt  system). 
Because  the  two  tasks  are  of  the 
same  priority,  they’ll  be  executed  in 
a  round-robin  fashion — five  seconds 
of  alternating  Is  and  2s  will  be 
printed  on  the  screen. 

So  that’s  how  to  use  the  subrou¬ 
tines.  I’ll  look  in  depth  at  how  they 
work  next  month. 

Meaner  Than  Ever 

Kevin  Jennings  writes: 

“I  read  with  some  interest  your 
column  in  the  May  1987  issue  of 
DDJ  regarding  the  calculation  of 
sample  means.  There  is  indeed  a 
simple  algorithm  for  computing  the 
true  mean  of  a  set  of  data  points  on 
the  fly.  I've  included  both  a  block 
diagram  [Figure  1,  page  142]  as  well 
as  a  C  function  that  implements  it 
[Example  9,  page  142 — I’ve  taken  the 
liberty  of  making  Kevin’s  code  a  little 
more  efficient — Allen).  Call  meant i 
with  reset  true  the  first  time;  thereaf¬ 
ter,  call  it  with  reset  false  and  data 
holding  the  current  sample.  The 
subroutine  returns  the  running 
mean. 

“The  data  comes  in,  one  sample 
at  a  time,  and  is  represented  in  the 
block  diagram  at  Z(t).  The  algorithm 
computes  a  gain  to  apply  to  the 
difference  between  this  measure¬ 
ment  and  what  the  filter  thinks  the 
measurement  should  be.  This  gain 
[K(t))  is  then  added  to  the  previous 
estimate  to  give  a  new  estimate.  If 
equation  2  is  rearranged  as: 


xhat(t)  =  ( 1— K( t ) )xhat ( t— 1 )  +  K(t)Z(t)  (3) 

and  you  look  at  what  K(t)  does  as 
the  samples  come  in,  you  should  be 
able  to  convince  yourself  that  this 
algorithm  does  indeed  return  the 


sample  mean  of  all  data  received 
from  the  time  that  the  filter  is  reset 
up  to  the  current  measurement.  I 
don't  recommend  that  you  actually 
implement  equation  3,  however,  be¬ 
cause  it’s  inherently  less  accurate 


timer (\scl28\) 

( 

int  i; 

t_printf ("Starting  up  timer\n"); 

for(  i  =  5;  — i  >=  0  ;  ) 

{ 

t_printf(  "!"  ); 

t_wait (  Queue2,  t_second (\scl28\) 

} 

t_printf ("Deleting  timer\n"); 
t_delete (  NULL  ) ; 

) 

#endif 


Example  7:  The  timer! )  task 


timer (  arg  ) 

{ 

/*  I'm  assuming  that  pointers  &  ints  are 
*  the  same  size  here. 

*/ 


int  i  ; 

while (  1  ) 

( 

for (  i  =  10000;  — i  >=  0 

t_cli(\scl28\); 
dv_putchar (  arg  +  'O'  ); 
t  sti  (\sc!28\)  ; 


maintask (\scl28\) 

( 

TCB  *tl,  *t2; 

tl  =  t_create (  timer,  "1st  timer",  100,  512,  (void*)l,  NULL): 
t2  =  t_create (  timer,  "2nd  timer",  100,  512,  (void*) 2,  NULL); 

t_perror(  "taskl:",  (int)tl  ); 
t_perror(  "task2:",  (int)t2  ); 

t_wait (  Queue2,  5  *  t_second (\scl28\)  ); 

t_delete(  tl  ); 
t_delete(  t2  ); 
t_delete (  NULL  )  ; 


Example  8:  Test  3,  used  in  Example  3 


Dr.  Dobb’s  Journal,  December  1987 


141 

991 


C  CHEST 

(continued  from  page  141) 

than  is  2.  Statisticians  will  recognize 
this  algorithm  as  a  least-squares  esti¬ 
mator.  It’s  also  called  a  Kalman  filter 
in  digital  signal  processing. 

“Note  that  it’s  more  convenient  to 
update  the  reciprocal  of  the  gain 
[1/K(tl]  than  to  update  K(t)  directly. 
Because  the  reciprocal  takes  on  only 
positive  integer  values,  it's  tempting 
to  declare  ki  as  unsigned  int,  but 
then  ki  would  wrap  to  0  when  it 
reached  the  maximum  value  for  un¬ 
signed  int  (65,535,  given  a  16-bit  int). 
You’d  have  to  correct  for  the  wrap¬ 
around  by  holding  ki  at  its  maxi¬ 
mum  value  once  it’s  reached.  There¬ 
after,  the  algorithm  would  no  longer 
give  you  the  exact  mean,  but  no 


other  algorithm  would  either.  You're 
just  trying  to  process  too  much  data. 
I  snuck  around  this  difficulty  by 
declaring  ki  as  double.  (Doubles  typi¬ 
cally  don’t  wrap  when  they  over¬ 
flow).  In  any  event,  because  you  have 
to  do  a  floating-point  divide  to  cal¬ 
culate  the  mean,  there's  no  advan¬ 
tage  in  storing  ki  in  an  unsigned  int 
because  the  compiler  would  have  to 
convert  it  to  double  to  do  the  arith¬ 
metic.” 

Kevin  doesn’t  point  out  that  the 
algorithm  as  implemented  requires 
several  floating-point  operations,  so 
it’s  slower  than  the  exponential 
smoothing  algorithm  I  presented  in 
May,  which  used  nothing  but  shifts 
and  addition.  On  the  other  hand, 
the  Kalman  filter  has  an  appealing 
simplicity  and  is  more  appropriate 


1/K(t)  =  1  +1/K(t— 1)  :  1/K( — 1 )  =  0  (1) 

xhat(t)  =  xhat(t- 1 )  +  K(t)(  Z(t)-xhat(t- 1 ))  :xhat(-1)  =  0  (2) 


Figure  1:  Block  diagram  of  an  algorithm  that  computes  the  true  mean  of 
a  set  of  points  on  the  fly 


double  mean (reset,  data) 
int  reset; 
double  data; 

< 

static  double  xhat,  ki; 

return  reset  ?  (ki  =  xhat  =  0) 

:  (xhat  +=  (data  \sc0\  xhat)  /  ++ki) 
! 

) 


Example  9:  A  C  function  that  implements  the  algorithm  shown  if  Figure  1 


than  exponential  smoothing  in 
many  applications.  It’s  certainly 
more  accurate  for  small  numbers  of 
samples.  Don't  be  tempted,  by  the 
way,  to  change  the  doubles  into 
floats.  Most  C  compilers  convert  all 
floats  to  doubles  whenever  they're 
used  in  an  expression.  Consequently, 
it  takes  longer  to  multiply  two  floats 
than  it  does  to  multiply  two  doubles 
because  you  have  to  convert  them 
to  doubles,  do  the  arithmetic,  and 
then  truncate  back  down  to  floats 
to  store  the  result.  Moreover,  any 
space  savings  that  you  get  by  using 
floats  are  usually  lost  in  the  addi¬ 
tional  code  needed  to  do  the  type 
conversion. 

Availability 

All  the  source  code  for  the  multi¬ 
tasking  kernel  described  both  this 
and  next  month  (including  the  pri¬ 
ority-queue  stuff)  is  available  for  $30 
on  an  IBM-PC  5V4"  disk  from  Soft¬ 
ware  Engineering  Consultants,  P.O. 
Box  5679,  Berkeley,  CA  94705.  In¬ 
clude  local  sales  tax  if  you’re  or¬ 
dering  from  California.  In  addition 
to  the  kernel  code  and  the  priority 
queue  routines,  the  disk  includes 
an  enhanced  version  of  the  curses 
window  I/O  package  described  in 
the  July  1987  C  Chest.  Because  the 
enhanced  curses  uses  direct  video 
reads  and  writes  rather  than  going 
through  DOS,  it’s  useful  in  multi¬ 
tasking  applications  that  can’t  use 
the  DOS  I/O  functions.  This  version 
of  curses  supports  overlapping  win¬ 
dows  (though  you  can  only  write  to 
the  top  one)  and  lets  you  delete  and 
move  windows.  In  addition,  it  lets 
you  create  boxed  windows  (Unix’s 
curses  doesn't). 

DDJ 

(Listings  begin  on  page  110.) 

Vote  for  your  favorite  feature/article. 

Circle  Reader  Service  No.  6. 


142 

992 


Dr.  Dobb’s  Journal,  December  1987 


THE  FORTH  COLUMN 

New  Forth  Sources,  a  Bibliography, 
and  String  Extensions  for  Forth— 83 


In  the  news  this  month  is  FORCE, 
Harris  Semiconductor’s  version  of 
the  Novix  NC4016  Forth  processor. 
FORCE  stands  for  Forth-optimized 
RISC  computing  engine.  Chuck 
Moore’s  original  hardware  design 
has  been  copied  into  the  Harris  cell 
library,  where  it  can  be  rapidly 
moved  to  various  semiconductor 
technologies.  Harris  has  fixed  the 
NC4016  bugs  and  has  added  some 
extra  features,  such  as  byte  addressa¬ 
bility.  FORCE  is  intended  to  be  the 
smart  heart  of  custom  VLSI  chips  for 
knotty  problems  with  high-speed  so¬ 
lutions. 

According  to  Dave  Williams  of 
Harris,  a  FORCE-based,  real-time  con¬ 
trol  processor  (RTCP)  will  be  avail¬ 
able  in  the  first  quarter  of  1988.  The 
RTCP  chip  will  contain  an  interrupt 
processor,  256-word  data  and  return 
stacks,  and  a  16  X  16  hardware  multi¬ 
ply.  (Rumor  has  it  that  it  will  be 
offered  on  an  IBM  PC  card  shortly 
thereafter.)  Dave  says  the  chip  will 
run  at  15  MHz,  or  more  than  15 
million  instructions  per  second. 
FORTH  Inc.  intends  to  support  the 
chip  with  a  variation  of  the  same 
polyFORTH  it  developed  for  the 
Novix  NC4016.  This  Forth  includes 
an  optimizing  compiler  that  can 
pack  sequential  instructions  together 
so  that  they  run  simultaneously.  It 
also  includes  extensive  fixed,  frac¬ 
tional,  and  floating-point  math  sup¬ 
port  and  a  flexible  nonlinear  curve 
fitter  (the  mathematical  equivalent 
of  a  monkey  wrench). 

Speaking  of  Forth  chips,  Dr.  C.  H. 


by  Martin  Tracy 

Ting’s  More  on  NC4000,  Volume  5,  is 
now  available.  This  newsletter  con¬ 
tains  80  pages  of  technical  nitty 
gritty  on  the  Novix  NC4016  (origi¬ 
nally  numbered  the  NC4000). 
Volume  5  contains  a  reprint  of  "The 
FORCE  Toolbox"  by  Dave  Williams 
of  Harris  Semiconductor.  You  can 


order  More  on  NC4000  for  $15  from 
Offete  Enterprises  Inc.  at  (415)  574- 
8250.  Back  issues  are  still  available. 

Also  available  from  Dr.  Ting  at 
Offete  is  the  F83  Reference  Manual 
(1987).  This  is  strictly  a  reference 
manual  and  is  meant  to  accompany 
the  popular  public-domain  LaxerV 
Perry  F83  Forth.  Words  are  arranged 
by  topic,  and  each  word  has  a  one- 
line  description.  A  total  of  400  words 
are  described  in  40  pages,  which 
include  an  index  and  a  complete 
catalog  of  Offete’s  other  publica¬ 
tions — all  this  for  only  $10. 

What’s  surprising  is  not  that  400 
words  are  included  but  that  more 
than  600  words  are  left  out!  (The 
Forth-83  Standard  only  requires 
about  200  words.)  To  quote  Dr.  Ting: 
“The  only  problem  with  F83,  like 
any  good  looking  and  hard  working 
wife,  is  that  it  is  too  wordy.” 

GEXie  Forth  Forum 

The  GENie  (General  Electric  Network 
for  Information  Exchange)  Forth 
Forum  is  now  in  operation.  This 
new  Forth-oriented  bulletin  board 
is  sponsored  by  the  Forth  Interest 
Group  (FIG).  GENie  is  reportedly  the 
largest  electronic  information  ex¬ 
change  network,  with  local-access 
phone  numbers  from  most  major 
cities. 

Alan  Furman  writes:  “Use  1,200 
bps,  even  parity,  7  bits,  1  stop  bit, 
half-duplex  (echo  on).  Dial  up  the 
sign-on  modem  line:  (800)  638-8369. 
It  helps  to  record  the  session  on 
disk  as  there  is  a  lot  of  scrolling  off. 
After  CONNECT,  type  HHH  (without 
CR)  or  just  wait  5-10  seconds.  The 
U#=  prompt  will  come  on.  At  the 


prompt,  type  XJM11849, GENIE  (and 
a  CR).  When  the  menu  comes  on, 
get  the  local  node  and  billing  infor¬ 
mation  before  signing  up.” 

Signing  up  with  this  number  re¬ 
sults  in  the  FIG  member's  discount 
deal:  first  three  hours  free  (but  regu¬ 
lar  $18  sign-up  fee).  The  easiest 
way  to  sign  up  is  by  credit  card 
number.  GENie  will  make  a  confirm¬ 
ing  phone  call  a  few  days  later. 
When  they  do,  have  a  3-  to  12-  char¬ 
acter  moniker  ready  (no  embedded 
spaces).  You  can  change  your  name 
later,  but  it  will  cost  you  $10.  The 
$18  initial  fee  includes  a  nicely  pack¬ 
aged  users’  manual  that  will  arrive 
in  about  a  week. 

Once  you  are  on  GENie,  type 
FORTH  at  the  command  line  to  go 
directly  to  the  Forth  conference.  The 
sysops  are  Dennis  Rufer  (D.RUFFER), 
Scott  Squires  (S.W.SQUIRES),  and 
Gary  Smith  (GARY-S).  Use  the  ATT 
command  to  see  who  else  has  “at¬ 
tended.”  The  categories  are: 

1.  FIG  Bulletin  Board 

2.  FIG  Real  Time  Conference 

3.  FIG  Software  Library 

4.  About  the  Roundtable 

5.  Roundtable  News 

You  can  download  a  copy  of 
Laxen/Perry  F83  Forth  or  one  of  the 
many  files  contributed  by  Gerald 
Shifrin  of  the  East  Coast  Forth 
Board.  Some  categories  are  open  to 
FIG  members  only. 

GENie  is  definitely  a  non-prime- 
time  activity  at  $5/hour;  prime  time 
costs  $35/hour.  User  assistance  is 
available  until  9:00  pm  Pacific  Time 
at  (800)  638-9636. 

Forth  Bibliography 

The  third  edition  of  A  Bibliography 
of  Forth  References  (1987)  is  now 
available  from  the  Institute  for  Ap¬ 
plied  Forth  Research,  P.O.  Box  27686, 
Rochester,  NY  14627.  This  2,000- 
entry  bibliography  references  arti- 


144 


Dr.  Dobb’s  Journal,  December  1987 

993 


THE  FORTH  COLUMN 

(continued  from  page  144) 

cles  and  books  from  Forth's  dim 
origins  up  to  January  1987.  Articles 
are  indexed  both  by  subject  and 
author. 

Because  many  Forth  programmers 
love  to  write,  the  author  index  reads 
like  a  Who’s  Who  of  Forth.  I  use  the 
subject  index  mostly,  though.  It’s  a 
great  help  in  convincing  software 
managers  or  potential  customers 
that  Forth  has  already  been  used 
successfully  in  their  line  of  work. 

Here  is  an  example  of  using  the 
index: 

Q:  What’s  a  QUAN? 

A:  (433  )  The  Quan  Concept  Ex¬ 
panded 

(1120)  Code  Field  Vectoring 
(1538)  High  Speed,  Low  Memory  Con¬ 
sumption  Structures 

Q:  Has  Forth  been  used  for  radar 
antenna  programming? 

A:  (359  )  The  Development  of  a  Com¬ 


puterized  Antenna  Range  Field 
(879  )  Microprocessor  Control  of  Auto¬ 
mated  Antenna  Ranges 
(1010)  Ground  Control  Approach 
Radar  Performance  Monitoring 

The  bibliography  was  edited  by 
Thea  Martin  and  was  generated  en¬ 
tirely  with  Forth  programs  provided 
by  Dick  and  Jill  Miller  of  Miller  Mi¬ 
crocomputer  Services  (MMS).  Thea 
writes: 

“We  used  MMS’  DATAHANDLER- 
PLUS  and  FORTHWRITE  to  produce 
this  document.  DATAHANDLER- 
PLUS  is  a  powerful,  flexible  database 
written  in  Forth  that  interfaces  to 
the  MMS  word  processor, 
FORTHWRITE.  All  three  editions  of 
the  Bibliography  have  been  pro¬ 
duced  with  these  systems.” 

You  can  order  a  copy  of  the  bibli¬ 
ography  for  $25  directly  from  the 
Institute,  or  you  can  get  it  from  FIG 
([408]  277-0668). 

AIMS  Forth  Meeting 

The  first  meeting  of  ANSI  X3J14  (ANS 
Forth)  was  held  on  August  3  and  4 
at  CBEMA  headquarters  in  Washing¬ 


ton,  D.C.  In  attendance  were: 

Greg  Bailey,  Athena  Programming 
(Novix  Forth) 

Gary  Betts,  Saba  Technology 
Ronald  D.  Braithwaite,  The  Tools 
Group 

Richard  Burton,  National  Bureau  of 
Standards 

Don  Colburn,  Creative  Solutions  Inc. 
(MacFORTH) 

Chris  Colburn,  Creative  Solutions 
Inc.  (MacFORTH) 

Ted  Dickens,  The  Dickens  Co. 

John  Dorband,  NASA  GSFC 
Ray  Duncan,  Laboratory  Microsys¬ 
tems  Inc.  (PC/FORTH,  UR/FORTH) 
Douglas  Fishman,  National  Bureau 
of  Standards 

Lawrence  P.  Forsley,  Laboratory  for 
Laser  Energetics  (Rochester  Con¬ 
ference,  JFAR ) 

Charlie  Keane,  PPI 
Guy  M.  Kelly  (IEEE  representative, 
chairman  of  the  FST) 

Charles  H.  Moore,  Computer  Cow¬ 
boys  (inventor  of  Forth) 

Mike  Nemeth,  Computer  Sciences 
Corp.  (MD  FIG,  Goddard  FUG) 
David  C.  Petty,  Digit  el 


James  Rash,  NASA  GSFC 
Elizabeth  D.  Rather,  FORTH  Inc. 
(polyFORTH) 

Gerald  A.  Shifrin,  MCI  Telecommuni 
cations  (ECFB  sysop) 

Bill  Ragsdale,  Dorado  Systems  (foun¬ 
der  of  FIG,  figForth) 

Robert  Smith,  Maxtor  (FST  secretary) 
Martin  Tracy,  FORTH  Inc.  (Master- 
Forth,  DDJ) 

The  editorial  comments  are  claims 
to  fame  and  not  necessarily  current 
agenda.  Ms.  Rather  served  as  the 
acting  chair  and  Mr.  Duncan  as  the 
acting  secretary.  Ms.  Cathie  Kachu- 
rik  offered  welcome  assistance  to 
the  X3J14  Technical  Committee  (TC) 
on  behalf  of  CBEMA. 

The  complete  unofficial  minutes 
of  this  meeting  are  available  on  the 
MCI  ANS  Forth  bulletin  board,  with 
a  copy  on  the  East  Coast  Forth 
Board  ([703]  442-8695).  (Minutes 
become  official  when  approved  at 
the  next  meeting.)  Unofficially,  here 
are  some  of  the  highlights: 

•  Ms.  Rather  observed  that  a  Forth 
standard  should  identify  and  docu¬ 


ment  accepted  practice  and  not  be 
an  instrument  to  advance  the  state 
of  the  art.  She  hoped  that  an  ANS 
Forth  would  be  a  clear  and  com¬ 
plete  statement  of  what  Forth  is  and 
does,  that  it  would  require  minimal 
changes  to  existing  systems,  that  it 
would  be  universally  accepted,  and 
that  it  would  lead  to  the  acceptance 
of  the  Forth  language  as  a  profes¬ 
sional  instrument. 

•  The  Forth-83  Standard,  Chapters 
1-12,  without  the  appendices,  was 
approved  as  the  Basis  Document. 
(The  Basis  Document  is  successively 
modified  until  it  becomes  the  Draft 
Proposal.)  All  members  of  the  TC 
will  receive  a  Basis  Document  with 
all  paragraphs  numbered  for  their 
comments  and  review. 

•  A  proposal  to  add  floating-point 
math  to  the  Scope  of  Work  was  de¬ 
feated. 

•  Volunteers  were  solicited  for  offi¬ 
cer  positions.  The  ANSI  SMC  selects 
officers  from  this  pool  of  volunteers. 
Ad  hoc  committees  were  created  to 
research  existing  practice  and  iden¬ 
tify  major  areas  of  noncompliance 
to  Forth-83.  (The  Research  Commit¬ 


tee  has  since  mailed  a  questionnaire 
to  all  identified  producers  of  Forth.) 

The  next  ANS  Forth  meeting  is 
scheduled  for  November  11  and  12, 
just  prior  to  the  Forth  Convention. 
By  the  time  you  read  this,  both  will 
be  over.  To  be  a  voting  member  on 
the  ANS  Forth  TC,  you  must  pay 
CBEMA  a  $200  fee  and  be  prepared 
to  attend  four  three-day  meetings 
each  year.  You  can  follow  the  pro¬ 
gress  of  the  dpANS  (draft  proposal, 
pronounced  “de-pants")  on  the  MCI 
ANS  Forth  bulletin  board,  GENie,  or 
through  this  column.  For  now,  any 
technical  proposals  should  be  sent 
to  the  acting  secretary  (me)  at 
FORTH  Inc.,  Ill  N.  Sepulveda  Blvd., 
Manhattan  Beach,  CA  90266. 

Some  Controlled  Words 

In  the  October  1987  issue  of  DDJ,  I 
presented  a  Forth-83  software  prel¬ 
ude  for  writing  portable  source 
code.  You  will  need  to  add  the 
words  presented  in  that  article  to 
your  Forth-83  Forth  to  enjoy  the 
fruits  of  this  column.  Example  1, 
page  148,  contains  typical  defini- 


994 


THE  FORTH  COLUMN 

(continued  from  page  147) 

tions,  which  you  can  adjust  to  fit 
your  Forth. 

Now  we  can  define  a  (small) 
number  of  additional  words  that 
have  come  into  general  usage: 

:  \  >IN  @  64  +  -64  AND  >IN  !  ; 

IMMEDIATE 
(  “backslash”  comment  to  the  end 

of  line.) 

:  THRU  (  n  n2)  1  +  SWAP  DO  I  LOAD 

LOOP  ; 

\  LOAD  blocks  n  through  n2. 

These  words  are  defined  early  on 
so  that  we  can  use  them  to  compile 
other  useful  words.  The  "backslash” 
comment  has  two  cousins: 

:  \\  1024  >IN  !  ;  IMMEDIATE 
\  comment  to  end  of  source  block. 

:  \IF  (  f  )  0  =  IF  [COMPILE]  \  THEN  ; 

\  conditional  comment  or  interpret 

line. 

IMMEDIATE 

The  first  word,  \\ ,  stops  interpreting 
or  compiling  a  block  immediately. 
Most  Forths  can  use  EXIT  for  this 


function.  The  second  word,  \IF ,  is 
both  a  convenient  comment  and  a 
key  for  conditional  execution.  I'll  dis¬ 
cuss  how  you  can  use  \IF  in  a 
moment. 

In  the  meantime,  consider  another 
useful  word: 

:  2*  (  n  -  n')  DUP  +  ; 

Many  programmers  assume  this 
word  is  required  by  the  Forth-83 
Standard.  (It  isn't,  but  2/  is.)  By  defin¬ 
ing  2*  as  high-level  Forth,  we  guaran¬ 
tee  that  we  can  use  it  in  any  Stan¬ 
dard  program.  But  chances  are  ex¬ 
cellent  that  2*  is  already  in  our  dic¬ 
tionary  as  a  CODE  definition.  If  we 
redefine  it,  the  new  2*  will  be  much 
slower  than  the  original.  What  we 
would  really  like  to  do  is  to  define  2* 
only  if  it  isn’t  already  in  the  dic¬ 
tionary: 

:  NEED  (  -  fl 

\true  if  the  following  word  is  already 
\in  the  dictionary. 

32  (  ie  blank)  WORD  FIND  SWAP 
DROP  0=  ; 

Now  we  can  have  our  cake  and  eat 
it  too: 


NEED  2*  MF  :  2*  (  n  -  n’)  DUP  +  ; 

NEED  looks  2*  up  in  the  dictionary. 
If  it  isn't  there,  we  define  it  in  high- 
level. 

There  are  a  few  caveats  to  this 
approach.  MF  can  only  be  used 
within  a  source  block.  NEED  returns 
either  of  two  truth  values,  1  or  -1, 
so  be  careful  if  you  use  it  in  logical 
expressions.  Finally,  we  assume  that 
if  2*  is  already  in  the  dictionary,  its 
function  is  to  double  a  number  and 
not  to,  say,  print  two  stars  (**)  on 
the  terminal. 

Fortunately,  the  Forth-83  Standard 
has  provided  for  this  eventuality  by 
including  2*  in  its  Controlled  Refer¬ 
ence  Words  with  the  appropriate 
definition.  Controlled  Reference 
Words  are  not  required  by  the  Stan¬ 
dard,  but  if  they  do  appear,  they 
must  have  the  prescribed  definition. 
Example  2,  below,  contains  some 
other  Controlled  Reference  Words. 

Unfortunately,  there  are  not  very 
many  words  in  the  Controlled  Refer¬ 
ence  Set.  (The  Standard  also  has 
Uncontrolled  Reference  Words, 
which  are  exactly  that.)  Of  course, 
we  can  define  or  redefine  any  word 
we  want  to.  This  is  one  of  Forth’s 


(  Moves  NEXT  address  to  and  from  stack.) 

:  I>  (  -  a)  COMPILE  R>  ;  IMMEDIATE 

:  >1  (  a  )  COMPILE  >R  ;  IMMEDIATE 

(  Even  address  alignment,  if  required.) 

:  ALIGN  HERE  1  AND  ALLOT  ; 

:  REALIGN  (  a  -  a')  DUP  1  AND  +  ; 

(  Hides  number  of  bytes  per  word.) 

2  CONSTANT  CELL 
:  CELL+  (  n  -  n1)  2+  ; 

:  CELLS  (  n  -  n')  2*  ; 

(  Compiles  self-reference.) 

:  RECURSE  (  _ )  ;  IMMEDIATE 

(  Forces  interpretation  of  the  input  stream.) 

:  INTERPRET  (  . . . )  ; 

(  Discards  return  stack  overhead  of  DO — LOOP.) 
:  UNDO  I>  R>  R>  2DROP  >1  ; 


Example  1 


NEED 

D2*  \IF  : 

D2*  (  d 

-  d')  2 DUP  D+  ; 

NEED 

HEX  \IF  : 

HEX  (  DECIMAL  )  16  BASE  !  ; 

NEED 

C,  \  IF  : 

I,  (  n  ) 

HERE  1  ALLOT  C!  ; 

NEED 

BL  MF  32 

CONSTANT 

BL  (  a  blank) 

NEED 

ERASE  \IF 

:  ERASE 

(an)  00  FILL  ; 

NEED 

BLANK  \IF 

:  BLANK 

(  a  n)  BL  FILL  ; 

NEED 

■  R  \IF  : 

.R  (  n  w) 

>R- DUP  0<  R>  D.R  ; 

:  2>R  (  n  n2) 

\  pushes  a  pair  on  the  return  stack. 
COMPILE  SWAP  COMPILE  >R  COMPILE  >R  ; 
IMMEDIATE 

:  2R>  (  -  n  n2) 

\  pops  a  pair  from  the  return  stack. 
COMPILE  R>  COMPILE  R>  COMPILE  SWAP  J 
IMMEDIATE 

:  SEXECUTE  (  ?  )  @  EXECUTE  ; 

:  AGAIN 

\  used  in  a  BEGIN —  AGAIN  structure. 

0  [COMPILE]  LITERAL  [COMPILE]  UNTIL  ; 
IMMEDIATE 

:  DLITERAL  SWAP 

[COMPILE]  LITERAL  [COMPILE]  LITERAL  ; 
IMMEDIATE 

:  S>D  (  n  -  d)  DUP  0<  ; 

\  single  to  double  number. 

:  WITHIN  (  n  min  max  -  f) 

\  true  if  min  <=  n  <  max. 

OVER  -  >R  -  R>  U<  ; 

-1  CONSTANT  TRUE 


Example  2 


Example  3 


148 


Dr.  Dobb's  Journal,  December  1987 

995 


greatest  benefits — nothing  is  sacred 
(except  Forth  itself).  Anytime  we  re¬ 
define  a  CODE  definition,  however, 
we  lose  speed  needlessly. 

The  solution  is  to  add  a  few  se¬ 
lected  high-performance  words  to 
the  Controlled  Reference  Set.  Be¬ 
cause  the  Forth-83  Standard  com¬ 
mittee  is  not  meeting  at  this  time, 
we  will  have  to  do  it  ourselves.  So, 
consider  the  words  in  Example  3, 
page  148,  as  belonging  to  the  DDJ 
Forth  Column  Controlled  Reference 
Word  Set.  These  definitions  should 
each  be  preceded  by  the  appropri¬ 
ate  NEED  phrase.  Any  other  defini¬ 
tions  we  require,  we  will  simply 
define  (or  redefine)  in  high-level 
Forth.  In  other  words,  we  won’t  be 
needing  NEED  anymore. 

Strings 

This  month’s  topic  of  interest  is  im¬ 
plementing  strings  in  Forth.  Let’s 
see  how  we  can  apply  our  new  tools 
to  this  issue. 

The  most  enthusiastic  string  pack¬ 
age  I  have  ever  seen  was  written  by 
George  Hawkins  and  is  available  as 
the  file  FSTRINGSARC  from  the 


ECFB.  George’s  package  provides  66 
definitions  on  50  screens  of  source 
code,  including  a  test  suite.  There 
are  21  pages  of  documentation  with 
a  glossary.  The  package  is  written  in 
Forth-83  but  uses  the  BRANCH  ex¬ 
perimental  extension.  If  your  Forth 
doesn’t  BRANCH,  change  the  follow¬ 
ing  definitions  on  screen  6: 

:  $,  HERE  OVER  @  2  +  DUP  ALLOT 
ALIGN  CMOVE  ; 

:  (SLIT)  I>  DUP 

DUP  CELL  +  SWAP  @  +  REALIGN 

>1  ; 

:  [SLIT]  COMPILE  (SLIT)  ,$  ;  IMMEDI¬ 
ATE 

Once  I  made  this  change,  it  com¬ 
piled  the  first  time  and  added  a 
modest  3.3K  to  my  dictionary. 

Although  the  FSTRINGS  package 
is  not  a  SNOBOL,  it  is  suitable  for 
medium-strength  string  processing, 
such  as  building  concordance 
tables.  We  can  approximate  many  of 
its  functions  using  materials  that  lie 
readily  at  hand. 

First,  we  need  some  way  to  make 
strings.  In  Forth-83,  strings  are 


stored  in  memory  as  counted 
strings.  A  counted  string  is  referred 
to  by  a  single  address,  which  points 
to  the  count  byte  of  the  string.  The 
count  byte  is  followed  by  that  many 
1-byte  ASCII  characters.  In  some 
Forths,  it  may  be  padded  with  a 
blank  or  zero  to  the  next  even  ad¬ 
dress.  You  can  think  of  a  counted 
string  as  a  packed  string  that  must 
be  unpacked  to  a  text  string  before 
it  can  be  used.  Text  strings  are  re¬ 
ferred  to  by  two  arguments:  the  ad¬ 
dress  of  the  first  character  and  the 
length  of  the  string,  with  the  length 
on  top.  Counted  strings  are  con¬ 
verted  to  text  strings  with  the 
COUNT  operator.  We  will  also  need 
an  operator  to  pack  text  strings  into 
counted  strings: 

:  PLACE  I  al  n  a2)  2DUP  !  1  +  SWAP 

CMOVE  ; 

\  packs  string  al  n  into  counted 

string  a2. 

: ,“  34  (  ie  ASCII  ”)  WORD  COUNT 
\  compiles  following  string  at  HERE 
DUP  >R  HERE  PLACE  R>  1  + 
ALLOT  ALIGN  ; 

CREATE  TEST  This  is  a  test.” 


996 


THE  FORTH  COLUMN 

(continued  from  page  149) 

TEST  COUNT  TYPE  This  is  a  test,  ok 

We  can  compile  strings  as  literals 
into  colon  definitions  in  this  way: 

:  (“)  I>  COUNT  2DUP  +  REALIGN 

>1  ; 

:  “  COMPILE  (“)  ;  IMMEDIATE 

\  fundamental  string  compiler. 

:  TEST  “  This  is  a  literal.”  TYPE  ; 
TEST  This  is  a  literal,  ok 

I  have  included  COUNT  in  the 
run-time  action  of  a  string  literal. 
Several  Forths  include  the  "  opera¬ 
tor  but  not  all  of  them  COUNT  the 
string.  Nonetheless,  because  we  are 
redefining  it  ourselves,  it  will  behave 
as  we  expect. 

We  can  also  make  a  string  by  EX- 
PECTing  it  from  the  user  into  PAD: 

(  For  example:) 

:  TEXT  (-an) 

PAD  80  EXPECT  PAD  SPAN  @  ; 

:  NAME? 

CR  .“  Sign  in  please:  ”  TEXT 
CR  .“  Your  name  is  :  ”  TYPE  ; 


Substrings  are  trivial  in  Forth: 

:  /STRING  (  a  n  n2  -  a’  n1) 

\  shorten  string  a  n  by  n2  characters. 

ROT  OVER  +  ROT  ROT  -  ; 

:  TEST  “  Catatonic"  ; 

TEST  6  -  TYPE  Cat  ok 
TEST  4  /STRING  TYPE  tonic  ok 
TEST  5  /STRING  2-  TYPE  on  ok 

We  can  make  the  ''  literal  STATE- 
smart  to  make  testing  easier.  While 
we’re  at  it,  let’s  have  a  STATE- smart 
ASCII,  too: 

:  ASCII  (  -  c)  BL  WORD  1  +  C@ 

\  value  of  following  character. 

STATE  @  IF  [COMPILE]  LITERAL 

THEN  ; 

IMMEDIATE 

:  (“)  I>  COUNT  2DUP  +  >1  ; 

:  "  (  -  a  n)  STATE  @ 

IF  COMPILE  (")  ,” 

ELSE  ASCII  "  WORD  COUNT  >R 
PAD  I  CMOVE  PAD  R>  THEN  ; 
IMMEDIATE 
ASCII  A  EMIT  A  ok 
“  Simplicity”  TYPE  Simplicity  ok 

Numbers  are  easily  made  into 
strings  with  the  flexible  "sharp"  op¬ 


erators:  <#,  #,  #s,  HOLD,  SIGN,  and 
#>.  Making  strings  into  numbers, 
however,  is  a  little  harder.  We’ll  use 
a  syntax  suggested  by  Stephen  Pelc 
("Proposed  Standard  Changes,” 
FORML  Conference,  1986): 

VARIABLE  DPL 
:  VAL?  I  a  n  -  d  2  ,  n  1  ,  0) 

\  string  to  number  conversion. 

PAD  OVER  -  SWAP  OVER  >R 

CMOVE 

BL  PAD  C!  PAD  DPL  !  0  0  R> 

DUP  C@  ASCII  -  =  DUP  >R  -  1- 
BEGIN  CONVERT  DUP  C@  DUP 

ASCII  :  = 

SWAP  ASCII  ,  ASCII  /  1  +  WITHIN 

OR 

WHILE  DUP  DPL  !  REPEAT 
R>  SWAP  >R  IF  DNEGATE  THEN 
PAD  1-  DPL  @  -  DPL  !  R>  PAD  = 

IF  DPL  @  0>  IF  DROP  1  ELSE  2 

THEN 

ELSE  2DROP  0  THEN  ; 

When  the  smoke  clears,  there  will 
be  a  2  on  the  stack  if  the  number  is 
a  double,  a  1  if  it  is  single,  and  a  0  if 
it  is  invalid.  The  number,  if  any,  will 
be  on  the  stack  under  the  flag.  A 
number  containing  punctuation 
from  the  set  +  ,-J:  is  a  double 
number,  with  DPL  set  to  the  number 
of  places  to  the  right  of  the  right¬ 
most  punctuation.  A  number  with¬ 
out  punctuation  is  a  single  number, 
and  DPL  is  set  to  a  negative  value. 

“  123,456"  VAL?  .  2  ok 
D.  123456  ok 
DPL  @  .  3  ok 

Many  Forths  already  support  auto¬ 
matic  single-  and  double-number 
conversion,  but  there  is  no  general 
agreement  on  the  syntax.  The  Forth- 
83  DPL  is  one  of  the  Uncontrolled 
Reference  Words,  which  means  we 
can’t  count  on  it.  Hopefully,  this 
issue  will  be  taken  up  by  the  ANS 
Forth  technical  committee. 

Because  strings  are  often  con¬ 
verted  to  double  numbers,  irrespec¬ 
tive  of  punctuation,  we  could  use  a 
simpler  conversion  primitive  based 
on  VAL?: 

:  VAL  (an-df)  VAL?  3  <  AND  DUP 
\  string  to  double  number. 

\  True  if  number  is  valid. 

IF  1  =  IF  S>D  THEN  TRUE  EXIT 

THEN 


150 


Dr.  Dobb's  Journal,  December  1987 

997 


THE  FORTH  COLUMN 

(continued  from  page  150) 

0  0  ; 

SKIP  and  SCAN  are  two  impor¬ 
tant  string  search  primitives  that 
appear  in  several  Forths  as  factors 
of  the  word  WORD: 

SKIP  (  a  n  c  -  a'  n’) 


-TEXT  ( 
\  -1  if 
\  and  1 

COMPARE 
\  -1  if 
\  and  1 


a  n  a2  - 
string  a 
if  >. 


-1  . 

n  < 


0 

a2 


1) 


(  a  n  a2  n2  - 
string  a  n  < 
if  >. 


■  -1  , 

a2  n2 


\  shortens  a  string  to  the  first  posi¬ 
tion 

\  unequal  to  c. 

SCAN  (  a  n  c  -  a’  n’) 

\  shortens  a  string  to  the  first  posi¬ 
tion 

\  equal  to  c. 

SKIP  and  SCAN  mimic  the  8086 
SCAS  instruction.  Their  high-level 
definitions  are  given  in  the  accompa¬ 
nying  source  screens 
(see  Listing  One,  page 
124).  SCAN  is  especially 
useful  for  lexing  and 
parsing  strings: 


0  if  equal. 


1) 

if  equal. 


-MATCH  (  a  n  a2  n2  -  offset  0,7-1) 
\  position  of  string  a2  n2  in  a  n. 

\  Offset  is  0  if  a  n  is  found  in  1st 
\  position.  True  with  invalid  offset 
\  if  a2  n2  isn't  in  a  n. 

:  ANIMAL  "  ANIMAL"  ; 

"  ANIMATE"  ANIMAL  COMPARE  .  1  ok 


"  ANT"  ANIMAL  COMPARE  .  -1  ok 
ANIMAL  “  IMA"  -MATCH  .  .  0  2  ok 
ANIMAL  "  XYZ"  -MATCH  .  .  -1  7777? 


ok 


Example  4 


(  For  example:) 

:  LEX  (  a  n  c  -  a2  n2  a3 
n3) 

\  splits  string  at  c,  right 
most  string 
\  on  top.  Either  string 
can  have  0  length. 
>R  2DLJP  R>  SCAN 
ROT  OVER  - 
ROT  ROT  DUP  0> 
NEGATE  /STRING  ; 
“  FORTH.COM"  ASCII  . 

LEX 

2SWAP  TYPE  SPACE 


TYPE  FORTH  COM  ok 

CTO""  (“C-to-quote”)  is  an  often  ne¬ 
glected  primitive  for  converting  a 
character  to  a  string: 

:  CTO“”  (  c  -  a  1)  CTEMP  C!  CTEMP 

1  ; 

ASCII  A  CTO"”  TYPE  A  ok 

The  remaining  string  operators 
(shown  in  Example  4,  left)  are  used 
to  compare  strings  and  to  find  the 
occurrence  of  one  string  in  another. 
Their  analogs  are  often  found  in  line 
and  screen  editors. 

The  final  string  operator,  EVAL,  is 
also  the  most  powerful.  EVAL  inter¬ 
prets  a  string.  The  following  defini¬ 
tion  of  EVAL  should  work  on  your 
Forth-83  system.  I  say  "should”  be¬ 
cause  there  is  some  question  as  to 
whether  the  Standard  words  >IN, 
BLK,  TIB,  and  #TIB  control  interpre¬ 
tation  or  simply  describe  it.  You  will 
need  to  test  this  word  on  your 
system. 

:  EVAL  (  a  n)  \  interprets  the  string. 
DUP  >R  TIB  SWAP  CMOVE  R@ 

#TIB  ! 

0  >IN  !  0  BLK  !  INTERPRET  R> 

>IN  !  ; 

:  TEST  “  LATER”  EVAL  ; 

:  LATER  ."  A  forward  reference.”  ; 
TEST  A  forward  reference,  ok 

This  completes  the  string  pack¬ 
age,  and  you  will  find  a  complete 
listing  of  it  in  Listing  One.  It  com¬ 
piles  to  about  IK  of  dictionary  space. 

Availability 

Most  of  the  source  code  for  articles 
in  this  issue  is  available  on  a  single 
disk.  To  order,  send  $14.95  to  Dr. 
Dobb's  Journal,  501  Galveston  Dr., 
Redwood  City,  CA  94063,  or  call  (415) 
366-3600,  ext.  216.  Please  specify  the 
issue  number  and  format  (MS-DOS, 
Macintosh,  Kaypro). 

DDJ 

(Listing  begins  on  page  124.) 

Vbte  for  your  favorite  feature/article. 

Circle  Reader  Service  No.  7. 


152 

998 


Dr.  Dobb’s  Journal,  December  1987 


PROGRAMMER'S  SERVICES 

OF  INTEREST 


Operating  Systems 
VenturCom  has  announced  Venix 
System  V  2.3,  a  real-time  Unix  operat¬ 
ing  system  for  the  IBM  PC/AT,  HP 
Vectra  Plus,  and  Compaq  Deskpro 
386.  Enhancements  include  impoved 
memory  management  to  minimize 
I/O;  a  more  efficiently  coded  kernel; 
a  larger,  1,024-byte  file  system;  and 
an  extended  buffer  cache. 

Venix  2.3  also  features  real-time 
extensions,  adherence  to  AT&T  stan¬ 
dards,  EGA  support,  AT&T  binary 
compatibility,  developers'  tools  that 
include  a  large-model  C  compiler, 
and  Ethernet  TCP/IP  support. 

The  full  system  is  priced  at  $990 
for  single-quantity  purchases  but  is 
being  offered  at  an  introductory 
price  of  $600  for  a  two-user  system 
until  December  31,  1987.  It  comes 
with  an  indexed,  four-volume  set  of 
documentation  and  60  days  of  toll- 
free  telephone  support.  Reader  Serv¬ 
ice  No.  16. 

VenturCom  Inc. 

215  First  St. 

Cambridge,  MA  02142 
(617)  661-1230 

PC-MOS/386,  Version  1.02,  from  The 
Software  Link,  now  supports  IBM 
PCs;  IBM  PC/ATs;  and  compatible 
8088-,  8086-,  and  80286-based  sys¬ 
tems.  PC-MOS/386  offers  file  sharing, 
user  security,  print  spooling,  disk 
caching,  NETBIOS  emulation,  EMS 
emulation,  interrupt-driven  serial 
I/O,  enhanced  command-line  recall 
and  editing,  support  for  very  large 
disk  volumes,  an  enhanced  direc¬ 
tory  structure,  sophisticated  com¬ 
mand  processing,  a  full-screen  text 
editor,  and  on-line  help.  Features 


that  are  specific  to  the  80386  CPU, 
such  as  support  for  32-bit  native¬ 
mode  applications,  are  now  isolated 
in  a  single  system  driver  file. 

PC-MOS/386  is  available  on  both 
51A-  and  3.5-inch  disks.  Prices  range 
from  $195  for  a  1-user  version  to 
$995  for  a  25-user  version.  Regis¬ 
tered  users  of  earlier  versions  can 
upgrade  to  Version  1.02  at  no  charge. 
Reader  Service  No.  17. 

The  Software  Link 
3577  Parkway  Ln. 

Atlanta,  GA  30092 
(404)  448-5465 

Digital  Research  is  now  shipping 
enhanced  versions  of  Concurrent 
DOS  386  and  Concurrent  DOS  XM 
(Expanded  Memory). 

Concurrent  DOS  386  supports 
most  PC-DOS/MS-DOS  applications 
and  more  than  700  business  applica¬ 
tions  written  specifically  for  Concur¬ 
rent  DOS.  Windowing  capabilities 
are  provided,  allowing  up  to  four 
applications  to  run  from  the  pri¬ 
mary  console.  It  is  available  in  a 
three-user  system  configuration, 
priced  at  $395,  or  in  a  ten-user  ver¬ 
sion  for  $495. 

Concurrent  DOS  XM  offers  im¬ 
proved  EGA  support,  enhanced  sup¬ 
port  for  the  IBM  PC/AT  keyboard, 
disk  formatting  for  up  to  four  parti¬ 
tions  per  hard  disk,  and  support  for 
the  AST-Four  Port/DOS  card  and  the 
four-port  Hostess  Multiport  Network 
Adapter.  The  three-user  system  is 
available  for  $295  and  the  six-user 
version  for  $395.  Reader  Service  No. 
18. 

Digital  Research 
P.O.  Box  DRI 
Monterey,  CA  93942 
(408)  649-3896 

The  Hyperspace  Z-System  for 
HD64180-compatible  and  Z280-com- 
patible  microprocessors  is  now  avail¬ 
able  from  Echelon.  The  Hyperspace 
Z-System  is  a  CP/M  2.2-compatible 
operating  system  that  includes 
ZCPR  3.3,  ZRDOS  2.0,  and  a  sample 
BIOS  for  64180  machines.  The 
system  features  a  large  free  memory 
area  of  57.25K.  The  retail  price  is 
$195.  Reader  Service  No.  19. 

Echelon  Inc. 


885  N.  San  Antonio  Rd. 

Los  Altos,  CA  94022 
(415)  948-3820 

UniPress  Software  has  launched 
the  Unix  Training  Center,  which 
offers  22  courses  to  help  users  and 
organizations  learn  to  use  Unix.  The 
center  personnel  can  provide  needs 
analyses,  curriculum  design,  origi¬ 
nal  course  development,  and  instruc¬ 
tor  training.  The  Unix  Training 
Center's  classes  can  be  given  either 
at  an  organization’s  facility  or  at  an 
off-site  meeting  area  determined  by 
a  client.  Classes  are  offered  on-site 
for  a  fee  of  $100  per  day  per  student, 
with  a  minimum  fee  of  $1,300  per 
day.  Reader  Service  No.  20. 

UniPress  Software 
2025  Lincoln  Hwy. 

Edison,  NJ  08817 
(201)  985-8000 

The  Wendin-DOS  Application  Devel¬ 
oper’s  Kit  is  now  available  from 
Wendin.  The  kit  allows  application 
programmers  to  develop  multi¬ 
tasking  applications  based  on  VAX/ 
VMS  system  services.  The  new  QIO 
(Queued  I/O)  and  RMS  (Record  Man¬ 
agement  System)  services  replace  an¬ 
tiquated  MS-DOS  calls  and  induce 
more  programming  flexibility.  Other 
serices  support  shared  memory  for 
interprocess  communication, 

memory-resident  pipes,  sema- 
phones,  file  locking,  file  permissions, 
extended  memory  access,  swapping, 
and  control  of  multiple  terminals. 
The  kit  sells  for  $99.  Reader  Service 
No.  21. 

Wendin  Inc. 

P.O.  Box  3888 
Spokane,  WA  99220-3888 
(509)  624-8088 

Microprocessor  Engineering  has 

released  its  OS-9  implementation  of 
the  MPE/Nautilus  Cross  Compiler. 
MPE/Nautilus  features  interactive  de¬ 
bugging  that  allows  development  of 
the  target  system  in  RAM  before  com¬ 
mitting  to  ROM.  Written  in  Forth, 
MPE/Nautilus  includes  automatic 
handling  of  defining  words  and  sup¬ 
port  for  forward  referencing.  Reader 
Service  No.  22. 

Microprocessor  Engineering  Ltd. 


154 


Dr.  Dobb’s  Journal,  December  1987 

999 


OF  INTEREST 

(continued  from  page  154) 

133/133a  Hill  Ln. 

Shirley,  Southampton 
England  SOI  5AF 
0703-631441 

Hardware 

Lloyd  I/O  is  now  shipping  its 
OMEGA  MC68020-based  workstation. 
The  workstation  provides  integral 
floating-point  math  support  via  the 
MC68881  math  coprocessor.  It  also 
includes  1  megabyte  of  zero-wait- 
state;  nonvolatile  static  RAM;  the  OS- 
9,  68K,  real-time,  multitasking  oper¬ 
ating  system;  and  a  C  compiler.  The 
standard  system  configuration  sup¬ 
ports  up  to  four  users;  further  users 
are  supported  by  optional  I/O  expan¬ 
sion  boards.  The  base  price  for  the 
workstation  is  $4,750.  Special  system 
configurations,  higher  clock  speeds, 
and  hardware  options  are  available 
for  an  additional  cost.  Reader  Serv¬ 
ice  No.  23. 

Lloyd  I/O  Inc. 

19535  Northeast  Glisan  St. 

P.O.  Box  30945 
Portland,  OR  97230 
(800)  227-3719 

An  80386-based  multiuser  system  run¬ 
ning  the  Xenix  System  V  operating 
system,  the  Altos  386  Series  2000,  is 
available  from  Altos  Computer  Sys¬ 
tems.  The  Series  2000 ’s  design  fea¬ 
tures  modular  architecture  that 
allows  memory  and  storage  to  be 
easily  expanded.  It  also  includes  an 
intelligent  file  processor  subsystem; 
system  memory;  a  communications 
processor;  an  ESDI  hard-disk  drive; 
a  1.6  megabyte,  5V4-inch  floppy-disk 
drive;  a  60-megabyte  streaming  mag¬ 
netic  tape  unit;  and  an  Altos  V  termi¬ 
nal.  The  80386  processor  runs  at  16 
MHz  with  a  32K  data  and  instruc¬ 
tion  cache,  an  intelligent  file  proces¬ 
sor,  and  an  80387  floating-point 
coprocessor. 

The  Altos  386  Series  2000  is  avail¬ 
able  in  four  configurations  ranging 
in  price  from  $25,000  to  $30,000. 
Reader  Service  No.  24. 

Altos  Computer  Systems 
2641  Orchard  Pkwy. 

San  Jose,  CA  95134 
(408)  946-6700 

DDJ 


156 

1000 


Dr.  Dobb’s  Journal,  December  1987 


FORUM 


SWAINE'S  FLAMES 


Remember  the  good  old  days 
when  ordinary  people  were 
afraid  of  computers?  Then  high  tech 
became  trendy.  I  wouldn’t  be  sur¬ 
prised  if  Helen  Gurley  Brown’s  Cos¬ 
mopolitan  editorial  this  month 
began,  “I  just  love  operating  sys¬ 
tems,  don’t  you?  They’re  so  earthy 
and  fundamental.  And  1987  was 
such  an  OS-some  year!” 

And  Helen  would  be  right;  1987 
was  OS-some.  Microsoft  and  IBM 
revealed  the  master  plan  for  the 
timed  release  of  OS/2,  the  operating 
system  for  the  1990s,  complete  with 
developer’s  kits  and  seminars,  and 
began  delivering  the  pieces  right  on 
schedule. 

Then,  just  when  you  thought  that 
DOS  was  a  stable  development  envi¬ 
ronment  (OS-sified?),  Microsoft  and 
a  cast  of  thousands  announced  (1) 
the  Lotus/Intel/Microsoft  Expanded 
Memory  Specification  (give  a  point 
to  Quarterdeck)  and  (2)  Windows/ 
386  (take  it  away  again).  Now, 
modulo  a  few  fatal  bugs  that  would 
undoubtedly  be  fixed  in  the  next 
release,  you  could  stick  code  as  well 
as  data  up  above  640K,  run  more 
and  bigger  TSRs,  switch  instantane¬ 
ously  among  existing  applications; 
here  was  multitasking  before  OS/2. 

It  wasn’t  immediately  obvious  just 
what  effect  the  release  of  Windows/ 
386  would  have  on  the  reception  of 
OS/2.  It  did  seem  that  the  peculiar 
benefits  of  OS/2  would  be  realized 
only  with  the  arrival  of  OS/2-specific 
software,  emphasizing  inter-applica¬ 
tion  communication  or  intra-appli¬ 
cation  multitasking.  And  it  de¬ 
pressed  the  Microsoft  OS/2  develop¬ 
ers,  who  longed  to  forget  the  286 
processor  and  get  on  with  the  real 
thing,  OS/2/386,  before  Intel  released 
the  486  and  really  depressed  them. 

While  we  were  all  waiting  for  OS/2, 
a  few  companies  went  ahead  with 
386  tools,  including  Phar  Lap,  Meta- 
Ware,  Softguard,  and  The  Software 


Link.  Compaq  offered  a  version  of 
DOS  that  eliminated  DOS’s  disk-file 
size  limitations.  And  Digital  Research 
delivered  Concurrent  DOS  386,  a 
DOS  3.3-compatible  multiuser,  multi¬ 
tasking  windowing  operating  system 
for  386  machines. 

In  that  other  universe,  Apple  re¬ 
leased  Multifinder,  allowing  users  to 
switch  smoothly  between  existing 
Mac  applications.  But  wait;  there’s 
more. 

Meanwhile,  a  number  of  develop¬ 
ments  brought  Unix  into  promi¬ 
nence  as  an  operating  system  for 
personal  computers.  There  was  the 
narrowing  power  gap  between  per¬ 
sonal  computers  and  workstations 
(and  the  entrenchment  of  Unix  on 
engineering  workstations).  There 
were  the  steps  toward  standardiza¬ 
tion  seen  in  X/Open  and  IEEE  Posix, 
and  the  movement  toward  conver¬ 
gence  of  the  Berkeley  and  AT&T 
versions.  Although  they  flaunted  stan¬ 
dardization  efforts,  you  wouldn’t 
want  to  discount  IBM’s  endorse¬ 
ment  of  Unix  with  its  AIX  (to  be 
available  on  the  RT,  PS/2,  and  System 
370),  and  Apple’s  ditto  with  A/UX. 
Nor  would  you  want  to  discount  the 
influence  of  Unix  supporters  Sun 
and  NeXT,  either,  or  all  those  sex, 
drugs,  and  Unix  buttons  at  the  Hack¬ 
ers'  Conference.  Then  there  was  the 
growing  support  for  the  Unix  graphi¬ 
cal  interface,  Xwindows. 

With  the  major  operating  systems 
all  moving  toward  some  form  of  mul¬ 
titasking  and  all  providing  some  sort 
of  windowing  user  interface,  what 
the  world  was  coming  to  need, 


many  believed,  were  tools  to  ease 
the  development  process  for 
window-oriented  software,  and  to 
make  it  easier  to  port  code  across 
windowing  environments.  The  White- 
water  Group  did  well  pitching  its 
object-oriented  language,  Actor,  as  a 
tool  to  make  Windows  development 
easier,  and  began  work  on  their  Mac 
version. 

Michael  Bentley  saw  the  same 
need.  While  reviewers  gushed  over 
Danny  Goodman’s  book  on  Bill  At¬ 
kinson’s  HyperCard  (a  thorough  and 
readable  book  about  an  interesting 
product),  Michael  Bentley’s  The  View¬ 
port  Technician,  which  promised 
not  only  to  show  how  to  develop 
code  portable  across  windowing 
environments,  but  how  to  make  that 
portable  code  efficient,  got  little 
press.  Did  the  book  deliver  on  its 
promise?  Only  someone  who  had 
developed  code  for  the  Amiga,  GEM, 
Windows,  and  the  Mac  could  say. 

Odds  and  ends: 

Contrary  to  Allen  Holub’s  expecta¬ 
tions,  Microsoft  did  not  deliver 
Quick  C  when  promised,  any  more 
than  Borland  delivered  Turbo  C 
when  promised. 

The  third  edition  of  Daniel  Remer 
and  Stephen  Elias’s  Legal  Care  for 
Your  Software  is  now  out.  I’m  not 
convinced  that  the  authors  have 
done  their  job  with  respect  to  the 
tricky  area  of  ownership  of  reusable 
code  in  work  done  for  hire,  but 
generally  this  is  a  good  first  book  in 
the  legal  issues  in  software  develop¬ 
ment  and  sale.  It’s  published  by 
Nolo  Press,  950  Parker  Street, 
Berkeley,  CA  94710. 

Michael  Swaine 
editor-in-chief 


160 


Dr.  Dobb’s  Journal,  December  1987 

1001 


Index 


1  bit,  hidden,  73 
10-MHz  80286, 431 
128-bit  data  bus,  10 
16-bit  Forths,  68 

"16-Bit  Software  Toolbox,"  145,  233,  400,  489,  557,  632;  reference 
to,  748 

16-MHz  386  machine,  650 
20- MHZ  68020,  9 
286  Mothercard  5.0,  650 
32-bit 

bus,  10 

personal  computer,  257 

386,  DOS-Extender,  support  for,  817;  Turbo  board,  420 
386LINK,  compatibility  with  Microsoft  LINK,  558 
"3-D  Images  from  Contour  Maps,"  826 
64-bit  data  bus,  10 

6502,  99;  chip,  97;  microprocessor,  8;  systems,  97;  first  256  bytes, 
98;  five  registers,  97;  flags,  97;  nonregular  instruction  set,  97; 
reducing  number  of  bytes,  101;  saving  bytes  using  subroutines, 
100;  saving  bytes,  102 
"6502  Hacks,"  97;  reference  to,  494; 

68000,  16,  48;  assembly  code,  12;  assembly  language,  23;  assembly- 
language  programs,  22;  cross  assembler,  496;  environment,  757; 
microprocessor,  22;  PC-relative  JSR,  758;  program,  20 
trap  mechanism,  23 
68008,  260,  646 

68020,  257;  CPU,  9;  20-MHZ,  9 

68020-based  system,  16 

68030,  10;  processor,  10 

6809,  260;  processor,  1 6 

6809-based  Tandy  Color  Computer,  16 

680xx,  4;  assembly  language,  72 

"680xx  Computers:  Where  Are  They  Going?,"  8 

68100  CPU  chips,  10 

68200  CPU  chips,  10 

68851  MMU,  9 

68881,  10;  FPU,  9 

68882,  10 

68HC1 1  floating-point  library,  500 
79  Standard,  324 
8-MHz  68008,  646 
80286  Architecture,  The,  402 
80286,  425,  916;  resources,  402 

80386,  89,  233,  425,  509;  programming  tools,  557;  resources,  233 

80386/80286  Assembly  Language  Programming,  233 

80386  Hardware  Reference  Manual,  233 

80386  Programmer's  Reference  Manual,  233 

80386:  A  Collection  of  Article  Reprints,  233 

80386:  A  High  Performance  Workstation  Microprocessor,  The,  233 

8080  microprocessor,  8 

8086  Book,  The,  634 

8086,  509;  changes  made  to,  89;  programming,  634 
"8088  Assembly-Language  Programming  Techniques,"  514;  reference 
to,  918 

8088  Book,  The,  825 

8088,  89;  microprocessor,  514,  quirk,  517;  optimization,  letter,  824 
83  Standard,  324 

AAIS  Prolog,  270;  debugger,  271;  user  interfaces,  271 

AALAP,  754 

absence  of  gesture,  347 

absolute  variables,  159 

Access  Associates,  82 

accessing  disk  blocks,  925 


accuracy,  defined,  935 
Ackerman,  Mark  S.,  97 
ACORN,  155 

active  values,  412,  903;  LOOPS,  641 
ACTOR, 78 

Ada,  524,  813;  package,  create  dimensional  data  types,  367; 

programming  language,  366 
adapting  I/O  routines,  235 

adaptors,  IBM-standard,  text  screens  for,  937;  non-MDA,  937 

additions,  188;  pointer,  514 

additive  synthesis,  35 1 

address,  storage,  437 

Advanced  A. I.  Systems,  169 

Advanced  Micro  Devices  290 1C  chip,  567 

Agora  system,  153 

Aho,  Sethi,  and  Ullman,  105 

AI,  76,  164,  166,  241,  640;  and  backtracking,  589;  Architects,  651; 
books,  new,  157;  commercial  applications,  150;  workstation, 

566 

AIDS  expert  system,  253 
AJS  Publishing,  909 
Albert,  Douglas  J.,  402 
Alegra,  82 
Alexy,  George,  825 
algorighm  composers,  349 

algorithms,  188,  189,  348,  398,  443,  666,  826;  Bickel's  856,  857; 
boxcar,  392,  393,  395;  contour-analysis,  826;  evaluating,  666; 
comparing  files,  666;  for  drawing  circles,  659;  for  finding 
matches,  666,  668;  hash,  142;  KMP,  766;  recursive  longest 
matching  sequence,  668,  669;  sorting,  482;  transmit  and  receive, 
750 

Alligator  Transforms,  8 1 
Allison,  Dennis,  94 
Alloy  Computer  Products,  740 
"Alternative  to  Soundex,  An,"  856 
Altos  Computer  Systems,  1000 
Amdek,  74 1 

American  Computer  &  Peripheral,  420 
American  National  Standards  Institute,  583 
Amiga,  9;  BASIC,  278;  interpreter,  283;  package,  278;  computers, 
22;  event  trapping,  25;  interface  standard,  22;  menus,  23; 
equestors,  23;  system  routines,  23;  text  editing,  23,  26;  text  I/O, 
26;  window,  22;  creating  menus  for,  24;  creating  windows  for, 
25;  menu,  26 
AmigaDOS  C,  419 
analog  voltage,  435;  measuring,  434 
analog-to-digital,  434 
analysis,  600 
Analytical  Engine,  346 
AND  clauses,  279 
Anderson's  Modula-2,  162 

Anderson,  Brian  R.,  429;  reference  to  article  by,  738 
ANS  Forth  meeting,  994 
ANS  Forth,  812 

ANSI  C,  510;  libraries,  586;  module-scope  declarations,  585;  name 
space  of  programs,  583;  new  void,  585;  preprocessor,  584; 
standard  for  C,  579 
ANSI,  583,  630;  features,  510 
APA  graphics,  836 
APL*Plus  System,  419 
Apollo  Domain  network,  9 
APPEND  command,  in  MS-DOS  3.30,  632 
Apple,  425;  LaserWriter,  749;  Macintosh,  729;  interpreters, 
compilers,  729 


1003 


AppleTalk,  749;  implementation,  751;  Transaction  Protocol  (ATP), 
750;  Async  driver,  installing,  752 
Apple's  Hierarchical  File  System,  9 
Applied  Physics,  8 1 8 

Approximations  for  Digital  Computers,  259 
approximation,  polynomial,  660;  extreme,  661;  unskewing,  661 
ARC,  184 
ARC44,  185 

"ARC  Wars:  MS-DOS  Archiving  Utilities,"  184 
ARCA/ARCE/ARCV,  186 
ARCH,  186 

architecture,  Starplan  II,  151 
arguments,  numeric,  225 
Ariel  Corp.,  335 

arithmetic,  expression  analyzer,  143;  mean,  391;  operators,  66,  67 

Arity  Corp.,  155 

Amet,  65 1 

array  element,  188 

arrays,  indexing,  188;  ordered,  899;  passing,  333;  searching  through, 
899;  sorted,  485 

Art  of  Computer  Programming:  Volume  3,  Sorting  and  Searching, 
The,  431 

Art  of  Prolog,  The,  276 

Art  of  Software  Testing,  The,  71 

ART  tool,  release  3.0,  155 

artificial  intelligence,  74,  78,  149;  for  Amiga,  278 

"Artificial  Intelligence,"  149,  241,  327,  410,  492,  566,  640,  734, 

900 

"Artificial  Neural  Network  Experiment,  An,"  262;  reference  to,  507 

artificial  neural  networks,  262,  264 

Ashton-Tate,  812 

Aspen  Scientific,  630 

assembly  code,  68000,  12 

Assembly  Language  Primer  for  the  IBM  PC  and  XT,  398 
Assembly  Language  Safari  for  the  IBM  PC:  First  Explorations,  490 
assembly  language,  20,  429;  routines,  851;  vs.  high  level  language, 
236-237,  559;  680xx,  72;  68000,  23;  programs,  68000,  22; 
routines,  for  interrupt  handling,  851 
AST  Research,  745 
"Async  AppleTalk,"  749 

Async  AppleTalk,  driver,  installing,  752;  Link  Access  Protocol,  754; 
user  interface,  752;  sending  frame,  753;  interrupt  handler,  7; 
users,  future,  756 
AT160F,  249 

Atari  ST,  9;  VCS  2600,  97 
ATC  Software,  574 
ATronics  International,  335 
Atron,  MiniProbe,  631 
attack  transients,  353 

AT&T  Bell  Labs  Technical  Journal,  reference  to,  559 

AT&T,  583,  740 

Augenstein,  Moshe  J.,  398 

AutoBoard  System  II,  741 

AutoCAD,  training  course  for,  420 

Autodesk's  AutoLISP,  336 

Autodesk,  419 

AutoLISP,  419 

"Automated  Interrupt  Handling  in  C,"  85 1 
Automatic  Funds  Transfer  Telex  Reader,  151 
automatic  hyphenation,  nr,  317 
Axelrod,  Robert,  323 
Axlon,  337 

A-to-D,  434;  converter,  434;  software  driver,  435 

Babbage,  Charles,  346 
BACAS,  153 
Bacchus  Software,  352 
Bach,  Maurice  J.,  559 

backtracking;  chronological,  589;  and  AI,  589;  dependency-directed, 
589;  types,  589 
"Backtracking,"  589 
Backus,  John,  84 
Bain,  William  A.,  154 
BAISC09,  20 
Baker  &  Rabinowitz,  82 
bandwidth,  3,  84,  170,  173,  174,  253,  826,  907 
Barber,  Gerry,  244 
bar-and-note  storage,  for  MIDI,  358 


BASIC  interpreter,  278-279;  Program  Analyzer,  500;  programs, 
readability  of,  333;  program,  from  Radio  Shack,  259 
BASIC09,  16,  19;  project,  343 
BASICA,  238,  240,  573 

BASIC,  16,  20,  84,  89,  247,  649,  757;  Amiga,  278;  CET,  169; 
popular  dialects,  333;  Quick,  238;  Quick,  Turbo,  True,  497; 
similarities  between  True,  Quick,  and  BASICA,  238-240; 
translating  to  C,  637;  translation  between  True,  Quick,  and 
BASICA,  238;  True,  167,  238;  Turbo,  169 
BASTOC,  637 

batch  file,  change  directories,  402 

BBS  and  Unix,  45 1 

BBS  command,  45 1 ;  advantages,  45 1 

BBS  directory  structure,  451 

Bell  Laboratories,  346 

Bell,  Gordon,  84 

Benchmark  Modula-2,  817 

benchmark  tests,  757;  PROLOG,  277;  using  5-MHzZ100,  185; 

Sieve,  637;  TDebug,  185 
Bering  Industries,  421 
Berkeley  Decision/Systems,  82 
Berkey,  Robert,  70 

BetterBASIC,  247,  416,  417,  498;  module  in,  248;  numeric  data 
types,  573;  strings  in,  573 
Bibliography  of  Forth  References,  A,  993 
Bickel's  algorithm,  856,  857,  compared  to  Soundex  algorithm,  858 
Bickel,  Michael,  algorithm,  856;  limitations,  856 
Biggerstaff,  Ted  J.,  399 

Binary  and  Continuous  Activation  System  (BACAS),  153 
binary  file  transfers,  453 

binary  trees,  430;  clustered,  in  V.I.P.,  731;  modification  to,  731-32; 
of  XOR,  672;  actions,  672;  defining,  672;  deleting  item,  674; 
inserting,  674;  traveling  in,  672 
binary  words,  unsigned,  437 
binary-to-integer  conversion  routine,  142 
binary-tree  traversal  routine,  62 
BIOS  port,  444 
BIOS  support,  12 
bit  flags,  98 

BlackStar  C  Function  Library,  909 
Blinn,  James  F.,  659 
block  range  optimization,  599 
blocking,  988 

Bluestreak  Plus,  Version  2.00,  81 
Bobrow,  Dan,  244 
BOCARAM,  419 
Boca  Research,  4 1 9 
Bolt,  Beranek,  and  Neumann,  155 
books,  new  AI,  157 
"Books,"  418,  572 
Boolean  argument,  842 

Borland  International,  78,  169,  447;  real-time  conference  on 
CompuServe,  74 
Borland  PROLOG,  7;  SIG,  164 
Borland,  146;  Turbo  C,  833,  930 
bottlenecks,  514;  information-transmission,  173 
bottom  node,  1 80 

Bowman,  Charles  F.,  589,  765;  reference  to  article  by,  916 

boxcar  algorithm,  392,  393;  average,  394,  395 

Boyer-Moore  routine,  146 

branching,  99 

Bratko,  Ivan,  276 

Breakthru,  80 

Breeden,  J.  Brooks,  845 

Bresenham's  algorithm,  664;  improved,  664 

BRIEF,  2.0,  818 

Brodie,  Leo,  1 1 

Brothers,  Dennis,  94,  95 

Brown,  Richard  R.,  749 

Brown,  Robert  Jay,  262,  430 

browsers,  LOOPS,  Lattice,  640 

Browser,  Class  Hierarchy,  734;  Disk,  735 

buckets,  104 

buffer  name,  842 

buffered  interrupt-driven  input,  442 

buffer,  single  character,  443 

bug  report,  for  June  1987,  808 

bug,  in  Microsoft  C  Compiler  Version  4.0,  395 


1004 


buses,  wider,  5 1 7 

BusMate  AT,  818;  PC/XT,  818 

bus,  32-bit;  VME,  9 

Butrick,  Richard,  518 

Butterfly  Machine,  155 

Byte  Information  Exchange  (BIX),  233 

bytes,  6502,  101,  102,  98;  saving  in  6502  using  subroutines,  100 

C  386  Compiler,  251 
C  Answer  Book,  The,  398 

"C  Chest,"  February,  bug,  631;  reference  to  May  1987,  816;  reference 
to  May  1987  issue,  991 

"C  Chest:  A  Preemptive  Multitasking  Kernel  and  More  Mean 
Subroutines,"  983 

"C  Chest:  Curses:  Unix-Compatible  Windowing  Output  Functions," 
550 

"C  Chest:  Language  Wars  over  Cs,"  806 

"C  Chest:  Nroff:  Hashing,  Expression,  and  Roman  Numerals,"  139 
"C  Chest:  NR:  A  C  Implementation  of  Nroff,  Part  2,"  223 
"C  Chest:  NR:  A  C  Implementation  of  Nroff,  Part  3,"  316 
"C  Chest:  Priority  Queues,"  482 
"C  Chest:  Shrinking  .EXE  File  Images,"  61 
"C  Chest:  Statistical  Applications  of  digital  Low-Pass  Filters,  Exec 
Bug  in  Microsoft  C,  391 

"C  Chest:  Subroutines  with  A  Variable  Number  of  Arguments,"  627 
"C  Chest:  The  Ultimate  Metronome:  Writing  Interrupt  Service 
Routines  in  C,"  721 

"C  Chest:  Using  the  Unix/ANSI  Time  Functions,"  893 
C  Companion,  The,  398 
C  compilers,  557;  new,  806;  Microsoft,  61 
C  languages,  583,  825;  programmers,  88;  programming  language, 
579 

C  Programming  Language,  The,  398 
C  Puzzle  Book,  The ,  398 
C  Toolbox,  The ,  399 
C  Ware  Corp.,  250 
C86PLUS>  169 
cable,  749 

cache,  routines,  using,  927;  size,  application,  926;  theory,  925;  disk, 
925;  processing  freed  records,  928;  closing,  929;  variable  length 
records,  9;  RAM  10;  variables  associated  with  efficiency,  926; 
multiple,  926 
CADcard  Model  1040,  336 
calculator,  four-function,  158 
calendar,  893 
calendar. c,  895 
Callihan,  Hubert  D.,  840 
calls,  DOS,  146 
Campbell,  Joe,  399 

capabilities  of  file  comparison  programs,  592 

Capouch,  Brian,  4,  16,  343 

Camegie-Mellon  University,  153 

Caro  Research,  740 

Carr,  Bob,  561 

Casio's  C2- 10 1,362 

cast  operator,  232 

CD  ROM  technology,  164 

Central  Coast  Software,419 

CET  BASIC,  78,  169 

CET  Technology,  78,  169 

CGA,  845 

Chalcendony  Software,  270 
Chavez,  Lori,  757 
Chignell,  Mark,  74,  164-166 
child  nodes,  178,  180 
chip  components,  smaller,  4 

chips,  517;  support,  9;  6502,  97;  Advanced  Micro  Devices  290 1C, 
567;  MC6888 1,  73 
CHKDATE.C,  490 
Chowning,  John,  346 
chronOS,  80 

circles,  algorithms  for  drawing,  659 

circuits,  sequential,  350 

Claff,  William,  233 

Clancy,  Mike,  398 

clarification  in  ANSI  C,  586 

Clark,  K.L.,  276 

Class  Hierarchy  Browser,  734 


clauses,  AND,  279 
clauses,  OR,  280 
ClickStart,  740 

click,  programmable  metronome,  722-724;  programmable 
metronome,  implementation  of,  725-728 
client  programs,  919;  relations,  323 
Clocksin,  W.F.,  276 
CMS  Enhancements,  740 
Cobalt  Blue,  500 

code  optimization.  Forth,  759;  clear,  366;  easily  understood,  366; 

locational,  179,  180;  sloppy,  259 
CodeView  debugger,  486,  63 1 
Codeworks,  82 
cognoscenti,  Forth,  1 1 
Color  Graphics  Adapter,  845 
color  map,  178 
columns,  concatenating,  235 
COM  port,  key  functions,  443;  technical  issues,  443 
Comdex,  November,  1986,  233 
Comer,  Douglas,  399 
command  names,  224 
Command  Plus,  400,  421 

commands,  and  options  of  file  comparison  programs,  593;  dot,  225, 
226;  nr,  225;  strings,  92;  structure,  in  nr,  224 
commercial  applications  of  AI,  150 
common  LISP,  243,  244,  327,  328,  329,  568 
CommonLoops,  243,  329;  compared  to  LOOPS,  640;  kernel  of, 

329;  method  combinations,  330;  multimethods,  330 
communication,  client  with  peer,  750;  peer,  750 
Compaq  386  Deskpro,  557,  513 
comparison  and  swap  function,  486 
compiled  word  processors,  1 39 
compilers,  C,  557;  Pascal,  557;  FLINT'S,  13 
compiling  source  code,  for  creating  386  programs,  509 
Complete  Book  of  Macintosh  Assembly  Language  Programming, 
Volume  II,  The,  419 
components,  chip,  4 
composite  objects,  243 
compound  facts,  519 

compound  fact/consequent  representaion,  5 1 8 
"Compressing  Image  Data  with  Quadtrees,"  178 
compression,  data,  1 84 
Computer  Crossware  Labs,  499 
Computer  Innovations,  169,  598 

Computer  Professionals  for  Social  Responsibility,  175-176,  574 

computer  science,  and  programming,  differences,  261 

Computer  Systems  Documentation,  79 

Computer  Technology  Associates,  83 

computer  vision  systems,  262 

computerized  recording,  363 

computers,  Amiga,  22;  application  to  musical  problem,  342; 
Macintosh,  22;  using  6502  chip,  97;  32-bit,  257;  any  well- 
defined,  345 
Computervision,  741 
concatenating  tables,  columns,  235 
Concurrent  DOS  386,  999;  XM,  999 
configuration  features,  for  MS-DOS  3.30,  632 
configuration,  nr  226 
Connection  Machine,  The ,  572 
Connections,  574 
consistency  in  programming,  563 
constant  propagation,  599 
Consulair  Corp.,  335 

contour  analysis,  algorithm,  826;  how  it  works,  828;  problems,  828; 

trace,  831;  using  the  program,  829 
contour  maps,  826;  transform  into  3-D  views,  827;  steps  to,  827; 

dimensions,  826 
contour  search,  830,  83 1 
contours,  nests,  830 
control  flow,  nr,  3 1 6 
control  registers,  846 
control  routines,  181,  182 
control  structures  in  PC  Scheme  2.0,  41 1 
controllers,  using  6502  chip,  97 
conventional  databases,  tying  PROLOG  to,  76 
converter,  A-to-D,  434 
Cooper,  Doug,  398 
Copam,  650 


1005 


copy  propagation,  599 

Cornerstone  Technology,  741 

Cortesi,  David  E.,  96,  437,  671,  919 

CPU  chips,  68100,  10;  68200,  10 

CPU,  68020,  9;  Xerox,  567 

Crane,  Clark  Allen,  43 1 

Creating  C  Tools  for  the  IBM  PC,  399 

Creative  Computer  Software,  169 

Cricket  Graph,  420 

Cricket  Software,  420 

CRON  program,  on  Unix  Clone,  33 1 

Cruise  Control,  234,  235 

Csharp  PC  Drivers  Package,  250 

CSSL,  249 

curses  library,  550 

curses,  erasing,  553;  formatting  output,  553;  implementing  554; 
initializing  windows,  551,  552;  keyboard  input,  553;  553; 
responding  to  typed  characters,  551;  scrolling,  553;  single 
characters,  writing  strings,  553 
curses-compatible  I/O  functions,  550 
curses-subset  package,  630 

Custom  Software  System,  implementation  of  Unix  vi  editor  395 
Cytek,  251 

C++  Programming  Language,  The,  572 
C++,  78 

C,  9,  89,  323,  425,  438,  509,  627;  ANSI  standard,  579; 

implementing  device  driver  in,  676;  macro  processing,  90; 
Microsoft,  930;  Borland  Turbo,  930;  optimization,  598;  type 
checking,  90 

C:  A  Reference  Manual,  398 

data,  bus,  64-bit,  10;  compression,  184;  hiding,  813;  segment 
address,  storing,  852;  segment,  64;  sharing,  528;  space,  64 
Data  Structures  and  Program  Design,  398 
Data  Structures  Using  Pascal,  398 
data  structure,  primary,  180;  visual  objects,  938 
Databasefns,  569 
databus,  128-bit,  10 
Datacopy  corp,  80 
Datalight,  499,  598 
DataWindows,  25 1 
data-encoding  encryption,  575 
data-transfer  rate,  disk  drive,  257 
dateh,  893 
dateh.c,  896 
db/LIB,  909 

"DDJ  On  Line,"  245,  331 

"DDJ  On  Line;  PROLOG  and  the  Future  of  AI,"  74,  164 

DDJFORUM,  426 

dead  assignment  elimination,  600 

dead  code  elimination,  600 

dead  variable  elimination,  600 

debugger,  345;  AAIS  prolog,  271;  ExperProlog-II,  271; 

MacPROLOG,  271;  PROLOG/m,271 
Debugger,  The,  817 
Debugging  C,  489 
debugging  facilities,  757 
decay  timings,  353 
decision-making,  649 

declarations,  232;  and  definitions,  in  one  file,  322 

declarative  programming,  74;  vs.  procedural  programming,  75 

decoding,  run-length,  71 

DEC-20-type  PROLOG,  7 

deductive-axiomatic  method,  77 

Defmicon,  65 1 

defining  an  item,  in  XOR  chain,  438 

definitions,  232;  and  declarations,  in  one  file,  322;  theory  of,  77 

Deikman,  Alan,  925 

deleting  items,  in  XOR  chain,  441 

Dell  Computer  Corporation,  650 

Delta  Research,  817 

demonstration  program,  for  Graphics  Toolbox,  839 

Design  of  the  UNIX  Operating  System,  The,  559 

Design  Software,  249 

"Designing  a  Music  Recorder,"  350 

designing  programs,  564 

Desqview,  Version  2.0,  651 

"Developing  80386  Applications  . . .  Today,"  509 


device  descriptor  modules,  18 

device  driver,  922;  facility,  676;  modification,  442,  features,  442; 
OS/2,  922;  writing,  922;  format,  676;  purpose,  676;  start-up 
code,  677;  strategy  function;  implementing  in  C,  676; 
implementing,  678;  return  info,  679;  linking,  679; 
shortcomings, 

device  support,  in  MS-DOS  3.30,  633 
DeVoney,  Chris,  145 
Devpac  Amiga,  499 
differential  equations,  662 

DIFF,  enhancement,  597;  functional  description,  595;  operation, 
594;  output,  593 
DigiDesign,  352 
digital  low-pass  filter,  391 
Digital  Research,  505,  999 
Digitalk,  499,  734 
DigitTronix,  82 

dimensional  data  types,  Ada  package  that  creates,  367;  need  for,  how 
they  help,  366 

"Dimensional  Data  Types,"  366;  reference  to,  581 
dimensional  units,  366 
Dinfo,  568 

directories,  changing  in  batch  file,  402 
Discovery  Systems,  336,  419 
disk  blocks,  accessing,  925 
Disk  Browser,  735 
disk  cache,  925 

disk  drive  data-transfer  rate,  257 
disk,  fill,  663 

display  memory,  finding,  937 

display  methods,  in  MIDI,  related  to  storage  methods,  361 
Disque,  Tom,  5 14;  reference  to,  824 
diversions,  with  nr,  229 
division,  integer,  67 

documentation,  system,  WEB,  87;  in  programming,  562 

dogStar  Software,,  83  420 

Door  into  Summer,  The,  84 

DOS  calls,  146 

DOS  Critical  Flag,  146 

DOS  locate  utility,  930-932 

DOS-2-DOS,  419 

dot  commands,  224,  225,  226 

double-precision,  numbers,  66;  operators,  68 

doubly  linked  list,  437 

Drafix  1/Atari  ST,  908 

Drexler,  Eric,  4 

DSBACKUP+,  249 

DSI-780  68020/688 1  card,  65 1 

DSP- 16,  335 

DSRECOVER,  249 

Duncan,  Ray,  92,  145,  233,  400,  489,  632 

Dunford,  Chris,  95,  400 

Duntemann,  Jeff,  741 

Dvorak,  John  D.,  657 

DWIM,  569 

DX/TX  FM  synthesizers,  347 

dynalink,  descriptors,  in  OS/2,  922;,libraries,  919,  using,  920 

"Dynamic  Linking  in  OS/2,"  919 

dynamic  linking,  919;  benefits,  920 

Dynamic  Memory  Overlays  for  Turbo  Pascal,  447 

Dynaperspective,  420 

Dynapro  Systems,  80 

Dynaware,  420 

East  Coast  Forth  Board,  8 1 1 
Eastman  Communications,  419 
Echelon,  999 

Edinburgh  syntax,  of  PROLOG,  272 

editorial,  reference  to  September  1987,  916 

editors,  87;  simple,  14;  text,  92,  95;  text:  modeless,  93;  simple,  14; 

text:  new,  95;  piano-scroll,  362;  SEdit,  568;  structure,  568 
"Efficient  Algorithm  for  Large  Priority  Queues,  An"  430 
EGAPaint,  846 

EGA,  845;  programming,  846;  sample  program,  847 
EGA- 16,  740 

Electronic  Engineering  Times,  268 
electronic  mail,  450 
Electronic  Specialists,  420 


1006 


Elements  of  Programming  Style,  The,  398 
element,  array,  188 
EMS,  745 

end-point-relative  storage,  for  MIDI,  357 

engine,  in  PC  Scheme  2.0,  41 1 

Enhanced  Graphics  Adapter  (EGA),  845 

enhancement,  for  DIFF,  597 

environment,  mechanism,  230;  MS-DOS,  Unix,  550 

Enyart,  Bob,  257 

EPROM-based  applications,  930 

equations,  differential,  662 

erasing  in  curses,  553 

Erdelsky,  Philip  J.,  345 

escape  sequences,  224,  225;  nr,  320 

ESP  Frame-Engine,  78 

ESP  Software  Systems,  400,  421 

event  trapping,  for  Macintosh  and  Amiga,  25 

Everex  Systems,  80 

Evolution  of  Cooperation,  The,  323 

exclusive-OR  function,  437 

excom,  and  CrossTalk,  445 

excom,  bug,  445;  on  IBM  PC,  445;  problem,  445;  sections,  443 
Expanded  Memory  Specification,  745 
expanding  MYCIN,  281 
ExperCommon  Lisp,  35 

Experiments  in  Artificial  Intelligence  for  Microcomputers,  78 
ExperOPS5-Plus,  155 

ExperProlog-II,  271;  debugger,  271;  user  interfaces,  271 
Expert  Development  Package,  155 
Expert  Information  Systems,  500 
Expert  Systems  International,  78 

expert  systems,  75,  76,  164,  166,  2531;  AIDS,  253;  in  BASIC  as 
opposed  to  PROLOG,  278 
ExperTelligence,  155,335 

exponential  smoothing  function,  391;  implementing  392 
expression  analyzer,  arithmetic,  143 
expressions,  144 

"Extended  IBM  PC  COM  Port  Driver,  An"  442;  reference  to,  648, 
906 

extended  representation  (EXT),  73 
externally  accessible  routine,  180 
extreme  approximation,  661 
EXT,  73 

F83  Reference  Manual,  993 

fact  statements,  5 1 8 

"Factoring  in  Forth,"  reference  to,  58 1 

Fail-Safe,  249 

Fall  Joint  Computer  Conference,  84 
Farbware,  169 

"Fast  Forth  for  the  68000,  A,"  757 
features,  in  text  editors,  93 
Feuer,  Alan  R.,  398 
FIFOs,  528 

Fifth  Generation  Computer  Project,  164,  425 
FIG,  761,  809 

"File  Comparison  Algorithms,"  666 

file  comparison  programs,  592;  capabilities,  592;  commands  and 
options,  593 

file  comparison  utility,  669 
file  manager  modules,  1 8 
fill  disk,  663 
find/replace  utility,  638 
first-in/first-out  queue,  430 
Fisher,  Roger,  323 
FiveStar  Electronics,  335 
fixed-increment  correction,  267 
Fixed-Point  Math,  810 
fixed-point  numbers,  72 
flags,  324;  6502,  97 
Flambeaux  Software,  574 
Flannery,  Brian  P,  418 
Flash-Up  Windows,  249 
Flavors,  daemon  methods,  242 
Fletcher,  G.  Yates,  4,  1 1 

FLINT,  4,  1 1;  compiler,  13;  interpreter,  13;  tructured  loop  control, 
14;  system's  inner  shell,  13 
FLOAT,  72,  73 


floating  point,  math  chip,  MC6888 1,  73;  representation,  72; 

numbers,  72 
Flores,  Fernando,  164 

"Flotsam  and  Jetsam,"  64,  232,  322,  398,  486,  556,  724 

flow  of  activities  in  programming,  565 

Floyd,  Edwin  T.,  104,  343 

FM  synthesis,  351 

FORCE,  993 

Ford  Aerospace,  1 5 1 

Foresight  Resources  Corp.,  908 

Fork's  operation,  breakdown,  526 

formats,  numeric,  72 

formatting  output  in  curses,  553 

FORML,  809 

FORTH  Inc.,  66 

Forth  83  Standard,  interpreting  a  string,  763;  recursive  definitions, 
761;  compiler  words,  762 
Forth  cognoscenti,  1 1 
"Forth  Column,  The,"  809,  993 
Forth  Dimensions,  809 
Forth  Forum,  GENie,  993 

Forth  implementations,  758;  optimization  techniques,  758 

Forth  Interest  Group  (FIG),  761,  809,  993 

Forth  Model  Library,  761,  809 

Forth  routines,  to  load  EGAPaint,  845 

Forth  stack,  70 

"Forth  Standard  Prelude,  A,"  761 
Forth  systems,  68 

Forth,  1 1,  15,  65,  66,  72,  323,  561,  757;  code  optimization,  759; 
description  of,  757;  fragility,  70;  implementing  strings  in.  996; 
"no  frills,"  11;  strength,  70;  string  search,  763;  structured 
control,  13;  subroutines  vs.  pointer  threading,  757 
Forth-83  Standard,  761 
Forth-83,  controlled  words,  994 
Forth-like  interpreter,  4,  1 1 
Forth-optimized  RISC  computing  engine,  993 
Forths,  16-bit,  68 
FORTRAN-based  MUSIC  V,  346 
Fort's  Software,  79 
forward  differences,  660 
"Four  PROLOGS  for  the  Macintosh,"  270 
Fourier  analysis,  35 1 
four-function  calculator,  158 
FPU,  68881,  9 
Fractal  Magic-EGA,  402 
FTL  Modula-2,  78 
Fujitsu  America,  249 
full  screen  editor,  in  PC  Scheme  2.0,  41 1 
function  range  optimization,  599 
functional  description  of  DIFF,  595 

function,  comparison  and  swap,  486;  exclusive-OR,  437;  hashing, 
107;  hash,  quick  to  compute,  105;  poor  hash,  105 

Gabriel,  Richard,  243,  244 
Garvin,  Mark,  350 
Gates,  Bill,  505 
Genera  Release  7.0,  328 

General  Electric  Network  for  Information  Exchange,  993 

General  Motors,  900 

generated  voltage,  435 

generator,  screen,  630 

GENie  Forth  Forum,  993 

geometry,  for  contour  maps,  826 

Getting  to  Yes,  323 

Gimple,  Scott  E.,  398 

global  common  subexpression  elimination,  600 

global  variables,  nr,  230 

GMX  Micro-20,  16 

Goedel,  Kurt,  414 

Gold  Hill  Computers,  155 

Grammer  Engine,  8 1 8 

grammer,  PROLOG,  277 

Grandmaster,  156 

graphical  image,  178 

Graphics  Software  Systems,  908 

"Graphics  Toolbox  for  Turbo  C,  A,"  833,  840 

"Graphics  Toolbox  for  Turbo  C,  A — Part  2,"  937 


graphics,  APA,  836;  basic  library,  833,  pixel-oriented,  833; 

programming,  935;  sampler,  839 
Grappel,  Robert  D.,  188 
Great  Western  Software  Company,  The,  741 
Greenleaf  Software,  25 1 
Grigonis,  Richard  W.,  278 
Guide,  87, 421 
GX4330,  909 
GX4342,  909 

Hackers,  566 
half  interval,  663 

halting  problem,  7,  345,  414;  relation  to  other  problems,  414;  solved 
in  theory,  415;  solving,  345 
Hammond,  Nick,  740 
Ham,  Michael,  65,  323,  561,  582 
handler's  resident  C  code,  852 
Harbison,  Samuel  P.,  398 

hardware,  interrupts,  434;  standards,  350;  designing  for  MIDI,  353 
Harrington,  Jan  L.,  4,  22,  450 
Harris  Semiconductor,  993 
Harris,  Bennette  R.,  671 

hash,  algorithm,  142;  code,  343;  function,  poor,  105;  quick  to 
computer,  105;  pointer  table,  104;  strategies,  141 
"Hashing  for  High-Performance  Searching,"  104;  reference  to,  343 
hashing,  function,  107;  open,  104;  strategy  for  table  maintenance, 

140 

Hastings,  Cecil  H.,  259 
hDC,  740 

header,  request,  676 
heap  representaion,  485 
heaps,  64;  uses,  482 
Heinlein,  Robert,  84 
Heitzo,  6 

Henderson,  Thom,  184 
Hester  and  Hirschberg,  1 07 

heuristic  search,  new  algorithms  for,  898  methods,  899 
Hewlett-Packard,  900 
hidden  1  bit,  73 

high  level  language,  vs.  assembly  language,  236-237 
highspeed  serial  data  capture,  442 
High-C,  510,  installation,  510-11;  performance,  51 1 
high-end  workstations,  9 

high-level,  interrupt  handler,  853;  languages,  429,  496,  846,  851 

high-level  vs.  assembly  language,  559 

Hilbert,  David,  414 

Hillis,  W.  Daniel,  572 

hindsight,  in  programming,  562 

Hinton,  Geoffrey,  268 

Hirschberg  and  Hester,  107 

HiSoft,  499 

Hitachi,  741 

Holub,  Allen,  61,  139,  223,  261,  316,  342,  391,  398,  482,  550, 

627,  721,  806,  816,  893,  983;  reference  to  "Viewpoint,"  738 
Hot  Shot,  42 1 

"How  Many  Ways  Can  You  Draw  a  Circle?,"  659 
Howell,  Jim,  856 

Huffman  encoding  data-compression  algorithm,  482 
Hunt,  William  J.,  399 
Hyman,  Michael,  I.,  489 
Hyperspace  Z-System,  999 

IBM  PC  from  a  the  Inside  Out,  The,  145 
IBM-PC,  interfacing  to,  550 
IBM,  425,  583 

IBM-PC  8088  Macro  Assembler  Programming,  145 
icon,  Xerox  1 186,  568 
identifier,  104 

"Identifying  the  Video  Adaptors  of  IBM  PCs,"  837 

image,  buffer,  deallocate,  842;  recognition,  262;  graphical,  178 

implementation,  AppleTalk,  751 

implementing,  automated  interrupt  handling,  85 1 ;  curses,  554 
"In  Search  of  Sine,"  176,  177;  reference  to,  259 
Inboard  386/PC,  915 

incremental  garbage  collection,  in  LISP,  567 
incremental  rotation,  660 
indexing  an  array,  188 


Inference  Corp.,  155 

inference  engine,  903 

Information  Appliance  Inc.,  95 

information-transmission  bottlenecks,  173 

initialization,  module,  168,  404 

initializing,  curses,  551;  windows,  in  curses,  552 

InitOverlay,  how  it  works,  449 

inner  shell,  13,  14;  FLINT  system's,  13 

input,  nr,  318 

insert  and  delete  operations,  486 
inserting  scans,  in  XOR  chain,  440 
Inside  the  80286, 402 
Instant  C,  78 
Instant  Replay,  83 

Institute  for  Applied  Forth  Research,  993 
instruction  set,  6502,  97 

integer,  division,  67;  multiplication,  188;  root  square,  935;  square 
root  routine,  935 
"Integers  Don't  Float,"  935 
integers,  simple,  72 

Intel  80386  Programmer's  Reference,  634 
Intel  80386,  233,  438,  509 
Intel  80C5 1  microcomputer,  267 
Intel  Corp,  252 
Intellicorp,  156 
Intelligence  Ware,  156 
Intelligence/Compiler,  156 
Intelligent  Data  Systems,  650 
Intelligent  Graphics  Corp.,  336 
Intel,  233,  745,  915;  manuals,  634 
interactive  language,  757 

interface,  standard  Amiga,  22;  standard  user,  Macintosh,  22 
interfacing  to  IBM  PC,  550 

INTERLISP-D,  568;  syntax  of,  568;  structure  editors  in,  568 
intermediate  optimizations,  598 
internal  storage  methods,  for  MIDI,  356 
interpreted  language,  72 

interpreter,  12;  Amiga  BASIC,  283;  FLINT’S,  13;  Forth-like,  11; 

Forth-like,  4 
interrupt  function,  854 

interrupt  handlers,  529,  851,  753;  low-level,  852;  high-level,  853 
interrupt,  instruction,  102;  processing,  529;  service  routine,  for 
MIDI,  354;  vector  swapping  routine,  854;  hardware,  434 
intersystem  file  transfers,  450 
intertask  communication  and  synchronization,  528 
intrinsic  functions,  510 
Introduction  to  the  80386,  233 
invalid  input  in  programming,  563 
invisible  pointers,  in  LISP,  567 
IOTools,  421 

iterative  approach,  in  programming,  562 

I/O,  bandwidth,  84;  intensive  applications,  925;  routines,  adapting, 
235 

James,  John,  68 

Japanese  computer  companies,  425 

Jasik  Designs,  817 

Jasik,  Steve,  422 

Jet  Reader,  80 

JForth,  817 

JMI  Software,  637 

Jones,  Do- While,  366 

Jourdain,  Robert,  145 

JRAM-RT,  79 

JUDGE,  154 

Kahn,  Phillipe,  74,  164-166,  652 
Karplus,  Kevin,  348 
Karplus-Strong,  348 

Kawai  K-3  synthesizer,  patch  editor  for,  362 

KED1T,  344 

KEE  system,  156 

KeepTrack,  400 

KEEworlds,  156 

Kemeny,  John,  167 

kernel,  minimal,  12FLINT,  12;  multitasking,  524,  983;  properties 
of,  524;  of  CommonLoops,  329;  OS-9,  20;  user's  manual,  985 


1008 


Kemighan,  Brian,  232,  398 
Key  Software  Products,  250 
keyboard  input,  in  curses,  553 
Kildall,  Gary,  425 
King,  Richard  Allen,  145 
Klein,  Andy,  676 

KMP  algorithm,  766;  modification,  767 

knowledge  media,  149 

knowledge  representation  in  PROLOG,  518 

Knuth,  Donald,  87,  91 1 

Knuth-Morris-Pratt  algorithm,  766 

Kraft,  Larry,  74,  164 

Krantz,  Don,  592 

Kruse,  Robert  L.,  398 

Krutch,  John,  78 

Kurtz,  Thomas,  167 

Kyan  Pascal,  25 1 

Kyan  Software,  169,  251 

Laboratory  Microsystems,  323 
Lafore,  Robert,  398 
Lammers,  Susan,  561 
Laney  Systems,  a500 
Language  Processors,  817 

language,  assembly  vs.  high-level,  559;  assembly,  20,  429;  high- 
level,  429;  interpreted,  72 
Lang-Allan,  8 1 
LaserWriter,  Apple,  749 
Lattice  Browser,  640 
Lattice  make  (LMK),  397 

Lattice,  78  419;  version  of  Unix  make  utility,  395-6 

Laxen/Perry  f83  Forth,  993 

leaf  nodes,  178 

Learning  Machines,  267 

LeBan,  Roy,  94 

LeBrun,  Marc,  347 

"Letters,"  89,  259,  343,  427,  507,  581,  657,  747,  825,  915 
Levitt,  David,  346 
Levy,  Steven,  566 
librarians,  247,  351 

libraries,  dynalink,  919;  in  ANSI  C,  586;  routines,  to  create  graphics, 
937;  utility,  184 
Lifeboat  Associates,  78 
liftetime  analysis,  600 
Ligett,  Steve,  749 
Lightspeed  Pascal,  80 
Lindley,  Craig  A.,  524 
line  drawings,  659 
linear  array,  1 04 
linker,  Phar  Lap,  5 1 1 

LISP,  75,  242;  common,  243,  244,  327;  implementation,  peculiar 
aspect  of,  566;  incremental  garbage  collection,  567;  inexpensive 
version  for  PC,  features,  410;  invisible  pointers,  567;  machine, 
566;  programming,  492;  storage  scheme,  567;  system 
requirements,  566;  Xerox  1 185,  640 
Lloyd  I/O,  1000 
LMK  (Lattice  make),  397 
load  time  linking,  920 
loader,  MS-DOS,  61 
locate  utility,  DOS,  930,  931,  932 
locational  codes,  179,  180 
Lockheed,  using  PROLOG,  76 

"Logic  and  Knowledge  Representation  in  PROLOG,"  518 

logical  structures,  92 

Logitech,  158 

loop  constructs,  649 

loop  induction,  601 

loop  invariant  code  motion,  600 

LOOPS,  640;  active  value,  642;  AI  applications,  644;  browsers, 

640;  compared  to  CommonLOOPS,  640;  control  structures, 

643;  design,  644;  library,  644;  environment,  640;  hierarchies  of 
classes,  640;  methods,  641;  multiple  inheritance,  643;  rules,  64; 
unrolling,  99;  virtual  copies,  644 
Lotus  1-2-3,819 

Lotus  Development  Corporation,  341,  742,  819 

Lotus,  745;  suit  against  Paperback  Software,  Mosaic  Software,  341 

low-level  interrupt  handler,  852 


low-pass  filter,  cutoff  frequencey,  393 
LPI-B ASIC,  817 
LPI-COBOl,  817 
LPI-FORTRAN,  817 
LPI-PASCAL,  817 
LPI-PL/I,  817 
LPI-RPG  II,  817 

M2505a  WORM,  249 
MAC  OS 

MacBus/RTI-800, 421 
MacC  Jr.,  335 
Mach2,  757,  758,  759 

"Macintosh  Buttons  and  Amiga  Gadgets,"  22 
Macintosh,  655,  749;  Apple,  729;  computers,  22;  creating  menus 
for,  24;  creating  windows  for,  25;  event  trapping,  25;  menus, 
23;  operating  system,  9;  products  available  for,  270;  SE,  757; 
standard  user  interface,  2;  system  routines,  23;  text  editing,  26, 
27;  text  I/O,  26;  standards,  23;  window,  22 
Macintosh/Amiga  interface  programming,  646 
MacMemory,  909 

MacNosy  V2  Documentation  V2.50,  422,  817 
MacPROLOG,  270;  debugger,  271;  simple  syntax,  273;  standard 
syntax,  272;  user  interfaces,  27 1 
macro  language,  extensive  nr  support  of,  140 
macro  processing  in  C,  90 
MacroMind,  421 
macros,  630;  with  nr,  229 
Macworld  Expo,  253 
MagicDrive,  249 
mail,  private,  454 
Mainstay,  729 
malloc(),  556 
Mandelbrot  image,  402 
Mansfield  Software  Group,  344 
map,  color,  178 
map-mask/bit-mask,  846 
Mariella,  Ray,  935 
Marshal  Language  Systems,  78 
Marshal  Pascal,  78 
MASM,  v.  4.0,  185 
Masterscope,  569 
Math  Box,  The,  810 

mathematical  reasoning,  comment  on  letter,  748 

mathematics,  relation  to  programming,  261 

Mathews,  Max,  346 

Mattel,  337 

May,  William  D.,  826 

Maze  Wars+,  421 

MC68000,  514 

MC6888 1  chip,  73 

MC6888 1  floating  point  math  chip,  73 
McCabe,  F.  G.  276 
McMahon,  Steve,  447 
measuring  analog  voltage,  434 
mechanism,  environment,  230 
media,  knowledge,  149 
Mellish,  C.S.,  276 

memory  overlay  scheme,  cautions,  449 
memory,  170 

memory-resident  utilities,  85 1 
MemOvrly.Inc,  using,  448;  customizing,  449 
menu  bars,  building,  938 

menus,  Amiga,  23;  creating  for  Macintosh  and  Amiga,  24; 

Macintosh,  23 
merging  queues,  43 1 
messages,  987;  public,  454 
Metacomco,  419 
MetaWare,  510,  557 

method  combination,  in  CommonLoops,  330 

methods,  LOOPS,  641;  deductive-axiomatic,  77 

metronome  program,  72 1 

Micro  Channel  board,  65 1 

Micro  Display  Systems,  74 1 

Micro  Enhancer,  80 

micro  implementation  of  PROLOG,  166 
Microfield  Graphics,  252 


1009 


Micrografx,  420 

Microprocessor  cycle,  extra  in  6502,  100 
Microprocessor  Engineering,  999 

microprocessors.  Motorola,  16;  6502,  8;  68000,  22;  8080,  8 

Microsoft  C,  930;  review,  807 

Microsoft  C  Compiler,  6 1 ,  63 1 

Microsoft  C  4.0,  637,  936 

Microsoft  C  Compiler  Version  4.0,  bug,  395,  486 

Microsoft  Corp.,  83 

Microsoft  LINK,  compatibility  with  386LINK,  558 
Microsoft  Macro  Assembler,  147;  V.4,  correction,  6 
Microsoft  MASM,  148 
Microsoft  OS/2,  505,  632 
Microsoft  QuickBASIC  compiler,  167 

Microsoft  Word,  search  and  replace  problem,  635;  Version  3.0,  139 

Microsoft,  283,  425,  557,  573,  583,  745,  915,  919 

MicroSolutions  Computer  Products,  818 

Microware  Systems  Corp.,  16,  817 

Microware,  4,  6 

micro-PROLOG,  276 

MIDI,  348,  346,  350;  advance  features,  360;  bar-and-note  storage, 
358;  capabilities,  352;  commands,  350;  composition,  352; 
designing  hardware  for,  353;  designing  sequencer  for,  353; 
display  methods,  361;  end-point-relative  storage,  357;  hardware 
specifications,  353;  internal  storage  methods,  357;  interrupt 
service  routine,  354;  OUT  ports,  353;  ;  parameters,  352;  pilot 
track  for,  359;  products,  350;  quantization  for,  358;  record  and 
intercept  musical  events,  352;  resources,  365;  single-point 
storage  for,  357;  software,  362;  software,  testing,  362; 
specfication,  350,  356;  ;  speed,  353;  time  stamps,  356,  357; 
timing  notes,  355;  types  of  software  clocks,  356;  vendors,  365 
Mighty  Meg,  335 
Miller  Microcomputer  Services,  68 
Miller,  Ron,  851 
MINIBUG,  511;  557 
"Mini  Forth  for  the  68000,  A"  1 1 
minimal  kernel,  12 
MiniProbe,  Atron,  63 1 
miscellany,  989 

MIT  LISP  Machine  group,  328 
MMSForth,  68;  utilities  option,  68 
MMU,  68851,9 
Model  730,  80 
Modgraph,  74 1 

modification,  of  device  driver,  442 
modularity,  16 

Modula-2,  65,  91,  429,  496,  524,  813;  Anderson's,  162;  compiler, 
169;  open  loops,  160;  reference  to  Mike  Suman's  viewpoint, 
427,  428;  strings,  158;  translating  to,  158;  translating  to,  160; 
vs.  True  BASIC,  406,  407 

module,  device  descriptor,  18;  file  manager,  18;  in  BetterBASIC, 
248;  in  True  BASIC,  167;  initialization,  168;  initialization, 
404;  range  optimizaiton,  599;  ROMed,  17 
module-scope  declarations,  ANSI  C,  585 
Moore,  Charles,  561 
More  on  NC4000,  Volume  5,  993 
Morse,  Stephen  P.,  402 
Mosaic  Software,  copyright  infringement,  34 1 
MotherCard  5.0,  79 
Motorola  68000  Microprocessor,  515 
Motorola  MC68000,  MC68010,  81 
Motorola,  8,  10;  microprocessors,  16 
mouse,  139;  and  pointer,  94 
moving  cursor,  in  curses,  553 
MPE/Nautilus  Cross  Compiler,  999 
ms  macro  package,  316 
MSA  Group,  741 

MS-BASIC,  573;  converting  to  C,  limitations,  639;  source  code, 
converting  to  C,  637 

MS-DOS,  2.1 1,  upgrade,  6;  3.0,  problem,  491;  3.10,  6;  3.30,  632; 
CD  ROM  extension,  83;  environment,  550;  implementation  of 
Turbo  Pascal,  534;  loader,  61;  locator,  930;  stack  command, 
145;  user-friendly  shell,  400 
MS-DOS  Handbook,  The,  145 
MS-DOS  Papers,  The,  234 
MTF,  107 

multigigabyte  address  spaces,  509 
multimethods,  243;  of  CommonLoops,  330 


multiple  inheritance,  LOOPS,  643;  ObjectLISP,  328 
multiple  statement  macros,  724 
multiplication,  188;  integer,  188 
multipliers,  negative,  190;  positive,  190 

multitasking  kernel,  983;  features  of,  530;  routines,  525;  properties 
of,  524 

multitasking,  524;  examples,  990;  what  it  is,  983;  approaches,  984; 

starting  and  stopping,  985 
Multitasking  with  Turbo  Pascal ,  524 
multiwindow  environment,  734 
Murray,  William  H.,  Ill,  233 

music,  algorithms,  346;  programming,  342;  software,  346;  and 
scientific  programming,  subroutines  for  both,  391 
Musicware,  335 

Musselman,  John,  434;  reference  to  article,  747 
Mustang-020,  9 

MYCIN,  278,  279,  280;  expanding,  281 
"MYCIN-Like  Expert  Systems,"  278 
Myers,  Glenford  J.,  71 
Myers,  Terry,  745 

NAG  Fortran  Library,  499 

Name  Binding  Protocol  (NBP),  750 

names,  command,  224 

nanotechnology,  4 

Naro,  Rick,  930 

National  Forth  Convention,  809 

National  Instruments,  421 

negation  logic,  for  PROLOG,  5 1 9 

negation-by-failure,  for  PROLOG,  519 

negative  literals,  519;  multipliers,  190 

Nelson,  Russell,  184 

network,  artificial  neural,  264;  layering,  749;  neural,  174 
neural  networks,  174;  artificial,  262 
"Neural  Research  Yields  Computer  That  Can  Learn,"  268 
Neuron  Data,  156 

neurophysiology,  262;  neural  network,  268 
new  account,  in  Unix  BBS,  455 

new  BASIC  subroutines,  416;  fundamental  data  types,  573 

new  data  types,  deriving  from  numeric  types,  367 

New  Flavors,  328;  development  tools,  329;  generic  functions,  328 

new  void,  ANSI  C,  585 

Newton's  method,  935 

Nexpert  Object  system,  156 

NeXT  and  Adobe  Systems  Inc.,  910 

next  generation  machines,  257 

node,  180,  672;  adding,  181;  bottom,  180;  child,  178,  180;  inserting 
in  tree,  485;  leaf,  178;  new,  181;  parent,  178;  priority,  430; 
root,  180;  types,  181 
NOMAD2,  854 

nondeterministic  programming  (NDP),  589 

non-MDA  adaptors,  937 

Norton,  Peter,  490 

Nostradamus,  83 

Novation,  574 

Novix  NC4016  Forth  processor,  993 

nr,  139,  140,  316;  as  compiler-like  text  formatter,  223;  as 

implementation  of  the  Unix  nroff  text  formatter,  223;  automatic 
hyphenation,  317;  command  structure,  224;  commands,  225; 
control  flow,  316;  dot  commands,  224;  escape  sequences,  320; 
input  and  output,  318;  output  line  numberint,  318;  support 
routines,  140;  tabs  and  leaders,  316;  text  formatter,  223;  three- 
part  titles,  317 
nroff,  Unix,  223 
NSI  Logic,  252 
number  registers,  230,  23 1 

numbers,  144;  double-precision,  66;  fixed-point,  72;  floating-point, 
72;  single-precision,  66 
number-input  word,  323 
numeric,  arguments,  225;  formats,  72 
numeric  types,  deriving  new  data  types  from,  367 
Numerical  Algorithms  Group  (NAG),  499 
Numerical  Recipes  in  C,  738 

Numerical  Recipes:  The  Art  of  Scientific  Computing,  418;  reference 
to,  738 
NVRD,  79 

088,  250 


1010 


ObjectLISP,  327;  for  object-oriented  extension,  327;  multiple 
inheritance,  328;  primitive  functions,  327;  shawdowed 
functions,  327 

object-oriented,  extension,  ObjectLISP,  327;  language,  242; 

paradigm,  149,  900;  classes  in,  902;  programming  in  AI,  241; 
programming,  for  AI,  900;  systems,  241;  systems,  advantages, 
241;  systems,  problem,  244;  tools,  900 
object-oriented  LISPs,  327;  future  directions,  330;  problem,  243 
"Of  Interest,"  78,  249,  335,  419,  499,  574,  650,  740,  817,  908,  999 
Offete  Enterprises  Inc.,  993 
Office  Automation  Toolkit,  156 
Oh!  Pascal ,  398 
Okahata,  Darryl,  93 
Okidata,  740 

OMEGA  MC68020-based  workstation,  1000 

Omnitronix,  421 

onion-skin  principle,  750 

OPAL,  169 

opaque  data  types,  813 

opaque  types.  Turbo  BASIC,  QuickBASIC,  True  BASIC,  814 
Opcode  Systems,  335 
open  loops,  Modula-2,  160 

operating  system,  22;  Macintosh,  9;  OS-9,  16;  OS-9,  4 
Operating  System  Design,  the  Xinu  Approach ,  399 
Operating  Systems:  Design  and  Implementation,  559 
operation  of  DIFF,  594 
operations,  insert  and  delete,  486 

operators,  66,  144;  arithmetic,  66,  67;  cast,  232;  double-precision,  68 

optimization,  of  Bickel’s  algorithm,  857;  of  contour  analysis 

algorithm,  832;  techniques,  515;  in  Forth,  758;  for  speed,  758; 
modem  techniques,  599;  program  range,  module  range,  function 
range,  block  range,  599;  Richard  Relph's  definition  of,  598; 
intermediate,  598 

"Optimizing  Compilers  for  C,"  598 

"Optimizing  Integer  Multiplications  by  Constant  Multipliers,"  188 
Optimum-C,  499 

Optimum-C,  Version  3.0,  review,  806 
OR  clauses,  280 
Orchid,  651 

ordered  arrays,  899;  searching  through,  899 
orthogonal  lines,  drawing,  836 

OS-9,  6,  9,  16,  18,  19;  kernel,  20;  operating  system,  4,  16;  shell,  20 
"OS-9,  Operating  System,  The,"  16;  reference  to,  343 
OS-9/68020  C  compiler,  817 
OS/2  Graphics  Development  Toolkit,  908 
OS/2,  505,  915,  919,  1000;  mechanisms  included  in,  919;  dynalink 
descriptors,  922,  errors  and  exit  lists,  922,  device  monitors  a;  for 
80286,  80386,  42;  run-time  linking,  923,  coding  dynalink,  923; 
shared  data  segments,  921 ;  I/O  privilege  and  devices,  921 
OS/286,  651 
OS/386,  651 

output  line  numbering,  nr,  318 
output,  of  DIFF,  593;  nr,  318 

overlay  facililty,  in  Turbo  Pascal,  447;  group,  setting  up,  448;  how 
they  work,  447;  scheme,  in  Turbo  Pascal,  447;  system,  in 
Turbo  Pascal, a  447 
OWL  International,  87,  421 
Oxxi,  817 

paint  programs,  845 
Palantir  Software,  740 

Paperback  Software,  156;  copyright  infringement,  341 
Pappas,  Chris  H.,  233 

paradigm,  object-oriented,  149,  900;  classes  in,  902 
Parallel  Distributed  Processing,  268 
parent  node,  178 
Parrot  1200,  574 

Pascal,  89,  104,  158,  509,  649;  compiler,  557;  data  structures,  104; 
functions,  898;  opaque  types,  815;  UCSD,  249;  version  2.0, 

169 

passing  arrays,  333 
PASTE,  235 

patch  editor,  35 1 ,  for  Kawai  K-3  synthesizer,  362;  programs,  compare 
to  patch  libararians,  35 1 

"Pattern  Matching  Using  Finite  State  Machines,"  765 
pattern  matching,  problem,  765 
pause,  in  multitasking  kernel,  528 


Pax,  82 

PC  Scheme,  156;  2.0,  full  screen  editor,  411;  2.0  engine,  411;  2.0, 
410,  413;  2.0,  compared  with  CommonLISP,  410;  2.0,  control 
structures  in,  4 1 1 ;  review  of,  492 
PC  Tech  Report,  233 

PCBOARD  Forth-only  bulletin-boards,  8 1 1 
PCKARC/PKXARC,  186 
PC’s  Limited,  650 

PC-MOS/386,  650;  reference  to,  738 
PC-MOS/386,  Version  1.02,  999 
PC/Assembler,  79 
PDP- 11/44,  442 
Peacenet,  175 

Pecan  Software  System,  249 

peer  communication,  750 

Pegasus,  74 1 

perform  interrupt,  102 

Performance  Analysis  Tool  Box,  83 

Performance  and  Evaluation  of  LISP  Systems,  244 

Personal  Computer  Support  Group,  80 

Personal  Consultant  Plus,  156 

Personal  Designer,  74 1 

Peter  Norton's  Assembly  Language  Book  for  the  IBM  PC,  490 
Phar  Lap,  80386  assembly  language  development  package,  557; 
assembler,  features  of,  558;  linker,  511,  557;  tools 
compatibility,  557;  performance,  558 
Phoenix  Technologies,  575 
piano-scroll  editors,  362 
Pierson,  Dan  L.,  270 
pilot  track  for  MIDI,  358 
Pioneering  Controls  Technologies,  250 
Pippenger,  Nicholas,  91 1 
pixel  density,  Xerox,  567 
pixel-based  techniques,  663 
pixel-oriented  graphic,  833 
PKARC,  186 
PKXARC,  186 
Plauger,  P.J.,  398 
PLS  PROLOG,  7 
plug-in  board,  335 
PL/PC,  169 
PML  Systems,  335 
PML86  board,  335 
pointer  addition,  subtraction,  514 
Polymake,  397 

polynomial,  approximation,  660;  rational,  661 

polyrhythms,  721;  problem,  722 

PolyShell,  420 

Polytron  Corp.,  420 

Polytron  Software,  397 

pop-up  windows,  938 

portability  of  contour  analysis  algorithm,  832 

Portable  C  and  Unix  Programming,  740 

portable  masks,  556 

Porter,  Kent,  833,  937 

positive  multipliers,  190 

PostScript,  910 

Poumelle,  Alex,  94 

Power  System  Professional  Pak,  249 

PQ.C  routines,  482 

Practical  Modem  1200  SA,  83 

Practical  Peripherals,  83 

pragmas,  510 

prefixes,  67 

"Preparing  for  ANSI  C,"  583 

preprocessor,  in  ANSI  C,  584 

Press,  William  H.,  418 

PRIDE,  expert  system,  644;  system,  152 

primary  data  structure,  180 

Prime  Factor  FFT,  8 1 

primitive  functions,  ObjectLISP,  327 

principle,  onion-skin,  750 

priority  node,  430 

priority  queue,  430,  482,  485;  representation  by  binary  tree,  484; 
scheme  in  real-time  system,  430;  routines,  487;  simple 
implementation,  430 
private  mail,  454 
problem,  halting,  7 


ion 


procedural  attachments,  903 
procedures,  recognizer,  279 
ProCED,  400 

processor,  68030,  10;  6809,  16 

Procrustes,  5 1 8 

Professional  Image  Board,  335 

Professional  Programmer  Dept.,  reference  to,  258 

program  libraries,  in  Unix  BBS;  commands,  453 

program  range  optimization,  599 

programmers,  C,  88 

Programmers  at  Work,  56 1 

Programmers  Journal,  233 

Programmer's  Introduction  to  C,  A,  82 

Programmer's  Problem  Solver  for  the  IBM  PC,  XT,  and  AT,  145 
programming,  8086,  634;  and  computer  science,  differences,  26 1 ; 
consistency,  563;  declarative  vs.  procedural,  75;  declarative,  74; 
designing,  564;  documentation,  562;  flow  of  activity;  hindsight, 
562;  in  AI,  object-oriented,  241;  iterative  approach,  562;  relation 
to  mathematics,  261;  safe  exits,  564;  sloppy,  5;  with  people, 

323 

Programming  in  Prolog,  276 
Programming  the  Intel  80386,  634 

programs,  68000  assembly-language,  22;  calendar,  dateh,  893; 

capabilities,  592; ;  commands  and  options,  593;  design  and  code, 
5;  file  comparison,  592;  ;  real-time,  430;  root-finding,  637; 
speed  gun,  368;  that  process  symbolic  information,  104; 
translation  of,  158;  Unix  COMPACT,  482 
Prolog  Programming  for  Artificial  Intelligence,  276 
PROLOG  AAIS,  7,  74,  75,  77,  164,  165,  270,  337;  AAIS  debugger, 
27 1 ;  AAIS  user  interfaces,  27 1 ;  as  side-effect  free,  276; 
benchmarks,  277;  Borland,  7 

PROLOG,  control  structures,  276;  DEC-20-type,  7;  dialects  of,  271; 
Edinburgh,  272;  history,  272;  for  Macintosh,  169;  grammer, 

277;  in  Japan,  76;  knowledge  resprensentation  in,  518;  micro 
implementation  of,  166;  negation  logic,  519;  negation-by¬ 
failure,  519;  PLS,  7;  programming  system,  499;  syntax  of, 

518;  text  and  manuals  for,  77;  Turbo,  166;  tying  to  conventional 
databases,  76;  used  by  Lockheed,  76 
PROLOG-based  expert  systems,  developed  for,  76 
Prolog-II,  as  a  different  language,  274;  other  features,  274; 

terminology,  274 
PROLOG/m,  270 
propagation,  constant,  copy,  599 
protocol,  749;  of  AppleTalk,  750 
prototypes  in  ANSI  C,  586 
Pro-C,  740 
pruning,  590 
pseudocode,  757 
PS/2,  740 

public  message,  454 
pull-down  windows,  938 
"Pushing  the  Sound  Envelope,"  346 
"Putting  ROM  Code  in  Its  Place,"  930 
P-tral,  499 

QDOS,  9 

Quadram  Corp,  651,  335 

quadtrees,  178,  179;  in  pointer  form,  182;  pruning,  181 
Quadtree,  8 1 

quantificational  level,  representation  at,  520 
quantifacational  statements,  standard,  523 
quantization,  for  MIDI,  358 
Quantum  QL,  9 

Quarterdeck  Systems,  557,  745,  65 1 

queues,  merging,  431;  priority,  482,485;  priority,  first-in/first-out, 
430;  priority,  simple  implementation,  430 
Quick  C,  review,  807 
quick  resident  utility,  851 
QuickBASIC  2.0, 497 

QuickBASIC,  238,  247,  573,  649;  compiler,  Microsoft,  167; 

implementation,  813;  opaque  types,  814;  subroutine  parameters, 
333;  subroutines  implemented  in,  333;  subroutines,  334 

Rabbit  Industries,  249 
Rabbit,  740 

Radio  Shack,  BASIC  program,  259 
RAM  cache,  10 


"RAM-Cache  Manager  in  C,  A,"  925 

random-access  disk  request,  925 

RapidFile,  812 

Raskin,  Jef,  96 

Raster  Technologies,  909 

rational  polynomials,  661 

Rational  Systems,  78 

Rational  Visions,  499 

RBS,  rules  of,  278 

readability  of  BASIC  programs,  333 

reader  survey,  annual,  823 

reading  back  from  screen  in  curses,  553 

Real  BASIC,  499 

Real-Time  Computer  Science  Corp.,  419 

real-time  programs,  430 

Real -Tools,  250 

receive  algorithm,  750 

recognition,  image,  262 

recognizer,  procedures,  279;  syntax-directed,  279 

recording,  computerized,  363 

Rector  and  Alexy,  634 

Rector,  Russell,  825 

recursive  definitions,  Forth  83  Standard,  761;  theory  of,  77 

register  allocation,  600 

registers,  6502,  97;  number  230,  231 

reheaping  process,  484 

relations,  client,  323 

Relph,  Richard,  509;  583,  598;  review,  comment  on,  748 

representation,  extended  (EXT),  73;  floating  point,  72 

request  header,  676 

requestors,  Amiga,  23 

Rhoads  Software,  421 

rhythmic  patterns,  721 

"Right  to  Assemble,  The  "  72 

Ritchie,  Dennis,  398 

RIX  Softworks,  846 

RLL  386  Relocation,  Linkage,  and  Library  Tools  package,  251-2 

robotic  control  signals,  337 

robot  vision  system,  262 

Rochester  Forth  Convention,  1987,  810 

Rocket  286,  335 

Rodime,  740 

Rollins,  Dan,  145 

ROM,  22 

ROM  BIOS,  interrupt  lOh,  833;  video  services,  functions,  833;  calls 
from  Turbo  C,  833 
ROM  code,  930 
ROMed  modules,  17 
ROMulator,  818 
root  node,  1 80 
root-finding  program,  637 

routine,  binary-to-integer  conversion,  142;  Boyer-Moor,  146;  control, 
181,  182;  externally  accessible,  180 
RS-232,  420;  modem  protocol,  350 
RTC  Plus,  500 
RTX286,  419 
rule  statements,  5 1 8 
rules,  LOOPS,  643;  of  RBS,  278 
rule-based  systems,  903 
RUN386,  51 1,  557;  performance,  512 

"Running  Light,"  88,  174,  258,  342,  426,  506,  580,  656,  824,  915 
run-length  decoding,  7 1 
Russell's  paradox,  414 

safe  exits,  in  programming,  564 
safeguards,  71 
Sage  II,  12 

Sams  &  Co.,  Howard  W.,  78 
Sapiens  Software  Corp.,  421 
Sapiens  V8, 421 
Sargent,  Murray,  III,  145 
scanning  list,  in  XOR  chain,  438 
Scholastech  Telecommunications,  450 

scientific  programming,  342;  and  music,  subroutines  for  both,  391 
SCOOPS,  410,  41 1,  412;  programming  in,  493 
screen  generator,  630 
screen  weirdness,  245-246 


1012 


SCRIPT,  400 
scrolling  in  curses,  553 
search  routine,  symbol,  106 
Sedgewick,  Robert,  398 
SEdit  editor,  568 
semaphores,  528 

sequencers,  352;  designing  for  MIDI,  353 
sequences,  escape,  224,  226 
sequential  circuits,  350 
sequential  closure,  901 
serial  data,  442 

serial  ports,  442  functions  of,  442 
SETL,  590 
sets,  159 

shadowed  functions,  ObjectLISP,  327 

Shammas,  Namir  Clement,  158,  238,  404,  637,  729,  813,  898 

Shapiro,  Ehud,  276 

shell  scripts  in  Unix  BBS,  452 

shell,  18;  for  Amiga,  419;  inner,  13,  14;  OS-9,  20;  188 

shift-add,  189 

shift-subtract,  189 

Shoemaker,  Richard  L.,  145 

SideKick,  146 

Sieve  benchmark,  637 

Sigma  Designs,  908 

SigmaVGA,  908 

signature,  standard  time,  721 

SIG,  DDJ,  on  CompuServe,  426 

Silicon  Beach  Software,  910 

SILOAM  (simple  image  learning  on  adaptive  machinery),  262 
SILOAM,  limitations,  267;  versions,  267 
Simon,  Paul,  71 

simple  image  learning  on  adaptive  machinery  (SILOAM),  262 
simple  integers,  72 
Sinclair,  9 

Sine,  reference  to,  570 

sine-wave  components,  35 1 

single  character,  buffer,  443;  in  curses,  553 

single-chip  EGA,  908 

single-point  storage,  for  MIDI,  357 

single-precision  numbers,  66 

Sintar  Software,  402 

sixers,  4 

sixth  generation  computers,  253 
SK  DOS,  420 
Skolem  functions,  521 
Skolem,  Thoralf,  521 
SLD-resolution,  518 
Smalltalk,  242,  329 

Smalltalk/V,  734,  735;  dialect  of,  734;  examples,  736;  inspectors, 
735;  graphics,  735;  subroutines,  736;  release  1 .2,  499; 
windows,  735 
Smart  Key,  740 
SMART,  252 
Smith  and  Johnson,  634 

SMPTE  (Society  for  Motion  Picture  and  Television  Engineers)  code, 
363 

SMPTE-MIDI  conversion,  364 
Socha,  John,  490 

Society  for  Motion  Picture  and  Television  Engineers  (SMPTE)  code, 
363 

Softguard,  51 1,521 

Software  Bottling  Company,  The,  249 

Software  Factory,  The,  169 

Software  Link,  738 

Software  Link,  The,  557,  650,  999 

Software  Masters,  574 

software,  music,  346;  standards,  350 

software 

Solution  Systems,  818 
Solutions  International,  818 
Sophisticated  Software,  79 
sorted  array,  485 
sorting  algorithm,  482 
SOTA  Technology,  650 
Sota  technology,  79 
source  code,  compiling,  509 


spaces,  144;  data,  64;  stack,  64 

speed,  gun  program,  368;  in  text  editors,  93 

Sperry,  Tyler,  505,  506,  580,  655,  656,  745,  746,  824,  915 

Spinoza,  77;  PROLOG,  77 

SPY  window,  in  Xerox  environment,  569 

SQRT,  663 

Squish,  249 

SSP/PC  library,  78 

stack,  64,  98,  230,  627;  command,  MS-DOS,  145;  Forth,  70;  frame, 
525,  627;  managing,  590;  manipulations,  requirements,  854; 
segment,  64;  space,  64 
standard  deviation,  39 1 
standard  quantificational  statements,  523 
standard  time  signature,  721;  information  given,  721 
standard  user  interface,  Macintosh,  22 
Standard,  83,  79,  324 
"Standard  #include  Files,"  486 
standard,  Amiga  interface,  22 

Stanford's  Center  for  Computer  Research  in  Music  and  Acoustics, 
(CCRMA),  346 
Starplan  II  architecture,  151 
Starting  Forth,  1 1 
Star-K  Software  Systems,  420 

state  machines,  765;  attributes,  765;  basic  operation;  extensions, 

769;  implementation,  767;  portability,  768;  use  and 
effectiveness,  765; 

"State  of  BASIC,  The"  167,  247,  333,  416,  497,  573,  649;  reference 
to,  570 

statements,  fact,  rule,  518 

statistics,  989 

Steele,  Guy  L.  Jr.,  398 

Stefik,  Mark,  149 

Steppe,  Tom,  666 

stepping  scan,  in  XOR  chain,  439 

Sterling  Castle  Software,  740,  909 

Sterling,  Leon,  276 

storage  address,  437 

Storage  Dimensions,  249 

storage  scheme,  in  LISP,  567 

storage  technique,  67 1 

Strauss,  Ed,  402 

Stride,  12 

strings,  implementing  in  Forth,  996;  manipulation,  158;  Modula-2, 
158;  with  nr,  229 
Strong,  Alex,  347 
Stroustrup,  Bjame,  572 
StruBAS  2.0,  500 

structure  editors,  in  INTERLISP-D,  568 
structured  loop  control,  FLINT,  14 

"Structured  Programming,"  65,  158,  238,  323,  404,  561,  637,  72, 
813,  898 

structures,  logical,  92 
STSC,  419 
subareas,  178 

subroutine,  279,  284,  333;  for  music  and  scientific  programming, 
391;  implemented  in  QuickBASIC,  True  BASIC,  333; 
parameters,  in  QuickBASIC,  333;  QuickBASIC,  TrueBASIC, 
334;  threading,  advantages  of,  759;  vs.  pointer-threading,  in 
Forth,  757;  with  a  variable  number  of  arguments,  627 
subtraction,  pointer,  514 
subVol,  83 
suffixes,  67 
Suman,  Mike,  91 

Suman,  Mike,  reference  to  February  "Viewpoint,"  647;  reference  to 
Modula-2  viewpoint,  427,  428 
Summit  Software,  416 
Sun  systems,  9 
SunDog  Software  Corp.,  249 
Super  3D,  910 
SuperGlue,  818 
support  chips,  9 
support  routines,  nr,  140 
Surf3-D/Surf  87  420 

"Swaine’s  Flames,"  84,  170,  253,  422,  501,  575,  652,  742,  819, 
991,  1000 

Swaine,  Michael,  3,  74,  84,  87,  164-166,  170,  173,  253,  257,  341, 
422,  425,  501,  505,  575,  579,  652,  742,  819,  823,  91 1,  1000 


1013 


swapping  routine,  interrupt  vector,  854 
SwyftCard,  96 
symbol  search  routine,  106 
symbol  tables,  104 

symbolic  information,  programs  that  process,  104 
Symbolics  3600  series,  328 
Symbolics  Flavors,  328 
Symbolics'  Flavors  system,  242 
Syncra  PC,  419 
syntax  of  PROLOG,  5 1 8 
syntax-directed  recognizers,  279 
synthesis,  additive,  351;  FM,  351;  phase-distortion,  351 
synthesizer,  350;  choosing  362;  linking  to  computer  equipment,  350; 
linking  to  computer  equipment,  350;  programming,  as  an  art, 

351 

Syscall,  20 

System  Enhancement  Associates  (SEA),  184 
system,  functions,  for  MS-DOS  3.30,  633;  monitoring,  in  Unix 
BBS,  455 

system  routines,  Amiga,  23;  Macintosh,  23 
Systems  Guild,  250 
Systems  Software  Tools,  399 

systems,  6502,  97;  object-oriented,  241;  operating,  22;  poorly 
designed,  565 
SystemV,  650 

T8,  252 

table  maintenance,  hashing  strategy  for,  140 

tables,  concatenating,  235;  hash  pointer,  104;  symbol,  104 

Tall  Tree  Systems,  79 

Tandy  Color  Computer,  6809-based,  16 

Tanenbaum,  Andrew  S.,  559 

task,  control  block,  524;  creating  and  deleting,  986 

TDebug  benchmark,  1 85 

TECH  Help!?The  Electronic  Manual,  574 

technical  issues  of  handling  COM  port,  443 

Tecmar,  65 1 

TEFT,  420 

telecommunications,  Unix,  450 
Telenix,  650 
TeleStar,  650 

teletype  machines,  825,  905 
TeleVideo,  650 

Tello,  Ernest  R.,  88,  149,  241,  327,  410,  492,  566,  640,  734,  900 
temporary  latches,  846 
Tenenbaum,  Aaron  M.,  398 

Terminate  and  Stay  Resident  utilities,  146,147,  234,  632 
test  suite,  510 
Teukolsky,  Saul  A.,  418 
Texas  Instruments,  156,410 

text  editing,  functions,  Macintosh,  27;  Macintosh  and  Amiga,  26; 

standards,  Macintosh  and  Amiga,  23 
"Text  Editors:  In  Matters  of  Taste,"  92 
text  editors,  92;  features,  93;  modeless,  93;  new,  95;  non- 
WYSIWYG,  95;  speed,  93;  transparent,  93 
text  formatter,  316;  nr,  223 
text  I/O,  Amiga,  26;  Macintosh,  26 
text  screens,  for  IBM-standard  adaptors,  937 
theorem-proving  algorithms,  164 
theory  of  definition,  77 
theory  of  recursive  definition,  77 
THEOS,  650 
THINK  Technologies,  80 
Thomas,  Levi,  92,  426 
three-part  titles,  nr,  317 
threshold  logic  unit  (TLU),  263 
throughput,  925 
timbre  software,  gesture,  347 
time,  signature,  721;  stamps,  for  MIDI,  356 
time-stamping,  357 
timing  notes,  for  MIDI,  355 
Ting,  C.H.,  993 

TLU  (threshold  logic  unit),  263;  computing  output,  263;  pattern 
dichotomizer,  264 
token  interactive,  14 
token,  12 

Tondo,  Clovis  L.,  398 
Totem,  421 


Tracy,  Martin,  761,  809.  993 

transients,  347 

translation  of  programs,  158 

Translator,  158 

transmit  algorithm,  750 

trap  mechanism,  68000,  23 

traps,  with  nr,  228 

traversal  routine,  binary-tree,  62 

trigonometry,  659 

TRON,  425 

True  BASIC,  167,  238,  247,  334,  404,  497,  649;  module,  examples, 
404;  modules  in,  167,404;  opaque  types,  814;  subroutines 
implemented  in,  333;  subroutines,  334;  support,  498;  variables 
and  functions,  573;  Version  2.0,  167;  vs.  Modula-2,  406 
Tseng  Labx,  741 

TSR  extension,  in  MS-DOS  3.30,  632 
Tsu,  Lao,  518 

Turbo  BASIC,  169,  497,  573,  649;  opaque  types,  814;  similarity  to 
QuickBASIC,  417 

Turbo  C,  350,  930;  by  Borland,  833;  function,  834;  review,  807 
Turbo  Pascal,  840;  limitation,  447;  nonstandard  procedures,  840; 
managing  memory,  842;  overlay  scheme,  447,  advantages  of, 
447,  disadvantages  of,  447;  overlay  system,  447;  procedures, 

525;  programs,  translating,  158,  160;  saving  screen  regions, 

840;  testing  CRT  mode,  841;  restoring  reg;  storing  regions  on 
disk,  843 

Turbo  Pascal-Turbo  BASIC-QuickBASIC,  84 

Turbo  Power  Software,  185 

Turbo  PROLOG,  78,  166 

Turbo  SE,  909 

Turbo  Techniques,  741 

TurboCAD,  741 

TurboMAGIC,  79 

TurboPower  Software,  500 

Turing  Machine,  345 

Turing,  Alan,  7,  345,  91 1 

Turner,  Nick,  4,  8,  72,  88,  92,  174,  258 

Twin,  The,  341 

"Two-Bit  Analog-to-Digital  Conversion,"  434 
type  checking  in  C,  90 
typed  characters,  curses,  55 1 
type-ahead  serial  data  capture,  442 
T-DebugPLUS,  500 

UAER  status  registers,  444 
UCSD  Pascal,  249 

Understanding  Computers  and  Cognition,  1 64 
Uniform  PC,  Version  2,  818 
UniPress  Software,  252,  999 
Unisys,  650 

United  States  Software  Corp,  500 
University  of  Essex,  UK,  153 

Unix,  9,  16,  425;  and  BBS,  45;  and  bulletin  boards,  450;  and 
"restricted  shell,"  45;  BBS,  new  account,  455;  COMPACT 
program,  48;  compatibility,  48;  environment,  55;  nroff,  22; 
organizational  information,  455;  system  monitoring,  45; 
telecommunications  environment,  45;  text  formatter,  13;  time 
functions,  893,  classes,  89;  time  functions,  893 
"Unix  BBS  Using  Shell  Scripts,  A,"  450 
Unix  Training  Center,  999 
Unix  V,  630 
unknown  voltage,  435 
unsigned  binary  words,  437 
unskewing  approximation,  661 
Ury,  William,  323 
UR/FORTH,  323 

user  commands,  augmented  in  MS-DOS  3.30,  632 

user  interface  standard,  similarity  between  Macintosh  and  Amiga,  22 

user  interfaces,  561;  AAIS  prolog,  271;  Async  Appletalk,  752; 

ExperProlog-II,  271;  MacPROLOG,  271;  PROLOG/m,271 
user-created  libraries,  with  Turbo  C,  834 
user-defined  functions,  497 
user-friendly  MS-DOS  shells,  400 
"Using  EGA  Graphics  Screens  in  Your  Programs,"  845 
Using  PC-DOS,  145 
Utilities  option  for  MMSForth,  68 

utility,  file  comparison,  669;  find/replace,  638;  library,  184 


1014 


U.S.  Geological  Survey,  827 


variables,  absolute,  159;  arguments,  627;  global,  230 
VAX,  9 

Vector  Automation,  741 
Veloz,  82 

vendors,  for  MIDI,  365 
Venix  System  V  2.3,  999 
VenturCom,  999 
Verbatim,  740 

Version  2.0  of  True  BASIC,  167 
Vetterling,  William  T.,  418 
video  adapter,  835 
Video  Seven,  908 
"Viewpoint,"  91,  261,  345,  429 
viPLUS,  252 

virtual  memory  system,  Xerox,  567 

Virtual  Terminal  Adapter,  651 

Visible  Computer  (TVC):  8088,  The,  574 

vision  system,  robot,  262 

vision,  computer,  262 

Vista  1600,  741 

Visual  Interactive  Programming  (V.I.P.),  729 
VITA,  9 
VME  bus,  9 

VME  International  Trade  Association  (VITA),  9 
VM/RUN,  511,512;  performance,  512 
voltage,  generated,  435;  unknown,  435 
Vorkoetter  Software,  S.M.,  420 
VP-Expert,  156 
VP-Planner,  34 1 

V.I.P.,  arrays,  729;  clustered  binary  trees,  731;  decision-making 
constructs,  729;  editor,  debugger,  730;  supports,  729;  user- 
defined  routines,  7 

Waite  Group,  233 

wait,  in  multitasking  kernel,  527 

Wallace,  Bob,  93,  96 

Ward,  Robert,  489 

Watkins,  Don,  95 

waveforms,  35 1 

WEB  documentation  system,  87 
Weiner,  Alan,  740 
Wendin,  999 

Wendin-DOS  Application  Developer's  Kit,  999 
West  Coast  Computer  Faire,  170 
West  Coast  Forth  Board,  8 1 1 
Weston,  Dan,  419 
"What's  the  DIFF,"  592 


"What's  Wrong  with  High-Level  Languages,"  reference  to,  429 

Whitewater  Group,  The,  78 

White,  Ronald  G.,  178 

Widrow,  Dr.  Bernard,  507 

Windows  Draw,  420 

windows,  creating  for  Macintosh  and  Amiga,  25;  Macintosh  and 
Amiga,  22;  pop-up,  pull-down,  938;  Xerox,  568 
Winograd,  Terry,  164 
Woodchuck  Industries,  499 
word  distribution  pattern,  106 
word  processor,  compiles,  139 

WordStar,  imprinted  programmers,  429;  imprinting,  494 
Workman  &  Associates,  78 
workstations,  high-end,  9 
writing  macros,  724 

"Writing  MS-DOS  Device  Drivers  in  C,"  676 
writing  strings,  in  curses,  553 
Wyse,  650 

WYSIWYG  (what  you  see  is  what  you  get),  94 

x  range,  663 
Xenix  System  V,  1000 
Xerox  1 185  Lisp,  640 

Xerox  1 186,  566;  pixel  density,  567;  CPU,  567;  virtual  memory 
system,  567;  window,  icon,  568 
Xerox  environment  SPY  window,  569 
Xerox  PARC,  149,  152,  329 
Xerox  Reprographics  Business  Group,  152 
Xerox,  329 

"XOR  Chain  Revisted,  The,"  671 

"XOR  Chain,  The,"  437;  reference  to,  657-658,  671 

XOR,  437 

XOR,  binary  trees,  672 
XOR,  tricks,  67 1 
X-Tree,  400 

Yamaha,  347 
yielding,  988 

yield,  in  multitasking  kernel,  526 
Z80,  8,  260 

Zimniewicz,  Thomas,  A.,  442 
Zipf,  George,  106 
Zoom  Telephonies,  82 
Zoom/Modem  PC  1200,  82 
Zoom/Modem  PC  2400,  741 
ZOO,  186 


1015 


More  Software  Tools 
from  M&T  Books 


Programming  Languages 

C  Chest  and  Other  C  Treasures  from  Dr.  Dobb's  Journal 
Edited  by  Allen  Holub 
Item  #40-2  $24.95  (book) 

Item  #49-6  $39.95  (book/disk) 

This  comprehensive  anthology  contains  the  popular  "C  Chest"  columns  from  Dr. 
Dobb's  Journal  of  Software  Tools,  along  with  the  lively  philosophical  and  practical 
discussions  they  inspired,  in  addition  to  other  information-packed  articles  by  C  experts. 
The  software  in  the  book  is  also  available  on  disk  with  full  source  code.  MS-DOS 
format. 

Turbo  C:  The  Art  of  Advanced  Program  Design,  Optimization,  and 
Debugging 

Stephen  R.  Davis 

Item  #38-0  $24.95  (book) 

Item  #45-3  $39.95  ( bookldisk ) 

Overflowing  with  example  programs,  this  book  fully  describes  the  techniques 
necessary  to  skillfully  program,  optimize,  and  debug  in  Turbo  C.  All  programs  are 
also  available  on  disk  with  full  source  code.  MS-DOS  format. 

C  Programming  for  MIDI 

Jim  Conger 

Item  #86-0  $22.95  (book) 

Item  #90-9  $37.95  ( bookldisk ) 

For  musicians  and  programmers  alike,  here  is  the  source  that  will  help  you  write 
programs  for  music  applications.  The  author  begins  by  outlining  the  features  of  MIDI 
(Musical  Instrument  Digital  Interface)  and  its  support  of  real-time  access  to  musical 
devices.  An  introduction  to  C  programming  fundamentals  as  they  relate  to  MIDI  is 
also  provided.  The  author  fully  demonstrates  these  concepts  with  two  MIDI 
applications:  a  patch  librarian  and  a  simple  sequencer. 

Dr.  Dobb’s  Toolbook  of  C 

Editors  of  Dr.  Dobb's  Journal 
Item  #89303-615-3  $29.95 

From  Dr.  Dobb’s  and  Brady  Communications,  this  book  contains  a  comprehensive 
library  of  valuable  C  code.  Dr.  Dobb’s  most  popular  articles  on  C  are  updated  and 
reprinted  here,  along  with  new  C  programming  tools.  Also  included  is  a  complete  C 
compiler,  an  assembler,  text  processing  programs,  and  more! 


Dr.  Dobb’s  Toolbook  of  Forth 

Edited  by  Marlin  Ouverson 
Item  #10-0  $22.95  (book) 

Item  #57-7  $39.95  (bookldisk) 

This  comprehensive  collection  of  useful  Forth  programs  and  tutorials  contains 
expanded  versions  of  DDT  s  best  Forth  articles  and  other  material,  including  practical 
code  and  in-depth  discussions  of  advanced  Forth  topics.  The  screens  in  the  book  are 
also  available  on  disk  as  ASCII  files  in  the  following  formats:  MS/PC-DOS,  Apple 
II,  Macintosh,  or  CP/M:  Osborne  or  8"  SS/SD. 

Dr.  Dobb's  Toolbook  of  Forth,  Volume  II 

Editors  of  Dr.  Dobb's  Journal 
Item  #41-0  $29.95  (book) 

Item  #51-8  $45.95  ( bookldisk ) 

This  complete  anthology  of  Forth  programming  techniques  and  developments  picks  up 
where  the  Toolbook  of  Forth,  First  Edition  left  off.  Included  are  the  best  articles  on 
Forth  from  Dr.  Dobb's  Journal  of  Software  Tools,  along  with  the  latest  material  from 
other  Forth  experts.  The  screens  in  the  book  are  available  on  disk  as  ASCII  files  in 
the  following  formats:  MS-DOS,  Macintosh,  and  CP/M:  Osborne  or  8"  SS/SD. 

The  New  BASICs:  Programming  Techniques  and  Library  Development 

Namir  Clement  Shammas 
Item  #37-2  $24.95  (book) 

Item  #43-7  $39.95  (bookldisk) 

This  book  will  orient  the  advanced  programmer  to  the  syntax  and  programming 
features  of  The  New  BASICs,  including  Turbo  BASIC  1.0,  QuickBASIC  3.0,  and 
True  BASIC  2.0.  You'll  learn  the  details  of  implementing  subroutines,  functions,  and 
libraries  to  permit  more  structured  coding.  Programs  and  subroutines  are  available  on 
disk  with  full  source  code.  MS-DOS  format. 

The  Turbo  Pascal  Toolbook 

Edited  by  Namir  Clement  Shammas 
Item  #25-9  $25.95  (book) 

Item  #61 -5  $45.95  (bookldisk) 

This  book  contains  routines  and  sample  programs  to  make  your  programming  easier 
and  more  powerful.  You'll  find  an  extensive  library  of  low-level  routines;  external 
sorting  and  searching  tools;  window  management;  artificial  intelligence  techniques; 
mathematical  expression  parsers,  including  two  routines  that  convert  mathematical 
expressions  into  RPN  tokens;  and  a  smart  statistical  regression  model  finder.  More 
than  800K  of  source  code  is  available  on  disk  for  MS-DOS  systems. 

Programming  Tools  and  Utilities 

The  Small-C  Handbook 

James  E.  Hendrix 

Item  #8359-7012-4  $17.95  (book) 

Item  #67-4  $37.90  (book  and  CP! M  disk) 


The  Small-C  Handbook  with  MS/PC-DOS  Addendum  and  Disk 

James  E.  Hendrix 

Item  #76-3  $42.90  (book! addendum! MS-DOS  disk ) 

Also  from  DDJ  and  Brady  Communications,  the  handbook  is  a  valuable  companion  to 
the  Small-C  compiler,  described  below.  The  book  explains  the  language  and  the 
compiler,  and  contains  entire  source  listings  of  the  compiler  and  its  library  of 
arithmetic  and  logical  routines. 

Small-C  Compiler 

James  E.  Hendrix 
Item  #01-1  $19.95 

Like  a  home  study  course  in  compiler  design,  the  Small-C  Compiler  and  The  Small-C 
Handbook  provide  all  you  need  to  leam  how  compilers  are  constructed,  as  well  as 
teaching  the  C  language  at  its  most  fundamental  level.  Full  source  code  is  included  on 
disk  in  both  CP/M  and  MS/PC-DOS  versions. 

A  Small  C  Compiler:  Language,  Theory,  and  Design 

James  E.  Hendrix 
Item  #88-7  $23.95 

A  full  presentation  of  the  design  and  theory  of  the  Small  C  compiler  (including  source 
code)  and  programming  language.  The  author  has  implemented  many  features  in  this 
compiler  that  make  it  an  excellent  example  for  learning  basic  compiler  theory.  Some 
of  these  features  are:  recursive  dissent  parsing,  one-pass  compilation,  and  the 
generation  of  assembly  language.  Here  is  a  look  into  a  real  compiler  with  the 
opportunity  for  hands-on  experience  in  designing  one. 

Small-Windows:  A  Library  of  Windowing  Functions  for  the  C 
Language 

James  E.  Hendrix 
Item  #35 -X  $29.95 

Small-Windows  is  a  complete  windowing  library  for  C.  The  package  includes  video 
functions,  menu  functions,  window  functions,  and  more.  The  package  is  available  for 
MS-DOS  systems  for  the  following  compilers:  Microsoft  C  Version  4.0,  Small-C, 
and  Lattice  C.  Documentation  and  full  C  source  code  is  included. 

Small  Tools:  Programs  for  Text  Processing 

James  E.  Hendrix 

Item  #78-X  $29.95  (manual! disk) 

This  package  of  text-processing  programs  written  in  Small-C  is  designed  to  perform 
specific,  modular  functions  on  text  files.  Source  code  is  included.  Small  Tools  is 
available  in  both  CP/M  and  MS/PC-DOS  versions  and  includes  complete 
documentation. 

Small-Mac:  An  Assembler  for  Small-C 

James  E.  Hendrix 

Item  #77-1  $29.95  (manualldisk) 

Small-Mac  is  a  macro  assembler  designed  to  stress  simplicity,  portability, 
adaptability,  and  educational  value.  Small-Mac  is  available  for  CP/M  systems  only  and 
includes  source  code  on  disk  with  complete  documentation. 


NR:  An  Implementation  of  the  Unix  NROFF  Word  Processor 

Allen  Holub 

Item  #33-X  $29.95 

NR  is  a  text  formatter  that  is  written  in  C  and  compatible  with  Unix's  NROFF.  NR 
comes  configured  for  any  Diablo-compatible  printer,  as  well  as  Hewlett  Packard's 
ThinkJet  and  LaserJet.  Both  the  ready-to-use  program  and  full  source  code  are  included. 
For  PC  compatibles. 

Statistical  Toolbox  for  Turbo  Pascal 

Namir  Clement  Shammas 

Item  #22-4  $69.95  (manuals/disks) 

Two  statistical  packages  in  one!  A  library  disk  and  reference  manual  that  includes 
statistical  distribution  functions,  random  number  generation,  basic  descriptive 
statistics,  parametric  and  nonparametric  statistical  testing,  bivariate  linear  regression, 
and  multiple  and  polynomial  regression.  The  demonstration  disk  and  manual 
incorporate  these  library  routines  into  a  fully  functioning  statistical  program.  For  IBM 
PCs  and  compatibles. 

Turbo  Advantage 

Lauer  and  Wallwitz 
Item  #26-7  $49.95 

A  library  of  more  than  200  routines,  with  source  code  sample  programs  and 
documentation.  Routines  are  organized  and  documented  under  the  following  categories: 
bit  manipulation,  file  management,  MS-DOS  support,  string  operations,  arithmetic 
calculations,  data  compression,  differential  equations,  Fourier  analysis  and  synthesis, 
and  much  more!  For  MS/PC-DOS  systems. 

Turbo  Advantage:  Complex 
Lauer  and  Wallwitz 
Item  #27-5  $89.95 

This  library  provides  the  Turbo  Pascal  code  for  digital  filters,  boundary-value 
solutions,  vector  and  matrix  calculations  with  complex  integers  and  variables,  Fourier 
transforms,  and  calculations  of  convolution  and  correlation  functions.  Some  of  the 
Turbo  Advantage:  Complex  routines  are  most  effectively  used  with  Turbo  Advantage. 
Source  code  and  documentation  included. 

Turbo  Advantage:  Display 
Lauer  and  Wallwitz 
Item  #28-3  $69.95 

Turbo  Advantage:  Display  includes  an  easy-to-use  form  processor  and  thirty  Turbo 
Pascal  procedures  and  functions  to  facilitate  linking  created  forms  to  your  program. 
Full  source  code  and  documentation  are  included.  Some  of  the  Turbo  Advantage 
routines  are  necessary  to  compile  Turbo  Advantage:  Display. 

Time  and  Task  Management  with  dBASE  III 

Timothy  Berry 

Item  #09-7  $49.95  (manual/MS-DOS  disk) 

Like  an  accounting  system  for  time  and  tasks,  this  package  helps  users  organize  hours, 
budgets,  activities,  and  resources.  Providing  both  a  useful  time-management  system 
and  a  library  of  dBASE  III  code  and  macros,  this  package  has  practical  as  well  as 


educational  value.  To  be  used  with  dBASE  III.  Source  code  and  documentation  is 
included.  MS-DOS  disk  format. 


Sales  Management  with  dBASE  III 

Timothy  Berry 

Item  #15-1  $49.95  (manual! MS-DOS  disk) 

Sales  management  works  with  dBASE  III  to  provide  a  powerful  information  system 
that  will  help  you  to  keep  track  of  clients,  names,  addresses,  follow-ups,  pending 
dates,  and  account  data.  This  system  organizes  all  the  day-to-day  activities  of  selling 
and  includes  program  files,  format  files,  report  files,  index  files,  and  data  bases. 
Documentation  and  full  source  code  is  included. 

Operating  Systems 

On  Command:  Writing  a  Unix-Like  Shell  for  MS-DOS 
Allen  Holub 
Item  #29-1  $39.95 

Learn  how  to  write  shells  applicable  to  MS-DOS,  as  well  as  to  most  other 
programming  environments.  This  book  and  disk  include  a  full  description  of  a  Unix- 
like  shell,  complete  C  source  code,  a  thorough  discussion  of  low-level  DOS 
interfacing,  and  significant  examples  of  C  programming  at  the  system  level.  All 
source  code  is  included  on  disk. 

/util:  A  Unix-Like  Utility  Package  for  MS-DOS 
Allen  Holub 
Item  #12-7  $29.95 

This  collection  of  utilities  is  intended  to  be  accessed  through  SH  but  can  be  used 
separately.  It  contains  programs  and  subroutines  that,  when  coupled  with  SH,  create  a 
fully  functional  Unix-like  environment.  The  package  includes  a  disk  with  full  C  source 
code  and  documentation  in  a  Unix-style  manual. 

Taming  MS-DOS,  Second  Edition 
Thom  Hogan 
Item  #87-9  $19.95 
hem  #92-5  $34.95 

Described  by  reviewers  as  "small  in  size,  large  on  content,"  and  "fun."  The  second 
edition  promises  to  be  just  as  readable  and  is  updated  to  cover  MS-DOS  3.3.  Some  of 
the  more  perplexing  elements  of  MS-DOS  are  succinctly  described  here  with  time¬ 
saving  tricks  to  help  customize  any  MS-DOS  system.  Each  trick  is  easily 
implemented  into  your  existing  tools  and  for  programmers,  Hogan  includes  many 
complete  source  code  files  that  provide  very  useful  utilities.  All  source  code  is  written 
in  BASIC. 

Program  Interfacing  to  MS-DOS 

William  G.  Wong 
hem  #34-8  $29.95 

Program  Interfacing  to  MS-DOS  will  orient  any  experienced  programmer  to  the  MS- 
DOS  environment.  The  package  includes  a  ten-part  manual  with  sample  program  files 
and  a  detailed  description  of  how  to  build  device  drivers,  along  with  the  device  driver 
for  a  memory  disk  and  a  character  device  driver  on  disk  with  macro  assembly  source 
code. 


Tele  Operating  System  Toolkit 

Ken  Berry 

This  task-scheduling  algorithm  drives  the  Tele  Operating  System  and  is  composed  of 
several  components.  When  integrated,  they  form  an  independent  operating  system  for 
any  8086-based  machine.  Tele  has  also  been  designed  for  compatibility  with  MS-DOS, 
UNIX,  and  the  MOSI  standard. 

SK:  THE  SYSTEM  KERNEL 

Item  #30-5  $49.95  (manual! disk) 

The  System  Kernel  contains  an  initialization  module,  general-purpose  utility 
functions,  and  a  real-time  task  management  system.  The  kernel  provides  MS-DOS 
applications  with  multitasking  capabilities.  The  System  Kernel  is  required  by  all 
other  components.  All  source  code  is  included  on  disk  in  MS-DOS  format. 

DS:  WINDOW  DISPLAY 

Item  #32-1  $39.95  (manual! disk) 

This  component  contains  BIOS  level  drivers  for  a  memory-mapped  display, 
window  management  support  and  communication  coordination  between  the 
operator  and  tasks  in  a  multitasking  environment.  All  source  code  is  included  on 
disk  in  MS-DOS  format. 

FS:  THE  FILE  SYSTEM 

Item  #65-8  $39.95  (manual/disk) 

The  File  System  supports  MS-DOS  disk  file  structures  and  serial  communication 
channels.  All  source  code  is  included  on  disk  in  MS-DOS  format. 

XS:  THE  INDEX  SYSTEM 

Item  #66-6  $39.95  (manualldisk) 

The  Index  System  implements  a  tree-structured  free-form  database.  All  source  code 
is  included  on  disk  in  MS-DOS  format. 

UNIX  Programming  on  the  80286/80386 

Alan  Deikman 

Item  #83-6  $24.95  (hook) 

Item  #91-9  $39.95  (hook! disk) 

A  complete  professional  level  tutorial  and  reference  for  programming  UNIX  and  Xenix 
on  80286/80386  based  computers.  Succinct  coverage  of  the  UNIX  program 
environment,  UNIX  file  system,  shells,  utilities,  and  C  programming  under  UNIX  are 
covered.  The  author  also  delves  into  the  development  of  device  drivers;  some  examples 
of  these  are  video  displays,  tape  cartridges,  terminals,  and  networks. 

The  Programmer's  Essential  OS/2  Handbook 

David  E.  Cortesi 

Item  #82-8  $24.95  (book) 

Item  #89-5  $39.95  (book! disk) 

Here  is  a  resource  no  developer  can  afford  to  be  without!  Cortesi  succinctly  organizes 
the  many  features  of  OS/2  into  related  topics  and  illuminates  their  uses.  Multiple 
indexes  and  a  web  of  cross  referencing  in  the  margins  provide  easy  access  to  all  OS/2 
topic  areas.  Equal  support  for  Pascal  and  C  programmers  is  provided.  The  essential 
reference  for  programmers  developing  in  the  OS/2  environment. 


Dr.  Dobb's  Toolbook  of  80286,  80386  Programming 

Editors  of  Dr.  Dobb's  Journal 
Item  #42-9  $24.95  (book) 

Item  #53-4  $39.95  (bookldisk) 

This  toolbook  is  a  comprehensive  discussion  on  the  powerful  80X86  family  of 
microprocessors.  The  editors  of  Dr.  Dobb's  Journal  of  Software  Tools  have  gathered 
their  best  articles,  updated  and  expanded  them,  and  added  new  material  to  create  this 
valuable  resource  for  all  80X86  programmers.  All  programs  are  available  on  disk  with 
full  source  code. 

Chips 

Dr.  Dobb’s  Z80  Toolbook 

David  E.  Cortesi 

Item  #07-0  $25.00  (book) 

Item  #55-0  $40.00  (bookldisk) 

This  book  contains  everything  users  need  to  write  their  own  Z80  assembly-language 
programs,  including  a  method  of  designing  programs  and  coding  them  in  assembly 
language  and  a  complete,  integrated  toolkit  of  subroutines.  All  the  software  in  the 
book  is  available  on  disk  in  the  following  formats:  8"  SS/SD,  Apple,  Osborne,  or 
Kaypro. 

Dr.  Dobb’s  Toolbook  of  68000  Programming 

Editors  of  Dr.  Dobb's  Journal 
Item  #13-216649-6  $29.95  (book) 

Item  #75-5  $49.95  (bookldisk) 

From  DDJ  and  Brady  Communications,  this  collection  of  practical  programming  tips 
and  tools  for  the  68000  family  contains  the  best  68000  articles  reprinted  from  DDJ 
along  with  much  new  material.  The  book  contains  many  useful  applications  and 
examples.  The  software  in  the  book  is  also  available  on  disk  in  the  following  formats: 
MS/PC-DOS,  Macintosh,  CP/M  8",  Osborne,  Amiga,  and  Atari  520ST. 

X68000  Cross  Assembler 
Brian  R.  Anderson 
Item  #71-2  $25.00 

This  manual  and  disk  contain  an  executable  version  of  the  68000  Cross  Assembler 
discussed  in  Dr.  Dobb's  Toolbook  of  68000  Programming,  complete  with  source  code 
and  documentation.  The  Cross-Assembler  requires  CP/M  2.2  with  64K  or  MS-DOS 
with  128K.  The  disk  is  available  in  the  following  formats:  MS-DOS,  8"  SS/SD,  and 
Osborne. 

Interfacing  to  S-100/IEEE  696  Microcomputers 

Mark  Garetz  and  Sol  Libes 
Item  #85-2  $24.95 

This  book  helps  S-100  bus  users  expand  the  utility  and  power  of  their  systems.  It 
describes  the  S-100  bus  with  unmatched  precision.  Various  chapters  describe  its 
mechanical  and  functional  design,  logical  and  electrical  relationships,  bus 
interconnections,  and  busing  techniques. 


General  Interest 


Public-Domain  Software:  Untapped  Resources  for  the  Business  User 

Rusel  DeMaria  and  George  R.  Fontaine 
Item  #39-9  $19.95  (book) 

Item  #47-X  $34.95  (bookldisk) 

Organized  into  a  comprehensive  reference,  this  book  introduces  the  novice  and  guides 
the  experienced  user  to  a  source  of  often  overlooked  software — public  domain  and 
Shareware.  This  book  will  tell  you  where  it  is,  how  to  get  it,  what  to  look  for,  and 
why  it's  for  you.  The  sample  programs  and  some  of  the  software  reviewed  is  available 
on  disk  in  MS-DOS  format.  Includes  $15  worth  of  free  access  time  on  CompuServe! 

Dr.  Dobb  s  Journal  Bound  Volume  Series 

Each  volume  in  this  series  contains  a  full  year's  worth  of  useful  code  and  fascinating 
history  from  Dr.  Dobb's  Journal  of  Software  Tools.  Each  volume  contains  every  issue 
of  DDJ  for  a  given  year,  reprinted  and  combined  into  one  comprehensive  reference. 


Volume 

1: 1976 

Item  #13-5 

$30.75 

Volume 

2:  1977 

Item  #16-X 

$30.75 

Volume 

3:  1978 

Item  #1 7-8 

$30.75 

Volume 

4:  1979 

Item  #14-3 

$30.75 

Volume 

5:  1980 

Item  #18-6 

$30.75 

Volume 

6:  1981 

Item  #19-4 

$30.75 

Volume 

7:  1982 

Item  #20-8 

$35.75 

Volume 

8:  1983 

Item  #00-3 

$35.75 

Volume 

9:  1984 

Item  #08-9 

$35.75 

Volume 

10:  1985 

Item  #21-6 

$35.75 

Volume 

11:  1986 

Item  #72-0 

$35.75 

Volume 

12:  1987 

Item  #84-4 

$39.95 

To  order  any  of  these  products,  send  your  payment  and  the  item  number(s),  along  with 
$2.25  per  item  for  shipping,  to  M&T  Books,  501  Galveston  Drive,  Redwood  City, 
California  94063.  California  residents,  please  include  the  appropriate  sales  tax.  Do 
not  forget  to  innclude  your  return  address.  Or,  you  may  call  us  toll-free  at  800-533- 
4372  (in  California  800-356-2002)  Monday  through  Friday  between  8  AM.  and  5  PM. 
PST.  When  ordering  disks,  please  indicate  format. 


The  Dr,  Dobb's  Bound  Volume  Series 


4 


Volume  1  -  1976 

Chronicles  the  advent  of  the  microcomputer  era  in  1976.  Always 
pertinent  for  bit  crunching  and  byte  saving,  home-brew  computer 
projects,  and  the  technical  history  of  home  computing.  Topics 
include:  Tiny  BASIC,  the  first  words  on  CP/M,  speech  synthesis, 
floating  point  routines,  the  6502  disassembler  for  the  Apple,  and 
much  more. 


Volume  7  -  1982 

Examines  the  potential  of  powerful  16-bit  micros,  including  the 
birth  of  the  IBM  PC  in  1982.  Topics  include:  in-depth  coverage  of 
Forth,  68000,  8088  programming,  C  software  tools,  and  utilities 
for  CP/M — including  a  spelling  checker,  using  bulletin  boards, 
and  more. 


Volume  2  -  1977 

The  small  computer  emerges  as  a  powerful  tool  for  the  modern 
age  in  these  issues  from  1977.  Topics  include:  Lawrence  Liver¬ 
more  Lab’s  BASIC,  Dr.  Starkweather’s  PILOT,  using  a  modem, 
string-handling  techniques,  ciphers,  Turtle  graphics,  and  micro 
utilities. 


Volume  3-  1978 

Explores  the  foundation  of  the  mass-market  computer  industry  in 
1978.  Topics  include:  programming  in  BASIC,  PILOT,  and  Pas¬ 
cal;  RAM  memoryi testers;  Apple  utility  programs;  the  S-100  bus 
standard;  STRUBAL;  and  a  structured  BASIC  compiler. 


Volume  8  -  1983 

DDJ  turns  pro.  Some  of  the  most  powerful  professional  program¬ 
mer’s  tools  ever  published  in  a  magazine  are  in  this  1983  volume, 
including  Small-C,  the  RED  editor,  and  an  Ada  subset. 


Volume  9  -  1984 

Shaping  things  to  come.  In  1984  DDJ  examined  new  program¬ 
ming  environments  Prolog,  expert  systems,  Modula-2,  and  Pas¬ 
cal.  Other  topics  include  GREP,  UNIX  internals,  and  two  encryp¬ 
tions  systems. 


Volume  4  -  1979 

Heralds  the  widespread  acceptance  of  the  microcomputer  in 
America.  Innovative  ideas  and  articles  guide  the  reader  through 
the  age  of  discovery  in  1979.  Topics  include:  selecting  business 
software,  microcomputer  speech  and  music,  information  net¬ 
works,  and  interfacing  techniques. 


Volume  10-1985 

The  year  of  living  dangerously.  In  1985  DDJ  added  more  memo¬ 
ry,  an  SCSI  port,  and  a  hard  disk  to  the  Macintosh;  challenged  the 
UNIX  establishment  Jgjflfls  for  a  free  UNIX;  and  asked  hard 
questions  about  privacy  j^aM»ntrol  in  the  Information  Age.  In¬ 
cludes  software  tool^  jn^j^^^lula'2,  Forth,  Pascal,  assembly  lan¬ 
guage,  and  Prolog.  r 


Volume  5  -  1980 

Focuses  on  the  technological  promise  of  the  modern  microcom¬ 
puter  and  the  creative  challenge  facing  programmers  in  1980. 
Topics  include:  the  revolutionary  impact  of  CP/M,  C  program¬ 
ming  and  the  UNIX  operating  systems,  a  survey  of  computer  net¬ 
works,  software  portability,  introduction  to  Forth,  and  compiler 
writing. 


Volume  1 1  -  1986 

The  promise  of  power.  Desktop  computers  began  to  rival  minis 
and  mainframes  in  power.  DDJ  covered  1986’s  changes  with 
issues  on  the  68000,  parallel  processing,  artificial  intelligence,  the 
80386,  and  multitasking.  DDJ  also  supported  the  new  chips  with 
assemblers,  translators,  and  other  development  tools. 


Volume  6-  1981 


Volume  12-1987 


The  microcomputer  enters  the  mainstream — these  issues  assess 
the  revolutionary  impact  of  microcomputers  on  the  individual 
and  on  society  in  1981.  Topics  include:  computer  conferencing, 
the  power  of  Forth,  8-  and  16-bit  technology,  Rubik’s  cube  simu¬ 
lator,  and  an  adventure  game  development  system. 


Building  better  brains.  DDJ  broke  new  ground  in  1987  with  a 
neural  network  implementation,  monthly  coverage  of  artificial 
intelligence  and  object-oriented  programming  techniques,  and  re¬ 
views  of  implementations  of  Prolog,  LISP,  and  Smalltalk. 


M&T  BOOKS 


M&T  Publishing,  Inc. 
501  Galveston  Drive 
Redwood  City,  CA  94063 


ISBN  □-c]3M375-flM-4  > ^ 3 ^  ^  S 


